CN116521496A - Method, system, computer device and storage medium for verifying server performance - Google Patents
Method, system, computer device and storage medium for verifying server performance Download PDFInfo
- Publication number
- CN116521496A CN116521496A CN202310432839.1A CN202310432839A CN116521496A CN 116521496 A CN116521496 A CN 116521496A CN 202310432839 A CN202310432839 A CN 202310432839A CN 116521496 A CN116521496 A CN 116521496A
- Authority
- CN
- China
- Prior art keywords
- error
- hardware
- scheme
- real
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000002347 injection Methods 0.000 claims abstract description 104
- 239000007924 injection Substances 0.000 claims abstract description 104
- 238000012795 verification Methods 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims abstract description 35
- 230000003993 interaction Effects 0.000 claims description 13
- 241001290266 Sciaenops ocellatus Species 0.000 claims description 7
- 238000012546 transfer Methods 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 2
- 238000011056 performance test Methods 0.000 abstract description 4
- 230000008569 process Effects 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- 238000012360 testing method Methods 0.000 description 7
- 108010028984 3-isopropylmalate dehydratase Proteins 0.000 description 6
- 238000004891 communication Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 238000012937 correction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 206010033799 Paralysis Diseases 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3476—Data logging
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A10/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
- Y02A10/40—Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a server performance verification method, a system, computer equipment and a storage medium, wherein the method comprises the following steps: receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction; transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme; acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information; processing the real error based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance; various real hardware errors occur and are identified by the server through the occurrence of the real hardware errors; the hardware real error occurrence scheme, the register and the log information are grabbed, and the server performance test efficiency is optimized.
Description
Technical Field
The present invention relates to the field of server testing, and in particular, to a method, a system, a computer device, and a storage medium for verifying server performance.
Background
In the prior art, when verifying the performance of a server, an error injection test is usually performed on a xdp (extended debug port) interface on a server motherboard, the error injection is an analog error or an error which only occurs on a hardware interface and is not actually occurred on hardware, and a PEI card and a MEI card provided by Intel can be injected singly or in a single type, so that the accuracy and the efficiency of the error injection test are limited.
Disclosure of Invention
The invention aims at: provided are a server performance verification method, system, computer device, and storage medium.
The technical scheme of the invention is as follows: in a first aspect, the present invention provides a server performance verification method, the method comprising:
receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction;
transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information;
the real error is processed based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance.
In a preferred embodiment, the receiving the error injection instruction and generating the hardware real error occurrence scheme based on the error injection instruction includes:
receiving an error injection instruction;
reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, hardware quantity, content capacity and memory location;
generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
In a preferred embodiment, before the receiving the error injection instruction and generating the hardware real error occurrence scheme based on the error injection instruction, the method further includes:
displaying a preset error injection scheme on a human-computer interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of a user-defined error, a random error and fault flood.
In a preferred embodiment, responding to the error injection scheme selected by the user as a custom error, the generating a hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information includes:
Receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
In a preferred embodiment, the generating the hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information in response to the error injection scheme selected by the user being a random error includes:
and randomly generating a hardware true error occurrence scheme based on the server hardware configuration information.
In a preferred embodiment, responding to the error injection scheme selected by the user as the fault flood, the generating the hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information includes:
and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error generation scheme.
In a preferred embodiment, before the sending the hardware real error occurrence scheme to the target hardware for the target hardware to generate the real error based on the hardware real error occurrence scheme, the method further includes:
Analyzing the hardware real error occurrence scheme to obtain hardware real error category information, wherein the hardware real error category information comprises at least one of CPU hardware errors, memory hardware errors, PCIE hardware errors and other errors;
analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information;
analyzing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending the hardware real error occurrence scheme to the target hardware comprises the following steps:
and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
In a preferred embodiment, the parsing the hardware true error occurrence scheme to obtain hardware true error category information includes:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
In a preferred embodiment, the analyzing the hardware true error occurrence scheme further generates a first error random number when obtaining the target error type information corresponding to the target hardware layer information;
Before the analyzing the hardware real error occurrence scheme obtains the target error type information corresponding to the target hardware layer information, the method further comprises:
and correcting the target hardware layer information based on the first error random number.
In a preferred embodiment, the analyzing the hardware true error occurrence scheme further generates a second error random number when obtaining the target error type information corresponding to the target hardware layer information.
Before the sending the target hardware layer information and the target error type information to the target hardware based on the hardware real error category information, the method further includes:
and correcting the target error type information based on the second error random number.
In a preferred embodiment, the sending the hardware true error occurrence scheme to the target hardware for the target hardware to generate the true error based on the hardware true error occurrence scheme includes:
and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
In a preferred embodiment, the obtaining the log information of the real error includes:
Capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an XDP interface, an IPMI interface, a redfish interface and an SSH interface; and the log information of the real error is reported to an OSkernel and BMC by the target hardware after the real error occurs through a UEFI system.
In a preferred embodiment, the acquiring the hardware true error occurrence scheme and the register information includes:
acquiring a hardware true error occurrence scheme;
and acquiring register information for transmitting and processing the trigger register setting, wherein the register information at least comprises SMI link information and CSMI link information.
In a preferred embodiment, the acquiring the register information of the error transfer and processing trigger register setting includes:
and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
In a preferred embodiment, the processing the real error to verify the server performance based on the log information of the real error, the hardware real error occurrence scheme, and the register information includes:
judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
And if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
In a preferred embodiment, the determining whether the log of the real error is consistent with the hardware real error occurrence scheme includes:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
In a preferred embodiment, the determining whether the register information is consistent with the hardware true error occurrence scheme includes:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme.
In a second aspect, the present invention also provides a server performance verification system, the system comprising:
the receiving and generating module is used for receiving the error injection instruction and generating a hardware true error generation scheme based on the error injection instruction;
the sending module is used for sending the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
The acquisition module is used for acquiring the log information of the real errors and acquiring a hardware real error occurrence scheme and register information;
and the processing module is used for processing the real errors based on the log information of the real errors, the hardware real error occurrence scheme and the register information so as to verify the server performance.
In a third aspect, the present invention also provides a computer apparatus comprising:
one or more processors;
and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the server performance verification method of any one of the first aspects.
In a fourth aspect, the present invention also provides a computer-readable storage medium storing computer instructions that cause the computer to perform the server performance verification method according to any one of the first aspects.
The invention has the advantages that: provided are a server performance verification method, system, computer device and storage medium, the method comprising: receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction; transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme; acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information; processing the real error based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance; various real hardware errors occur and are identified by the server through the occurrence of the real hardware errors; the hardware real error occurrence scheme, the register and the log information are grabbed, and the server performance test efficiency is optimized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a server performance verification architecture in the present application;
FIG. 2 is a flow chart of a method for verifying server performance provided by the present application;
FIG. 3 is a block diagram of server performance verification provided herein;
FIG. 4 is a block diagram of a server performance verification system provided herein;
fig. 5 is a schematic diagram of a computer device provided in the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
As described in the background art, in the prior art
The server motherboard has xdp (extenddebugport) interface, which is a JTAG (joint testactiongroup, joint test workgroup) type interface, and is an international standard test protocol (ieee 1149.1 compliant) mainly used for error injection testing. In the prior art, the error injection is an analog error or an error which only occurs in a hardware interface and is not actually occurred in hardware, but the entity PEI card or MEI card provided by Intel can inject the same type of error in a single and single type, and cannot continuously and randomly generate a large amount of hardware errors.
To solve the above problems, the present application creatively proposes a server performance verification method, a system, a computer device and a storage medium, where the method includes: receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction; transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme; acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information; processing the real error based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance; various real hardware errors occur and are identified by the server through the occurrence of the real hardware errors; the hardware real error occurrence scheme, the register and the log information are grabbed, and the server performance test efficiency is optimized.
The following describes the aspects of the present application in detail with reference to the drawings and various embodiments.
Embodiment one: the embodiment describes a server performance verification architecture in the present application.
Specifically, referring to fig. 1, the architecture includes:
the server hardware and hardware interfaces comprise CPU (central processing unit/Processor) hardware interfaces, memory hardware interfaces, PCIE (PCI-Express, bus and interface standard) hardware interfaces, PCH (platform control south bridge) hardware interfaces and corresponding hardware which are integrated and deployed to the server, and of course, the server hardware and the hardware interfaces also comprise other hardware interfaces which are set according to actual needs, wherein the CPU hardware interfaces, the memory hardware interfaces, the PCIE hardware interfaces, the PCH hardware interfaces and other hardware interfaces which are set according to actual needs are visible and interactive; the hardware receives and executes the error issued by the error verification system through a hardware interface;
the architecture further includes: the system comprises a man-machine interaction interface and a log input module, wherein the man-machine interaction interface is connected with an error verification system and is used for inputting an error injection instruction by a user, displaying an error injection scheme for the user to select, monitoring the error occurrence progress, checking the error occurrence history, checking the error occurrence scheme and displaying a verification result after verification; the log grabbing interfaces comprise, but are not limited to, a serial port, an XDP interface, an IPMI interface, a redfish interface, an SSH interface and the like, are used for grabbing logs after fault injection, and provide log information for an error processing mechanism;
The architecture also comprises an error generating device, wherein the error verifying system is arranged in the error generating device, and an error generating module and an error diagnosing module are contained in the error generating device. The error generating module is responsible for generating a hardware real error generating scheme according to an error injection instruction input by a user, analyzing the hardware real error generating scheme into an error generating instruction transmitted to the hardware, and the error diagnosing module is responsible for analyzing the log grabbed by the log grabbing interface and judging whether the real error generating scheme accords with the hardware real error generating scheme.
Embodiment two: based on the architecture of server performance verification described in the first embodiment, the present embodiment is described with reference to fig. 2 and fig. 3, where the server performance verification process is described in the present application.
Specifically, referring to fig. 2 and fig. 3, the present application provides a server performance verification method, where the method includes:
s210, receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction.
Specifically, the system is connected with the man-machine interaction interface, a user inputs an error injection instruction on the man-machine interaction interface, and the system generates a hardware true error generation scheme after receiving the error injection instruction transmitted by the man-machine interaction interface.
In one embodiment, the receiving the error injection instruction and generating the hardware true error occurrence scheme based on the error injection instruction includes:
s211, displaying a preset error injection scheme on a man-machine interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of self-defined errors, random errors and fault flooding.
Specifically, in order to improve the usability of the user, when the user inputs the error injection command through the input/output function of the man-machine interface, a plurality of error injection schemes are provided on the man-machine interface, including but not limited to: custom errors (user can customize hardware type, error type, number of error occurrences, error location, etc.), random errors (errors will be randomly generated according to server hardware configuration), fault flooding (a large number of hardware errors are continuously and randomly generated according to server hardware configuration).
S212, receiving an error injection instruction.
And receiving a preset error injection scheme and specific content selected by a user, wherein the user can select any one of custom errors, random errors and fault flooding. Of course, if the preset error injection scheme also includes other error injection schemes, the user may also select the error injection scheme, which is not limited to the above-mentioned custom error, random error, and fault flood.
S213, reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, number of hardware, content capacity, and memory location.
Specifically, the system reads the hardware configuration of the server, such as the information of the hardware model, the hardware quantity, the memory capacity, the position and the like, confirms the currently available configuration hardware, and is used for making a real error occurrence scheme of the hardware.
S214, generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
Specifically, (1) responding to an error injection scheme selected by a user to be a custom error, wherein the generating a hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
s2141, receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information input by a user;
s2142, generating a hardware true error occurrence scheme based on the custom hardware type information, custom error occurrence frequency information, error position information and the server hardware configuration information.
If the user-defined hardware type information, the user-defined error occurrence frequency information and the error position information input by the user conflict with the read server hardware configuration information, the conflict content information is sent to the man-machine interaction interface for display reminding. The user-defined hardware type information input by the user is a CPU hardware type, but the read server hardware configuration information shows that the current CPU type hardware is unavailable, then the user injection instruction conflicts with the read server hardware configuration information, the error cannot be executed, and the conflict content information is displayed, namely, the CPU hardware is unavailable to a human-computer interaction interface for displaying reminding.
(2) Responding to the error injection scheme selected by the user as random error, wherein the generating the hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
s2143, randomly generating a hardware true error generation scheme based on the server hardware configuration information.
(3) Responding to the error injection scheme selected by the user as fault flood, wherein the generating the hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
S2144, randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error occurrence scheme.
SA10, analyzing the hardware true error occurrence scheme to obtain hardware true error category information, wherein the hardware true error category information comprises at least one of CPU hardware errors, memory hardware errors, PCIE hardware errors and other errors.
Preferably, the step includes: and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
Specifically, after the hardware real error occurrence scheme is generated, the system grabs the hardware real error occurrence scheme to analyze step by step. Because the server is integrated and deployed with a CPU hardware interface, a memory hardware interface, a PCIE hardware interface, a PCH hardware interface and corresponding hardware, and also comprises other hardware interfaces which are set according to actual needs, a hardware analysis module corresponding to each hardware and the hardware interface is deployed in the system. After the system captures the real error occurrence scheme of the hardware, the real error occurrence scheme is analyzed in the corresponding hardware analysis module according to the error priority and the error list.
For example, if the error list in the hardware real error occurrence scheme includes a CPU hardware error and a memory hardware error, the CPU hardware error priority is higher than the memory hardware error priority based on the error priority, and the CPU hardware error and the memory hardware error are sequentially parsed in the CPU hardware parsing module and the memory parsing module.
SA20, analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information.
Specifically, after the error category to which the real error generation scheme of the hardware belongs is identified and analyzed, the hardware layer information of the error is continuously analyzed and acquired.
For example, after analyzing the hardware real error occurrence scheme to be the CPU hardware error category, the corresponding hardware layer information is further analyzed. The hardware layer comprises a transmission layer, a data layer and a processing layer. The transmission layer is responsible for transmitting, receiving and outputting data, and is conventionally called RX (received data) and TX (transport) for transmitting data, taking PCIE hardware as an example. The data layer stores a large amount of data, and the processing layer involves processing and computing of the data. The deeper the hardware level, the higher the severity of the data transmission errors.
And when the hardware real error occurrence scheme is analyzed to obtain the target error type information corresponding to the target hardware layer information, a first error random number is also generated.
SA30, correcting the target hardware layer information based on the first error random number.
The first error random number is generated in the process of resolving the hardware layer, and corrects errors after resolving the target hardware layer information, so that the purpose is to prevent errors from being affected by algorithms when resolving the next level, and a fixed step state is formed, such as common normal distribution. The random number is used for enabling errors to be covered in a whole range in a controllable mode, and reducing risks of uneven coverage.
SA40, analyzing the hardware true error occurrence scheme to obtain the target error type information corresponding to the target hardware layer information.
And generating a second error random number when the real error occurrence scheme of the hardware is analyzed to obtain the target error type information corresponding to the target hardware layer information.
Specifically, after the error is resolved to the hardware layers, the error needs to be further resolved, and the error is further refined in each hardware layer to obtain the target error type information. Such as single BIT errors or BIT jumps that can occur at any hardware level, due to BIT flipping of a single BIT or multiple BITs in binary data. Other errors also involve numerous types of errors, such as CRC errors, read-write process errors, and the like.
SA50, correcting the target error type information based on the second error random number.
The error type analysis process generates a second error random number, and the second error random number corrects errors after analyzing the target error type information, so as to prevent error distribution from being affected by an algorithm and form a fixed step state, such as a common normal distribution. The random number is used for enabling errors to be covered in a whole range in a controllable mode, and reducing risks of uneven coverage.
S220, sending the hardware real error occurrence scheme to the target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme.
Specifically, the sending the hardware real error occurrence scheme to the target hardware includes:
s221, sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
More specifically, the target hardware layer information and the target error type information are issued to the target hardware in the form of an error packet or an error stream to generate a real error.
After the analysis of the error type is completed, the errors are issued to the hardware in the form of error packets or error streams, and the expression forms of the hardware errors are different, which is roughly divided into hardware physical layer setting: a certain physical layer of the hardware is completely paralyzed and cannot work; link failure: failure of a segment or all segments of the data transmission link results in data that cannot be read or added to the calculation sequence; data errors: data cannot be used due to translocation, deletion, compiling errors and the like in the process of reading, writing and transmitting the data, and data errors occur, so that the data cannot be used; as well as other types of errors, cause hardware to run in error.
And the target hardware is executed after receiving the target hardware layer information and the target error type information issued by the system, so as to generate a real error.
S230, acquiring log information of the real errors, and acquiring a hardware real error generation scheme and register information.
In one embodiment, the obtaining the log information of the real error includes:
s231, capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an XDP (extensible debug port) interface, an IPMI interface (IntelligentPlatform ManagementInterface, an intelligent platform management interface), a redfish interface and an SSH interface; the log information of the real error is reported to an osker (operating system kernel) and a BMC (baseboard management controller) by the target hardware after the real error occurs through a UEFI (Unified ExtensibleFirmwareInterface ) system.
Specifically, the target hardware executes to generate an error, the error is transmitted to the UEFI system after the error occurs, the UEFI processes the error, and the error may be refused to be corrected again by the hardware, or the error is considered to be a true error triggering register after the error is accumulated to a certain value. The target hardware reports the error to the UEFI system, which reports the error to OSkernel and BMC, which generates a log of log information (including but not limited to massage, dmasg, etc.).
In one embodiment, the acquiring the hardware real error occurrence scheme and the register information includes:
s232, acquiring a hardware true error occurrence scheme.
S233, acquiring register information of setting an error transfer and processing trigger register, wherein the register information at least comprises SMI (serial management interface) link information and CSMI link information.
Specifically, the acquiring the register information of the trigger register setting of the error transfer and processing includes:
and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
After the error occurs, the error is transferred to the UEFI system, the error is transferred and processed, the register setting is triggered, and register information (including but not limited to an SMI link, a CSMI link and the like) is transferred to the system for error processing.
S240, processing the real errors based on the log information of the real errors, the hardware real error occurrence scheme and the register information to verify the server performance.
In one embodiment, the processing the real error to verify the server performance based on the log information of the real error, the hardware real error occurrence scheme, and the register information includes:
Judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
and if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
Preferably, the determining whether the log of the real error is consistent with the hardware real error occurrence scheme includes:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
The judging whether the register information is consistent with the hardware true error occurrence scheme comprises the following steps:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme. The register information comprises an error type list, and whether the register information is consistent with the hardware true error occurrence scheme is judged by comparing whether the error type list in the register information is matched with the error type list in the hardware true error occurrence scheme.
The type of error in the real scheme is consistent with the register information.
The error processing flow pulls the hardware true error occurrence scheme, the register information and the log information, and carries out the fault processing flow, and the error processing is divided into manual checking and automatic checking, and the main purpose is to compare whether the hardware true error occurrence scheme is matched with the true error log and whether the hardware true error occurrence scheme is matched with the register information, so as to verify whether the RSA function of the server is normal. The automatic check will execute the automatic use case, and the final result of execution will be presented in the man-machine interface. The human checking mode can directly display the grabbed information to an operator for manual testing.
According to the server performance verification method provided by the embodiment, various real hardware errors are generated and identified by the server through the generation of the real hardware errors; the hardware real error occurrence scheme, the register and the log information are grabbed, and the performance test efficiency of the server is optimized;
further, a variety of real hardware integration is deployed to the server such that a variety of real hardware errors occur and are identified by the server.
Furthermore, according to the read server configuration information, an error occurrence scheme conforming to the configuration of the server to be tested is formulated, and the adaptation capability to various servers is excellent.
Further, large-scale and continuous injection of real hardware errors is realized;
further, the hardware errors are refined step by step through analyzing the errors, so that a large number of various and complex hardware errors are generated, and the method is greatly close to the real environment used by customers.
Embodiment III: corresponding to the above-described embodiments one to two, the server performance verification system provided in the present application will be described below with reference to fig. 4. The system may be implemented in hardware or software, or may be implemented in a combination of hardware and software, which is not limited in this application.
In one example, the present application provides a server performance verification system comprising:
a receiving and generating module 410, configured to receive an error injection instruction and generate a hardware real error occurrence scheme based on the error injection instruction;
a sending module 420, configured to send the hardware real error occurrence scheme to a target hardware, so that the target hardware generates a real error based on the hardware real error occurrence scheme;
the acquiring module 430 is configured to acquire log information of the real error, and acquire a hardware real error occurrence scheme and register information;
A processing module 440, configured to process the real error based on the log information of the real error, the hardware real error occurrence scheme and the register information to verify the server performance.
In one embodiment, the reception generation module 410 includes:
a receiving unit 411 configured to receive an error injection instruction;
a reading unit 412, configured to read server hardware configuration information, where the server hardware configuration information at least includes: hardware model, hardware quantity, content capacity and memory location;
the generating unit 413 is configured to generate a hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information, where the hardware real error occurrence scheme includes at least an error list, an error priority, and an error type.
Preferably, the system further comprises:
the display module 450 is configured to display a preset error injection scheme on the man-machine interface for a user to select to input the error injection instruction before the receiving and generating module 410 receives the error injection instruction and generates the hardware real error occurrence scheme based on the error injection instruction, where the preset error injection scheme includes any one of a custom error, a random error and a fault flood.
More preferably, the generating unit 413 is specifically configured to: responding to the user-selected error injection scheme as the custom error, and receiving custom hardware type information, custom error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
More preferably, the generating unit 413 is specifically configured to: and responding to the error injection scheme selected by the user as a random error, and randomly generating a hardware true error generation scheme based on the server hardware configuration information.
More preferably, the generating unit 413 is specifically configured to: and responding to the fault flood by the error injection scheme selected by the user, and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error occurrence scheme.
Preferably, the system further comprises:
a first parsing module 460, configured to send the hardware real error occurrence scheme to a target hardware by using the sending module 420, so that before the target hardware generates a real error based on the hardware real error occurrence scheme, the hardware real error occurrence scheme is parsed to obtain hardware real error category information, where the hardware real error category information includes at least one of a CPU hardware error, a memory hardware error, a PCIE hardware error, and other errors;
The second parsing module 470 is configured to parse the hardware real error occurrence scheme to obtain target hardware layer information corresponding to the hardware real error category information;
a third parsing module 480, configured to parse the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending module 420 is specifically configured to: and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
More preferably, the first parsing module 460 is specifically configured to:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
More preferably, the first parsing module 460 is further configured to generate a first error random number when parsing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the system further comprises:
and a correction module 490, configured to correct the target hardware layer information based on the first error random number before the second analysis module 470 analyzes the hardware actual error occurrence scheme to obtain the target hardware layer information corresponding to the hardware actual error category information.
More preferably, the second parsing module 470 is further configured to generate a second error random number when parsing the hardware real error occurrence scheme to obtain the target error type information corresponding to the target hardware layer information;
the correction module 490 is further configured to correct the target error type information based on the second error random number before the sending module 420 sends the target hardware layer information and the target error type information to target hardware based on the hardware real error type information.
More preferably, the sending module 420 is specifically configured to: and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
More preferably, the obtaining module 430 includes:
a grabbing unit 431, configured to grab the log information of the real error based on a preset interface, where the preset interface includes at least one of a serial port, an XDP interface, an IPMI interface, a redfish interface, and an SSH interface; and the log information of the real error is reported to an OSkernel and BMC by the target hardware after the real error occurs through a UEFI system.
More preferably, the obtaining module 430 further includes:
A first obtaining unit 432, configured to obtain a hardware real error occurrence scheme;
the second obtaining unit 433 is configured to obtain register information that is set by the error transfer and handling trigger register, where the register information includes at least SMI link information and CSMI link information.
More preferably, the second acquiring unit 433 is specifically configured to: and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
More preferably, the processing module 440 is specifically configured to determine whether the log of the real error is consistent with the hardware real error occurrence scheme, and determine whether the register information is consistent with the hardware real error occurrence scheme;
if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, the processing module 440 verifies that the server performance is acceptable.
More preferably, the processing module 440 includes:
the first determining unit 441 is configured to determine whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, where the error information includes an error type, an error generation position, an error number and an error generation time.
More preferably, the processing module 440 further includes:
a second judging unit 442, configured to judge whether the error type in the register information is consistent with the error type in the hardware real error occurrence scheme.
Embodiment four: corresponding to the first to third embodiments, a description will be given below of a computer device provided in the present application with reference to fig. 5. As shown in fig. 5, in one example, the present application provides a computer device comprising:
one or more processors;
and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the following:
receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction;
transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information;
the real error is processed based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
receiving an error injection instruction;
reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, hardware quantity, content capacity and memory location;
generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
displaying a preset error injection scheme on a human-computer interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of a user-defined error, a random error and fault flood.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and randomly generating a hardware true error occurrence scheme based on the server hardware configuration information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error generation scheme.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
analyzing the hardware real error occurrence scheme to obtain hardware real error category information, wherein the hardware real error category information comprises at least one of CPU hardware errors, memory hardware errors, PCIE hardware errors and other errors;
analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information;
analyzing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending the hardware real error occurrence scheme to the target hardware comprises the following steps:
and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
The method comprises the steps that a first error random number is also generated when the real error occurrence scheme of the hardware is analyzed to obtain target error type information corresponding to the target hardware layer information;
the program instructions, when read for execution by the one or more processors, further perform the operations of:
and correcting the target hardware layer information based on the first error random number.
And generating a second error random number when the real error occurrence scheme of the hardware is analyzed to obtain the target error type information corresponding to the target hardware layer information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and correcting the target error type information based on the second error random number.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an XDP interface, an IPMI interface, a redfish interface and an SSH interface; and the log information of the real error is reported to an OSkernel and BMC by the target hardware after the real error occurs through a UEFI system.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
acquiring a hardware true error occurrence scheme;
and acquiring register information for transmitting and processing the trigger register setting, wherein the register information at least comprises SMI link information and CSMI link information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
And if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme.
The program instructions, when read and executed by the one or more processors, may further perform operations corresponding to the steps in the foregoing method embodiments, and reference may be made to the foregoing description, which is not repeated herein. With reference to FIG. 5, an exemplary architecture for a computer device is shown, which may include a processor 510, a video display adapter 511, a disk drive 512, an input/output interface 513, a network interface 514, and a memory 520. The processor 510, the video display adapter 511, the disk drive 512, the input/output interface 513, the network interface 514, and the memory 520 may be communicatively coupled via a communication bus 530.
The processor 510 may be implemented by a general-purpose central processing unit (CentralProcessingUnit, CPU), a microprocessor, an application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc., for executing relevant programs to implement the technical solutions provided herein.
Memory 520 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), random access memory (RandomAccessMemory, RAM), static storage devices, dynamic storage devices, or the like. The memory 520 may store an operating system 521 for controlling the operation of the computer device 500, and a Basic Input Output System (BIOS) 522 for controlling the low-level operation of the computer device 500. In addition, a web browser 523, data storage management 524, and an icon font processing system 525, etc. may also be stored. The icon font processing system 525 may be an application program that specifically implements the operations of the foregoing steps in the embodiments of the present application. In general, when the technical solutions provided in the present application are implemented by software or firmware, relevant program codes are stored in the memory 520 and invoked by the processor 510 to be executed.
The input/output interface 513 is used for connecting with an input/output module to realize information input and output. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
The network interface 514 is used to connect communication modules (not shown) to enable communication interactions of the device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 530 includes a path to transfer information between components of the device (e.g., processor 510, video display adapter 511, disk drive 512, input/output interface 513, network interface 514, and memory 520).
In addition, the computer device 500 may also obtain information of specific acquisition conditions from the virtual resource object acquisition condition information database 541 for making condition judgment, and so on.
It should be noted that although the above-described computer device 500 illustrates only a processor 510, a video display adapter 511, a disk drive 512, an input/output interface 513, a network interface 514, a memory 520, a bus 530, etc., the computer device may include other components necessary to achieve proper operation in an implementation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the present application, and not all the components shown in the drawings.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and include several instructions to cause a computer device (which may be a personal computer, a cloud server, or a network device, etc.) to perform the methods described in the various embodiments or some parts of the embodiments of the present application.
Fifth embodiment: corresponding to the first to fourth embodiments described above, a computer-readable storage medium provided in the present application will be described below. In one example, the present application provides a computer-readable storage medium storing computer instructions that cause the computer to:
receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction;
Transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information;
the real error is processed based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance.
The computer instructions cause the computer to further perform the operations of:
receiving an error injection instruction;
reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, hardware quantity, content capacity and memory location;
generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
The computer instructions cause the computer to further perform the operations of:
displaying a preset error injection scheme on a human-computer interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of a user-defined error, a random error and fault flood.
The computer instructions cause the computer to further perform the operations of:
receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
The computer instructions cause the computer to further perform the operations of:
and randomly generating a hardware true error occurrence scheme based on the server hardware configuration information.
The program instructions, when read for execution by the one or more processors, further perform the operations of:
and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error generation scheme.
The computer instructions cause the computer to further perform the operations of:
analyzing the hardware real error occurrence scheme to obtain hardware real error category information, wherein the hardware real error category information comprises at least one of CPU hardware errors, memory hardware errors, PCIE hardware errors and other errors;
Analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information;
analyzing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending the hardware real error occurrence scheme to the target hardware comprises the following steps:
and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
The computer instructions cause the computer to further perform the operations of:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
The method comprises the steps that a first error random number is also generated when the real error occurrence scheme of the hardware is analyzed to obtain target error type information corresponding to the target hardware layer information;
the computer instructions cause the computer to further perform the operations of:
and correcting the target hardware layer information based on the first error random number.
And generating a second error random number when the real error occurrence scheme of the hardware is analyzed to obtain the target error type information corresponding to the target hardware layer information.
The computer instructions cause the computer to further perform the operations of:
and correcting the target error type information based on the second error random number.
The computer instructions cause the computer to further perform the operations of:
and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
The computer instructions cause the computer to further perform the operations of:
capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an XDP interface, an IPMI interface, a redfish interface and an SSH interface; and the log information of the real error is reported to an OSkernel and BMC by the target hardware after the real error occurs through a UEFI system.
The computer instructions cause the computer to further perform the operations of:
acquiring a hardware true error occurrence scheme;
and acquiring register information for transmitting and processing the trigger register setting, wherein the register information at least comprises SMI link information and CSMI link information.
The computer instructions cause the computer to further perform the operations of:
and acquiring the register information which is transmitted to the trigger register setting in the UEFI system after the error occurs.
The computer instructions cause the computer to further perform the operations of:
judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
and if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
The computer instructions cause the computer to further perform the operations of:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
The computer instructions cause the computer to further perform the operations of:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points. The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
In addition, it is to be understood that: the terms "first," "second," "third," "fourth" are used herein for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", "a third" and a fourth "may explicitly or implicitly include one or more such feature.
The above embodiments are merely for illustrating the technical concept and features of the present invention, and are not intended to limit the scope of the present invention to those skilled in the art to understand the present invention and implement the same. All modifications made according to the spirit of the main technical proposal of the invention should be covered in the protection scope of the invention.
Claims (20)
1. A method for verifying server performance, the method comprising:
receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction;
transmitting the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
Acquiring log information of the real errors, and acquiring a hardware real error occurrence scheme and register information;
the real error is processed based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance.
2. The server performance verification method according to claim 1, wherein the receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction comprises:
receiving an error injection instruction;
reading server hardware configuration information, wherein the server hardware configuration information at least comprises: hardware model, hardware quantity, content capacity and memory location;
generating a hardware real error generation scheme based on the error injection instruction and the server hardware configuration information, wherein the hardware real error generation scheme at least comprises an error list, an error priority and an error type.
3. The server performance verification method according to claim 2, wherein before receiving an error injection instruction and generating a hardware true error occurrence scheme based on the error injection instruction, the method further comprises:
Displaying a preset error injection scheme on a human-computer interaction interface for a user to select to input an error injection instruction, wherein the preset error injection scheme comprises any one of a user-defined error, a random error and fault flood.
4. The server performance verification method according to claim 3, wherein, in response to a user-selected error injection scheme being a custom error, the generating a hardware real error occurrence scheme based on the error injection instruction and the server hardware configuration information comprises:
receiving user-defined hardware type information, user-defined error occurrence frequency information and error position information which are input by a user;
generating a hardware true error occurrence scheme based on the custom hardware type information, the custom error occurrence frequency information, the error position information and the server hardware configuration information.
5. The method for verifying server performance as defined in claim 3, wherein,
responding to the error injection scheme selected by the user as random error, wherein the generating the hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
And randomly generating a hardware true error occurrence scheme based on the server hardware configuration information.
6. The method for verifying server performance as defined in claim 3, wherein,
responding to the error injection scheme selected by the user as fault flood, wherein the generating the hardware true error generation scheme based on the error injection instruction and the server hardware configuration information comprises the following steps:
and randomly generating errors based on the server hardware configuration information for a preset duration to generate a hardware true error generation scheme.
7. The method according to any one of claims 2-6, wherein before the sending the hardware real error occurrence scheme to the target hardware for the target hardware to generate the real error based on the hardware real error occurrence scheme, the method further comprises:
analyzing the hardware true error generation scheme to obtain hardware true error category information, wherein the hardware true error category information comprises at least one of CPU hardware errors, memory hardware errors, bus and interface standard hardware errors and integrated south bridge errors;
analyzing the hardware true error generation scheme to obtain target hardware layer information corresponding to the hardware true error category information;
Analyzing the hardware real error occurrence scheme to obtain target error type information corresponding to the target hardware layer information;
the sending the hardware real error occurrence scheme to the target hardware comprises the following steps:
and sending the target hardware layer information and the target error type information to target hardware based on the hardware true error type information.
8. The server performance verification method according to claim 7, wherein the parsing the hardware real error occurrence scheme to obtain hardware real error category information includes:
and analyzing the hardware true error occurrence scheme based on the error priority and the error list to obtain hardware true error category information.
9. The method for verifying server performance according to claim 8, wherein the analyzing the hardware real error occurrence scheme further generates a first error random number when obtaining target error type information corresponding to the target hardware layer information;
before the analyzing the hardware real error occurrence scheme obtains the target error type information corresponding to the target hardware layer information, the method further comprises:
and correcting the target hardware layer information based on the first error random number.
10. The method for verifying server performance according to claim 9, wherein the analyzing the hardware real error occurrence scheme further generates a second error random number when obtaining target error type information corresponding to the target hardware layer information;
before the sending the target hardware layer information and the target error type information to the target hardware based on the hardware real error category information, the method further includes:
and correcting the target error type information based on the second error random number.
11. The server performance verification method according to claim 10, wherein the sending the hardware real error occurrence scheme to target hardware for the target hardware to generate a real error based on the hardware real error occurrence scheme comprises:
and issuing the target hardware layer information and the target error type information to target hardware in the form of an error packet or an error stream so as to generate a real error.
12. The server performance verification method according to claim 11, wherein the obtaining the log information of the real error includes:
capturing log information of the real errors based on a preset interface, wherein the preset interface comprises at least one of a serial port, an extended debugging port interface, an intelligent platform management interface, a redfish interface and an SSH interface; and the log information of the real errors is reported to an operating system kernel and a baseboard management controller by the target hardware after the real errors occur through a unified extensible firmware interface system.
13. The server performance verification method according to claim 12, wherein the acquiring hardware real error occurrence scheme and register information includes:
acquiring a hardware true error occurrence scheme;
and acquiring register information for transmitting and processing the setting of the trigger register, wherein the register information at least comprises serial management interface link information and CSMI link information.
14. The server performance verification method according to claim 13, wherein the obtaining the register information of the error transfer and handling trigger register setting includes:
and after the error occurs, the acquired register information is transmitted to a trigger register set in the unified extensible firmware interface system.
15. The server performance verification method according to claim 14, wherein the processing the real error based on the log information of the real error, the hardware real error occurrence scheme, and the register information to verify the server performance includes:
judging whether the log of the real error is consistent with the hardware real error occurrence scheme or not, and judging whether the register information is consistent with the hardware real error occurrence scheme or not;
And if the log of the real error is consistent with the hardware real error occurrence scheme and the register information is consistent with the hardware real error occurrence scheme, verifying that the server performance is qualified.
16. The server performance verification method according to claim 15, wherein the determining whether the log of the real error is consistent with the hardware real error occurrence scheme includes:
judging whether the log of the real error is consistent with error information in the hardware real error occurrence scheme, wherein the error information comprises error types, error generation positions, error times and error generation time.
17. The server performance verification method according to claim 15, wherein the determining whether the register information is consistent with the hardware real error occurrence scheme includes:
and judging whether the error type in the register information is consistent with the error type in the hardware true error occurrence scheme.
18. A server performance verification system, the system comprising:
the receiving and generating module is used for receiving the error injection instruction and generating a hardware true error generation scheme based on the error injection instruction;
The sending module is used for sending the hardware real error occurrence scheme to target hardware so that the target hardware can generate real errors based on the hardware real error occurrence scheme;
the acquisition module is used for acquiring the log information of the real errors and acquiring a hardware real error occurrence scheme and register information;
and the processing module is used for processing the real errors based on the log information of the real errors, the hardware real error occurrence scheme and the register information so as to verify the server performance.
19. A computer device, the computer device comprising:
one or more processors;
and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the server performance verification method of any one of claims 1-17.
20. A computer-readable storage medium storing computer instructions that cause the computer to perform the server performance verification method of any one of claims 1-17.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310432839.1A CN116521496A (en) | 2023-04-21 | 2023-04-21 | Method, system, computer device and storage medium for verifying server performance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310432839.1A CN116521496A (en) | 2023-04-21 | 2023-04-21 | Method, system, computer device and storage medium for verifying server performance |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116521496A true CN116521496A (en) | 2023-08-01 |
Family
ID=87393405
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310432839.1A Pending CN116521496A (en) | 2023-04-21 | 2023-04-21 | Method, system, computer device and storage medium for verifying server performance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116521496A (en) |
-
2023
- 2023-04-21 CN CN202310432839.1A patent/CN116521496A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108388489B (en) | Server fault diagnosis method, system, equipment and storage medium | |
US7324922B2 (en) | Run-time performance verification system | |
US9569325B2 (en) | Method and system for automated test and result comparison | |
US9384085B2 (en) | Method, device, and system for monitoring quickpath interconnect link | |
US11748218B2 (en) | Methods, electronic devices, storage systems, and computer program products for error detection | |
CN109800159A (en) | Program debugging method, program debugging device, terminal device and storage medium | |
CN100365994C (en) | Method and system for regulating ethernet | |
CN109388604B (en) | Hot plug control method, device and storage medium based on PCIe | |
CN101286129A (en) | Embedded systems debugging | |
CN109753391A (en) | The systems, devices and methods of the functional test of one or more structures of processor | |
EP3692443A1 (en) | Application regression detection in computing systems | |
CN110674034A (en) | Health examination method and device, electronic equipment and storage medium | |
CN114003445A (en) | I2C monitoring function test method, system, terminal and storage medium of BMC | |
CN112380046A (en) | Calculation result checking method, system, device, equipment and storage medium | |
CN112463432A (en) | Inspection method, device and system based on index data | |
CN112286750A (en) | GPIO (general purpose input/output) verification method and device, electronic equipment and medium | |
CN106445787B (en) | Method and device for monitoring server core dump file and electronic equipment | |
CN118113508A (en) | Network card fault risk prediction method, device, equipment and medium | |
CN116382968B (en) | Fault detection method and device for external equipment | |
CN113722143A (en) | Program flow monitoring method and device, electronic equipment and storage medium | |
CN116521496A (en) | Method, system, computer device and storage medium for verifying server performance | |
CN116306413A (en) | FPGA simulation verification method and device, electronic equipment and storage medium | |
Carreira et al. | Assessing the effects of communication faults on parallel applications | |
CN115587003A (en) | xGMI speed reduction function test method, system, device and readable storage medium | |
CN114328065A (en) | Interrupt verification method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |