CN110175096B - GPU (graphics processing Unit) pressurization test method, system, terminal and storage medium - Google Patents
GPU (graphics processing Unit) pressurization test method, system, terminal and storage medium Download PDFInfo
- Publication number
- CN110175096B CN110175096B CN201910425730.9A CN201910425730A CN110175096B CN 110175096 B CN110175096 B CN 110175096B CN 201910425730 A CN201910425730 A CN 201910425730A CN 110175096 B CN110175096 B CN 110175096B
- Authority
- CN
- China
- Prior art keywords
- nbody
- command
- gpu
- duration
- refreshing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/2205—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
- G06F11/2236—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test CPU or processors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/2273—Test methods
Abstract
The invention provides a GPU pressurization test method, a system, a terminal and a storage medium, comprising the following steps: setting a refresh period of an nbody command; collecting the number of GPU cards and generating a corresponding number of nbody commands according to the number of the GPU cards; acquiring the duration of the nbody command; and refreshing the nbody command according to the duration and the refreshing period. The invention can automatically generate the nbody commands which are in one-to-one correspondence with the GPU cards to simultaneously pressurize the GPU cards, and can automatically update the nbody commands, thereby avoiding the interruption of pressurization and simultaneously saving human resources and test time.
Description
Technical Field
The invention belongs to the technical field of server testing, and particularly relates to a GPU (graphics processing unit) pressurization testing method, a system, a terminal and a storage medium.
Background
With the development of artificial intelligence, GPU servers are becoming more popular, and in order to test the stability and reliability of GPU cards, the GPU cards need to be subjected to long-time (generally >24h), heavy-load pressurization test, and a cuba-owned nbody tool can provide heavy-load pressurization for the GPU cards, but the nbody general pressurization test time is only about 30-40 mins. Another nbody command can only test 1 GPU card. This results in the need to manually perform the pressurization of the GPU while testing the GPU. The degree of automation is low when the GPU test is executed, and time and human resources are consumed.
Disclosure of Invention
In view of the above-mentioned deficiencies of the prior art, the present invention provides a GPU pressure test method, system, terminal and storage medium to solve the above-mentioned technical problems.
In a first aspect, the present invention provides a GPU pressurization test method, including:
setting an n-body command refresh period, and setting the n-body command refresh period to be 30 min;
collecting the number of GPU cards and generating a corresponding number of nbody commands according to the number of the GPU cards, wherein the method comprises the following steps: collecting all GPU card identification codes; generating an nbody command corresponding to the GPU card one by one according to the identification code;
acquiring the duration of the nbody command;
refreshing the nbody command according to the duration and the refresh period, comprising: judging whether the duration of the nbody command reaches the refresh period: if yes, regenerating an nbody command corresponding to the identification code according to the identification code of the GPU card to which the nbody command belongs; otherwise, the acquisition and monitoring of the duration of the nbody command are circulated.
The method further comprises the following steps: starting a GPU state monitoring program, and monitoring error reporting information; and outputting the monitoring result in the form of a test log.
In a second aspect, the present invention provides a GPU pressurization test system, comprising:
the period setting unit is configured for setting a refresh period of the nbody command;
the command generation unit is configured to collect the number of GPU cards and generate a corresponding number of nbody commands according to the number of the GPU cards, and comprises: the information acquisition module is configured for acquiring all GPU card identification codes; the command generation module is configured to generate the nbody commands corresponding to the GPU cards one by one according to the identification codes;
the time acquisition unit is configured to acquire the duration of the nbody command;
the command refreshing unit is configured to refresh the nbody command according to the duration and the refreshing period, and comprises: the time judgment module is configured to judge whether the duration time of the nbody command reaches the refresh period; the regeneration module is configured to regenerate the nbody command corresponding to the identification code according to the identification code of the GPU card to which the nbody command belongs; and the cyclic acquisition module is configured for cyclically monitoring the acquisition of the duration of the nbody command.
In a third aspect, a terminal is provided, including:
a processor, a memory, wherein,
the memory is used for storing a computer program which,
the processor is used for calling and running the computer program from the memory so as to make the terminal execute the method of the terminal.
In a fourth aspect, a computer storage medium is provided having stored therein instructions that, when executed on a computer, cause the computer to perform the method of the above aspects.
The beneficial effect of the invention is that,
according to the GPU pressurization test method, the system, the terminal and the storage medium, provided by the invention, the problem that the pressurization interruption affects the test result is avoided by setting the n-body command refreshing period, generating the corresponding n-body command according to the number of the GPU cards, then acquiring the duration of the n-body command in real time, and updating the n-body command with the duration reaching the refreshing period in time. The invention can automatically generate the nbody commands which are in one-to-one correspondence with the GPU cards to simultaneously pressurize the GPU cards, and can automatically update the nbody commands, thereby avoiding the interruption of pressurization and simultaneously saving human resources and test time.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention.
FIG. 2 is a schematic block diagram of a system of one embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following explains key terms appearing in the present invention.
The GPU, a Graphics Processing Unit (abbreviated as GPU), also called a display core, a visual processor, and a display chip, is a microprocessor that is specially used for image operation on a personal computer, a workstation, a game machine, and some mobile devices (such as a tablet computer, a smart phone, etc.).
FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention. The execution body of fig. 1 may be a GPU pressurization test system.
As shown in fig. 1, the method 100 includes:
and step 140, refreshing the nbody command according to the duration and the refreshing period.
Optionally, as an embodiment of the present invention, the setting an nbody command refresh period includes:
the nbody command refresh period is set to 30 min.
Optionally, as an embodiment of the present invention, the acquiring the number of GPU cards and generating a corresponding number of nbody commands according to the number of GPU cards includes:
collecting all GPU card identification codes;
and generating the nbody commands corresponding to the GPU cards one by one according to the identification codes.
Optionally, as an embodiment of the present invention, the refreshing the nbody command according to the duration and the refresh period includes:
judging whether the duration of the nbody command reaches the refresh period:
if yes, regenerating an nbody command corresponding to the identification code according to the identification code of the GPU card to which the nbody command belongs;
otherwise, the acquisition and monitoring of the duration of the nbody command are circulated.
Optionally, as an embodiment of the present invention, after the refreshing the nbody command according to the duration and the refresh period, the method further includes:
starting a GPU state monitoring program, and monitoring error reporting information;
and outputting the monitoring result in the form of a test log.
In order to facilitate understanding of the present invention, the GPU pressurization test method provided by the present invention is further described below with reference to the principle of the GPU pressurization test method of the present invention and the process of pressurizing the GPU in the embodiment.
Specifically, the GPU pressurization test method includes:
s1, since the n body command generally has a pressurizing test time of only about 30-40mins, in order to avoid a pressurizing interruption which may occur, the present embodiment sets the refresh period of the n body command to 30 min.
S2, reading the number of GPU cards in the test server and the identification code of each GPU card through the script, establishing nbody commands corresponding to all the GPU cards one by one according to the identification codes of the GPU cards, and controlling the nbody to generate the nbody commands through the automatic test script.
And S3, circularly acquiring the duration of each nbody command, recording the generation time of each nbody command when each nbody command is generated, and calculating the duration of the nbody command according to the current time and the generation time. The nbody command duration is updated every 2 s.
And S4, judging whether the duration of the nbody commands acquired in the step S3 reaches the refresh period (30min), and if the duration of all the nbody commands is the same and reaches the refresh period, updating all the nbody commands, namely reestablishing the nbody commands. If the duration of the nbody command is asynchronous, after the duration of a certain nbody command reaches a refresh period, the nbody command is reestablished aiming at the GPU card identification code to which the nbody command belongs, and the directed refresh of the nbody command is realized.
And S5, in the process of pressurizing each GPU card of the test server, starting a monitoring program, monitoring whether an error message exists in a log file generated in the pressurizing process, and immediately outputting the error message to a result file if the error message exists, so that a tester can conveniently perform error analysis subsequently.
The specific contents of the automatic test script used in this embodiment are as follows (taking 8 GPU cards in the test server as an example):
as shown in fig. 2, the system 200 includes:
a period setting unit 210, wherein the period setting unit 210 is used for setting an nbody command refresh period;
the command generating unit 220 is used for acquiring the number of GPU cards and generating n body commands with corresponding number according to the number of the GPU cards;
a time obtaining unit 230, wherein the time obtaining unit 230 is configured to obtain a duration of the nbody command;
a command refresh unit 240, said command refresh unit 240 configured to refresh said nbody command according to said duration and refresh period.
Optionally, as an embodiment of the present invention, the command generating unit includes:
the information acquisition module is configured for acquiring all GPU card identification codes;
and the command generation module is configured to generate the nbody commands corresponding to the GPU cards one by one according to the identification codes.
Optionally, as an embodiment of the present invention, the command refresh unit includes:
the time judgment module is configured to judge whether the duration time of the nbody command reaches the refresh period;
the regeneration module is configured to regenerate the nbody command corresponding to the identification code according to the identification code of the GPU card to which the nbody command belongs;
and the cyclic acquisition module is configured for cyclically monitoring the acquisition of the duration of the nbody command.
Fig. 3 is a schematic structural diagram of a terminal system 300 according to an embodiment of the present invention, where the terminal system 300 may be used to execute the GPU stress test method according to the embodiment of the present invention.
The terminal system 300 may include: a processor 310, a memory 320, and a communication unit 330. The components communicate via one or more buses, and those skilled in the art will appreciate that the architecture of the servers shown in the figures is not intended to be limiting, and may be a bus architecture, a star architecture, a combination of more or less components than those shown, or a different arrangement of components.
The memory 320 may be used for storing instructions executed by the processor 310, and the memory 320 may be implemented by any type of volatile or non-volatile storage terminal or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk. The executable instructions in memory 320, when executed by processor 310, enable terminal 300 to perform some or all of the steps in the method embodiments described below.
The processor 310 is a control center of the storage terminal, connects various parts of the entire electronic terminal using various interfaces and lines, and performs various functions of the electronic terminal and/or processes data by operating or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory. The processor may be composed of an Integrated Circuit (IC), for example, a single packaged IC, or a plurality of packaged ICs connected with the same or different functions. For example, the processor 310 may include only a Central Processing Unit (CPU). In the embodiment of the present invention, the CPU may be a single operation core, or may include multiple operation cores.
A communication unit 330, configured to establish a communication channel so that the storage terminal can communicate with other terminals. And receiving user data sent by other terminals or sending the user data to other terminals.
The present invention also provides a computer storage medium, wherein the computer storage medium may store a program, and the program may include some or all of the steps in the embodiments provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).
Therefore, the method and the device avoid the problem that the test result is influenced by pressurization interruption by setting the refreshing period of the nbody commands, generating the corresponding nbody commands according to the number of the GPU cards, then acquiring the duration of the nbody commands in real time, and updating the nbody commands with the duration reaching the refreshing period in time. The invention can automatically generate the nbody command which is in one-to-one correspondence with the multiple GPU cards to simultaneously pressurize the multiple GPU cards, and can automatically update the nbody command, thereby avoiding pressurization interruption, simultaneously saving human resources and test time.
Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in the form of a software product, where the computer software product is stored in a storage medium, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like, and the storage medium can store program codes, and includes instructions for enabling a computer terminal (which may be a personal computer, a server, or a second terminal, a network terminal, and the like) to perform all or part of the steps of the method in the embodiments of the present invention.
The same and similar parts in the various embodiments in this specification may be referred to each other. Especially, for the terminal embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant points can be referred to the description in the method embodiment.
In the embodiments provided by the present invention, it should be understood that the disclosed system, system and method can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, systems or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (8)
1. A GPU pressurization test method is characterized by comprising the following steps:
setting a refresh period of an nbody command;
collecting the number of GPU cards and generating a corresponding number of nbody commands according to the number of the GPU cards;
acquiring the duration of the nbody command;
refreshing the nbody command according to the duration and the refreshing period; the refreshing the nbody command according to the duration and the refresh period comprises: judging whether the duration of the nbody command reaches the refresh period: if so, regenerating an nbody command corresponding to the identification code according to the identification code of the GPU card to which the nbody command belongs; if not, the acquisition of the duration of the nbody command is circulated.
2. The GPU pressurization test method of claim 1, wherein said setting an nbody command refresh period comprises:
the nbody command refresh period is set to 30 min.
3. A GPU pressure test method according to claim 1, wherein the collecting the number of GPU cards and generating a corresponding number of nbody commands according to the number of GPU cards comprises:
collecting all GPU card identification codes;
and generating the nbody commands corresponding to the GPU cards one by one according to the identification codes.
4. A GPU stress testing method according to claim 1, wherein after refreshing the nbody commands according to the duration and refresh period, the method further comprises:
starting a GPU state monitoring program, and monitoring error reporting information;
and outputting the monitoring result in the form of a test log.
5. The GPU pressurization test system of claim 1, comprising:
the period setting unit is configured for setting a refresh period of the nbody command;
the command generation unit is configured to collect the number of the GPU cards and generate the number of the nbody commands according to the number of the GPU cards;
the time acquisition unit is configured to acquire the duration of the nbody command;
the command refreshing unit is configured to refresh the nbody command according to the duration and the refreshing period; the command refresh unit includes: the time judgment module is configured to judge whether the duration time of the nbody command reaches the refresh period; the regeneration module is configured to regenerate the nbody command corresponding to the identification code according to the identification code of the GPU card to which the nbody command belongs if the duration of the nbody command reaches the refresh period; and the cycle acquisition module is configured for acquiring the duration time of the nbody command in a cycle manner if the duration time of the nbody command does not reach the refreshing period.
6. The GPU pressurization test system of claim 5, wherein the command generation unit comprises:
the information acquisition module is configured for acquiring all GPU card identification codes;
and the command generation module is configured to generate the nbody commands corresponding to the GPU cards one by one according to the identification codes.
7. A terminal, comprising:
a processor;
a memory for storing instructions for execution by the processor;
wherein the processor is configured to perform the method of any one of claims 1-4.
8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910425730.9A CN110175096B (en) | 2019-05-21 | 2019-05-21 | GPU (graphics processing Unit) pressurization test method, system, terminal and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910425730.9A CN110175096B (en) | 2019-05-21 | 2019-05-21 | GPU (graphics processing Unit) pressurization test method, system, terminal and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110175096A CN110175096A (en) | 2019-08-27 |
CN110175096B true CN110175096B (en) | 2020-02-07 |
Family
ID=67691787
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910425730.9A Active CN110175096B (en) | 2019-05-21 | 2019-05-21 | GPU (graphics processing Unit) pressurization test method, system, terminal and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110175096B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111338862B (en) * | 2020-02-16 | 2022-07-19 | 苏州浪潮智能科技有限公司 | GPU mode switching stability test method, system, terminal and storage medium |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8539438B2 (en) * | 2009-09-11 | 2013-09-17 | International Business Machines Corporation | System and method for efficient creation and reconciliation of macro and micro level test plans |
CN102063354A (en) * | 2009-11-18 | 2011-05-18 | 英业达股份有限公司 | Pressure test method of server |
CN102279782B (en) * | 2011-04-01 | 2014-02-19 | 奇智软件(北京)有限公司 | Pressure and device for testing hardware pressure |
CN104679615A (en) * | 2013-11-26 | 2015-06-03 | 英业达科技有限公司 | Bus pressure test system and method thereof |
CN103984612B (en) * | 2014-05-28 | 2017-11-10 | 浪潮电子信息产业股份有限公司 | A kind of method of the unattended pressure test based on HPL instruments |
CN104375914A (en) * | 2014-11-24 | 2015-02-25 | 浪潮电子信息产业股份有限公司 | Automatic testing method for internal pressure changes of server |
CN104615523A (en) * | 2015-03-05 | 2015-05-13 | 浪潮电子信息产业股份有限公司 | Fatigue testing method of BMC management module based on IPMI protocol |
CN107423183A (en) * | 2017-04-25 | 2017-12-01 | 郑州云海信息技术有限公司 | A kind of GTX series video card calculates the applied voltage test method of performance |
CN109086184A (en) * | 2018-07-18 | 2018-12-25 | 郑州云海信息技术有限公司 | The monitoring method of GPU pressure test under a kind of server Linux system |
CN109522173A (en) * | 2018-11-02 | 2019-03-26 | 郑州云海信息技术有限公司 | A kind of OPA network card testing method, device, terminal and storage medium |
-
2019
- 2019-05-21 CN CN201910425730.9A patent/CN110175096B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110175096A (en) | 2019-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111338862B (en) | GPU mode switching stability test method, system, terminal and storage medium | |
CN110175096B (en) | GPU (graphics processing Unit) pressurization test method, system, terminal and storage medium | |
CN111966551A (en) | Method, system, terminal and storage medium for verifying remote command execution result | |
CN111475106A (en) | RAID customization creating method, system, terminal and storage medium | |
CN111147331A (en) | Server network card interaction test method, system, terminal and storage medium | |
CN110569154A (en) | Chip interface function testing method, system, terminal and storage medium | |
CN110554917A (en) | method, system, terminal and storage medium for efficiently traversing large data volume set | |
CN111176917B (en) | Method, system, terminal and storage medium for testing stability of CPU SST-BF function | |
CN109992420B (en) | Parallel PCIE-SSD performance optimization method and system | |
CN111176924A (en) | GPU card dropping simulation method, system, terminal and storage medium | |
CN112214384A (en) | Hard disk serial number management method, system, terminal and storage medium | |
CN109117406B (en) | PCIE hot plug test method, device, terminal and storage medium | |
CN111949518A (en) | Method, system, terminal and storage medium for generating fault detection script | |
CN110543394A (en) | server sensor information consistency testing method, system, terminal and storage medium | |
CN112463504B (en) | Double-control storage product testing method, system, terminal and storage medium | |
CN112463195B (en) | Method, system, terminal and storage medium for cluster grouping online upgrade | |
CN115129249A (en) | SAS link topology identification management method, system, terminal and storage medium | |
CN110703988B (en) | Storage pool creating method, system, terminal and storage medium for distributed storage | |
CN109800114B (en) | BMC visual test method, device, terminal and storage medium | |
CN113076111A (en) | Customized cluster configuration method, system, terminal and storage medium | |
CN110543459A (en) | Method, system, terminal and storage medium for acquiring file lock state under NFS | |
CN111858198A (en) | Multi-scheme memory plugging test method, system, terminal and storage medium | |
CN112463473B (en) | Method, system, terminal and storage medium for testing storage data stream unit | |
CN111475349B (en) | Method, system, terminal and storage medium for testing stability of cluster DPDK | |
CN109920466B (en) | Hard disk test data analysis method, device, terminal and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |