CN115203019A

CN115203019A - Performance test method, device and equipment of GPU (graphics processing Unit) server and storage medium

Info

Publication number: CN115203019A
Application number: CN202210713603.0A
Authority: CN
Inventors: 邱红飞; 郑文武; 王海霞; 黄植勤; 李先绪; 朱海云
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2022-06-22
Filing date: 2022-06-22
Publication date: 2022-10-18

Abstract

The disclosure provides a performance test method, a performance test device, performance test equipment and a storage medium of a GPU (graphics processing Unit) server, and relates to the technical field of computers. The method comprises the following steps: acquiring a mirror image data packet, wherein the mirror image data packet comprises a test tool for testing the performance of a GPU server of a graphic processor and library software for supporting the normal work of the GPU server; starting a mirror image data packet to obtain a container example of a test environment comprising a GPU server, wherein the test environment is composed of a test tool and library software; testing the performance of the GPU server according to the container example; and generating a test result according to the log generated by testing the performance of the GPU server. The test environment for testing the GPU server is deployed in a container mode, so that the situation that more software, dependence packages and the like need to be installed when the test environment for testing the GPU server is deployed is avoided, the deployment of the test environment is simpler and easier, and mistakes are not easy to occur.

Description

Performance test method, device and equipment of GPU (graphics processing Unit) server and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a performance testing method, apparatus, device, and storage medium for a GPU server.

Background

With the development of computer technology, the performance of a GPU (graphics processing unit) server is also continuously improved, and GPU servers with different performances are used to meet different image processing requirements, and it is determined that the performance of one GPU server needs to be tested on the GPU server.

In the related art, when testing a GPU server, a technician deploys a corresponding test environment according to the type of the GPU server, specifically, deploys library software and a driver for supporting normal operation of the GPU server, and a test tool for testing the GPU server, and then tests the performance of the GPU server according to the test tool.

However, different types of GPU servers correspond to different test environments, which are represented by different versions and types of library software, drivers, and test tools, so that the test environments are difficult to deploy, and errors are prone to occur during deployment.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The present disclosure provides a performance testing method, apparatus, device and storage medium for a GPU server, which at least to some extent overcomes the problem of difficulty in deploying a GPU server testing environment in the related art.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to one aspect of the disclosure, a performance testing method of a GPU server is provided, which includes:

acquiring a mirror image data packet, wherein the mirror image data packet comprises a test tool for testing the performance of a GPU server of a graphic processor and library software for supporting the GPU server to normally work;

starting the mirror image data packet to obtain a container example of a test environment comprising the GPU server, wherein the test environment is composed of the test tool and the library software;

testing the performance of the GPU server according to the container instance;

and generating a test result according to a log generated by testing the performance of the GPU server.

In one embodiment of the disclosure, the test tool comprises an inference test model for testing inference performance of the GPU server, and/or a training test model for testing training performance of the GPU server; the testing the performance of the GPU server according to the container instance comprises: determining a target test model from the test tool according to the performance to be tested of the GPU server, wherein the performance to be tested is inference performance or training performance, and the target test model is the inference test model or the training test model; and testing the performance to be tested of the GPU server according to the container example after the target test model is determined.

In an embodiment of the present disclosure, before the testing the performance to be tested of the GPU server according to the container instance after the target test model is determined, the method further includes: mapping a test data set corresponding to the target test model in the container instance; acquiring test parameters, wherein the test parameters comprise the number of target GPU cards used for testing in GPU cards included in the GPU server, and the batch size and batch number corresponding to the target GPU cards; the step of testing the performance to be tested of the GPU server according to the container example after the target test model is determined comprises the following steps: and testing the performance to be tested of the GPU server according to the container example, the test data set and the test parameters after the target test model is determined.

In an embodiment of the present disclosure, said mapping, in the container instance, the test data set corresponding to the target test model includes: mapping a set of inference data in the container instance if the target test model is the inference test model; in the case where the target test model is the training test model, a training data set is mapped in the container instance.

In an embodiment of the present disclosure, the test parameters further include a first parameter and/or a second parameter and/or a third parameter, where the first parameter is used to indicate the target GPU cards, the second parameter is used to indicate the central processing units CPU bound to each target GPU card, and the third parameter is used to indicate the memory bound to each target GPU card.

In an embodiment of the present disclosure, before generating a test result according to a log generated by testing performance of the GPU server, the method further includes: acquiring hardware information of the GPU server; the generating a test result according to the log generated by testing the performance of the GPU server comprises the following steps: and generating the test result according to the hardware information of the GPU server and the performance data and/or the energy consumption data recorded in the log.

In one embodiment of the present disclosure, the method further comprises: and installing an operating system, a GPU card driver corresponding to the GPU server and Docker software.

According to another aspect of the present disclosure, there is provided a performance testing apparatus of a GPU server, including:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a mirror image data packet, and the mirror image data packet comprises a test tool for testing the performance of a GPU server of a graphic processor and library software for supporting the GPU server to normally work;

the processing module is used for starting the mirror image data packet to obtain a container example of a test environment comprising the GPU server, wherein the test environment is composed of the test tool and the library software;

the testing module is used for testing the performance of the GPU server according to the container example;

and the generating module is used for generating a test result according to a log generated by testing the performance of the GPU server.

In one embodiment of the disclosure, the test tool comprises an inference test model for testing inference performance of the GPU server, and/or a training test model for testing training performance of the GPU server; the test module is used for determining a target test model from the test tool according to the performance to be tested of the GPU server, wherein the performance to be tested is inference performance or training performance, and the target test model is the inference test model or the training test model; and testing the performance to be tested of the GPU server according to the container example after the target test model is determined.

In one embodiment of the present disclosure, the apparatus further comprises: a mapping module to map a test data set corresponding to the target test model in the container instance; the acquisition module is further configured to acquire test parameters, where the test parameters include the number of target GPU cards used for testing in the GPU cards included in the GPU server, and the batch size and batch number corresponding to the target GPU cards; and the test module is used for testing the performance to be tested of the GPU server according to the container example, the test data set and the test parameters after the target test model is determined.

In an embodiment of the present disclosure, the mapping module is configured to map an inference data set in the container instance if the target test model is the inference test model; in the case where the target test model is the training test model, a training data set is mapped in the container instance.

In an embodiment of the present disclosure, the obtaining module is further configured to obtain hardware information of the GPU server; and the generating module is used for generating the test result according to the hardware information of the GPU server and the performance data and/or the energy consumption data recorded in the log.

In one embodiment of the present disclosure, the apparatus further comprises: and the installation module is used for installing an operating system, a GPU card driver corresponding to the GPU server and Docker software.

According to still another aspect of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform any of the above-described methods of performance testing of the GPU server via execution of the executable instructions.

According to yet another aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the performance testing method of the GPU server described above.

According to yet another aspect of the present disclosure, a computer program product is provided, which includes a computer program or computer instructions, which is loaded and executed by a processor, so as to make a computer implement the performance testing method of the GPU server described above.

The technical scheme provided by the embodiment of the disclosure at least has the following beneficial effects:

according to the technical scheme, the container example is obtained by obtaining the mirror image data packet (comprising the testing tool for testing the performance of the GPU server and the library software supporting the normal work of the GPU server) and starting the mirror image data packet, and the testing tool and the library software included in the container example form the testing environment of the GPU server. The test environment for testing the GPU server is deployed in a container mode, so that the situation that more software, dependence packages and the like need to be installed when the test environment for testing the GPU server is deployed is avoided, the deployment of the test environment is simpler and more convenient, and mistakes are not easy to occur. And then, testing the performance of the GPU server according to the container example, and generating a test result according to a log generated in the test.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.

FIG. 1 shows a schematic diagram of a system architecture in an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method for testing performance of a GPU server according to one embodiment of the disclosure;

FIG. 3 illustrates a flow diagram for making mirrored packets in a blank GPU server in one embodiment of the present disclosure;

FIG. 4 illustrates a flow diagram for testing inference performance or training performance of a GPU server in one embodiment of the present disclosure;

FIG. 5 is a flowchart illustrating a related art method for testing the performance of a GPU server;

FIG. 6 is a flow diagram that illustrates testing performance of a GPU server in one embodiment of the present disclosure.

FIG. 7 is a schematic diagram illustrating a performance testing apparatus for a GPU server according to an embodiment of the present disclosure;

fig. 8 shows a block diagram of an electronic device in an embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

In the technical field of computers, a performance test result of a GPU server is an important technical index of the GPU server, and different test results can be obtained when different test models are used for testing the same GPU server, so that if the performance of different GPU servers needs to be compared, the same test model needs to be used for testing, and when different GPU servers are used for testing different test models, the test results obtained respectively do not have the premise of comparison. The type of the GPU server is determined by the type of the carried GPU card, and the types of the GPU cards in the current market are numerous, and the corresponding test models are numerous, so that the test results of different GPU servers are difficult to transversely compare, and the performance test results of the GPU servers lack a unified standard.

On the other hand, before testing the GPU server, a corresponding test environment needs to be deployed in the GPU server, and the deployment of the test environment of the GPU server mainly includes deploying library software supporting normal work of the GPU server, and a test tool for testing the performance of the GPU server. The library software and the test tools corresponding to different types of servers are different, the test tools comprise various software and dependency packages, versions of the software are different on different platforms, the dependency packages are complex, and therefore the test environment for the GPU server is difficult to deploy, and therefore the test environment which cannot be used for testing the GPU server is easy to deploy in a mode of directly downloading the corresponding software, the dependency packages and the library software by a technician.

In view of the above, the embodiment of the present disclosure provides a performance testing method for a GPU server. According to the method, by means of containerization deployment of the test environment of the GPU server, the situation that more software, dependence packages and the like need to be installed when the test environment of the GPU server is deployed is avoided, and the test environment is more convenient to deploy and is not prone to making mistakes. Moreover, the test environment deployed by the container can comprise various test models, so that the same test model can be selected for testing when different GPU servers are tested, test results can be transversely compared, and the standardization of performance test results of the GPU servers is promoted.

Fig. 1 is a schematic diagram illustrating an exemplary system architecture of a performance testing method of a GPU server or a performance testing apparatus of a GPU server, which may be applied to the embodiments of the present disclosure.

As shown in fig. 1, the system architecture may include a server 101, a GPU server 102.

In some embodiments of the present disclosure, the server 101 stores therein a mirror data packet, which includes a test tool for testing the performance of the GPU server 102 and library software supporting normal operation of the GPU server 102, and the GPU server 102 can obtain the mirror data packet from the server 101. In some embodiments, the server 101 further stores therein an installation package of an operating system, a GPU card driver, and Docker software corresponding to the GPU server 102. The GPU server 102 may also obtain the installation package of the operating system, the GPU card driver, and the Docker software from the server 101.

The server 101 and the GPU server 102 may be communicatively connected through a network, which may be a wired network or a wireless network.

In some embodiments of the present disclosure, the wireless network or wired network described above uses standard communication technologies and/or protocols. The Network is typically the Internet, but may be any Network including, but not limited to, a Local Area Network (LAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a mobile, wireline or wireless Network, a private Network, or any combination of virtual private networks. In some embodiments, data exchanged over a network is represented using techniques and/or formats including HyperText Mark-up Language (HTML), extensible Mark-up Language (XML), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as Secure Socket Layer (SSL), transport Layer Security (TLS), virtual Private Network (VPN), internet protocol Security (IPsec), and so on. In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.

In another embodiment of the present disclosure, an exemplary system architecture of a performance testing apparatus of a GPU server or a performance testing method of a GPU server that may be applied to an embodiment of the present disclosure may include only the GPU server 102, and an installation package of a mirror image data package, an operating system, a GPU card driver, and Docker software may be stored in a mobile hard disk, and then the GPU server may obtain the installation package of the mirror image data package, the operating system, the GPU card driver, and the Docker software from the mobile hard disk.

The server 101 may be a server providing various services, in some embodiments, the server 101 may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), and a big data and artificial intelligence platform. The GPU server 102 may be a server configured with any GPU card, which is not limited in this disclosure.

The present exemplary embodiment will be described in detail below with reference to the drawings and examples.

The embodiment of the disclosure provides a performance test method of a GPU server, which can be executed by any electronic equipment, for example, the electronic equipment can be executed by the GPU server.

Fig. 2 is a flowchart illustrating a method for testing performance of a GPU server in an embodiment of the present disclosure, and as shown in fig. 1, the method for testing performance of a GPU server in the embodiment of the present disclosure includes the following steps S201 to S204.

S201, the GPU server acquires a mirror image data packet, wherein the mirror image data packet comprises a test tool for testing the performance of the GPU server of the graphics processor and library software for supporting the normal work of the GPU server.

The mirror image package is an executable independent software package, and the mirror image package may include all contents required for running a certain software, for example, in the embodiment of the present disclosure, the mirror image package acquired by the GPU server includes contents required for testing the performance of the GPU server. In some embodiments, the test tool for testing the performance of the GPU server in the image data packet acquired by the GPU server includes: deep learning framework software (e.g., tensorFlow, pytrch, caffe2, MXNet, etc.), python software, test models. The library software supporting the normal work of the GPU server in the mirror image data packet comprises: library software corresponding to the GPU server.

In some embodiments, the mirror data package may be stored in a server or in a removable hard disk, which is not limited by this disclosure. When the mirror image data packet is stored in the server, the GPU server is operated manually, an instruction for acquiring the mirror image data packet from the server is sent to the GPU server through the related control, and after the GPU server receives the instruction, the mirror image data packet is acquired from the server for storing the mirror image data packet. In other embodiments, when the mirror image data packet is stored in the mobile hard disk, the mobile hard disk is manually inserted into a corresponding slot of the GPU server, and an instruction for downloading the mirror image data packet from the mobile hard disk is manually sent to the GPU server; and the GPU server downloads the mirror image data packet from the mobile hard disk after receiving the instruction. For example, when inference performance test needs to be performed on the GPU server, an instruction for acquiring a mirror image data packet which can be used for inference performance test is manually issued to the GPU server through the related control, and after receiving the instruction, the GPU server acquires a manually-specified mirror image data packet which can be used for inference performance test from the mobile hard disk or the server. The mirror image data packet used for reasoning performance test is the mirror image data packet comprising the reasoning test model.

In some embodiments, before the GPU server obtains the mirror data packet, if the GPU server has not installed the operating system, the GPU card driver, and the Docker software, the method further includes: and installing an operating system, a GPU card driver corresponding to a GPU card configured by the GPU server and Docker software in the GPU server. The Docker software is used for starting the mirror image data packet in the GPU server to obtain the container instance and operating the container instance.

S202, the GPU server starts a mirror image data packet to obtain a container example of a test environment comprising the GPU server, wherein the test environment is composed of a test tool and library software.

After the GPU server obtains the mirror image data packet, the mirror image data packet may be started (load instruction) according to the Docker software. After the mirror image data packet is started, a container example of the mirror image data packet is obtained in the GPU server, and a test environment is provided for the GPU server in the container example. The test environment provided by the container instance is composed of a test tool and library software. The test tools and library software included in the container instance are consistent with the test tools and library software included in the mirrored data package from which the container instance was obtained.

S203, the GPU server tests the performance of the GPU server according to the container example.

And testing the performance of the GPU server according to the container example, namely testing the performance of the GPU server in a test environment provided by the container example.

In some embodiments, the testing tool in the container instance comprises an inference test model for testing inference performance of the GPU server, and/or a training test model for testing training performance of the GPU server; testing the performance of the GPU server according to the container example, comprising the following steps: according to the performance to be tested of the GPU server, determining a target test model from the test tools, wherein the performance to be tested is inference performance or training performance, and the target test model is an inference test model or a training test model; and testing the performance to be tested of the GPU server according to the container example after the target test model is determined.

In some embodiments, a code instruction for determining the target test model is manually input into the container instance according to the performance to be tested of the GPU server, and after receiving the code instruction, the GPU server executes the code instruction in the container instance, thereby determining the target test model from the test tools included in the container instance. For example, the performance to be tested is inference performance, the test tool in the container instance includes an inference test model and a training test model, and the inference test model corresponds to a plurality of neural network models, such as VGG16, resNet50, and the like, and the instruction for determining the target test model from the test tool may be: model = TL-VGG16. Wherein, TL-VGG16 is the name of the model file corresponding to the inference test model VGG16. For another example, the performance to be tested is training performance, the test tool in the container example includes an inference test model and a training test model, the training test model corresponds to multiple neural network models, such as VGG16, resNet50, and the like, and the instruction for determining the target test model from the test tool may be: model = XL-VGG16. Wherein XL-VGG16 is the name of the model file corresponding to the training test model VGG16.

In some embodiments, the container instance does not include the test data set corresponding to the target test model, and the test on the performance to be tested of the GPU server requires the use of the corresponding test data set. Before testing the performance to be tested of the GPU server according to the container example after the target test model is determined, the method further comprises the following steps: mapping a test data set corresponding to the target test model in the container instance; and acquiring test parameters, wherein the test parameters comprise the number of target GPU cards used for testing in GPU cards included in the GPU server, and the batch size and batch number corresponding to the target GPU cards.

In some embodiments, the mirror image data packet has default test parameters, and if the test parameters are not obtained before the performance to be tested of the GPU server is tested according to the container instance after the target test model is determined, the performance of the GPU server is tested by using the default test parameters in the mirror image data packet.

Wherein mapping a test data set corresponding to a target test model in a container instance comprises: and specifying the storage position of the test data set corresponding to the target test model in the GPU server and the name of the test data set in the container example. After the test data set corresponding to the target test model is manually stored in a memory of the GPU server, a corresponding instruction for specifying the position and the name of the test data set is input into the GPU server, and after the GPU server receives the instruction, the storage position of the test data set can be specified in the container through the instruction. For example, if the test data set is in the tf _ imagenet file of the data file of the directory of the GPU server, and the name of the test data set is imagenet, the instruction indicating the storage location and name of the test data set may be: data _ dir =/data/tf _ imagenet, data _ name = imagenet. In some embodiments, the data format of the test data set corresponding to the target test model is a data format supported by the neural network framework software installed in the container instance, for example, when the neural network framework software is TensorFlow, the supported data format may be a TF _ Record format.

In some embodiments, mapping a test data set corresponding to a target test model in a container instance includes: mapping the inference data set in the container instance under the condition that the target test model is the inference test model; in the case where the target test model is a training test model, a training data set is mapped in the container instance.

In some embodiments, the test parameters are manually input into the container instance of the GPU server, and the acquisition of the test parameters is completed when the GPU server receives the test parameters. In some embodiments, when specifying the number of target GPU cards to be tested among the GPU cards included in the GPU server, the number of target GPU cards may be specified by num _ GPUs = number of target GPU cards, for example, num _ GPUs =8 indicates that the number of target GPU cards is 8. Specifying the batch size corresponding to the target GPU card may be specified by batch _ size = batch size, for example, batch _ size =128 indicates that the batch size is 128, that is, the data amount of a batch of data is 128 when one target GPU card batch processes data, and when the test data set is composed of pictures, batch _ size =128 indicates that a batch of data is 128 pictures when one target GPU batch processes data. When the number of batches corresponding to the target GPU card is specified, the number of batches may be specified by num _ batchs = the number of batches, for example, num _ batchs =1000 indicates that the number of batches is 1000, that is, each target GPU card is specified to process 1000 batches of data when the performance of the GPU server is tested.

In some embodiments, the test parameters further include a first parameter and/or a second parameter and/or a third parameter, where the first parameter is used to indicate the target GPU cards, the second parameter is used to indicate the central processing units CPU bound to each target GPU card, and the third parameter is used to indicate the memory bound to each target GPU card. When the test parameters include a first parameter, a target GPU card used by the GPU server for testing may be specified when the GPU service is tested, and the first parameter may be specified by hip _ visual _ devices = the first parameter, for example, when the number of the target GPU cards is 4, hip _ visual _ devices =0,1,2,3 indicates that the 4 target GPU cards are respectively the 0 th, 1 st, 2 nd, and 3 rd GPU cards of the GPU server. Under the condition that the test parameters comprise the first parameters, the GPU server can still perform performance test according to the target GPU cards specified by the first parameters when the GPU server does not automatically call a corresponding number of target GPU cards for performance test, and stability of performance test performed by the GPU server is guaranteed.

In some embodiments, the GPU card typically forwards the data that cannot be processed to the CPU for processing, and then the GPU card continues to process subsequent data according to the result after receiving the result. For example, GPU cards are often unable to process floating point type data. The second parameter is used for binding the target GPU card and the CPU so as to ensure that data which cannot be processed by the target GPU card can be processed by the bound CPU in time, so that the influence of the need of queuing for CPU processing when the speed of processing the data by the GPU card cannot be processed by the CPU due to the fact that some data which cannot be processed are transferred is avoided, and further the test result of testing the performance of the GPU server can be more effective. The second parameter may be specified by cppunodebound = second parameter, for example, when the number of target GPU cards is 4 and corresponds to 0 th, 1 st, 2 nd, and 3 rd GPU cards, respectively, cppunodebound =4,5,6,7 means that the 0 th target GPU card is bound to the 4 th CPU, the 1 st target GPU card is bound to the 5 th CPU, the 2 nd target GPU card is bound to the 6 th CPU, and the 3 rd target GPU card is bound to the 7 th CPU.

In some embodiments, the third parameter may be specified by membindd = third parameter, for example, when the number of target GPU cards is 4 and corresponds to 0 th, 1 st, 2 nd and 3 rd GPU cards, respectively, membindd =4,5,6,7 indicates that the 0 th target GPU card is bound to the 4 th memory, the 1 st target GPU card is bound to the 5 th memory, the 2 nd target GPU card is bound to the 6 th memory, and the 3 rd target GPU card is bound to the 7 th memory.

The first parameter, the second parameter and the third parameter can reduce the influence of a CPU and a memory of the GPU server on a test result when the performance of the GPU server is tested, so that the test result is more effective.

In some embodiments, after mapping a test data set corresponding to a target test model in a container instance and obtaining test parameters, according to the container instance after determining the target test model, testing the performance to be tested of a GPU server includes: and testing the performance to be tested of the GPU server according to the container example, the test data set and the test parameters after the target test model is determined. And testing the performance to be tested of the GPU server, namely processing a test data set through a target test model by using the computing power of a target GPU card of the GPU server under the setting corresponding to the test parameters in a test environment provided by the container example.

And testing the GPU server under the condition that the target test model is a training test model, namely training a neural network model corresponding to the training test model by using a test data set in a container example. In the process of training the neural network model, the precision (accuracy) of the neural network model, and the rate and energy consumption of the GPU server for processing data are recorded in the log of the GPU server. And testing the GPU server under the condition that the target test model is the reasoning test model, namely processing the data in the test data set by utilizing the neural network model corresponding to the reasoning test model in the container instance. And in the process of processing the data in the test data set by using the neural network model corresponding to the reasoning test model, the data processing rate and the energy consumption of the GPU server are recorded in the log of the GPU server.

And S204, the GPU server generates a test result according to the log generated by testing the performance of the GPU server.

In some embodiments, generating the test result from the log generated by testing the performance of the GPU server comprises: and generating a test result according to the data processing speed of the GPU server when the performance of the GPU server is tested in the log of the GPU server. In some embodiments, the data processing rate of the GPU server embodied in the test results comprises: and in the process of testing the performance of the GPU server, the GPU server processes the data at a stable speed. In some embodiments, when the performance to be tested of the GPU server is a training performance, the rate at which the GPU server processes data is stabilized, including the rate at which the GPU server processes data after the accuracy of the neural network model corresponding to the training test model is stabilized. For example, when the test data set consists of images, the rate at which the GPU server processes the data is stabilized may be 1320.3images/s (images/second).

In some embodiments of the present disclosure, before generating the test result, the method further includes: and acquiring hardware information of the GPU server. At this time, generating a test result according to a log generated by testing the performance of the GPU server includes: and generating a test result according to the hardware information of the GPU server and the performance data and/or the energy consumption data recorded in the log. The performance in the performance data recorded in the log is the rate of processing data by the GPU server; the energy consumption of the GPU server is also an important data representing the performance of the GPU server, for example, the GPU server with lower energy consumption has better performance at the same data processing speed. In some embodiments, the hardware information of the GPU server is stored in the memory of the GPU server, and the obtaining the hardware information of the GPU server includes: and acquiring hardware information of the GPU server from the storage of the GPU server.

The hardware information of the GPU server is embodied in the test result, so that the test result can more intuitively embody the GPU server under which hardware configuration the test result corresponds to.

In other embodiments, when the performance of two different GPU servers needs to be compared, the test environments of the different GPU servers may be consistent (including consistent test model, consistent test parameter, consistent test data set, and consistent test framework software) by deploying the same mirror image data package (the mirror image data package includes library software corresponding to the two different GPU servers), so that the test results have contrast.

According to the technical scheme, the container example is obtained by obtaining the mirror image data packet (comprising the testing tool for testing the performance of the GPU server and the library software supporting the normal work of the GPU server) and starting the mirror image data packet, and the testing tool and the library software included in the container example form the testing environment of the GPU server. The test environment for testing the GPU server is deployed in a container mode, so that the situation that more software, dependence packages and the like need to be installed when the test environment for testing the GPU server is deployed is avoided, the deployment of the test environment is simpler and more convenient, and mistakes are not easy to occur. And then, testing the performance of the GPU server according to the container instance, and generating a test result according to a log generated in the test.

In some embodiments, the mirror image packet may be generated in any GPU server, and the generation process of the mirror image packet will be described below by taking as an example that the mirror image packet is generated in a blank GPU server (other drivers such as an operating system and a GPU card driver are not installed yet). In some embodiments of the present disclosure, as shown in fig. 3, the process of making the mirror packet in the blank GPU server includes S301 to S309.

S301: an operating system is installed in the blank GPU server, for example, the operating system is a Linux operating system.

S302: drivers including other components such as a GPU card driver, a network card driver, and a disk driver are installed in the GPU server.

S303: and installing Docker software in the GPU server, wherein the Docker software is used for creating a container and packaging the container into a mirror image data packet after the container configuration is finished.

S304: creating a container based on Docker software, and installing library software corresponding to the GPU card in the container, wherein in some embodiments of the disclosure, the library software corresponding to the GPU card installed in the container comprises library software corresponding to the GPU card; in other embodiments of the present disclosure, the library software corresponding to the GPU card installed in the container includes library software corresponding to a plurality of GPU cards, and as to which kind of library software corresponding to the GPU card is specifically included in the library software corresponding to the plurality of GPU cards, the embodiments of the present disclosure are not limited, and may be determined according to the type of the GPU card configured in the GPU server to be tested. For example, if the GPU card configured in the GPU server under test is an england GPU card, the library software corresponding to the multiple GPU cards installed in the container may include: library software corresponding to engida GPU cards (including CUDA (computer Unified Device Architecture) and cuDNN (a GPU acceleration library for deep neural networks)) and other types of GPU card corresponding library software.

S305: deep learning framework software (including application programs and dependency packages corresponding to the deep learning framework software) is installed in the container created in S304, and the deep learning framework provided by the deep learning framework software can simplify development and training of a deep learning model, and in an embodiment of the present disclosure, the deep learning framework software is used for providing the deep learning framework for the test model deployed in S307. For example, the deep learning framework software can be TensorFlow, or Pyorch, or Caffe2, or MXNet.

S306: python software (including an application program and a dependent package corresponding to the Python software) is installed in the container created in S304, and the Python software is used for writing and debugging a test script.

S307: a test model is deployed in the container created in S304, which may include a training test model and/or an inference test model. In some embodiments of the present disclosure, when the testing the model includes training the testing model, deploying the testing model in the container created in S304 may include: the training test model is written by Python software, and the initial neural network model with the parameters of the training test model not adjusted is trained.

In some embodiments of the present disclosure, when the test model comprises an inference test model, deploying the test model in the container created in S304 may include: importing a forward propagation diagram (an initial reasoning test model without parameters) into the container created in S304, and generating a file in a format corresponding to the deep learning software installed in S305 at a specified position in the container after importing the forward propagation diagram; a parameter file corresponding to the forward propagation map in a format corresponding to the deep learning software installed in S305 is acquired. And solidifying the forward propagation diagram by using the parameter file, namely importing the parameter file into a file corresponding to the forward propagation diagram to obtain a reasoning test model. The parameter file is imported into a file corresponding to the forward propagation diagram so as to import the parameters included in the parameter file into the forward propagation diagram.

Taking the deep learning framework software installed in S305 as a tensrflow, the file format generated after the forward propagation graph is imported into the container is a pb (a file format supported by the tensrflow) file, and the file format of the parameter file is a cpkt (a file format supported by the tensrflow) file.

In the case that the test model deployed in S307 includes a training test model, which neural network model is specifically the training test model, the embodiment of the present disclosure is not limited, and may be determined empirically. For example, the Neural Network model that can be deployed as a training test model into a container may be VGG16 (Visual Geometry Group, super-resolution test sequence (16 is related to the number of layers of the Neural Network model)), or ResNet50 (Residual Network, 50 is related to the number of layers of the Neural Network model), or NMT (Neural Machine translation), or AlexNet (Alex Network), etc.

In the case that the deployed test model in S307 includes an inference test model, which is a specific neural network model, the embodiments of the present disclosure are not limited and may be determined empirically. For example, the neural network model that can be deployed into the container as an inferential test model can be VGG16, or ResNet50, or NMT, or AlexNet. It should be noted that, although the inference test model and the training test model may be the same neural network model, the inference test model is a neural network model with parameters adjusted to a certain degree, and the training test model is an initial neural network model with parameters not adjusted.

In some embodiments, where the test model comprises a trained test model and/or an inferred test model, the trained test model and/or the inferred test model comprises a plurality of neural network models. Namely, a plurality of training test models and/or a plurality of reasoning test models are deployed in the container.

S308: a test data set is mapped in the container created in S304, the data format of the test data set corresponds to the deep learning framework software installed in S305, and taking the deep learning software installed in S305 as tensrfow as an example, the data format of the test data set is TF _ Record (a data format supported by tensrfow). The test data set comprises a training data set and/or an inference data set, the test data set corresponds to the test model deployed in S307, and when the test model comprises the inference test model, the test data set comprises the inference data set; when the test model comprises a training test model, the test data set comprises a training data set. What data is specifically included in the test data set is determined by the test model deployed in S307. For example, when the test model includes a training test model and/or an inference test model, and the training test model and/or the inference test model is VGG16, or ResNet50, or AlexNet, the test data set includes a training data set and/or an inference data set composed of pictures; when the training test model and/or the inference test model is NMT, the test data set includes a training data set and/or an inference data set composed of text.

In some embodiments, S308 is not performed, and S309 is performed directly after S307 is performed.

S309: and submitting the container package into a mirror image data package.

In some embodiments, the successfully-made mirror image data packet may include library software of one GPU server, and may also include library software of multiple GPU servers, which is not limited in this disclosure. Meanwhile, the test model included in the mirror image data packet may include only the inference test model or the training test model, or include both the inference test model and the training test model. When the test model comprises an inference test model, the inference test model in the mirror image data packet can correspond to one or more neural network models which can be used as the inference test model; when the test model includes a training test model, the training test model in the mirror image data packet may correspond to one or more neural network models that may be used as training test models.

After the mirror image data packet is manufactured, the mirror image data packet may be stored in a server or a mobile hard disk, which is not limited by the present disclosure.

By deploying the test environment of the GPU server in the container and packaging and submitting the container into a mirror image data packet, the test environment of the GPU server can be directly deployed according to the mirror image data packet when the GPU server is tested subsequently. And when the mirror image data packet comprises the library software corresponding to the various GPU servers, the mirror image data packet can be multiplexed in the various GPU servers to carry out the deployment of the test environment, so that the test environment is not required to be respectively configured when the performance of the various GPU servers is tested, and the deployment efficiency of the test environment is improved.

In order to facilitate understanding of the performance test method of the GPU server provided by the embodiment of the present disclosure, an inference performance test process and a training performance test process of the GPU server are described below by taking an example in which a test tool in a mirror image data packet includes an inference test model and a training test model, and the inference test model and the training test model both correspond to a plurality of neural network models. In some embodiments, the process of testing the inference performance or training performance of the GPU server based on the mirror data packet is shown in fig. 4.

S401: and installing drivers of other components such as an operating system, a GPU card driver and the like and Docker software.

S402: and deploying the mirror image data packet, and starting the mirror image data packet to obtain a container instance.

S403: selecting a test model in the container instance, where the test model may be any one of the neural network models corresponding to the inference test model, or may be any one of the neural network models corresponding to the training test model, for example, the neural network model corresponding to the training test model includes: VGG16, alexNet, NMT, etc.; the neural network model corresponding to the reasoning test model comprises: resNet50, NMT, VGG16, and the like. For example, the selected test model may be VGG16 in a training test model, the instruction for selecting a model may be model = XL-VGG16, and XL-VGG16 is a file name corresponding to VGG16 in the training test model; or the selected test model can be the VGG16 in the inference test model, the instruction for selecting the model can be model = TL-VGG16, and the TL-VGG16 is the file name corresponding to the VGG16 in the inference test model.

S404: the test parameters are input into the container example, and the content specifically included in the test parameters is already described in S203 of the embodiment corresponding to fig. 2, and is not described again here. The specific values of the various parameters included in the test parameters may be set empirically. For example, when the selected test model is the VGG16 in the training test model, the input test parameters may be: batch _ size =128 (batch size 128), num _ batches =1000 (batch number 1000), num _ GPUs =8 (number of target GPU cards used is 8). For example, when the selected test model is the VGG16 in the inferential test model, the input test parameters may be: batch _ size =64, num \ubatches =1000, num \ugpus =8.

S405: mapping S404 the test data set corresponding to the selected test model in the container instance, where the instruction for mapping the test data set may be: data _ dir =/data/tf _ imagenet (storage location of test data set), data _ name = imagenet (name of test data set).

S406: executing training or reasoning performance test, and when the selected test model is a reasoning test model, testing the reasoning performance of the GPU server; and when the selected test model is a training test model, testing the training performance of the GPU server.

S407: in some embodiments, the test is terminated, where the test is terminated, that is, the data of the batch quantity and the batch size in the test parameters are processed completely, for example, when the training performance test is performed, the test data set is composed of images, the batch quantity is 1000, and the batch size is 128, and then after 128 × 1000=128000 images are all used for training the training test model, the test is considered to be terminated.

S408: the process of generating the test result, and the content included in the test result are already described in S204 of the embodiment corresponding to fig. 2, and are not described herein again.

When the test tool in the mirror image data packet comprises the reasoning test model and the training test model, a test environment for testing the reasoning performance of the GPU server and a test environment for testing the training performance of the GPU server can be simultaneously deployed in the GPU server through the mirror image data packet. By the method, the difficulty and the workload of respectively deploying the test environments when the reasoning performance and the training performance of the GPU server need to be tested are reduced.

In some embodiments, according to the performance testing method of the GPU server provided in the related art, a process of testing the performance of the GPU server may be as shown in fig. 5, and includes:

s501: installing an operating system in a GPU server;

s502: installing drivers of other components such as a GPU card driver and the like in the GPU server;

s503: installing library software in the GPU server, wherein the library software is corresponding to the GPU server;

s504: the method comprises the steps that Python software and a related dependency package are installed in a GPU server, the Python software is used for compiling a test model, and the dependency package is used for supporting the normal work of the Python software;

s505: installing neural network framework software, such as TensorFlow, in the GPU server, wherein the neural network framework software is used for providing a deep learning framework for the test model in S507;

s506: preparing a test script in a GPU server;

s507: preparing a test model in the GPU server, wherein the test model can be an inference test model or a training test model;

s508: preparing a test data set in a GPU server;

s509: configuring test parameters in a GPU server;

s510: performing inference or training performance tests in the GPU server;

s511: and the GPU server outputs a report.

According to the provided performance testing method of the GPU server, when the performance of the GPU server is tested, a testing environment in the GPU server needs to be deployed and tested in the GPU server, the deployment of the testing environment comprises the steps of installing various kinds of software and dependency packages in the GPU server, preparing testing scripts, testing models and testing data sets, and configuring testing parameters. In the related art, the deployment process of the test environment is complex, and the versions of software and corresponding dependency packages which need to be installed are numerous on each platform, so that the deployment of the test environment is prone to errors. Moreover, the test environment needs to be redeployed for each GPU server to be tested, and because the deployment of the test environment is not in accordance with a uniform standard, the test environments of different GPU servers have different possibilities, which leads to a failure to compare the test results of different GPU servers. According to the performance testing method of the GPU server provided by the embodiment of the present disclosure, a process of testing the performance of the GPU server may be as shown in fig. 6, and includes:

s601: installing an operating system in a GPU server;

s602: installing drivers of other components such as a GPU card driver and the like in the GPU server;

s603: installing Docker software in the GPU server, wherein the Docker software is used for loading a mirror image data packet in the GPU server and starting the mirror image data packet;

s604: loading a mirror image data packet in a GPU server, wherein the mirror image data packet has default test parameters;

s605: mapping a test data set to the mirror image data packet;

s606: performing inference or training performance tests in the container instance;

s607: and the GPU server outputs a report.

As can be seen from comparison between the performance test process of the GPU server corresponding to fig. 5 and the performance test process of the GPU server corresponding to fig. 6, when a test environment is deployed, the performance test method of the GPU server provided in the related art needs to install various software and dependency packages in the GPU server, prepare a test script, a test model, and a test data set, and configure test parameters. Compared with the deployment of the test environment in the related art, the performance test method of the GPU server provided by the embodiment of the disclosure deploys the test environment in a mode of loading the mirror image data packet (containerization), is simpler and easier when deploying the test environment, and can directly multiplex the mirror image data packet into different GPU servers for deploying the test environment when the mirror image data packet comprises library software corresponding to different GPU servers and tests the performance of different GPU servers. Furthermore, the same mirror image data packet can be multiplexed to conveniently deploy the test environment with the same test framework software, test model, test data and test parameters in different GPU servers, so that the performance test results of different GPU servers can be compared. Furthermore, a simple test environment deployment mode is beneficial to promoting the unification of the test standards of the GPU server.

Based on the same inventive concept, the embodiment of the present disclosure further provides a performance testing apparatus for a GPU server, as described in the following embodiments. Because the principle of solving the problem of the embodiment of the apparatus is similar to that of the embodiment of the method, reference may be made to the implementation of the embodiment of the apparatus, and repeated descriptions are omitted.

Fig. 7 is a schematic diagram illustrating a performance testing apparatus for a GPU server in an embodiment of the present disclosure, as shown in fig. 7, the apparatus includes:

an obtaining module 701, configured to obtain a mirror image data packet, where the mirror image data packet includes a test tool for testing performance of a GPU server of a graphics processor and library software for supporting normal operation of the GPU server;

a processing module 702, configured to start a mirror image data packet, to obtain a container instance of a test environment including a GPU server, where the test environment is composed of a test tool and library software;

the test module 703 is configured to test the performance of the GPU server according to the container instance;

the generating module 704 is configured to generate a test result according to a log generated by testing the performance of the GPU server.

In one embodiment of the disclosure, the test tool comprises an inference test model for testing inference performance of the GPU server, and/or a training test model for testing training performance of the GPU server; the test module 703 is configured to determine a target test model from the test tools according to a performance to be tested of the GPU server, where the performance to be tested is inference performance or training performance, and the target test model is inference test model or training test model; and testing the performance to be tested of the GPU server according to the container example after the target test model is determined.

In one embodiment of the present disclosure, the apparatus further comprises: a mapping module 705 for mapping a test data set corresponding to the target test model in the container instance; the obtaining module 701 is further configured to obtain test parameters, where the test parameters include the number of target GPU cards used for testing in the GPU cards included in the GPU server, and the batch size and batch number corresponding to the target GPU cards; the test module 703 is configured to test the performance to be tested of the GPU server according to the container instance, the test data set, and the test parameters after the target test model is determined.

In one embodiment of the present disclosure, the mapping module 705 is configured to map the inference data set in the container instance if the target test model is an inference test model; in the case where the target test model is a training test model, the training data set is mapped in the container instance.

In an embodiment of the present disclosure, the obtaining module 701 is further configured to obtain hardware information of a GPU server; and the generating module is used for generating a test result according to the hardware information of the GPU server and the performance data and/or the energy consumption data recorded in the log.

In one embodiment of the present disclosure, the apparatus further comprises: and the installation module 706 is configured to install the operating system, the GPU card driver corresponding to the GPU server, and the Docker software.

It should be noted that although in the above detailed description several modules of the device for action execution are mentioned, this division is not mandatory. Indeed, the features and functionality of two or more of the modules described above may be embodied in one module, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module described above may be further divided into embodiments by a plurality of modules.

As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 800 according to this embodiment of the disclosure is described below with reference to fig. 8. The electronic device 800 shown in fig. 8 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present disclosure.

As shown in fig. 8, electronic device 800 is in the form of a general purpose computing device. The components of the electronic device 800 may include, but are not limited to: the at least one processing unit 810, the at least one memory unit 820, and a bus 830 that couples various system components including the memory unit 820 and the processing unit 810.

Wherein the storage unit stores program code that can be executed by the processing unit 810, so that the processing unit 810 performs the steps according to various exemplary embodiments of the present disclosure described in the above section "detailed description of the present specification.

The storage unit 820 may include readable media in the form of volatile memory units such as a random access memory unit (RAM) 8201 and/or a cache memory unit 8202, and may further include a read only memory unit (ROM) 8203.

The storage unit 820 may also include a program/utility 8204 having a set (at least one) of program modules 8205, such program modules 8205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment.

Bus 830 may be any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 800 may also communicate with one or more external devices 840 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 800, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 800 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 850. Also, the electronic device 800 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 860. As shown in FIG. 8, the network adapter 860 communicates with the other modules of the electronic device 800 via the bus 830. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 800, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium, which may be a readable signal medium or a readable storage medium. On which a program product capable of implementing the above-described method of the present disclosure is stored. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the disclosure as described in the "detailed description" section above of this specification, when the program product is run on the terminal device.

More specific examples of the computer-readable storage medium in the present disclosure may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

In the present disclosure, a computer readable storage medium may include a propagated data signal with readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Alternatively, program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

In an exemplary embodiment of the present disclosure, there is also provided a computer program product, which includes a computer program or computer instructions, which is loaded and executed by a processor, so as to make a computer implement the performance testing method of the GPU server.

In particular implementations, program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In situations involving remote computing devices, the remote computing devices may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to external computing devices (e.g., through the internet using an internet service provider).

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Moreover, although the steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that the steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a mobile terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope of the disclosure being indicated by the following claims.

Claims

1. A performance testing method of a GPU server is characterized by comprising the following steps:

testing the performance of the GPU server according to the container instance;

2. The method according to claim 1, wherein the test tool comprises an inference test model for testing inference performance of the GPU server, and/or a training test model for testing training performance of the GPU server;

the testing the performance of the GPU server according to the container instance comprises:

according to the performance to be tested of the GPU server, determining a target test model from the test tool, wherein the performance to be tested is inference performance or training performance, and the target test model is the inference test model or the training test model;

and testing the performance to be tested of the GPU server according to the container example after the target test model is determined.

3. The method according to claim 2, wherein before the testing the performance to be tested of the GPU server according to the container instance after the target test model is determined, the method further comprises:

mapping a test data set corresponding to the target test model in the container instance;

acquiring test parameters, wherein the test parameters comprise the number of target GPU cards used for testing in GPU cards included in the GPU server, and the batch size and batch number corresponding to the target GPU cards;

the step of testing the performance to be tested of the GPU server according to the container example after the target test model is determined comprises the following steps:

and testing the performance to be tested of the GPU server according to the container example, the test data set and the test parameters after the target test model is determined.

4. The method of claim 3, wherein mapping the test data set corresponding to the target test model in the container instance comprises:

mapping a set of inference data in the container instance if the target test model is the inference test model;

in the case where the target test model is the training test model, a training data set is mapped in the container instance.

5. The method according to claim 3, wherein the test parameters further include a first parameter and/or a second parameter and/or a third parameter, the first parameter is used for indicating the target GPU cards, the second parameter is used for indicating a Central Processing Unit (CPU) of each target GPU card corresponding to the binding, and the third parameter is used for indicating a memory of each target GPU card corresponding to the binding.

6. The method according to any of claims 1-5, wherein before generating the test result from the log generated by testing the performance of the GPU server, the method further comprises:

acquiring hardware information of the GPU server;

generating a test result according to a log generated by testing the performance of the GPU server, wherein the test result comprises the following steps:

and generating the test result according to the hardware information of the GPU server and the performance data and/or the energy consumption data recorded in the log.

7. The method of any of claims 1-5, further comprising:

and installing an operating system, a GPU card driver corresponding to the GPU server and Docker software.

8. A performance testing apparatus for a GPU server, the apparatus comprising:

the processing module is used for starting the mirror image data packet to obtain a container example of a test environment comprising the GPU server, and the test environment is composed of the test tool and the library software;

9. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the performance testing method of the GPU server of any of claims 1-7 via execution of the executable instructions.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the performance testing method of a GPU server of any of claims 1 to 7.