Detailed Description
So that the manner in which the features and elements of the disclosed embodiments can be understood in detail, a more particular description of the disclosed embodiments, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may be practiced without these details. In other instances, well-known structures and devices may be shown in simplified form in order to simplify the drawing.
The terms "first," "second," and the like in the description and in the claims, and the above-described drawings of embodiments of the present disclosure, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the present disclosure described herein may be made. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions.
The term "plurality" means two or more unless otherwise specified.
In the embodiment of the present disclosure, the character "/" indicates that the preceding and following objects are in an or relationship. For example, A/B represents: a or B.
The term "and/or" is an associative relationship that describes objects, meaning that three relationships may exist. For example, a and/or B, represents: a or B, or A and B.
The term "correspond" may refer to an association or binding relationship, and a corresponds to B refers to an association or binding relationship between a and B.
With reference to fig. 1, an embodiment of the present disclosure provides a method for testing performance of an accelerator card in a server, including:
step S101, determining a plurality of performances to be tested of the accelerator card in the server and weights corresponding to the performances to be tested, and acquiring data to be tested corresponding to the performances to be tested.
And S102, inputting the data to be tested into a performance test model preset in the server under the condition that the accelerator card normally operates, and obtaining the reasoning accuracy corresponding to the performance to be tested.
And step S103, acquiring a performance test result of the accelerator card according to the weight corresponding to each performance to be tested and the reasoning accuracy corresponding to each performance to be tested.
By adopting the method for testing the performance of the accelerator card in the server provided by the embodiment of the disclosure, the plurality of performances to be tested of the accelerator card in the server and the corresponding weights thereof are determined, the data to be tested corresponding to each performance to be tested is obtained, under the condition that the accelerator card normally operates, the reasoning accuracy rate corresponding to each performance to be tested is obtained through each data to be tested by using the performance test model preset in the server, and then the performance test result of the accelerator card is obtained by using the weight corresponding to each performance to be tested and the reasoning accuracy rate corresponding to each performance to be tested. The performance test of the accelerator card in the server is realized, so that the performance of the accelerator card can be known conveniently.
In some embodiments, the server for testing the performance of the accelerator card is powered normally, the version is a stable commercial BIOS version, a Linux operating system is installed, and a driver can run normally, and an accelerator card hardware interface driver, an accelerator card driver, a deep learning software library and the like are installed and can run normally.
In some embodiments, the accelerator card comprises an artificial intelligence accelerator card. In some embodiments, the to-be-tested performance of the accelerator card in the server includes one or more of an image classification performance, a target detection performance, a semantic segmentation performance, a language inference performance, a recommendation performance, and the like.
Optionally, obtaining to-be-tested data corresponding to each to-be-tested performance includes: and screening the data to be tested corresponding to each performance to be tested from a preset data set.
In some embodiments, the data set comprises: ImageNet2012 Dataset, COCO (common Objects in context) Dataset, BraTS 2019 Dataset, SQuAD v1.1(The Stanford query Answering Dataset) Dataset, and Criteo Terabbyte Dataset, among others. The ImageNet2012 dataset stores several annotated or labeled training set pictures, several verification set pictures, and several unlabeled test set pictures. The COCO data set stores a training set of a plurality of pictures with labels and label files, a verification set of a plurality of pictures and label files, and a test set of a plurality of pictures without labels and label files. The BraTS 2019 dataset stores a training set of several labeled case data, a validation set of several case data, and a test set of several unlabeled case data. The SQuAD v1.1 data set stores a plurality of training sets of tagged question-answer pairs, a plurality of verification sets of question-answer pairs and a plurality of test sets of untagged question-answer pairs. The Criteo Terabbyte dataset stores a plurality of training sets with labeled numerical variables and category variables, a plurality of verification sets with the numerical variables and the category variables, and a plurality of test sets with the numerical variables and the category variables without labels.
Optionally, determining the weight corresponding to each performance to be measured includes: performing table look-up operation on each performance to be tested according to a preset weight matching table to obtain the weight corresponding to each performance to be tested; the weight matching table stores the corresponding relationship between the performance to be measured and the weight. Therefore, the weight corresponding to each performance to be tested is determined by looking up the table, and when the performance of the accelerator card in the server is tested, the weight of each performance to be tested of the accelerator card is taken into consideration, so that the obtained performance test result of the accelerator card is more suitable for the requirement condition of a user on the accelerator card.
Optionally, inputting each data to be tested into a performance test model preset in the server, and obtaining inference accuracy corresponding to each performance to be tested, including: inputting each data to be tested into a performance test model preset in a server to obtain a model output result corresponding to each data to be tested; comparing the model output result corresponding to each data to be detected with the preset label corresponding to each data to be detected to obtain the comparison result corresponding to each data to be detected; and obtaining the reasoning accuracy rate corresponding to each performance to be tested according to each comparison result. Therefore, the inference accuracy rate corresponding to each performance to be tested is obtained according to the comparison result of the model output result corresponding to each data to be tested and the preset label corresponding to each data to be tested, so that the performance test result of the accelerator card can be obtained according to the inference accuracy rate corresponding to each performance to be tested and the weight corresponding to the inference accuracy rate, the performance test of the accelerator card is realized, and the performance of the accelerator card is convenient to know.
In some embodiments, in conjunction with fig. 2, a method for testing the performance of an accelerator card in a server includes:
step S201, determining a plurality of performances to be tested of an accelerator card in a server and weights corresponding to the performances to be tested, and acquiring data to be tested corresponding to the performances to be tested;
step S202, under the condition that the accelerator card normally runs, inputting each data to be tested into a performance test model preset in a server, and obtaining a model output result corresponding to each data to be tested;
step S203, comparing the model output result corresponding to each data to be tested with the preset label corresponding to each data to be tested to obtain the comparison result corresponding to each data to be tested;
step S204, obtaining the reasoning accuracy rate corresponding to each performance to be tested according to each comparison result;
and step S205, acquiring a performance test result of the accelerator card according to the weight corresponding to each performance to be tested and the reasoning accuracy corresponding to each performance to be tested.
Therefore, the performance test result of the accelerator card is obtained by using the weight corresponding to each performance to be tested of the accelerator card and the corresponding reasoning accuracy rate. The performance test of the accelerator card in the server is realized, so that the performance of the accelerator card can be known conveniently, and a user can design a cloud platform conveniently according to the performance of the accelerator card.
Optionally, the server is preset with performance test models corresponding to different types of data to be tested, and each data to be tested is input into the corresponding performance test model.
Optionally, the different types of data to be tested correspond to different performance test models respectively.
Optionally, before inputting each piece of data to be tested into the performance test model preset in the server, the method further includes: and acquiring a performance test model preset in the server. Optionally, the performance test model is obtained by: acquiring training data corresponding to each performance to be tested; and respectively inputting the training data into a preset model to be trained for training to obtain a performance test model corresponding to each performance to be tested.
Optionally, obtaining training data corresponding to each performance to be measured includes: and screening out data to be trained corresponding to each performance to be tested from a preset data set, and determining the screened data to be trained as training data corresponding to the performance to be tested.
In some embodiments, the model to be trained comprises: one or more of Inception V3 model, SSD ResNet-34 model, 3D-UNet model, BERT-base model, DLRM model, and the like.
In some embodiments, the performance to be tested is image classification performance, data to be trained corresponding to the image classification performance is screened out from the ImageNet2012 data set, the screened data to be trained is determined as training data corresponding to the image classification performance, the training data corresponding to the image classification performance is input into an Inception V3 model for training, and a performance test model corresponding to the image classification performance is obtained; and screening the data to be tested corresponding to the image classification performance from the ImageNet2012 data set, inputting the data to be tested corresponding to the image classification performance into the performance test model corresponding to the image classification performance under the condition that the accelerator card normally operates, and obtaining a model output result corresponding to the data to be tested of the image classification performance.
In some embodiments, the performance to be tested is target detection performance, the data to be trained corresponding to the target detection performance is screened out from the COCO data set, the screened data to be trained is determined as training data corresponding to the target detection performance, the training data corresponding to the target detection performance is input into the SSD ResNet-34 model for training, a performance test model corresponding to the target detection performance is obtained, the data to be tested corresponding to the target detection performance is screened out from the COCO data set, and under the condition that the accelerator card normally operates, the data to be tested corresponding to the target detection performance is input into the performance test model corresponding to the target detection performance, and a model output result corresponding to the data to be tested of the target detection performance is obtained.
In some embodiments, the performance to be tested is semantic segmentation performance, the data to be trained corresponding to the semantic segmentation performance is screened out from the BraTS 2019 data set, the screened data to be trained is determined as training data corresponding to the semantic segmentation performance, the training data corresponding to the semantic segmentation performance is input into a 3D-UNet model to be trained, a performance test model corresponding to the semantic segmentation performance is obtained, the data to be tested corresponding to the semantic segmentation performance is screened out from the BraTS 2019 data set, the data to be tested corresponding to the semantic segmentation performance is input into the performance test model corresponding to the semantic segmentation performance under the condition that the accelerator card normally operates, and a model output result corresponding to the data to be tested of the semantic segmentation performance is obtained.
In some embodiments, the performance to be tested is language reasoning performance, the data to be trained corresponding to the language reasoning performance is screened out from the SQuAD v1.1 data set, the screened data to be trained is determined as training data corresponding to the language reasoning performance, the training data corresponding to the language reasoning performance is input into a BERT-base model for training, a performance test model corresponding to the language reasoning performance is obtained, the data to be tested corresponding to the language reasoning performance is screened out from the SQuAD v1.1 data set, the data to be tested corresponding to the language reasoning performance is input into the performance test model corresponding to the language reasoning performance under the condition that the accelerator card normally operates, and a model output result corresponding to the data to be tested corresponding to the language reasoning performance is obtained.
In some embodiments, the performance to be tested is recommended performance, the data to be trained corresponding to the recommended performance is screened from the Criteo Terabyte dataset data set, the screened data to be trained is determined as training data corresponding to the recommended performance, the training data corresponding to the recommended performance is input into a DLRM model for training, a performance test model corresponding to the recommended performance is obtained, the data to be tested corresponding to the recommended performance is screened from the Criteo Terabyte dataset data set, the data to be tested corresponding to the recommended performance is input into the performance test model corresponding to the recommended performance under the condition that the accelerator card normally operates, and a model output result corresponding to the data to be tested of the recommended performance is obtained.
Optionally, obtaining training data corresponding to each performance to be measured includes: acquiring test items corresponding to various performances to be tested; determining a data set corresponding to each performance to be tested according to the test item corresponding to each performance to be tested, and screening out data to be trained corresponding to each performance to be tested from each data set; and determining the screened data to be trained as the training data corresponding to the performance to be tested.
Optionally, obtaining a test item corresponding to each performance to be tested includes: and performing table look-up operation on each performance to be tested according to a preset test item matching table to obtain a test item corresponding to each performance to be tested, wherein the test item matching table stores the corresponding relation between the performance to be tested and the test item.
Optionally, determining a data set corresponding to each performance to be tested according to the test item corresponding to each performance to be tested includes: and performing table look-up operation on the test items corresponding to the performances to be tested according to a preset data set matching table to obtain data sets corresponding to the test items of the performances to be tested, wherein the data set matching table stores the corresponding relation between the test items corresponding to the performances to be tested and the data sets.
In some embodiments, the weight matching table and the test item matching table are different matching tables, and table lookup operations are performed on each to-be-tested performance in the preset weight matching table and the preset test item matching table, respectively, to obtain a weight and a test item corresponding to each to-be-tested performance.
In some embodiments, the weight matching table and the test item matching table are the same matching table, i.e. the weight and test item matching table. And performing table look-up operation on each performance to be tested in a preset weight and test item matching table to obtain the weight and the test item corresponding to each performance to be tested. The weight and test item matching table stores the corresponding relations among the performances to be tested, the weights and the test items.
In some embodiments, table 1 is an example table of the weight and test item matching table, as shown in table 1, the performance to be tested is image classification performance, the weight corresponding to the image classification performance is 20%, and the test item corresponding to the image classification performance is a network inference performance test based on ImageNet2012 data set inclusion v 3; the performance to be tested is target detection performance, the weight corresponding to the target detection performance is 20%, and the test item corresponding to the target detection performance is SSD ResNet-34 network inference performance test based on a COCO data set; the performance to be tested is semantic segmentation performance, the weight corresponding to the semantic segmentation performance is 20%, and the test item corresponding to the semantic segmentation performance is a 3D-UNet network inference performance test based on a BraTS 2019 data set; the performance to be tested is language reasoning performance, the weight corresponding to the language reasoning performance is 20%, and the test item corresponding to the language reasoning performance is a BERT-base network reasoning performance test based on the SQuAD v1.1 data set; the performance to be tested is recommended performance, the weight corresponding to the recommended performance is 20%, and the test item corresponding to the recommended performance is a DLRM network reasoning performance test based on a Criteo Terabbyte dataset.
TABLE 1
Optionally, obtaining the inference accuracy corresponding to each performance to be measured according to each comparison result includes: dividing the first number corresponding to each performance to be measured by the second number corresponding to each performance to be measured to obtain the reasoning accuracy rate corresponding to each performance to be measured; the first quantity is the quantity of the data to be detected, of which the comparison result is the model output result and the corresponding preset label are consistent, and the second quantity is the total quantity of the data to be detected. In this way, the ratio of the number of the output results of each model, which is consistent with the corresponding preset labels, to the total number of the data to be tested is determined as the inference accuracy rate of the corresponding performance to be tested, so that the performance inference result of the accelerator card can be obtained according to the inference accuracy rate of the performance to be tested and the corresponding weight. Therefore, the performance test of the accelerator card from a plurality of performance angles is realized, so that the performance of the accelerator card can be known.
In some embodiments, the performance to be tested is image classification performance, the test item corresponding to the image classification performance is determined to be a network inference performance test based on an ImageNet2012 data set increment V3, the ImageNet2012 data set is imported into a tested service system in a server, 70% of data in the ImageNet2012 data set is screened out to serve as data to be trained corresponding to the image classification, the screened data to be trained serves as training data of the image classification, the training data corresponding to the image classification is input into an increment V3 model to be trained, and a performance test model corresponding to the image classification performance is obtained. Screening 30% of data from the ImageNet2012 data set as to-be-detected data corresponding to the image classification performance, inputting the to-be-detected data corresponding to the image classification performance into a performance test model corresponding to the image classification performance under the condition that the accelerator card normally operates, obtaining a model output result corresponding to the to-be-detected data, comparing the model output result corresponding to the to-be-detected data with a preset label corresponding to the to-be-detected data, obtaining a comparison result corresponding to the to-be-detected data, and dividing the number of the to-be-detected data with the model output result consistent with the preset label by the total number of the to-be-detected data to obtain the inference accuracy corresponding to the image classification performance.
Optionally, obtaining a performance test result of the accelerator card according to the weight corresponding to each performance to be tested and the inference accuracy corresponding to each performance to be tested, includes: obtaining a score corresponding to each performance to be tested according to the reasoning accuracy corresponding to each performance to be tested; obtaining the comprehensive performance score of the accelerator card according to the score corresponding to each performance to be tested and the weight corresponding to each performance to be tested; and obtaining a performance test result of the accelerator card according to the comprehensive performance score. Therefore, the comprehensive performance score of the accelerator card is obtained according to the reasoning accuracy rate corresponding to each performance to be tested and the weight corresponding to the performance to be tested, and further the performance test result of the accelerator card is obtained, so that the performance test of the accelerator card is realized, and the performance of the accelerator card is known more conveniently.
Optionally, obtaining a score corresponding to each performance to be measured according to the inference accuracy corresponding to each performance to be measured includes: and performing table look-up operation on the reasoning accuracy corresponding to each performance to be tested according to a preset score matching table to obtain a score corresponding to each performance to be tested, wherein the score matching table stores the corresponding relation between the reasoning accuracy corresponding to each performance to be tested and the score.
In some embodiments, fig. 2 is an example table of the score matching table, as shown in table 2, the performance to be measured is image classification, and the score corresponding to the image classification is 1 score when the inference accuracy corresponding to the image classification is greater than or equal to 0 and less than or equal to 20%; under the condition that the value range of the reasoning accuracy corresponding to the image classification is more than 20% and less than or equal to 40%, the score corresponding to the image classification is 2; under the condition that the value range of the reasoning accuracy corresponding to the image classification is 40 percent and more than or equal to 60 percent, the score corresponding to the image classification is 3 points; under the condition that the value range of the reasoning accuracy corresponding to the image classification is more than 60% and less than or equal to 80%, the score corresponding to the image classification is 4; under the condition that the value range of the reasoning accuracy corresponding to the image classification is more than 80% and less than or equal to 100%, the score corresponding to the image classification is 5. The performance to be tested is recommended, and under the condition that the value range of the reasoning accuracy rate corresponding to the recommendation is not less than 0 and not more than 20 percent, the corresponding score is recommended to be 1 score; under the condition that the value range of the inference accuracy rate corresponding to the recommendation is 20 percent and more than or equal to 40 percent, the score corresponding to the recommendation is 2 scores; under the condition that the value range of the inference accuracy rate corresponding to the recommendation is 40 percent and more than or equal to 60 percent, the score corresponding to the recommendation is 3 points; under the condition that the value range of the inference accuracy rate corresponding to the recommendation is 60 percent and more than or equal to 80 percent, the score corresponding to the recommendation is 4; and under the condition that the value range of the inference accuracy rate corresponding to the recommendation is 80 percent and the inference accuracy rate is less than or equal to 100 percent, the score corresponding to the recommendation is 5.
TABLE 2
Optionally, obtaining the comprehensive performance score of the accelerator card according to the score corresponding to each performance to be measured and the weight corresponding to each performance to be measured includes: and respectively dividing the score corresponding to each performance to be tested by the weight corresponding to each performance to be tested to obtain the single performance score corresponding to each performance to be tested, and adding the single performance scores corresponding to each performance to be tested to obtain the comprehensive performance score of the accelerator card.
Optionally, obtaining a performance test result of the accelerator card according to the comprehensive performance score includes: and determining a score interval where the comprehensive performance score is located, and acquiring a performance test result corresponding to the accelerator card according to the score interval where the comprehensive performance score is located. Therefore, the performance test result of the accelerator card is obtained through the score interval where the comprehensive performance score of the accelerator card is located, so that the normalized test of the performance of the accelerator card can be realized, and the normalized performance test result of the accelerator card is obtained.
Optionally, obtaining a performance test result corresponding to the accelerator card according to a score interval where the comprehensive performance score of the accelerator card is located includes: and performing table look-up operation on a score interval where the comprehensive performance score of the accelerator card is located according to a preset performance test result matching table to obtain a performance test result corresponding to the accelerator card, wherein the performance test result matching table stores a corresponding relation between the score interval and the performance test result.
In some embodiments, in the case that the overall performance score is in the score interval equal to 25 points, the performance test result of the corresponding accelerator card is: "performance level is L1, accelerator card is not mature yet, reasoning performance is low"; under the condition that the comprehensive performance score is in a score interval of 25 points < the comprehensive performance score < 50 points, the performance test result corresponding to the accelerator card is as follows: "performance level L2, accelerator card has basic capability, slightly lower inference performance or equal performance"; under the condition that the comprehensive performance score is in a score interval of more than or equal to 50 scores and less than 75 scores, the performance test result corresponding to the accelerator card is as follows: "performance level L3, accelerated card reasoning ability mature"; under the condition that the comprehensive performance score is in a score interval that the comprehensive performance score is not less than 75 scores and is less than 100 scores, the performance test result corresponding to the accelerator card is as follows: "performance level L4, accelerator card reasoning ability is good"; under the condition that the comprehensive performance score is in a score interval of more than or equal to 100 scores and less than or equal to 125 scores, the performance test result corresponding to the accelerator card is as follows: "performance level L5, accelerated card reasoning ability is excellent".
In some embodiments, as shown in fig. 3, a method for testing the performance of an accelerator card in a server includes:
step S301, determining a plurality of performances to be tested of an accelerator card in a server and weights corresponding to the performances to be tested, and acquiring data to be tested corresponding to the performances to be tested;
step S302, under the condition that the accelerator card normally runs, inputting each data to be tested into a performance test model preset in a server to obtain the reasoning accuracy rate corresponding to each performance to be tested;
step S303, obtaining a score corresponding to each performance to be measured according to the reasoning accuracy corresponding to each performance to be measured;
step S304, obtaining the comprehensive performance score of the accelerator card according to the score corresponding to each performance to be tested and the weight corresponding to each performance to be tested;
and S305, obtaining a performance test result of the accelerator card according to the comprehensive performance score.
Therefore, the performance test of the accelerator card in the server is realized by determining a plurality of performances to be tested of the accelerator card in the server and the corresponding weights thereof, acquiring data to be tested corresponding to each performance to be tested, acquiring the reasoning accuracy corresponding to each performance to be tested by using the performance test model preset in the server through each data to be tested under the condition that the accelerator card normally operates, acquiring the score corresponding to each performance to be tested according to the reasoning accuracy corresponding to each performance to be tested, acquiring the comprehensive performance score of the accelerator card by using the weight corresponding to each performance to be tested and the corresponding score, and acquiring the performance test result of the accelerator card according to the comprehensive performance score, thereby facilitating the understanding of the performance of the accelerator card. Because a plurality of performance angles are considered when the performance test is carried out on the accelerator card, the performance test result of the accelerator card is more comprehensive and accurate.
Optionally, after obtaining the data to be tested corresponding to each performance to be tested, the method further includes: and under the condition that the accelerator card in the server does not normally operate, early warning is carried out on the abnormal operation of the accelerator card. By early warning the abnormal operation of the accelerator card, the user can conveniently process the abnormal operation of the accelerator card, and the test error of the accelerator card is reduced.
Optionally, the accelerator card does not operate normally, including: the accelerator card is not in a preset position on the server or the accelerator card is in a preset position on the server but cannot operate.
Optionally, the early warning of the abnormal operation of the accelerator card includes: and sending preset prompt information to a preset user terminal. The user is prompted to process the abnormal operation of the accelerator card conveniently. For example, the prompt message is "accelerator card is not operating normally".
Optionally, the early warning of the abnormal operation of the accelerator card includes: and displaying the preset prompt information through a display device. The user is prompted to process the abnormal operation of the accelerator card conveniently. For example, the prompt message is "accelerator card is not operating normally".
Optionally, the user terminal includes: smart phones, tablets or phone watches, etc.
In some embodiments, as shown in fig. 4, a method for testing the performance of an accelerator card in a server includes:
step S401, determining a plurality of performances to be tested of an accelerator card in a server and weights corresponding to the performances to be tested, and acquiring data to be tested corresponding to the performances to be tested; the server is normally powered and is a stable commercial BIOS version, a Linux operating system is installed, a driving program can normally run, and an accelerator card hardware interface driver, an accelerator card driver, a deep learning software library and the like are installed and can normally run; the performance to be tested of the accelerator card in the server comprises one or more of image classification performance, target detection performance, semantic segmentation performance, language reasoning performance and recommendation performance.
Step S402, under the condition that the accelerator card normally runs, inputting each data to be tested into a performance test model preset in a server, and obtaining a model output result corresponding to each data to be tested;
step S403, comparing the model output result corresponding to each data to be tested with the preset label corresponding to each data to be tested, and obtaining the comparison result corresponding to each data to be tested;
s404, obtaining reasoning accuracy corresponding to each performance to be tested according to each comparison result;
step S405, obtaining a score corresponding to each performance to be measured according to the inference accuracy corresponding to each performance to be measured;
step S406, obtaining the comprehensive performance score of the accelerator card according to the score corresponding to each performance to be tested and the weight corresponding to each performance to be tested;
and step S407, obtaining a performance test result of the accelerator card according to the comprehensive performance score.
Therefore, a plurality of performances to be tested of the accelerator card in the server and corresponding weights thereof are determined, data to be tested corresponding to each performance to be tested are obtained, under the condition that the accelerator card normally operates, a model output result corresponding to each data to be tested is obtained through a performance test model preset in the server according to each data to be tested, inference accuracy corresponding to each performance to be tested is obtained according to a comparison result of the model output result corresponding to each data to be tested and a preset label corresponding to each data to be tested, then comprehensive performance scores of the accelerator card are obtained according to the weights corresponding to each performance to be tested and the inference accuracy corresponding to each performance to be tested, further performance test results of the accelerator card are obtained according to the comprehensive performance scores of the accelerator card, performance test on the accelerator card is achieved, and a user can design a cloud platform according to the performance of the accelerator card. Moreover, multiple performance dimensions of image classification performance, target detection performance, semantic segmentation performance, language inference performance and recommendation performance are considered when the accelerator card is subjected to performance test, so that performance indexes of the accelerator card during inference performance test are comprehensive, the performance of the accelerator card can be subjected to normative test, and the performance test of the accelerator card in all directions and multiple angles is standardized. Meanwhile, the weight corresponding to each performance to be tested is also considered when the accelerator card is subjected to performance test, so that the performance test result of the accelerator card can be closer to the actual requirement, and the quality of the accelerator card can be known conveniently.
As shown in fig. 5, an apparatus for testing the performance of an accelerator card in a server according to an embodiment of the present disclosure includes a determining module 101, an obtaining module 102, and a testing module 103. The determining module 101 is configured to determine a plurality of performances to be measured of the accelerator card in the server and a weight corresponding to each of the performances to be measured, and obtain data to be measured corresponding to each of the performances to be measured; the obtaining module 102 is configured to input each piece of data to be tested into a performance test model preset in the server under the condition that the accelerator card normally operates, and obtain a reasoning accuracy rate corresponding to each piece of performance to be tested; the test module 103 is configured to obtain a performance test result of the accelerator card according to the weight corresponding to each performance to be tested and the inference accuracy corresponding to each performance to be tested.
By adopting the device for testing the performance of the accelerator card in the server provided by the embodiment of the disclosure, the plurality of performances to be tested of the accelerator card in the server and the weights corresponding to the performances to be tested are determined, the data to be tested corresponding to each performance to be tested is obtained, under the condition that the accelerator card normally operates, the reasoning accuracy rate corresponding to each performance to be tested is obtained by using the data to be tested and the performance test model preset in the server, and then the performance test result of the accelerator card is obtained by using the weights corresponding to each performance to be tested and the reasoning accuracy rate corresponding to each performance to be tested. The performance test of the accelerator card in the server is realized, so that the performance of the accelerator card can be known conveniently.
Optionally, the determining module is configured to determine the weight corresponding to each performance to be measured by: performing table look-up operation on each performance to be tested according to a preset weight matching table to obtain the weight corresponding to each performance to be tested; the weight matching table stores the corresponding relationship between the performance to be measured and the weight.
Optionally, the obtaining module is configured to input each piece of data to be tested into a performance test model preset in the server, and obtain an inference accuracy corresponding to each piece of performance to be tested, by: inputting each data to be tested into a performance test model preset in a server to obtain a model output result corresponding to each data to be tested; comparing the model output result corresponding to each data to be detected with the preset label corresponding to each data to be detected to obtain the comparison result corresponding to each data to be detected; and obtaining the reasoning accuracy rate corresponding to each performance to be tested according to each comparison result.
Optionally, the obtaining module is configured to obtain the inference accuracy corresponding to each performance to be measured according to each comparison result by: dividing the first number corresponding to each performance to be measured by the second number corresponding to each performance to be measured to obtain the reasoning accuracy rate corresponding to each performance to be measured; the first quantity is the quantity of the data to be detected, of which the comparison result is the model output result and the preset label are consistent, and the second quantity is the total quantity of the data to be detected.
Optionally, the test module is configured to obtain the performance test result of the accelerator card according to the weight corresponding to each performance to be tested and the inference accuracy corresponding to each performance to be tested by: obtaining a score corresponding to each performance to be tested according to the reasoning accuracy corresponding to each performance to be tested; obtaining the comprehensive performance score of the accelerator card according to the score corresponding to each performance to be tested and the weight corresponding to each performance to be tested; and obtaining a performance test result of the accelerator card according to the comprehensive performance score.
Optionally, the apparatus for testing the performance of the accelerator card in the server further includes an early warning module. The early warning module is configured to perform early warning on abnormal operation of the accelerator card in the server under the condition that the accelerator card does not operate normally.
As shown in fig. 6, an apparatus for testing the performance of an accelerator card in a server according to an embodiment of the present disclosure includes a processor (processor)200 and a memory (memory) 201. Optionally, the apparatus may also include a Communication Interface (Communication Interface)202 and a bus 203. The processor 200, the communication interface 202 and the memory 201 can communicate with each other through the bus 203. The communication interface 202 may be used for information transfer. The processor 200 may call logic instructions in the memory 201 to perform the method for testing the performance of the accelerator card in the server of the above embodiment.
In addition, the logic instructions in the memory 201 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products.
The memory 201 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 200 executes functional applications and data processing by executing program instructions/modules stored in the memory 201, i.e. implements the method for testing the performance of the accelerator card in the server in the above-described embodiment.
The memory 201 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Further, the memory 201 may include a high-speed random access memory, and may also include a nonvolatile memory.
By adopting the device for testing the performance of the accelerator card in the server provided by the embodiment of the disclosure, the plurality of performances to be tested of the accelerator card in the server and the weights corresponding to the performances to be tested are determined, the data to be tested corresponding to each performance to be tested is obtained, under the condition that the accelerator card normally operates, the reasoning accuracy rate corresponding to each performance to be tested is obtained by using the data to be tested and the performance test model preset in the server, and then the performance test result of the accelerator card is obtained by using the weights corresponding to each performance to be tested and the reasoning accuracy rate corresponding to each performance to be tested. The performance test of the accelerator card in the server is realized, so that the performance of the accelerator card can be known conveniently.
The embodiment of the disclosure provides an electronic device, which includes the above-mentioned apparatus for testing the performance of an accelerator card in a server. The electronic equipment determines a plurality of performances to be tested of the accelerator card in the server and weights corresponding to the performances to be tested, acquires data to be tested corresponding to the performances to be tested, acquires inference accuracy corresponding to the performances to be tested by using a performance test model preset in the server through the data to be tested under the condition that the accelerator card normally operates, and acquires a performance test result of the accelerator card by using the weights corresponding to the performances to be tested and the inference accuracy corresponding to the weights. The performance test of the accelerator card in the server is realized, so that the performance of the accelerator card can be known conveniently.
Optionally, the electronic device comprises: a computer or server, etc.
Optionally, in the case that the electronic device is a computer, the server determines whether the accelerator card is operating normally. Optionally, when the electronic device is a server, sending a preset prompt message to a preset user terminal.
The embodiment of the disclosure provides a storage medium, which stores program instructions, and when the program instructions are executed, the method for testing the performance of the accelerator card in the server is executed.
The disclosed embodiments provide a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-described method for testing the performance of an accelerator card in a server.
The computer-readable storage medium described above may be a transitory computer-readable storage medium or a non-transitory computer-readable storage medium.
The technical solution of the embodiments of the present disclosure may be embodied in the form of a software product, where the computer software product is stored in a storage medium and includes one or more instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium comprising: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes, and may also be a transient storage medium.
The above description and drawings sufficiently illustrate embodiments of the disclosure to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. The examples merely typify possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in or substituted for those of others. Furthermore, the words used in the specification are words of description only and are not intended to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, the terms "comprises" and/or "comprising," when used in this application, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Without further limitation, an element defined by the phrase "comprising an …" does not exclude the presence of other like elements in a process, method or apparatus that comprises the element. In this document, each embodiment may be described with emphasis on differences from other embodiments, and the same and similar parts between the respective embodiments may be referred to each other. For methods, products, etc. of the embodiment disclosures, reference may be made to the description of the method section for relevance if it corresponds to the method section of the embodiment disclosure.
Those of skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software may depend upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments. It can be clearly understood by the skilled person that, for convenience and brevity of description, the specific working processes of the system, the apparatus and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments disclosed herein, the disclosed methods, products (including but not limited to devices, apparatuses, etc.) may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units may be merely a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to implement the present embodiment. In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. In the description corresponding to the flowcharts and block diagrams in the figures, operations or steps corresponding to different blocks may also occur in different orders than disclosed in the description, and sometimes there is no specific order between the different operations or steps. For example, two sequential operations or steps may in fact be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. Each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.