CN117851208A - Chip evaluation method and device, electronic equipment and medium - Google Patents

Chip evaluation method and device, electronic equipment and medium Download PDF

Info

Publication number
CN117851208A
CN117851208A CN202410033252.8A CN202410033252A CN117851208A CN 117851208 A CN117851208 A CN 117851208A CN 202410033252 A CN202410033252 A CN 202410033252A CN 117851208 A CN117851208 A CN 117851208A
Authority
CN
China
Prior art keywords
operator
original
performance
chip
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410033252.8A
Other languages
Chinese (zh)
Inventor
蒋丽娟
李秀红
金旻玺
裴芝林
张行程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai AI Innovation Center
Original Assignee
Shanghai AI Innovation Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai AI Innovation Center filed Critical Shanghai AI Innovation Center
Priority to CN202410033252.8A priority Critical patent/CN117851208A/en
Publication of CN117851208A publication Critical patent/CN117851208A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a chip evaluation method, a device, electronic equipment and a medium, wherein the method comprises the following steps: selecting at least one original operator from the neural network; calculating performance indexes of the original operators under a calculation chip aiming at each original operator, wherein the performance indexes are determined based on index parameters of the original operators in an operation test data set; and evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result. According to the method, the performance evaluation of the computing chip is carried out by selecting the original operator and according to the performance index of the original operator under the computing chip, so that the obtained evaluation result can more closely reflect the specific performance of the computing chip, and the accuracy of the evaluation result is improved.

Description

Chip evaluation method and device, electronic equipment and medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a chip evaluation method, a device, an electronic apparatus, and a medium.
Background
Evaluating the performance of a chip in a real network has important reference significance for the design of the chip.
The existing chip evaluation mode mainly focuses on the overall performance of a computer system, and a reference evaluation program is at an application program level, so that an evaluation result cannot well reflect the specific performance of a computing chip.
Disclosure of Invention
The invention provides a chip evaluation method, a device, electronic equipment and a medium, so that an evaluation result reflects the specific performance of a computing chip more closely, and the accuracy of the evaluation result is improved.
According to an aspect of the present invention, there is provided a chip evaluation method, the method including:
selecting at least one original operator from the neural network;
calculating performance indexes of the original operators under a calculation chip aiming at each original operator, wherein the performance indexes are determined based on index parameters of the original operators in an operation test data set;
and evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result.
According to another aspect of the present invention, there is provided a chip evaluation apparatus including:
the selecting module is used for selecting at least one original operator from the neural network;
the computing module is used for computing the performance index of each original operator under a computing chip, and the performance index is determined based on the index parameters of the original operator in the running test data set;
and the evaluation module is used for evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the chip evaluation method according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute the chip evaluation method according to any one of the embodiments of the present invention.
The embodiment of the invention provides a chip evaluation method, a device, electronic equipment and a medium, wherein the method comprises the following steps: selecting at least one original operator from the neural network; calculating performance indexes of the original operators under a calculation chip aiming at each original operator, wherein the performance indexes are determined based on index parameters of the original operators in an operation test data set; and evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result. By using the technical scheme, the performance evaluation of the computing chip is performed by selecting the original operator and according to the performance index of the original operator under the computing chip, the obtained evaluation result can more closely reflect the specific performance of the computing chip, and the accuracy of the evaluation result is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a chip evaluation method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a chip evaluation method according to a second embodiment of the present invention;
FIG. 3 is a flowchart of another chip evaluation method according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of a chip evaluation device according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a chip evaluation method according to an embodiment of the present invention, where the method may be performed by a chip evaluation device, and the chip evaluation device may be implemented in hardware and/or software, and the chip evaluation device may be configured in an electronic apparatus.
Training and reasoning for large-scale networks is an urgent need for computational power, and many AI chips oriented to neural networks are gradually emerging in the market, and general-purpose graphics processor (Graphic Processing Unit, GPU) chips are also configured with specialized components such as tensor computational cores to more effectively accelerate neural network applications. However, computing chips often achieve high throughput by configuring more computing cores, higher clock frequencies, complex hierarchical storage, which also allows applications to obtain higher performance when tensor shapes are typically larger. However, tensor shapes for practical neural network applications are various, and evaluation of performance of a computing chip in a real network has important reference significance for chip design.
The chip evaluation generally defines a plurality of evaluation indexes and a group of benchmark test programs, a user records the running results such as the running time or the accuracy of the benchmark test programs by running the given benchmark test programs, then calculates the evaluation indexes of the benchmark test programs on the chip according to the definition mode of the evaluation indexes, further evaluates the performance advantages and disadvantages of the chip in a certain aspect through the evaluation indexes, and can rank the chips participating in the evaluation based on the evaluation indexes.
The existing evaluation mode is mainly divided into three aspects, one of which is that a typical neural network in different fields of deep learning can be selected as a benchmark test program, and the time required for the neural network to reach a given convergence value is defined as an evaluation index for evaluating the performance of the system for operating the neural network. Secondly, the time required by the selected application to run can be defined as an evaluation index based on the benchmark evaluation mode of the traditional application program, so as to evaluate the performance of the system for solving a certain problem, and the traditional application program can be a highly parallel computing benchmark test (High Performance Linpack, HPL). Finally, performance, power consumption, cost, and accuracy may be used as evaluation indicators to evaluate certain core components of the computing chip, such as multiply-add components, network-on-chip, and the like.
However, the above solution ignores visual evaluation of the computing chip, such as hardware utilization, and the selected reference program is at a model level, lacking evaluation of performance of a specific deep learner on the chip; meanwhile, the traditional benchmark test program is often focused on the whole evaluation of the computer system, and the selected benchmark test program cannot represent the calculation characteristics in the field of deep learning; furthermore, the evaluation program of the specific component of the computing chip cannot reflect the performance of the computing chip in the actual application scenario as a whole.
Based on the above, the embodiment of the invention provides a chip evaluation method, which is characterized in that a typical operator in a deep learning network is selected as a benchmark evaluation program of an operator level, the proportion of operator examples in a performance interval is defined as a chip evaluation index, the performance of a computing chip in a neural network scene is measured, and a quantitative evaluation method for the use condition of computing chip hardware is provided, so that the chip is effectively evaluated. As shown in fig. 1, the method includes:
s110, selecting at least one original operator from the neural network.
The type of the original operator is not limited, for example, the original operator can be a deep learning operator, and can be used as a benchmark test program.
In some embodiments, the deep learning operator may be selected from a large number of neural networks as an original operator, so as to perform performance evaluation of a subsequent computing chip, and a specific selection manner is not limited, for example, at least one original operator may be selected randomly, at least one original operator may also be selected according to a selection policy, and the selection policy may include, for example, selecting a more typical original operator, or further selecting an original operator according to operator parameters, and so on.
In some embodiments, after the selecting at least one original operator from the neural network, the method further includes:
for each original operator, collecting tensor shapes used by the original operator in a history operation process;
and constructing a test data set based on tensor shapes corresponding to the original operators.
The embodiment can also collect tensor shapes used by the selected original operators in running in a large number of real networks, then construct a test data set based on the tensor shapes collected by each original operator, and the specific organization form of the test data set is not limited, for example, the test data set can be constructed according to each operator as a dimension, each operator is one piece of test data, and the test data contains tensor shapes collected by the operator under all the neural networks.
S120, calculating performance indexes of the original operators under a calculation chip according to each original operator, wherein the performance indexes are determined based on index parameters of the original operators in an operation test data set.
The performance index is used for representing the performance of the original operator under the computing chip, for example, the performance index can be determined based on index parameters of the original operator in the running test data set, specific content of the index parameters is not limited here, and the index parameters corresponding to different types of original operators can be the same or different, and can be determined specifically according to actual conditions.
After selecting at least one original operator, the embodiment can calculate, for each original operator, a performance index of the original operator under a calculation chip, and specific means for calculating the performance index can determine, for example, first an index parameter of the original operator when a test data set is run, for example, the index parameter of the original operator when each test data set is run can be calculated respectively, or directly determine an index parameter of the original operator when all test data are run, and then a corresponding performance index can be calculated according to the obtained index parameter.
S130, evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result.
After the performance indexes of the original operators are obtained through the steps, the performance of the computing chip can be evaluated according to the performance indexes to obtain an evaluation result, for example, the performance of the computing chip can be directly evaluated according to the size of each performance index, if the evaluation result is better when all the performance indexes are larger, the corresponding evaluation result can be obtained quantitatively through specific calculation of each performance index, if the value of all the performance indexes can be added to obtain a value which can be used for evaluation, the larger the value is better, the priority ratio can be set for each operator, the corresponding priority ratio is multiplied by the performance index, and the result is obtained through addition.
The first embodiment of the invention provides a chip evaluation method, which comprises the steps of selecting at least one original operator from a neural network; calculating performance indexes of the original operators under a calculation chip aiming at each original operator, wherein the performance indexes are determined based on index parameters of the original operators in an operation test data set; and evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result. By using the method, the performance evaluation of the computing chip is performed by selecting the original operator and according to the performance index of the original operator under the computing chip, the obtained evaluation result can more closely reflect the specific performance of the computing chip, and the accuracy of the evaluation result is improved.
Example two
Fig. 2 is a flowchart of a chip evaluation method according to a second embodiment of the present invention, where the second embodiment is optimized based on the above embodiments. In this embodiment, the case after the selecting at least one original operator from the neural network is further specified as: dividing the at least one original operator to obtain a calculation type operator set and a memory access type operator set.
Meanwhile, the performance indexes comprise a first performance index and a second performance index, and the calculating the performance index of each original operator under a calculation chip comprises the following steps of: calculating a first performance index of each original operator in the calculation type operator set under a calculation chip; and calculating a second performance index of the original operator under a calculation chip aiming at each original operator in the access type operator set.
For details not yet described in detail in this embodiment, refer to embodiment one.
As shown in fig. 2, the method includes:
s210, selecting at least one original operator from the neural network.
S220, dividing the at least one original operator to obtain a calculation type operator set and a memory access type operator set.
The embodiment can further divide the original operators, so that different calculations of the performance indexes can be performed according to different types of original operators. The specific division process is not limited, and a calculation type operator set and a memory type operator set can be obtained based on at least one original operator.
In some embodiments, the dividing the at least one original operator to obtain a computational operator set and a memory access operator set includes:
for each original operator, calculating the calculation density of the original operator based on operator parameters;
and dividing the original operator into a calculation type operator set or a memory type operator set according to the calculation concentration and a reference threshold.
The operator parameters can be used for representing the calculation memory characteristics of the original operator, for example, the operator parameters can comprise floating point operation values, memory quantity and other parameters; the reference threshold can be used for dividing the original operator, the specific size can be determined according to an empirical value, and the specific size can be obtained by calculating the ratio of the floating point operation peak value of the machine to the theoretical bandwidth.
The embodiment can realize the division of the original operators based on the reference threshold by calculating the calculation density of each original operator, for example, the ratio of the floating point operation value to the access quantity can be calculated, the calculation result is taken as the calculation density of the original operators, if the calculated calculation density is greater than the reference threshold, the calculation density can be called as a calculation-intensive operator, otherwise, the calculation density is called as a memory-intensive operator, and the original operators can be divided into corresponding sets.
S230, calculating a first performance index of each original operator in the calculation type operator set under a calculation chip.
The first performance index may refer to a performance index under a compute chip for each original operator in the compute type operator set.
In some embodiments, the calculating the first performance index of the original operator under the calculation chip includes:
calculating calculation performance parameters of each test data of the original operator in the running test data set respectively;
and determining a first performance index of the original operator under the computing chip based on each computing performance parameter and the computing threshold.
Specifically, in this embodiment, the calculation performance parameters of each test data of the original operator in the running test data set may be calculated respectively to obtain each calculation performance parameter, so that the first performance index of the original operator under the calculation chip may be measured according to the calculation threshold. Taking single test data as an example, the process of determining the calculation performance parameter of the original operator in running the test data is not limited, for example, the parameter output by the original operator when the original operator calculation chip runs the test data can be directly used as the calculation performance parameter, and the corresponding calculation performance parameter can also be determined by calculating the output parameter.
Further, the means for measuring the first performance index based on the calculation threshold may be determined by comparing the calculation performance parameters with the calculation threshold, or may be determined by comparing an instance ratio of the calculation performance parameters falling above 70% of the calculation threshold with the calculation threshold as the first performance index. The calculation threshold value can be understood as the highest floating point operation performance of each calculation chip, and is a determined known value for each chip.
In some embodiments, the calculating the calculation performance parameters of each test data in the running test data set of the original operator includes:
for each test data, counting the running time of the original operator in running the test data;
calculating the ratio of floating point operation times to the running time, and determining the ratio as the calculation performance parameter of the original operator in running the test data.
In a specific embodiment, the running time of the original operator in the process of running test data can be counted by calling a deep learning framework, such as a high-performance operator library interface provided by a pytorch or a hardware manufacturer, the ratio of the floating point operation times of the original operator to the running time is used as the calculation performance parameter of the original operator, and further, the process can be repeatedly executed, such as running for 10 times, the average value of each ratio is obtained, and the obtained average value is used as the final performance data (i.e. the calculation performance parameter) of the original operator.
S240, calculating a second performance index of each original operator in the access storage type operator set under a calculation chip.
The second performance index may refer to a performance index under the compute chip of each original operator in the set of access storage type operators.
In some embodiments, the calculating the second performance index of the original operator under the calculation chip includes:
respectively calculating bandwidth parameters of each test data of the original operator in the running test data set;
and determining a second performance index of the original operator under the computing chip based on each bandwidth parameter and the bandwidth threshold.
The bandwidth threshold may be understood as the highest bandwidth performance achieved by each computing chip itself, being a certain known value for each chip.
Specifically, in this embodiment, bandwidth parameters of each test data of the original operator in the running test data set may be calculated respectively to obtain each bandwidth parameter, so that a second performance index of the original operator under the calculation chip may be measured according to the bandwidth threshold. The specific process of calculating the second performance index is not further developed here, as long as the second performance index can be calculated.
S250, evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result, wherein the performance index comprises a first performance index and a second performance index.
According to the chip evaluation method provided by the second embodiment of the invention, at least one original operator is selected from the neural network; dividing the at least one original operator to obtain a calculation type operator set and a memory access type operator set; calculating a first performance index of each original operator in the calculation type operator set under a calculation chip; calculating a second performance index of each original operator in the access type operator set under a calculation chip; and evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result, wherein the performance index comprises a first performance index and a second performance index. By utilizing the method, the performance index of the original operator under the calculation chip can be calculated pertinently according to the different types of original operators by dividing the selected original operators, so that the accuracy of the evaluation result is further improved.
Fig. 3 is a flowchart of another chip evaluation method according to the second embodiment of the present invention, as shown in fig. 3, first, a typical deep learning operator may be selected from a large number of neural networks, and the deep learning operator is divided into two categories according to its computation memory characteristics, that is, a computation-intensive operator and a memory-intensive operator (that is, the at least one original operator is divided to obtain a computation-type operator set and a memory-type operator set). Illustratively, the deep learning operator selected as the benchmark may include mainly a two-dimensional convolution, a dense matrix multiplication, an adaptive two-dimensional average pooling operation, a two-dimensional neighbor interpolation operation, a dropout operation, an activation operation relu, and a loss operation crossentrollloss; according to the calculation memory characteristics of operators, performing statistical analysis by taking two-dimensional convolution and dense matrix multiplication in the selected deep learning operator as calculation intensive operators, namely operators with larger calculated quantity compared with the memory quantity, and performing statistical analysis by taking the rest operators including self-adaptive two-dimensional average value pooling operation, two-dimensional neighbor interpolation operation, dropout operation, activating operation relu and loss operation cross sentropyloss as the memory intensive operators, namely operators with larger calculated quantity compared with the memory quantity.
Then, for the selected deep learning operator, tensor shapes used by the deep learning operator in running in a large number of real networks can be collected and organized into a test data set for performance data testing of the operator (namely, for each original operator, collecting tensor shapes used by the original operator in the history running process, and constructing the test data set based on tensor shapes corresponding to the original operators).
Finally, a statistical performance index can be tested, for example, the selected typical deep learning operator can be operated on the test data set to obtain corresponding performance data, and the performance index is calculated, so that the obtained performance index can be used as an evaluation standard of the calculation chip.
Wherein, for a computationally intensive operator, testing the computing performance of the operator in a test data set (i.e. under different tensor shapes) (i.e. respectively computing the computing performance parameters of each test data of the original operator in an operating test data set), and taking as a performance index an instance ratio of computing performance falling above 70% of the peak computing performance of the machine (i.e. determining a first performance index of the original operator under a computing chip based on each computing performance parameter and a computing threshold);
for access intensive operators, the test operator's reached bandwidths in the test data sets (i.e., under different tensor shapes) (i.e., the bandwidth parameters of each test data of the original operator in the running test data set are calculated separately), and the instance ratio with bandwidth values above 70% of the machine peak bandwidth is taken as the performance index (i.e., the second performance index of the original operator under the calculation chip is determined based on each bandwidth parameter and bandwidth threshold).
The principle of the above-described method is that the higher the performance of the application program is, the higher the hardware usage rate of the computing chip is, and therefore, the hardware usage of the computing chip is evaluated by calculating the instance ratio of the high performance interval as the performance index, that is, the higher the instance ratio is, the better the hardware usage of the chip is. Moreover, factors affecting the performance of compute-intensive applications and memory-access-intensive applications are focused on different chip hardware components, so that the selected deep learning operators are divided into two categories, the compute-intensive operators count their floating point computing performance, and the memory-access-intensive operators count their bandwidth.
When the table below counts that the test platform is NVIDIA A100, the selected typical operator achieves the example ratio of more than 70% in the performance interval, and the experimental result shows that the example ratio is generally lower, which indicates that the hardware utilization rate of the computing chip in the actual application scene is generally lower.
Operator class Up to an example ratio of performance interval above 70%
Two-dimensional convolution 3.31%
Dense matrix multiplication 3.55%
Adaptive two-dimensional averaging pooling operation 9.87%
Two-dimensional nearest neighbor interpolation operation 0
dropout operation 11.73%
Activating operation relu 25.32%
Loss operation cross sentropyloss 4.52%
According to the chip evaluation method provided by the embodiment of the invention, the chip performance can be evaluated based on the deep learning operator, tensor shapes of operators in the operation of the operators in the typical neural network are collected as test data sets of the reference test program by selecting typical operator level application in the deep learning field as the reference test program, and the example proportion of the performance interval is defined to serve as an index of chip evaluation, so that the hardware use condition of the computing chip in a real network scene can be intuitively quantized, and the visual evaluation of the performance of the computing chip in the deep learning application scene is achieved.
Example III
Fig. 4 is a schematic structural diagram of a chip evaluation device according to a third embodiment of the present invention. As shown in fig. 4, the apparatus includes:
a selecting module 310, configured to select at least one original operator from the neural network;
a calculation module 320, configured to calculate, for each original operator, a performance index of the original operator under a calculation chip, where the performance index is determined based on an index parameter of the original operator in a running test data set;
and the evaluation module 330 is configured to evaluate the performance of the computing chip according to the performance index of each original operator, so as to obtain an evaluation result.
The third embodiment of the invention provides a chip evaluation device, wherein at least one original operator is selected from a neural network through a selection module; calculating performance indexes of the original operators under a calculation chip by a calculation module aiming at each original operator, wherein the performance indexes are determined based on index parameters of the original operators in an operation test data set; and evaluating the performance of the computing chip according to the performance indexes of the original operators through an evaluation module to obtain an evaluation result. By using the device, the performance evaluation of the computing chip is carried out by selecting the original operator and according to the performance index of the original operator under the computing chip, the obtained evaluation result can more closely reflect the specific performance of the computing chip, and the accuracy of the evaluation result is improved.
Optionally, the chip evaluation device provided in the third embodiment of the present invention further includes:
the division module is used for dividing at least one original operator after the at least one original operator is selected from the neural network to obtain a calculation type operator set and a memory access type operator set;
the performance index includes a first performance index and a second performance index, and the computing module includes:
the first calculation unit is used for calculating a first performance index of each original operator in the calculation type operator set under a calculation chip;
the second calculation unit is used for calculating a second performance index of each original operator in the access storage type operator set under a calculation chip.
Optionally, the first computing unit includes:
the calculation subunit is used for respectively calculating the calculation performance parameters of each test data of the original operator in the running test data set;
and the determining subunit is used for determining a first performance index of the original operator under the computing chip based on each computing performance parameter and the computing threshold value.
Optionally, the computing subunit is specifically configured to:
for each test data, counting the running time of the original operator in running the test data;
calculating the ratio of floating point operation times to the running time, and determining the ratio as the calculation performance parameter of the original operator in running the test data.
Optionally, the second computing unit is specifically configured to:
respectively calculating bandwidth parameters of each test data of the original operator in the running test data set;
and determining a second performance index of the original operator under the computing chip based on each bandwidth parameter and the bandwidth threshold.
Optionally, the dividing module is specifically configured to:
for each original operator, calculating the calculation density of the original operator based on operator parameters;
and dividing the original operator into a calculation type operator set or a memory type operator set according to the calculation concentration and a reference threshold.
Optionally, the chip evaluation device provided in the third embodiment of the present invention further includes:
the collection module is used for collecting tensor shapes used by the original operators in the history operation process for each original operator after at least one original operator is selected from the neural network;
and the construction module is used for constructing a test data set based on tensor shapes corresponding to the original operators after at least one original operator is selected from the neural network.
The chip evaluation device provided by the embodiment of the invention can execute the chip evaluation method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as a chip evaluation method.
In some embodiments, the chip evaluation method may be implemented as a computer program, which is tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the chip evaluation method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the chip evaluation method by any other suitable means (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of chip evaluation, the method comprising:
selecting at least one original operator from the neural network;
calculating performance indexes of the original operators under a calculation chip aiming at each original operator, wherein the performance indexes are determined based on index parameters of the original operators in an operation test data set;
and evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result.
2. The method of claim 1, further comprising, after the selecting at least one original operator from the neural network:
dividing the at least one original operator to obtain a calculation type operator set and a memory access type operator set;
the performance indexes comprise a first performance index and a second performance index, and the calculating the performance index of each original operator under a calculation chip comprises the following steps of:
calculating a first performance index of each original operator in the calculation type operator set under a calculation chip;
and calculating a second performance index of the original operator under a calculation chip aiming at each original operator in the access type operator set.
3. The method of claim 2, wherein calculating the first performance index of the original operator under a calculation chip comprises:
calculating calculation performance parameters of each test data of the original operator in the running test data set respectively;
and determining a first performance index of the original operator under the computing chip based on each computing performance parameter and the computing threshold.
4. A method according to claim 3, wherein said separately calculating the calculation performance parameters of each test data in the running test data set for the original operator comprises:
for each test data, counting the running time of the original operator in running the test data;
calculating the ratio of floating point operation times to the running time, and determining the ratio as the calculation performance parameter of the original operator in running the test data.
5. The method of claim 2, wherein calculating a second performance index of the original operator under a calculation chip comprises:
respectively calculating bandwidth parameters of each test data of the original operator in the running test data set;
and determining a second performance index of the original operator under the computing chip based on each bandwidth parameter and the bandwidth threshold.
6. The method according to claim 2, wherein the dividing the at least one original operator to obtain a computational operator set and a memory operator set includes:
for each original operator, calculating the calculation density of the original operator based on operator parameters;
and dividing the original operator into a calculation type operator set or a memory type operator set according to the calculation concentration and a reference threshold.
7. The method of claim 1, further comprising, after the selecting at least one original operator from the neural network:
for each original operator, collecting tensor shapes used by the original operator in a history operation process;
and constructing a test data set based on tensor shapes corresponding to the original operators.
8. A chip evaluation apparatus, characterized in that the apparatus comprises:
the selecting module is used for selecting at least one original operator from the neural network;
the computing module is used for computing the performance index of each original operator under a computing chip, and the performance index is determined based on the index parameters of the original operator in the running test data set;
and the evaluation module is used for evaluating the performance of the computing chip according to the performance index of each original operator to obtain an evaluation result.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the chip evaluation method of any one of claims 1-7.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores computer instructions for causing a processor to implement the chip evaluation method of any one of claims 1-7 when executed.
CN202410033252.8A 2024-01-09 2024-01-09 Chip evaluation method and device, electronic equipment and medium Pending CN117851208A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410033252.8A CN117851208A (en) 2024-01-09 2024-01-09 Chip evaluation method and device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410033252.8A CN117851208A (en) 2024-01-09 2024-01-09 Chip evaluation method and device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN117851208A true CN117851208A (en) 2024-04-09

Family

ID=90528373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410033252.8A Pending CN117851208A (en) 2024-01-09 2024-01-09 Chip evaluation method and device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN117851208A (en)

Similar Documents

Publication Publication Date Title
CN114884813B (en) Network architecture determining method and device, electronic equipment and storage medium
CN117851208A (en) Chip evaluation method and device, electronic equipment and medium
CN114999665A (en) Data processing method and device, electronic equipment and storage medium
CN114760190A (en) Service-oriented converged network performance anomaly detection method
CN114676177A (en) Financial index determination method, device, equipment, medium and product
CN113411390B (en) Scheduling method and device of content distribution network and electronic equipment
CN116662788B (en) Vehicle track processing method, device, equipment and storage medium
CN115576830A (en) Method and device for determining quality of use case, electronic equipment and storage medium
CN115271505A (en) Operation and maintenance index statistical method, device, platform and storage medium
CN116777674A (en) Power distribution network data processing method and device, electronic equipment and storage medium
CN116523249A (en) Production line determining method, device, equipment and storage medium
CN114529202A (en) Project evaluation method and device, electronic equipment and storage medium
CN117667403A (en) Server resource occupancy a method for predicting the situation a device(s) apparatus and medium
CN117993478A (en) Model training method and device based on bidirectional knowledge distillation and federal learning
CN117611412A (en) Event early warning method, device, equipment and medium
CN116522158A (en) Method and device for predicting cold and hot states of data, electronic equipment and storage medium
CN117370213A (en) Test data generation method and device, electronic equipment and storage medium
CN117592618A (en) Active user prediction method, device, server and storage medium
CN117690277A (en) Threshold determining method, device, equipment and storage medium
CN117608896A (en) Transaction data processing method and device, electronic equipment and storage medium
CN116304796A (en) Data classification method, device, equipment and medium
CN116484237A (en) Method and device for determining electric carbon factor, electronic equipment and storage medium
CN117234876A (en) Data acquisition method, device, equipment and storage medium
CN116167519A (en) Monitoring amount prediction method, device, equipment and medium
CN116205321A (en) Method, device, equipment and storage medium for determining carbon consumption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination