CN117574983B

CN117574983B - Operator processing model training method and related device

Info

Publication number: CN117574983B
Application number: CN202410061762.6A
Authority: CN
Inventors: 刘强; 张超; 杨晓峰; 陈鹏; 刘煜宏
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2024-01-16
Filing date: 2024-01-16
Publication date: 2024-04-30
Anticipated expiration: 2044-01-16
Also published as: CN117574983A

Abstract

The embodiment of the application discloses a training method and a related device for an operator processing model, which can be applied to cloud technology, artificial intelligence and other scenes. The method comprises the steps of obtaining an operator, sample input parameters and a first test result, wherein the sample input parameters are used for describing the data dimension of input data of the operator, the first test result is obtained by testing the operator through a target accelerator card, and the operator is a real result obtained by testing based on the sample input parameters. And according to the operator and the sample input parameters, predicting through an initial operator processing model to obtain a first prediction result, namely, a theoretical result obtained by testing the operator through a target accelerator card based on the sample input parameters. Based on the training direction of minimizing the target difference, namely minimizing the difference between the first prediction result and the first test result, the parameters of the initial operator processing model are adjusted to obtain an operator processing model, so that the theoretical result of testing the operator through the target test card based on different input parameters can be accurately predicted.

Description

Operator processing model training method and related device

Technical Field

The application relates to the technical field of chips, in particular to a training method and a related device of an operator processing model.

Background

A business model (e.g., a deep learning model in the field of artificial intelligence) is generally composed of a plurality of computing units, which are Operators (OPs). In the deep learning model, operators correspond to computational logic in the layer. For example, the convolution Layer (Convolution Layer) is an operator, and the weight summation process in the full-connected Layer (FC Layer) is an operator, etc.

The business model is generally larger and its training time is longer. To shorten the training time, multiple operators included in the business model may be optimized. For example, the input parameters of each operator are adjusted, so that the operators operate based on the optimal input parameters, the training efficiency of the service model is improved, and the time for training the service model is shortened.

In the related art, a service model is trained by determining a hardware performance of a specific test case test operator under several sets of input parameters and then determining a set of input parameters from the several sets of input parameters based on the hardware performance. However, the accuracy of the input parameters of the operators used for training the business model determined in the mode is low, and the effect of shortening the training time of the business model is not great.

Disclosure of Invention

In order to solve the technical problems, the application provides a training method and a related device for an operator processing model, which are used for improving the accuracy of determining the input parameters of an operator.

The embodiment of the application discloses the following technical scheme:

in one aspect, an embodiment of the present application provides a training method for an operator processing model, where the method includes:

Acquiring an operator, a sample input parameter and a first test result, wherein the sample input parameter is used for describing the data dimension of input data of the operator, and the first test result is a real result obtained by testing the operator through a target accelerator card based on the sample input parameter;

According to the operator and the sample input parameters, predicting through an initial operator processing model to obtain a first prediction result, wherein the first prediction result is a theoretical result obtained by testing the operator through the target accelerator card based on the sample input parameters;

And adjusting parameters of the initial operator processing model based on a training direction of minimizing target differences, so as to obtain an operator processing model, wherein the target differences are differences between the first prediction result and the first test result, and the operator processing model is used for predicting theoretical results obtained by testing the operator through the target accelerator card based on different input parameters.

In another aspect, an embodiment of the present application provides a training apparatus for an operator processing model, where the apparatus includes: the device comprises an acquisition unit, a prediction unit and an adjustment unit;

The acquisition unit is used for acquiring an operator, a sample input parameter and a first test result, wherein the sample input parameter is used for describing the data dimension of the input data of the operator, and the first test result is a real result obtained by testing the operator through a target accelerator card based on the sample input parameter;

the prediction unit is used for predicting through an initial operator processing model according to the operator and the sample input parameters to obtain a first prediction result, wherein the first prediction result is a theoretical result obtained by testing the operator through the target accelerator card based on the sample input parameters;

The adjusting unit is used for adjusting parameters of the initial operator processing model based on a training direction of minimizing target differences, so as to obtain an operator processing model, wherein the target differences are differences between the first prediction result and the first test result, and the operator processing model is used for predicting theoretical results obtained by testing the operator through the target accelerator card based on different input parameters.

In another aspect, an embodiment of the present application provides a computer device including a processor and a memory:

the memory is used for storing a computer program and transmitting the computer program to the processor;

The processor is configured to perform the method of the above aspect according to instructions in the computer program.

In another aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program for executing the method described in the above aspect.

In another aspect, embodiments of the present application provide a computer program product which, when run on a computer device, causes the computer device to perform the method of the above aspect.

According to the technical scheme, the operator, the sample input parameters and the first test result are obtained, wherein the sample input parameters are used for describing the data dimension of the input data of the operator, the first test result is the operator tested by the target accelerator card, and the operator is a real result obtained by testing based on the sample input parameters. And according to the operator and the sample input parameters, predicting through an initial operator processing model to obtain a first prediction result, namely, a theoretical result obtained by testing the operator through a target accelerator card based on the sample input parameters. Based on the training direction of minimizing the target difference, namely, enabling the difference between the first prediction result and the first test result to be minimum, parameters of the initial operator processing model are adjusted to obtain an operator processing model, so that the operator processing model can accurately predict theoretical results of testing the operator through the target test card based on different input parameters.

Therefore, the actual results obtained by respectively testing the operators through the target test card based on different sample input parameters are continuously learned through training the initial operator processing model, so that the predicted theoretical results are more and more similar to the actual results, the learning capacity of the operator processing model obtained through training based on the initial operator processing model is higher, and the theoretical results obtained by testing the operators through the target test card based on different input parameters can be predicted. That is, the number of test results is increased by performing the prediction by the operator processing model, so that the accuracy of determining the input parameters of the operator is improved, so that the service model is trained based on the operator and the input parameters thereof, and the training time is shortened.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is an application scenario schematic diagram of a training method of an operator processing model according to an embodiment of the present application;

FIG. 2 is a flow chart of a training method of an operator processing model according to an embodiment of the present application;

FIG. 3 is a training schematic diagram of an operator processing model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an operator processing model according to an embodiment of the present application;

FIG. 5 is a schematic diagram of determining a first input parameter according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an operator processing model according to an embodiment of the present application;

FIG. 7 is a schematic diagram of determining a second input parameter according to an embodiment of the present application;

FIG. 8 is a schematic diagram of determining a third input parameter according to an embodiment of the present application;

FIG. 9 is a schematic diagram of an operator processing model according to an embodiment of the present application;

FIG. 10 is a schematic diagram of determining a fourth input parameter according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a training advertisement recommendation model according to an embodiment of the present application;

FIG. 12 is a schematic diagram of a training advertisement recommendation model according to an embodiment of the present application;

FIG. 13 is a schematic structural diagram of a training device for an operator processing model according to an embodiment of the present application;

fig. 14 is a schematic diagram of a server structure according to an embodiment of the present application;

fig. 15 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "includes" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

In the related art, a service model is trained by determining a hardware performance of a specific test case test operator under several sets of input parameters and then determining a set of input parameters from the several sets of input parameters based on the hardware performance. Because the number of times of testing operators based on specific test cases is limited, or the test results obtained by testing cannot cover all conditions required by the subsequent service models, the input parameters of the operators determined based on the test results with a small number are possibly not optimal input parameters suitable for training the subsequent service models, so that the accuracy of the input parameters of the operators determined in the mode and used for training the service models is low, and the effect of shortening the training time of the service models is not great.

Based on the above, the embodiment of the application provides a training method and a related device for an operator processing model, by training an initial operator processing model, continuously learning real results respectively obtained by testing operators through a target test card based on different sample input parameters, so that a predicted theoretical result is more and more similar to the real result, the learning capacity of the operator processing model obtained by training the initial operator processing model is higher, and the theoretical result obtained by testing the predictors through the target test card based on different input parameters can be predicted. That is, the number of test results is increased by performing the prediction by the operator processing model, so that the accuracy of determining the input parameters of the operator is improved, so that the service model is trained based on the operator and the input parameters thereof, and the training time is shortened.

The training method of the operator processing model provided by the application can be applied to computer equipment with training capability of the operator processing model, such as terminal equipment and a server. The terminal device may be a desktop computer, a notebook computer, a mobile phone, a tablet computer, an internet of things device, a portable wearable device, the internet of things device may be an intelligent sound box, an intelligent television, an intelligent air conditioner, an intelligent vehicle-mounted device, etc., the intelligent vehicle-mounted device may be a vehicle-mounted navigation terminal, a vehicle-mounted computer, etc., and the portable wearable device may be an intelligent watch, an intelligent bracelet, a head-mounted device, etc., but is not limited thereto; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server or a server cluster for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content distribution network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligent platform. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.

In order to facilitate understanding of the training method of the operator processing model provided by the embodiment of the present application, an application scenario of the training method of the operator processing model is described by taking an execution body of the training method of the operator processing model as an example of a server.

Referring to fig. 1, the diagram is a schematic diagram of an application scenario of a training method of an operator processing model according to an embodiment of the present application. As shown in fig. 1, the application scenario includes a server 100, where the server 100 is configured to execute the training method of the operator processing model provided by the embodiment of the present application.

The server 100 obtains the operator, the sample input parameters and the first test result. As shown in fig. 1, taking an operator as a matrix multiplication operator as an example, the sample input parameters may be different combinations (M, N, K), where M is the number of rows of the first matrix, N is the number of columns of the first matrix or the number of rows of the second matrix, and K is the number of columns of the second matrix, i.e. the first matrix includes M rows and N columns, and the second matrix includes N rows and K columns. Such as (M, N, K) may be (32,32,32), (32,64,32), (65536,65536,65536), and the like. The first test result is a real result obtained by testing the operator through the target accelerator card, in the test process, the test is performed based on the sample input parameter, taking the first test result as an execution time as an example, the real execution time required by passing through the target accelerator card test matrix multiplier is 5 seconds based on (32,32,32), the real execution time required by passing through the target accelerator card test matrix multiplier is 4 seconds based on (32,64,32), the real execution time required by passing through the target accelerator card test matrix multiplier is 10 seconds based on (65536,65536,65536), and the like.

The server 100 predicts through an initial operator processing model according to the operator and the sample input parameters to obtain a first prediction result, that is, a theoretical result obtained by testing the operator through a target accelerator card based on the sample input parameters. The theoretical execution time required to test the matrix multiplier through the target accelerator card, e.g., based on (32,32,32), is 6 seconds. Parameters of the initial operator processing model are adjusted, such as by adjusting parameters of the initial operator processing model, based on a training direction that minimizes the target variance, i.e., minimizes the variance between the first predicted outcome and the first test outcome, such that the theoretical execution time required to pass the target accelerator card test matrix multiplier based on (32,32,32) is gradually approaching 5 seconds, even equal to 5 seconds. After training is completed, an operator processing model is obtained, so that the operator processing model can accurately predict theoretical results of an operator tested by a target test card based on different input parameters.

The training method of the operator processing model provided by the embodiment of the application can be executed by a server. However, in other embodiments of the present application, the terminal device may also have a similar function to the server, so as to perform the training method of the operator processing model provided in the embodiment of the present application, or the terminal device and the server jointly perform the training method of the operator processing model provided in the embodiment of the present application, which is not limited in this embodiment.

The following describes a training method of an operator processing model provided by the application in detail through a method embodiment.

Referring to fig. 2, the flow chart of the training method of the operator processing model according to the embodiment of the application is shown. For convenience of description, the following embodiments will be described by taking an execution body of the training method of the operator processing model as a server as an example. As shown in FIG. 2, the training method of the operator processing model includes S201-S203.

S201: and acquiring an operator, a sample input parameter and a first test result.

A business model, such as a deep learning model, generally consists of a plurality of computing units, i.e., operators (OPs), which correspond to computing logic in layers included in the business model, for example: the convolution Layer (Convolution Layer) is an operator, and the weight summation process in the full-connected Layer (FC Layer) is an operator, etc.

The sample input parameters are used to describe the data dimension of the input data of the operator, the input data of the operator referring to the data input into the operator, and the data dimension referring to the attributes or the number of features describing the data features. It will be appreciated that the input data required for different operators may be different.

Taking the matrix multiplication operator as an example, the required input data is a matrix, and the sample input parameters for the matrix multiplication (matmul) operator are used to describe the shape of the matrix and can be represented by (M, N, K). Where M is the number of rows of the first matrix, N is the number of columns of the first matrix and the number of rows of the second matrix, and K is the number of columns of the second matrix. The different (M, N, K) combinations correspond to different sample input parameters, e.g., (M, N, K) may be various combinations of (32,32,32), (32,64,32), (65536,65536,65536), etc.

As a possible implementation manner, hardware configuration data of the target accelerator card may be acquired, and sample input parameters for the operator are determined according to the hardware configuration data. The target accelerator card is an accelerator card for testing operators, and the accelerator card is a hardware device for accelerating specific tasks in a computer system, so that performance of the computer for processing specific types of workloads, such as a chip, a graphics processing unit (Graphics Processing Unit, GPU), a tensor processing unit (Tensor Processing Unit, TPU), a neural network processing unit (Neural network Processing Unit, NPU) and the like, is improved. The hardware configuration data is used for describing the hardware capability of the target accelerator card, and may be parameter data such as a video memory. According to the hardware configuration data of the target accelerator card, the sample input parameters of the operator are determined, so that more sample input parameters can be generated in the limited capacity of the target accelerator card as much as possible, and further the generated sample input parameters are prevented from exceeding the capacity of the target accelerator card, and training time is wasted. For example, the sample input parameters may be determined starting from the smallest computational unit 32 to the largest dimension that the memory of the target accelerator card can load. Taking a matrix multiplier as an example, if the maximum dimension that can be loaded by the video memory of the target accelerator card is 65536, (M, N, K) may be any combination of (32,32,32) - (65536,65536,65536).

The first test result is a real result obtained by testing the operator through the target accelerator card based on the sample input parameters. That is, the first test result is a real result obtained by testing the operator through the target accelerator card, and the test is performed based on the sample input parameters in the test process. The first test result is used to describe the hardware performance of the operator on the target accelerator card, for example, the execution time, bandwidth, time delay, memory handling time, etc., which is not limited in detail in the present application. For example, the time delay of the test operator based on the sample input parameters may be implemented by a chip.

As a possible implementation manner, a service model to which the operator belongs may be determined, and test cases applicable to the service model and the operator may be determined according to the service model. And according to the sample input parameters, the test case is operated through the target accelerator card, and a first test result is obtained.

The business model is a model built based on operators, such as an advertisement recommendation model. Different business models have different functions, and the fact that the input data of the same operator in different business models is different may occur, so that the input parameters of the operators are different, and the test results are different. Therefore, the test cases which are specific to the operator and are suitable for the service model where the operator is located are determined based on the service model, and accordingly, according to sample input parameters, the test cases are operated through the target accelerator card to obtain a first test result, the first test result is suitable for the operator and the service model where the operator is located, and the prediction accuracy of the operator processing model obtained through subsequent training based on the first test result is higher.

S202: and according to the operator and the sample input parameters, predicting through an initial operator processing model to obtain a first prediction result.

The initial operator processing model is used for testing a theoretical result obtained by testing the predictor through the target accelerator card based on different sample input parameters, and compared with the operator processing model with which training is completed, the prediction accuracy of the initial operator processing model is lower than that of the operator processing model, and parameters of the initial operator processing model are not adjusted.

The first prediction result is a theoretical result obtained by testing the operator through the target accelerator card based on the sample input parameters, and the theoretical result is a result obtained through model prediction and possibly different from a real result obtained through testing.

As a possible implementation manner, the operator identifier for identifying the operator and the sample input parameter may be input into an initial operator processing model, and prediction is performed through the initial operator processing model, so as to obtain a first prediction result.

S203: and adjusting parameters of the initial operator processing model based on the training direction of the minimized target difference to obtain an operator processing model.

The target difference is the difference between the first predicted result and the first test result, that is, based on the training direction that minimizes the difference between the first predicted result and the first test result, the parameters of the initial operator processing model are continuously adjusted, so that the initial operator processing model can learn how to obtain a theoretical result obtained by testing the operator through the target accelerator card based on the sample input parameters based on the operator and the sample input parameters through prediction, and the theoretical result gradually approaches to the real result, even is equal to the real result.

Referring to fig. 3, a training schematic diagram of an operator processing model according to an embodiment of the present application is shown. In fig. 3, according to the operator and the sample input parameters, prediction is performed by an initial operator processing model, and a first prediction result is obtained. And adjusting parameters of the initial operator processing model based on the training direction of the minimized target difference to obtain an operator processing model.

After training is completed, an operator processing model is obtained based on the initial operator processing model, and the operator processing model is used for predicting theoretical results obtained by testing the operators through the target accelerator card based on different input parameters. It should be noted that, the prediction function of the operator processing model may be understood that, in the process of training the operator processing model, a hardware design space is established based on limited sample input parameters, in the hardware design space, there are more test results, where the test results are obtained by using different input parameters by the operator when the operator tests the operator through the target accelerator card, so that the operator processing model may predict a theoretical result obtained by testing the operator through the target accelerator card based on different input parameters.

Referring to fig. 4, a schematic diagram of an operator processing model according to an embodiment of the present application is shown. And obtaining a first test result through the test cases corresponding to the target accelerator operation operators, and so on, testing different operators through a plurality of test cases to obtain a plurality of first test results, thereby constructing and obtaining a hardware design space based on different first test results corresponding to different sample input parameters. And obtaining an operator processing model through hardware design space training so that the operator processing model has a prediction function, namely, a theoretical result, such as a first input parameter, obtained by testing the predictor through the target accelerator card based on different input parameters.

Therefore, the initial operator processing model is trained through a plurality of operators, a plurality of sample input parameters of each operator and a first test result corresponding to each sample input parameter, so that the obtained operator processing model can predict theoretical results obtained by testing different operators through the target accelerator card under different input parameters, and further, when the operator processing model can determine what input parameters are respectively of different operators, better effects can be obtained by testing the target accelerator card.

The training method of the operator processing model provided by the embodiment of the application can relate to an artificial intelligence technology, wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is a theory, a method, a technology and an application system which simulate, extend and expand human intelligence by using a digital computer or a machine controlled by the digital computer, sense environment, acquire knowledge and acquire an optimal result by using the knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.

In the embodiment of the application, the artificial intelligence technology mainly comprises the technical directions of machine learning, deep learning and the like. Machine learning (MACHINE LEARNING, ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like. The pre-training model is the latest development result of deep learning, and integrates the technology.

The Pre-training model (PTM), also called a kerbstone model or a large model, refers to a deep neural network (Deep neural network, DNN) with large parameters, which is trained on massive unlabeled data, and the PTM extracts common features from the data by utilizing the function approximation capability of the large-parameter DNN, and is suitable for downstream tasks through fine tuning (PEFT), efficient fine tuning (PARAMETER EFFICIENT FINE-tuning) of parameters, prompt learning (prompt-tuning) and other technologies. Therefore, the pre-training model can achieve ideal effects in a small sample (Few-shot) or Zero sample (Zero-shot) scene. PTM can be classified according to the data modality of the process into language models (e.g., ELMO, BERT, GPT, etc.), visual models (e.g., swin-transformer, viT, V-MOE, etc.), speech models (e.g., VALL-E), multi-modal models (e.g., viBERT, CLIP, flamingo, gato, etc.), etc., where multi-modal models refer to models that build a representation of two or more data modality features. The pre-training model is an important tool for outputting artificial intelligence generation Content (ARTIFICIAL INTELLIGENCE GENERATED Content, AIGC), and can also be used as a general interface for connecting a plurality of specific task models.

In the training method of the operator processing model provided by the embodiment of the application, the operator processing model can be a model obtained based on a pre-training model, can also be a multi-layer perceptron and the like, is not particularly limited, and can be set by a person skilled in the art according to actual needs.

Therefore, the actual results obtained by respectively testing the operators through the target test card based on different sample input parameters are continuously learned through training the initial operator processing model, so that the predicted theoretical results are more and more similar to the actual results, the learning capacity of the operator processing model obtained through training based on the initial operator processing model is higher, and the theoretical results obtained by testing the operators through the target test card based on different input parameters can be predicted. That is, the number of test results is increased by performing the prediction by the operator processing model, so that the accuracy of determining the input parameters of the operator is improved, the service model is trained based on the operator and the input parameters thereof, and the training time is shortened.

As a possible implementation, after the operator processing model is obtained by training, different applicable input parameters can be determined for different operators based on the operator processing model. The following is a description of S204 to S208 (S204 to S208 are not shown in fig. 2).

S204: and obtaining a first operator to be tested.

The first operator to be tested is the operator to be predicted by the operator handling model.

S205: and determining a plurality of first input parameters to be tested according to the first operator to be tested.

The first input parameters are used for describing the data dimension of the input data of the first operator to be tested, and each first input parameter corresponds to at least one data dimension of the input data, namely different first input parameters are used for describing the input data of the first operator to be tested from different data dimensions. For example, taking a matrix multiplier as an example of the first to-be-tested operator, the first to-be-tested input parameter (M, N, K) may be any one of combinations (32,32,32) to (65536,65536,65536).

S206: and according to the first operator to be tested and the first input parameter to be determined of the target, predicting through the operator processing model to obtain a second prediction result.

And taking one first input parameter to be determined in the plurality of first input parameters as a target first input parameter to be determined, predicting the first input parameter to be determined based on the target first input parameter by an operator processing model according to a first operator to be tested and the target first input parameter to be determined, and obtaining a second prediction result. Specifically, the second prediction result is a theoretical result obtained by testing the first operator to be tested through the target accelerator based on the target first input parameter to be tested.

S207: and taking the plurality of first input parameters to be determined as target first input parameters respectively, and obtaining second prediction results corresponding to the first input parameters respectively.

And S206, taking the plurality of first input parameters to be determined as target first input parameters respectively, and obtaining second prediction results corresponding to each first input parameter to be determined respectively.

S208: and determining the first input parameters corresponding to the first to-be-tested operator from the plurality of first to-be-tested input parameters according to the second prediction results respectively corresponding to the first to-be-tested input parameters.

The second prediction results corresponding to each first waiting input parameter can describe the hardware performance corresponding to the first waiting operator based on different first waiting input parameters when the first waiting operator is tested through the target accelerator card, so that the first input parameters required by the first waiting operator in the follow-up training business model can be determined based on the plurality of second prediction results.

As a possible implementation manner, the plurality of second prediction results may be ranked, a first to-be-determined input parameter corresponding to the first to-be-ranked second prediction result is taken as the first input parameter, and the theoretical result obtained by testing the first to-be-measured object through the target accelerator card based on the first input parameter has the best effect, for example, if the first test result is the execution time, the second prediction result is also the execution time, and if the execution time corresponding to the first to-be-ranked second prediction result is the shortest. Therefore, the effect of the first input parameter on the real result obtained by testing the first to-be-measured object through the target accelerator card is also best, for example, the execution time obtained by truly testing based on the first input parameter is also shortest, so that the training efficiency of the service model where the subsequent first to-be-measured object is located is improved, and the training time is shortened.

Referring to fig. 5, a schematic diagram of determining a first input parameter according to an embodiment of the present application is shown. In fig. 5, a plurality of first input parameters to be tested are determined according to the first operators to be tested, and by inputting each first input parameter to be tested and the first operators to be tested into an operator processing model, second prediction results corresponding to each first input parameter to be tested are obtained, and based on the plurality of second prediction results, the first input parameters corresponding to the first operators to be tested are determined.

Therefore, after the operator processing model is obtained through training, the operator processing model is used for predicting theoretical results obtained by testing the operators through the target accelerator card based on different input parameters. If the first operator to be tested is to be predicted, a plurality of first input parameters to be tested can be generated based on the first operator to be tested, so that a plurality of theoretical results, namely a plurality of second predicted results, obtained by testing the first operator to be tested through the target accelerator based on each first input parameter to be tested are predicted through the operator processing model. And based on a plurality of second prediction results, determining a first input parameter corresponding to the first operator to be tested from a plurality of first input parameters to ensure that the first operator to be tested has the best effect of using the first input parameter by the first operator to be tested when testing the first operator to be tested through the target accelerator card, thereby improving the training efficiency of the service model where the subsequent first operator to be tested is located and shortening the training time.

As a possible implementation manner, a training process based on the service model to be trained where the first operator to be tested is located is described below.

And obtaining a service model to be trained and a calculation diagram corresponding to the service model to be trained, wherein the calculation diagram is used for describing a plurality of operators included in the service model to be trained and a calculation sequence among the operators, the operators are respectively used as first operators to be tested, the S204-S208 are executed to obtain first input parameters respectively corresponding to the operators, and the operators are tested through a target accelerator card based on the first input parameters respectively corresponding to the operators and the calculation sequence to obtain the trained service model corresponding to the service model to be trained.

The business model to be trained refers to a business model which is not trained yet. For example, a service model to be trained can be built through various deep learning open source frameworks (such as TensorFlow, keras, etc.), and then compiled through a mode of domain specific language (Domain Specific Language, DSL, etc.), so as to obtain a calculation map corresponding to the service model to be trained.

A computational graph (Computational Graph) is a graphical representation that describes a plurality of operators and a computational order among the plurality of operators that are included in a business model to be trained. As a possible implementation manner, the computation graph includes nodes and edges connecting the nodes, wherein the nodes are operators, and the edges are data flow directions between the operators and are used for representing computation sequences between the operators.

And respectively taking the operators as first operators to be tested, executing the S204-S208 to obtain first input parameters corresponding to the operators respectively, testing the operators by using the adaptive first input parameters by using a target accelerator card, testing the operators based on the calculation sequence indicated by a calculation graph, forming a service model to be trained by the operators, and completing the testing of the service model to be trained after the testing of the operators is completed, so that the trained service model corresponding to the service model to be trained is obtained by continuous testing. The trained service model is a service model which is obtained based on the service model to be trained and is trained.

Therefore, the same business model also has different input parameters in practical application, taking the advertisement recommendation model as an example, in order to recommend advertisements possibly interested by the user, the interests of the user need to be input, but the interests of different users are different, so the input parameters of the advertisement recommendation model and the like may be different. The business model comprises a plurality of operators, so that the input parameters of the operators also need to be flexibly changed based on the change of the input parameters of the business model, and better effect can be achieved in the process of training the business model. Based on the first input parameters corresponding to each operator in the calculation diagram are determined based on the calculation diagram corresponding to the service model to be trained, and a plurality of operators included in the service model to be trained are tested based on the calculation sequence indicated by the calculation diagram, so that the trained service model is obtained. Therefore, the first input parameters corresponding to a plurality of operators included in the business model are flexibly adjusted according to the change of the input parameters of the business model, the training efficiency of the business model is improved, and the training time is further shortened.

As one possible implementation manner, the operator processing model applicable to different accelerator cards can be trained for different accelerator cards, so that the operator processing model is more targeted. The following description will take an example in which the target accelerator card includes a first accelerator card and a second accelerator card.

The first accelerator card and the second accelerator card are different accelerator cards. The first test result comprises a first test sub-result and a second test sub-result, the first test sub-result is a real result obtained by testing the operator through the first accelerator card based on the sample input parameters, and the second test sub-result is a real result obtained by testing the operator through the second accelerator card based on the sample input parameters.

Based on this, a first operator processing model applicable to the first accelerator card and a second operator processing model applicable to the second accelerator card can be obtained through training according to the operator, the sample input parameters, the first test sub-result and the second test sub-result, which are described below.

(1) And according to the operator and the sample input parameters, predicting through a first initial operator processing model to obtain a first predictor result, namely, when the operator is tested through a first accelerator card, the operator tests a theoretical result based on the sample input parameters. And then, based on a training direction for minimizing the first difference, namely, minimizing the difference between the first predictor result and the first test sub result, adjusting parameters of a first initial operator processing model to obtain a first operator processing model, so that the first operator processing model can be used for a theoretical result obtained by testing the predictor through a first accelerator card based on different input parameters, namely, a theoretical result obtained by testing the predictor through the first accelerator card based on different input parameters.

(2) And according to the operator and the sample input parameters, predicting through a second initial operator processing model to obtain a second predictor result, namely, when the operator is tested through a second accelerator card, the operator tests a theoretical result based on the sample input parameters. And then, based on a training direction for minimizing the second difference, namely, minimizing the difference between a second predictor result and a second test result, adjusting parameters of a second initial operator processing model to obtain a second operator processing model, so that the second operator processing model can be used for a theoretical result obtained by testing the predictor through a second accelerator card based on different input parameters, namely, a theoretical result obtained by testing the predictor through the second accelerator card based on different input parameters.

It is understood that the first predictor includes a first predictor and a second predictor. The target differences include a first difference and a second difference. The operator handling model includes a first operator handling model and a second operator handling model. The initial operator handling model includes a first initial operator handling model and a second initial operator handling model.

The first initial operator processing model is used for testing theoretical results obtained by a predictor through a first acceleration card based on different sample input parameters, and compared with the trained first operator processing model, the prediction accuracy of the first initial operator processing model is lower than that of the first operator processing model, and parameters of the first initial operator processing model are not adjusted. The second initial operator processing model is used for testing theoretical results obtained by testing the predictor through the second acceleration card based on different sample input parameters, and compared with the trained second operator processing model, the prediction accuracy of the second initial operator processing model is lower than that of the second operator processing model, and parameters of the second initial operator processing model are not adjusted.

Referring to fig. 6, a schematic diagram of an operator processing model according to an embodiment of the present application is shown. In fig. 6, the operator processing model includes a first operator processing model and a second operator processing model, the initial operator processing model includes a first initial operator processing model and a second initial operator processing model, the first predicted result includes a first predicted sub-result and a second predicted sub-result, and the first test result includes a first test sub-result and a second test sub-result.

In fig. 6, operator and sample input parameters are input to a first initial operator processing model and a second initial operator processing model, a first predictor result is obtained by predicting the first initial operator processing model, and parameters of the first initial operator processing model are adjusted based on minimizing a difference between the first predictor result and a first test result, that is, minimizing the first difference, so as to obtain the first operator processing model. And predicting through the second initial operator processing model to obtain a second predictor result, and adjusting parameters of the second initial operator processing model based on minimizing the difference between the second predictor result and the second test result, namely minimizing the second difference, so as to obtain a second operator processing model.

Therefore, because hardware configuration data of different accelerator cards are different, test results obtained by aiming at operator tests are different, different operator processing models are trained for different accelerator cards, for example, a first operator processing model is trained for a first accelerator card, a second operator processing model is trained for a second accelerator card, and the like, so that the operator processing models are more targeted, and the prediction accuracy is improved.

As a possible implementation manner, after different operator processing models are obtained by training for different accelerator cards, not only can input parameters with good subsequent training service model effects be determined for operators, but also prediction results obtained by the operators on different accelerator cards can be determined. The following description will be given separately.

Mode one: an input parameter is determined based on the operator.

Firstly, not only the second operator to be tested but also the first accelerator card to be applied can be obtained. The second operator to be tested is an operator waiting for prediction through the operator processing model, and the second operator to be tested and the first operator to be tested can be the same operator or different operators. The first accelerating card to be applied is the accelerating card corresponding to the trained operator processing model, such as the first accelerating card or the second accelerating card.

And secondly, determining an operator processing model to be applied, which is applicable to the first accelerator card to be applied, according to the first accelerator card to be applied, for example, if the first accelerator card to be applied is the first accelerator card, the operator processing model to be applied, which is applicable to the first accelerator card to be applied, is the first operator processing model. And if the first accelerating card to be applied is the second accelerating card, the operator processing model to be applied, which is applicable to the first accelerating card to be applied, is the second operator processing model.

And determining a plurality of second undetermined input parameters according to the second to-be-tested operator, wherein the second undetermined input parameters are used for describing the data dimension of the input data of the second to-be-tested operator, and different second undetermined input parameters are used for describing different data dimensions of the input data of the second to-be-tested operator. And taking one second undetermined input parameter of the plurality of second undetermined input parameters as a target second undetermined input parameter, and predicting through an operator to be applied processing model according to a second operator to be tested and the target second undetermined input parameter to obtain a third prediction result. The third prediction result is a theoretical result obtained by testing the second to-be-tested algorithm through the first to-be-applied accelerator card based on the target second to-be-tested input parameter.

And finally, respectively taking the plurality of second undetermined input parameters as target second undetermined input parameters to obtain third prediction results respectively corresponding to the second undetermined input parameters. And determining second input parameters corresponding to the second to-be-tested algorithm from the plurality of second to-be-tested input parameters according to third prediction results respectively corresponding to the second to-be-tested input parameters.

And the third prediction result corresponding to each second input parameter to be determined can describe the hardware performance corresponding to the second input parameters to be determined by the second operator to be tested based on different second input parameters when the second operator to be tested is tested by the first accelerator to be applied, so that the second input parameters required by the second operator to be tested in the subsequent training business model can be determined based on a plurality of third prediction results.

As a possible implementation manner, the plurality of third prediction results may be ranked, the second input parameter to be determined corresponding to the first third prediction result is used as the second input parameter, and the theoretical result obtained by testing the second operator to be tested through the first accelerator to be applied based on the second input parameter is the best, so that the real result obtained by testing the second operator to be tested through the first accelerator to be applied by the second input parameter is the best, further, the training efficiency of the service model where the subsequent second operator to be tested is located is improved, and the training time is shortened.

Referring to fig. 7, a schematic diagram of determining a second input parameter according to an embodiment of the present application is shown. In fig. 7, a to-be-applied operator processing model applicable to the first to-be-applied accelerator card is determined according to the first to-be-applied accelerator card, a plurality of second to-be-input parameters are determined according to the second to-be-tested operators, third prediction results corresponding to the second to-be-tested input parameters are obtained by inputting each second to-be-tested parameter and the second to-be-tested operator into the to-be-applied operator processing model, and the second input parameters corresponding to the second to-be-tested operators are determined based on the third prediction results.

Therefore, after the corresponding operator processing models are obtained by training aiming at different accelerator cards, the corresponding different operator processing models are used for predicting theoretical results obtained by testing the predictors through the corresponding accelerator cards based on different input parameters. If the second operator to be tested is to be predicted, a plurality of second input parameters to be determined can be generated based on the second operator to be tested, and a corresponding operator processing model is determined based on the used accelerator card, so that a plurality of theoretical results, namely a plurality of third predicted results, obtained by testing the second operator to be tested through the corresponding accelerator card based on each second input parameter to be tested are predicted through the corresponding operator processing model. And determining a second input parameter corresponding to the second to-be-tested operator from a plurality of second to-be-tested input parameters based on a plurality of third prediction results, so that when the second to-be-tested operator is tested through the corresponding accelerator card, the second to-be-tested operator has the best effect of using the second input parameter, the training efficiency of the service model of the subsequent second to-be-tested operator is improved, and the training time is shortened.

Mode two: an acceleration card and corresponding input parameters are determined based on the operator.

The following description will take different operator processing models as a first operator processing model and a second operator processing model as examples.

Obtaining a third operator to be tested, wherein the third operator to be tested is an operator waiting for prediction through an operator processing model, and the third operator to be tested and the first operator to be tested can be the same operator or different operators. And determining a plurality of third undetermined input parameters according to the third to-be-tested operator, wherein the third undetermined input parameters are used for describing the data dimension of the input data of the third to-be-tested operator, and different third undetermined input parameters are used for describing different data dimensions of the input data of the third to-be-tested operator.

And taking one third undetermined input parameter of the plurality of third undetermined input parameters as a target third undetermined input parameter, and predicting through the first operator processing model according to the third undetermined input parameters and the target third undetermined input parameters to obtain a fourth prediction result. And according to the third operator to be tested and the target third input parameter to be determined, predicting through the second operator processing model to obtain a fifth prediction result. The fourth prediction result is a theoretical result obtained by testing the third to-be-tested algorithm through the first accelerator based on the target third to-be-tested input parameter. And the fifth prediction result is a theoretical result obtained by testing the third to-be-tested algorithm through the second accelerator card based on the target third to-be-tested input parameter.

And respectively taking the plurality of third input parameters to be determined as target third input parameters to obtain fourth prediction results respectively corresponding to the third input parameters to be determined and fifth prediction results respectively corresponding to the third input parameters to be determined, and determining third input parameters corresponding to a third to-be-tested operator and an acceleration card applicable to the third input parameters from the plurality of third input parameters according to the fourth prediction results respectively corresponding to the third input parameters to be determined and the fifth prediction results respectively corresponding to the third input parameters to be determined.

And the fourth prediction result corresponding to each third input parameter to be determined can describe the hardware performance corresponding to the third input parameter to be determined based on different third operators to be tested when the third operators to be tested are tested through the first accelerator card, so that the third input parameters required by the third operators to be tested through the first accelerator card can be determined based on a plurality of fourth prediction results. Similarly, the fifth prediction result corresponding to each third input parameter to be determined can describe the hardware performance corresponding to the third input parameter to be determined based on different third operators to be tested when the third operators to be tested are tested through the second accelerator card, so that the third input parameters required by the third operators to be tested through the second accelerator card can be determined based on a plurality of fifth prediction results.

As one possible implementation, the plurality of fourth predictors may be ordered based on a class of predictors, e.g., if the fourth predictor is for execution time, the fourth predictor may be ordered from small to large; if the fourth prediction result is for the memory transfer time, the fourth prediction result may be sorted from small to large, and so on. And taking a third undetermined input parameter corresponding to the first and second predicted results as a third input parameter required by testing through the first accelerator, wherein the theoretical result obtained by testing the third operator to be tested through the first accelerator based on the third input parameter is the best, so that the real result obtained by testing the third operator to be tested through the first accelerator by the third input parameter is also the best. And similarly, the plurality of fifth predicted results can be ranked, and the third undetermined input parameter corresponding to the first ranked fifth predicted result is used as the third input parameter required by the test through the second accelerator card.

Referring to fig. 8, a schematic diagram of determining a third input parameter according to an embodiment of the present application is shown. In fig. 8, a plurality of third input parameters to be determined are determined according to third to-be-tested algorithms, by inputting each third input parameter to be determined and each third to-be-tested algorithm into the first operator processing model and the second operator processing model respectively, obtaining a fourth prediction result through the first operator processing model, obtaining a fifth prediction result through the second operator processing model, obtaining a fourth prediction result and a fifth prediction result corresponding to each third input parameter to be tested, and determining a third input parameter corresponding to the third to-be-tested algorithm and an acceleration card applicable to the third input parameter based on the fourth prediction result and the fifth prediction result.

Therefore, aiming at the third test operator, the third input parameters with the best test effect of the first accelerator card determined by the first operator processing model and the third input parameters with the best test effect of the second accelerator card determined by the second operator model can be provided for the user, and even one or more combinations of whether the first accelerator card and the second accelerator card can be operated currently, the operation cost, the waiting time for the operation and the like are provided for the user, so that the user can select the service model where the first accelerator card or the second accelerator card is used for training the subsequent third test operator, and the experience of the user is improved. It may even be helpful for the user to determine on which accelerator card the different operators take what input parameters would be better.

As a possible implementation manner, not only the operator processing model may be trained to predict a theoretical result obtained by testing an operator through a target accelerator card based on a sample input parameter, but also the operator processing model may be trained to predict a theoretical result obtained by testing an operator through different accelerator cards based on a sample input parameter, and the following description will be given with reference to a specific example in which the target accelerator card includes a first accelerator card and a second accelerator card.

Specifically, according to an operator, a sample input parameter and a first acceleration card, a third predictor result is obtained through prediction by an initial operator processing model, and according to the operator, the sample input parameter and a second acceleration card, a fourth predictor result is obtained through prediction by the initial operator processing model. And adjusting parameters of the initial operator processing model based on the training directions of the minimized third difference and the minimized fourth difference to obtain the operator processing model.

The first prediction result comprises a third prediction sub result and a fourth prediction sub result, the third prediction sub result is a theoretical result obtained by testing the operator through the first acceleration card based on the sample input parameter, and the fourth prediction sub result is a theoretical result obtained by testing the operator through the second acceleration card based on the sample input parameter. The first test result comprises a first test sub-result and a second test sub-result, the first test sub-result is a real result obtained by testing the operator through the first accelerator card based on the sample input parameters, and the second test sub-result is a real result obtained by testing the operator through the second accelerator card based on the sample input parameters. The target differences include a third difference between the third predictor and the first test sub-result and a fourth difference between the fourth predictor and the second test sub-result. The operator processing model is used for predicting theoretical results obtained by testing the operators through different acceleration cards based on different input parameters.

Referring to fig. 9, a schematic diagram of an operator processing model according to an embodiment of the present application is shown. In fig. 9, the first predictor includes a third predictor and a fourth predictor, the first test result includes a first test sub-result and a second test sub-result, and the target difference includes a third difference and a fourth difference.

In fig. 9, the operator, the sample input parameter, and the accelerator card (the first accelerator card or the second accelerator card) are input to the initial operator processing model, the operator, the sample input parameter, and the first accelerator card predict through the output operator processing model to obtain a third predictor result, the operator, the sample input parameter, and the second accelerator card predict through the output operator processing model to obtain a fourth predictor result, minimize a difference between the third predictor result and the first test sub result, that is, minimize the third difference, and minimize the fourth predictor result and the second test sub result, that is, minimize the fourth difference, and adjust parameters of the initial operator processing model to obtain the operator processing model.

Therefore, based on operators, sample input parameters and various acceleration cards, an operator processing model is obtained through training, so that the operator processing model can predict theoretical results obtained by testing operators through different acceleration cards based on different input parameters, only one operator processing model is trained, and training difficulty is reduced.

As a possible implementation manner, a fourth operator to be tested and a second accelerator card to be applied may be obtained, where the fourth operator to be tested is an operator waiting for prediction by an operator processing model, and the fourth operator to be tested and the first operator to be tested may be the same operator or different operators. The second accelerator card to be applied is an accelerator card corresponding to the trained operator processing model, such as the first accelerator card or the second accelerator card.

And then, determining a plurality of fourth input parameters to be determined according to the fourth to-be-tested operator, wherein the fourth input parameters to be determined are used for describing the data dimension of the input data of the fourth to-be-tested operator, and different fourth input parameters are used for describing different data dimensions of the input data of the fourth to-be-tested operator. And taking one fourth undetermined input parameter of the fourth undetermined input parameters as a target fourth undetermined input parameter, and predicting through an operator processing model according to a fourth to-be-tested operator, the target fourth undetermined input parameter and the second to-be-applied acceleration card to obtain a sixth prediction result. The sixth prediction result is a theoretical result obtained by testing the fourth to-be-tested algorithm based on the target fourth to-be-tested input parameter through the second to-be-applied acceleration card.

And finally, respectively taking the plurality of fourth undetermined input parameters as target fourth undetermined input parameters to obtain sixth prediction results respectively corresponding to the fourth undetermined input parameters. And determining fourth input parameters corresponding to the fourth to-be-tested algorithm from the fourth to-be-tested input parameters according to sixth prediction results respectively corresponding to the fourth to-be-tested input parameters.

And the fourth to-be-tested operator can be described based on hardware performance corresponding to different fourth to-be-tested input parameters when the fourth to-be-tested operator is tested through the second to-be-applied acceleration card according to the sixth prediction result corresponding to each fourth to-be-tested input parameter, so that fourth input parameters required by the fourth to-be-tested operator in a subsequent training service model can be determined based on a plurality of sixth prediction results.

As a possible implementation manner, the plurality of sixth prediction results may be ranked, a fourth to-be-determined input parameter corresponding to the sixth prediction result ranked first is used as a fourth input parameter, and a theoretical result obtained by testing the fourth to-be-tested operator through the second to-be-applied accelerator card based on the fourth input parameter is the best, so that a real result obtained by testing the fourth to-be-tested operator through the second to-be-applied accelerator card by the fourth input parameter is the best, further training efficiency of a service model where the subsequent fourth to-be-tested operator is located is improved, and training time is shortened.

Referring to fig. 10, a schematic diagram of determining a fourth input parameter according to an embodiment of the present application is shown. In fig. 10, a plurality of fourth input parameters to be determined are determined according to the fourth to-be-tested operator, and a sixth prediction result corresponding to each fourth input parameter to be tested is obtained by inputting each fourth input parameter to be determined, respectively, the fourth to-be-tested operator and the second to-be-applied acceleration card into the operator processing model, and the fourth input parameter corresponding to the fourth to-be-tested operator is determined based on the plurality of sixth prediction results.

Therefore, based on operators, sample input parameters and various acceleration cards, an operator processing model is obtained through training, and the operator processing model can predict theoretical results obtained by testing operators through different acceleration cards based on different input parameters. After the fourth operator to be tested and the second accelerator card to be applied are obtained, the operator processing model is used for carrying out prediction to obtain sixth prediction results which are respectively corresponding to a plurality of fourth input parameters to be tested and are determined based on the fourth operator to be tested, so that the fourth input parameters corresponding to the fourth operator to be tested are determined, when the fourth operator to be tested is tested through the second accelerator card to be applied, the fourth operator to be tested has the best effect of using the fourth input parameters, the training efficiency of a service model where the subsequent fourth operator to be tested is located is further improved, and the training time is shortened.

As one possible implementation manner, a service model to be trained and a calculation map corresponding to the service model to be trained are obtained, the calculation map is used for describing a plurality of operators included in the service model to be trained and a calculation sequence among the operators, the operators are respectively used as fourth operators to be tested, and fourth input parameters respectively corresponding to the operators are obtained. And testing a plurality of operators through a second accelerator card to be applied based on fourth input parameters and calculation sequences respectively corresponding to the operators, and obtaining a trained service model corresponding to the service model to be trained.

Therefore, as the same business model has different input parameters in practical application, the business model comprises a plurality of operators, the input parameters of the operators also need to be flexibly changed based on the change of the input parameters of the business model, and the better effect can be achieved in the process of training the business model. Based on the above, the embodiment of the application determines the fourth input parameters corresponding to each operator in the calculation graph based on the calculation graph corresponding to the service model to be trained, and tests a plurality of operators included in the service model to be trained based on the calculation sequence indicated by the calculation graph to obtain the trained service model. Therefore, the fourth input parameters corresponding to a plurality of operators included in the business model are flexibly adjusted according to the change of the input parameters of the business model, the training efficiency of the business model is improved, and the training time is further shortened.

In order to facilitate further understanding of the technical solution provided by the embodiments of the present application, an execution body of the training method of the operator processing model provided by the embodiments of the present application is taken as an example of a server, and the training method of the operator processing model is described in an overall exemplary manner.

A description will be given of a computing center constituted by a plurality of servers.

The computing force center includes a plurality of servers. Each server comprises a plurality of display cards, and each display card corresponds to one micro-architecture. Microarchitecture, also referred to as microarchitecture or microprocessor architecture, is a method of executing a given instruction set architecture in a processor.

In the computing force center, one operator processing model can be deployed on one micro-architecture, and the operator processing models corresponding to different micro-architectures are different. According to the input parameter division, the input of some operator processing models is an operator to be tested and input parameters, and the input of some operator processing models is an operator to be tested, input parameters and an acceleration card. According to model class division, different operator processing models correspond to different accelerator cards, or one operator processing model can predict theoretical results of different operators on different accelerator cards based on different input parameters, and the like.

The computing force center can be built based on cloud technology. Cloud technology (Cloud technology) refers to a hosting technology for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.

In embodiments of the application, artificial intelligence cloud services may be provided through a computing force center. The artificial intelligence cloud service is also commonly referred to as AIaaS (AI AS A SERVICE, chinese is "AI as service"). The service mode of the artificial intelligent platform is the mainstream at present, and particularly, the AIaaS platform can split several common AI services and provide independent or packaged services at the cloud. This service mode is similar to an AI theme mall: all developers can access one or more artificial intelligence services provided by the use platform through an API interface, and partial deep developers can also use an AI framework and AI infrastructure provided by the platform to deploy and operate and maintain self-proprietary cloud artificial intelligence services.

After the training process of the operator processing model is described, the operator processing model may be obtained by training based on the foregoing modes of fig. 3, fig. 6, fig. 9, and the like. The following description will be given separately.

(1) Based on the operator processing model obtained through training in the mode of FIG. 3, the theoretical result obtained by testing different operators through the target accelerator card based on different input parameters can be predicted. Based on this, an operator processing model trained based on the manner of fig. 3 can be deployed on the target accelerator card.

(2) Based on the operator processing model obtained through training in the mode of fig. 6, a first operator processing model can be used for a first acceleration card, and the first operator processing model can predict theoretical results obtained by testing different operators through the first acceleration card based on different input parameters. And a second operator processing model is used for the second accelerator card, and the second operator processing model can predict theoretical results obtained by testing different operators through the second accelerator card based on different input parameters. Based on this, a first operator processing model may be deployed on a first accelerator card and a second operator processing model may be deployed on a second accelerator card.

(3) Based on the operator processing model obtained through training in the mode of FIG. 9, theoretical results obtained by testing different operators through different accelerator cards based on different input parameters can be predicted. Based on the above, the operator processing model obtained based on the training in the mode of fig. 9 can be deployed on the scheduling server, so that the training effect corresponding to each acceleration card is determined, and a plurality of servers and the like included in the computing center are better scheduled.

In order to facilitate further understanding of the technical solution provided by the embodiments of the present application, a training method of the operator processing model is described below by taking a business model as an advertisement recommendation model as an example.

Referring to fig. 11, a schematic diagram of a training advertisement recommendation model according to an embodiment of the present application is shown. In fig. 11, the advertisement recommendation model is input as user items, such as gender, age, etc. of the user, it is to be understood that in the specific embodiment of the present application, the data related to the gender, age, etc. is related to the user, when the above embodiment of the present application is applied to the specific product or technology, the user needs to obtain individual permission or individual consent, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.

The dense features and the sparse features can be extracted based on the user items, the dense features are calculated based on the dense feature pooling operator, the sparse features are converted into the dense features after being searched based on the embedded table, and the converted dense features are calculated based on the sparse feature pooling operator. And then obtaining a prediction result through a feature crossover operator and prediction calculation. But the training speed in this way is still slow.

Based on the above, in order to improve the training speed of the model, the operator processing model provided by the embodiment of the application can be used for predicting the input parameters of each operator, so that the accuracy of determining the input parameters of the operators is improved, the service model is trained based on the operators and the input parameters thereof, and the training time is shortened.

Referring to FIG. 12, a schematic diagram of a training advertisement recommendation model is provided according to an embodiment of the present application. The input parameters of the dense feature pooling operator, the sparse feature pooling operator and the feature crossing operator can be predicted by the operator processing model obtained through training based on the mode of FIG. 3, and then a final prediction result is obtained based on prediction calculation. After the accuracy of the input parameters of the operators is improved, the training efficiency of the advertisement recommendation model can be improved, the training time is shortened, and the user experience is improved.

Aiming at the operator processing model training method, the application also provides a corresponding operator processing model training device, so that the operator processing model training method is applied and realized in practice.

In the present embodiment, the term "module" or "unit" refers to a computer program or a part of a computer program having a predetermined function and working together with other relevant parts to achieve a predetermined object, and may be implemented in whole or in part by using software, hardware (such as a processing circuit or a memory), or a combination thereof. Also, a processor (or multiple processors or memories) may be used to implement one or more modules or units. Furthermore, each module or unit may be part of an overall module or unit that incorporates the functionality of the module or unit.

Referring to fig. 13, the structure of a training device for an operator processing model according to an embodiment of the present application is shown. As shown in fig. 13, the training apparatus 1300 of the operator processing model includes: an acquisition unit 1301, a prediction unit 1302, and an adjustment unit 1303;

The obtaining unit 1301 is configured to obtain an operator, a sample input parameter, and a first test result, where the sample input parameter is used to describe a data dimension of input data of the operator, and the first test result is a real result obtained by testing the operator through a target accelerator card based on the sample input parameter;

The prediction unit 1302 is configured to perform prediction through an initial operator processing model according to the operator and the sample input parameter, so as to obtain a first prediction result, where the first prediction result is a theoretical result obtained by testing the operator through the target accelerator card based on the sample input parameter;

the adjusting unit 1303 is configured to adjust parameters of the initial operator processing model based on a training direction that minimizes a target difference, so as to obtain an operator processing model, where the target difference is a difference between the first prediction result and the first test result, and the operator processing model is configured to predict a theoretical result obtained by testing the operator by the target accelerator card based on different input parameters.

As a possible implementation manner, the training apparatus 1300 of the operator processing model further includes an input parameter determining unit configured to:

Acquiring a first operator to be tested;

Determining a plurality of first to-be-tested input parameters according to the first to-be-tested operator, wherein different first to-be-tested input parameters are used for describing different data dimensions of input data of the first to-be-tested operator;

According to the first to-be-tested operator and the target first to-be-tested input parameters, predicting through the operator processing model to obtain a second prediction result, wherein the second prediction result is a theoretical result obtained by testing the first to-be-tested operator through the target accelerator based on the target first to-be-tested input parameters, and the target first to-be-tested input parameters are one first to-be-tested input parameter of the plurality of first to-be-tested input parameters;

Respectively taking the plurality of first input parameters to be determined as the target first input parameters to obtain second prediction results respectively corresponding to the first input parameters to be determined;

And determining the first input parameters corresponding to the first to-be-tested operator from the plurality of first to-be-tested input parameters according to the second prediction results respectively corresponding to the first to-be-tested input parameters.

As a possible implementation manner, the training apparatus 1300 of the operator processing model further includes a model training unit configured to:

acquiring a service model to be trained and a calculation graph corresponding to the service model to be trained, wherein the calculation graph is used for describing a plurality of operators included in the service model to be trained and a calculation sequence among the operators;

Respectively taking the operators as the first operators to be tested, and obtaining first input parameters respectively corresponding to the operators;

And testing the operators through the target accelerator card based on the first input parameters and the calculation sequence corresponding to the operators respectively to obtain a trained service model corresponding to the service model to be trained.

As a possible implementation manner, the training apparatus 1300 of the operator processing model further includes a first test result determining unit configured to:

Determining a business model to which the operator belongs;

determining test cases applicable to the service model and the operator according to the service model;

And operating the test case through the target accelerator card according to the sample input parameters to obtain the first test result.

As a possible implementation manner, the training apparatus 1300 of the operator processing model further includes a sample input parameter determining unit configured to:

acquiring hardware configuration data of the target accelerator card;

And determining sample input parameters for the operator according to the hardware configuration data.

As a possible implementation manner, the target accelerator card includes a first accelerator card and a second accelerator card, the first test result includes a first test sub-result and a second test sub-result, the first test sub-result is a real result obtained by the operator performing the test through the first accelerator card based on the sample input parameter, and the second test sub-result is a real result obtained by the operator performing the test through the second accelerator card based on the sample input parameter;

the prediction unit 1302 is specifically configured to:

According to the operator and the sample input parameters, predicting through a first initial operator processing model to obtain a first predictor result, wherein the first predictor result is a theoretical result obtained by testing the operator through the first accelerator card based on the sample input parameters;

according to the operator and the sample input parameters, predicting through a second initial operator processing model to obtain a second predictor result, wherein the second predictor result is a theoretical result obtained by testing the operator through the second accelerator card based on the sample input parameters;

the adjusting unit 1303 is specifically configured to:

Adjusting parameters of the first initial operator processing model based on a training direction of minimizing a first difference, so as to obtain a first operator processing model, wherein the first difference is a difference between the first predictor result and the first test result, and the first operator processing model is used for predicting a theoretical result obtained by testing the operator through the first accelerator card based on different input parameters;

And adjusting parameters of the second initial operator processing model based on a training direction of minimizing a second difference, so as to obtain a second operator processing model, wherein the second difference is a difference between the second predictor result and the second test result, and the second operator processing model is used for predicting a theoretical result obtained by testing the operator through the second accelerator card based on different input parameters.

acquiring a second operator to be tested and a first accelerating card to be applied, wherein the first accelerating card to be applied is the first accelerating card or the second accelerating card;

determining an operator processing model to be applied, which is applicable to the first accelerator card to be applied, according to the first accelerator card to be applied, wherein the operator processing model to be applied is the first operator processing model or the second operator processing model;

Determining a plurality of second undetermined input parameters according to the second to-be-tested operator, wherein different second undetermined input parameters are used for describing different data dimensions of input data of the second to-be-tested operator;

According to the second operator to be tested and the target second input parameter to be determined, predicting through the operator to be applied processing model to obtain a third prediction result, wherein the third prediction result is a theoretical result obtained by testing the second operator to be tested through the first accelerator card to be applied based on the target second input parameter to be determined, and the target second input parameter to be determined is one second input parameter to be determined in the plurality of second input parameters to be determined;

Respectively taking the plurality of second undetermined input parameters as the target second undetermined input parameters to obtain third prediction results respectively corresponding to the second undetermined input parameters;

and determining second input parameters corresponding to the second to-be-tested algorithm from the second to-be-tested input parameters according to third prediction results respectively corresponding to the second to-be-tested input parameters.

acquiring a third operator to be tested;

determining a plurality of third input parameters to be determined according to the third operator to be tested, wherein different third input parameters to be determined are used for describing different data dimensions of input data of the third operator to be tested;

According to the third operator to be tested and the target third input parameter to be determined, predicting through the first operator processing model to obtain a fourth prediction result, wherein the fourth prediction result is a theoretical result obtained by testing the third operator to be tested through the first accelerator based on the target third input parameter to be determined, and the target third input parameter to be determined is one third input parameter to be determined in the plurality of third input parameters to be determined;

according to the third operator to be tested and the target third input parameter to be determined, predicting through the second operator processing model to obtain a fifth prediction result, wherein the fifth prediction result is a theoretical result obtained by testing the third operator to be tested through the second accelerator card based on the target third input parameter to be determined;

Respectively taking the plurality of third undetermined input parameters as the target third undetermined input parameters to obtain fourth prediction results respectively corresponding to the third undetermined input parameters and fifth prediction results respectively corresponding to the third undetermined input parameters;

And determining a third input parameter corresponding to the third to-be-tested operator and an acceleration card applicable to the third input parameter from the plurality of third to-be-tested input parameters according to a fourth prediction result corresponding to each third to-be-tested input parameter and a fifth prediction result corresponding to each third to-be-tested input parameter, wherein the acceleration card applicable to the third input parameter is the first acceleration card or the second acceleration card.

the prediction unit 1302 is specifically configured to:

According to the operator, the sample input parameters and the first acceleration card, predicting through the initial operator processing model to obtain a third predictor result, wherein the third predictor result is a theoretical result obtained by testing the operator through the first acceleration card based on the sample input parameters;

According to the operator, the sample input parameters and the second accelerator card, predicting through the initial operator processing model to obtain a fourth predictor result, wherein the fourth predictor result is a theoretical result obtained by testing the operator through the second accelerator card based on the sample input parameters;

the adjusting unit 1303 is specifically configured to:

And adjusting parameters of the initial operator processing model based on training directions of minimizing a third difference and minimizing a fourth difference to obtain the operator processing model, wherein the third difference is a difference between the third predictor result and the first test result, the fourth difference is a difference between the fourth predictor result and the second test result, and the operator processing model is used for predicting a theoretical result obtained by testing the operator through different accelerator cards based on different input parameters.

acquiring a fourth operator to be tested and a second acceleration card to be applied, wherein the second acceleration card to be applied is the first acceleration card or the second acceleration card;

Determining a plurality of fourth input parameters to be determined according to the fourth to-be-tested operator, wherein different fourth input parameters to be determined are used for describing different data dimensions of input data of the fourth to-be-tested operator;

According to the fourth to-be-tested operator, the target fourth to-be-input parameters and the second to-be-applied accelerator card, predicting through the operator processing model to obtain a sixth prediction result, wherein the sixth prediction result is a theoretical result obtained by testing the fourth to-be-tested operator through the second to-be-applied accelerator card based on the target fourth to-be-input parameters, and the target fourth to-be-input parameter is one fourth to-be-input parameter of the fourth to-be-determined input parameters;

respectively taking the plurality of fourth undetermined input parameters as the target fourth undetermined input parameters to obtain sixth prediction results respectively corresponding to the fourth undetermined input parameters;

and determining fourth input parameters corresponding to the fourth to-be-tested algorithm from the fourth to-be-tested input parameters according to sixth prediction results respectively corresponding to the fourth to-be-tested input parameters.

Respectively taking the operators as the fourth operators to be tested to obtain fourth input parameters respectively corresponding to the operators;

and testing the operators through the second accelerator card to be applied based on fourth input parameters and the calculation sequence which are respectively corresponding to the operators, so as to obtain a trained service model corresponding to the service model to be trained.

The embodiment of the application also provides a computer device which can be a server or a terminal device, and the computer device provided by the embodiment of the application is introduced from the aspect of hardware materialization. Fig. 14 is a schematic structural diagram of a server, and fig. 15 is a schematic structural diagram of a terminal device.

Referring to fig. 14, which is a schematic diagram of a server structure according to an embodiment of the present application, the server 1400 may have a relatively large difference between configurations or performances, and may include one or more processors 1422, such as a central processing unit (Central Processing Units, CPU), a memory 1432, one or more application programs 1442, or a storage medium 1430 (such as one or more mass storage devices) for data 1444. Wherein the memory 1432 and storage medium 1430 can be transitory or persistent storage. The program stored in the storage medium 1430 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, a processor 1422 may be provided in communication with a storage medium 1430 to execute a series of instructions operations on the storage medium 1430 on the server 1400.

The Server 1400 can also include one or more power supplies 1426, one or more wired or wireless network interfaces 1450, one or more input/output interfaces 1458, and/or one or more operating systems 1441, such as a Windows Server ^TM,Mac OS X^TM,Unix^TM, Linux^TM,FreeBSD^TM, or the like.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 14.

Wherein, the CPU 1422 is configured to perform the following steps:

Based on a training direction of minimizing a target difference, adjusting parameters of the initial operator processing model to obtain an operator processing model, wherein the target difference is a difference between the first prediction result and the first test result, and the operator processing model is used for predicting a theoretical result obtained by testing the operator through the target accelerator card based on different input parameters

Optionally, the CPU 1422 may further execute method steps of any specific implementation of the training method of the operator processing model in the embodiment of the present application.

Referring to fig. 15, the structure of a terminal device according to an embodiment of the present application is shown. Taking the example that the terminal device is a smart phone as an example, fig. 15 is a block diagram showing a part of the structure of the smart phone, where the smart phone includes: radio Frequency (RF) circuitry 1510, memory 1520, input unit 1530, display unit 1540, sensor 1550, audio circuitry 1560, wireless fidelity (WiFi) module 1570, processor 1580, power supply 1590, and the like. Those skilled in the art will appreciate that the smartphone structure shown in fig. 15 is not limiting of the smartphone and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The following describes each component of the smart phone in detail with reference to fig. 15:

The RF circuit 1510 may be used for receiving and transmitting signals during a message or a call, and particularly, after receiving downlink information of a base station, the signal is processed by the processor 1580; in addition, the data of the design uplink is sent to the base station.

The memory 1520 may be used to store software programs and modules, and the processor 1580 implements various functional applications and data processing of the smartphone by running the software programs and modules stored in the memory 1520.

The input unit 1530 may be used to receive input numerical or character information and generate key signal inputs related to user settings and function control of the smart phone. In particular, the input unit 1530 may include a touch panel 1531 and other input devices 1532. The touch panel 1531, also referred to as a touch screen, may collect touch operations on or near the user and drive the corresponding connection device according to a predetermined program. The input unit 1530 may include other input devices 1532 in addition to the touch panel 1531. In particular, other input devices 1532 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit 1540 may be used to display information input by a user or information provided to the user and various menus of the smart phone. The display unit 1540 may include a display panel 1541, and optionally, the display panel 1541 may be configured in the form of a Liquid Crystal Display (LCD) CRYSTAL DISPLAY, an Organic Light-Emitting Diode (OLED), or the like.

The smartphone may also include at least one sensor 1550, such as a light sensor, a motion sensor, and other sensors. Other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured with the smart phone are not described in detail herein.

Audio circuitry 1560, speaker 1561, and microphone 1562 may provide an audio interface between a user and a smart phone. The audio circuit 1560 may transmit the received electrical signal converted from audio data to the speaker 1561, and be converted into a sound signal by the speaker 1561 for output; on the other hand, the microphone 1562 converts the collected sound signals into electrical signals, which are received by the audio circuit 1560 for conversion into audio data, which is processed by the audio data output processor 1580 for transmission to, for example, another smart phone via the RF circuit 1510 or for output to the memory 1520 for further processing.

Processor 1580 is a control center of the smartphone, connects various parts of the entire smartphone with various interfaces and lines, performs various functions of the smartphone and processes data by running or executing software programs and/or modules stored in memory 1520, and invoking data stored in memory 1520. In the alternative, processor 1580 may include one or more processing units.

The smart phone also includes a power source 1590 (e.g., a battery) for powering the various components, which may preferably be logically connected to the processor 1580 via a power management system, such as to provide for managing charging, discharging, and power consumption.

Although not shown, the smart phone may further include a camera, a bluetooth module, etc., which will not be described herein.

In an embodiment of the present application, the memory 1520 included in the smart phone may store a computer program and transmit the computer program to the processor.

The processor 1580 included in the smart phone may execute the training method of the operator processing model provided in the foregoing embodiment according to instructions in the computer program.

The embodiment of the application also provides a computer readable storage medium for storing a computer program for executing the training method of the operator processing model provided by the above embodiment.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions to cause the computer device to perform the training method of the operator processing model provided in various alternative implementations of the above aspects.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, where the above program may be stored in a computer readable storage medium, and when the program is executed, the program performs steps including the above method embodiments; and the aforementioned storage medium may be at least one of the following media: read-Only Memory (ROM), RAM, magnetic disk or optical disk, etc.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment is mainly described in a different point from other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, with reference to the description of the method embodiments in part. The apparatus and system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the technical scope of the present application should be included in the scope of the present application. Further combinations of the present application may be made to provide further implementations based on the implementations provided in the above aspects. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims

1. A method of training an operator processing model, the method comprising:

Acquiring an operator, a sample input parameter and a first test result, wherein the sample input parameter is used for describing the data dimension of input data of the operator, the first test result is a real result obtained by testing the operator through a target accelerator card based on the sample input parameter, the sample input parameter is determined based on hardware configuration data of the target accelerator card, and the hardware configuration data is used for describing the hardware capability of the target accelerator card;

according to the operator and the sample input parameters, predicting through an initial operator processing model to obtain a first prediction result, wherein the first prediction result is a theoretical result obtained by the initial operator processing model for predicting the operator and testing the operator through the target accelerator card based on the sample input parameters;

Adjusting parameters of the initial operator processing model based on a training direction of minimizing target differences, so as to obtain an operator processing model, wherein the target differences are differences between the first prediction result and the first test result, and the operator processing model is used for predicting a theoretical result obtained by testing the operator through the target accelerator card based on different input parameters;

Acquiring a first operator to be tested;

2. The method according to claim 1, wherein the method further comprises:

3. The method according to claim 1, wherein the method further comprises:

Determining a business model to which the operator belongs;

4. The method according to claim 1, wherein the method further comprises:

acquiring hardware configuration data of the target accelerator card;

5. The method of claim 1, wherein the target accelerator card comprises a first accelerator card and a second accelerator card, the first test result comprises a first test sub-result and a second test sub-result, the first test sub-result is a real result of the operator testing through the first accelerator card based on the sample input parameter, and the second test sub-result is a real result of the operator testing through the second accelerator card based on the sample input parameter;

And predicting through an initial operator processing model according to the operator and the sample input parameter to obtain a first prediction result, wherein the method comprises the following steps of:

The training direction based on the minimized target difference adjusts parameters of the initial operator processing model to obtain an operator processing model, and the training direction comprises the following steps:

6. The method of claim 5, wherein the method further comprises:

7. The method of claim 5, wherein the method further comprises:

acquiring a third operator to be tested;

Predicting through the second operator processing model according to the third operator to be tested and the target third input parameter to be determined to obtain a fifth prediction result, wherein the fifth prediction result is a theoretical result obtained by testing the third operator to be tested through the second accelerator card based on the target third input parameter to be determined;

8. The method of claim 1, wherein the target accelerator card comprises a first accelerator card and a second accelerator card, the first test result comprises a first test sub-result and a second test sub-result, the first test sub-result is a real result of the operator testing through the first accelerator card based on the sample input parameter, and the second test sub-result is a real result of the operator testing through the second accelerator card based on the sample input parameter;

9. The method of claim 8, wherein the method further comprises:

10. The method according to claim 9, wherein the method further comprises:

11. An apparatus for training an operator processing model, the apparatus comprising: the device comprises an acquisition unit, a prediction unit and an adjustment unit;

the acquisition unit is used for acquiring an operator, a sample input parameter and a first test result, wherein the sample input parameter is used for describing the data dimension of input data of the operator, the first test result is a real result obtained by testing the operator through a target accelerator card based on the sample input parameter, the sample input parameter is determined based on hardware configuration data of the target accelerator card, and the hardware configuration data is used for describing the hardware capability of the target accelerator card;

The prediction unit is used for predicting through an initial operator processing model according to the operator and the sample input parameters to obtain a first prediction result, wherein the first prediction result is a theoretical result obtained by the initial operator processing model for predicting the operator through the target accelerator card based on the sample input parameters;

The adjusting unit is used for adjusting parameters of the initial operator processing model based on a training direction of minimizing target difference, so as to obtain an operator processing model, wherein the target difference is a difference between the first prediction result and the first test result, and the operator processing model is used for predicting a theoretical result obtained by testing the operator through the target accelerator card based on different input parameters;

The training device of the operator processing model further comprises an input parameter determining unit, which is used for:

Acquiring a first operator to be tested;

12. The apparatus according to claim 11, wherein the training means of the operator processing model further comprises a model training unit for:

13. The apparatus according to claim 11, wherein the training means of the operator processing model further comprises a first test result determining unit configured to:

Determining a business model to which the operator belongs;

14. The apparatus according to claim 11, wherein the training means of the operator processing model further comprises a sample input parameter determination unit for:

acquiring hardware configuration data of the target accelerator card;

15. The apparatus of claim 11, wherein the target accelerator card comprises a first accelerator card and a second accelerator card, the first test result comprising a first test sub-result and a second test sub-result, the first test sub-result being a true result of the operator testing through the first accelerator card based on the sample input parameter, the second test sub-result being a true result of the operator testing through the second accelerator card based on the sample input parameter;

the prediction unit is specifically configured to:

The adjusting unit is specifically configured to:

16. The apparatus according to claim 15, wherein the training means of the operator processing model further comprises an input parameter determining unit for:

17. The apparatus according to claim 15, wherein the training means of the operator processing model further comprises an input parameter determining unit for:

acquiring a third operator to be tested;

18. The apparatus of claim 11, wherein the target accelerator card comprises a first accelerator card and a second accelerator card, the first test result comprising a first test sub-result and a second test sub-result, the first test sub-result being a true result of the operator testing through the first accelerator card based on the sample input parameter, the second test sub-result being a true result of the operator testing through the second accelerator card based on the sample input parameter;

the prediction unit is specifically configured to:

The adjusting unit is specifically configured to:

19. The apparatus according to claim 18, wherein the training means of the operator processing model further comprises an input parameter determining unit for:

20. The apparatus according to claim 19, wherein the training means of the operator processing model further comprises a model training unit for:

21. A computer device, the computer device comprising a processor and a memory:

The processor is configured to perform the method of any of claims 1-10 according to the computer program.

22. A computer readable storage medium, characterized in that the computer readable storage medium is for storing a computer program for executing the method of any one of claims 1-10.

23. A computer program product comprising a computer program which, when run on a computer device, causes the computer device to perform the method of any of claims 1-10.