CN116755862A - Training method, device, medium and equipment for operator optimized scheduling model - Google Patents

Training method, device, medium and equipment for operator optimized scheduling model Download PDF

Info

Publication number
CN116755862A
CN116755862A CN202311010092.7A CN202311010092A CN116755862A CN 116755862 A CN116755862 A CN 116755862A CN 202311010092 A CN202311010092 A CN 202311010092A CN 116755862 A CN116755862 A CN 116755862A
Authority
CN
China
Prior art keywords
operator
optimized
optimization
determining
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311010092.7A
Other languages
Chinese (zh)
Other versions
CN116755862B (en
Inventor
王鹏程
陈自强
吕波
胡陈枢
李勇
程稳
曾令仿
陈�光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202311010092.7A priority Critical patent/CN116755862B/en
Publication of CN116755862A publication Critical patent/CN116755862A/en
Application granted granted Critical
Publication of CN116755862B publication Critical patent/CN116755862B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The specification discloses a training method, device, medium and equipment for an operator optimization scheduling model, comprising the following steps: and determining the current moment as information of each operator in the training sample image classification model which is trained in advance based on the image data, inputting an operator optimizing scheduling model to be trained, and determining the operator to be optimized at the current moment. And determining the running descending time of the operator to be optimized after the operator to be optimized optimizes the image data. And then according to the information, the operator to be optimized and the running descending time when the operator to be optimized carries out image classification on the image data, the operator to be trained optimizes the scheduling model, so that the operator which needs to be scheduled for optimization at the current moment can be determined through the operator optimizing scheduling model which is completed through training, the trouble of manual design to select the strategy of the operator which needs to be optimized is reduced, and the subsequent speed of deploying the image classification model to be deployed on hardware is accelerated.

Description

Training method, device, medium and equipment for operator optimized scheduling model
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a training method, apparatus, medium, and device for an operator optimization scheduling model.
Background
With the continuous development of technology, deep learning models are increasingly widely applied. Typically, the deep learning model needs to be trained offline in a model framework, and after the deep learning model is trained, the deep learning model is deployed on a hardware device to run the deep learning model execution service.
At present, when the deep learning model is deployed on hardware equipment, operators in the deep learning model can be optimized first, and then the optimized deep learning model is deployed on the hardware equipment, so that the speed of model deployment is increased, and the execution speed of the service can be increased when the deep learning model is operated to execute the service. For example, when the image classification model is deployed on the hardware equipment, operators in the image classification model can be optimized first, and then the optimized image classification model is deployed on the hardware equipment, so that the deployment speed of the image classification model is increased, and the classification speed of the image classification model for classifying the image can be increased when the image classification model is used for classifying the image later. Therefore, how to train an operator-optimized scheduling model to perform optimized scheduling on operators in a deep learning model is an important problem.
Based on the above, the present specification provides a training method of an operator optimized scheduling model.
Disclosure of Invention
The present disclosure provides a training method, apparatus, medium, and device for an operator optimization scheduling model, so as to partially solve the foregoing problems in the prior art.
The technical scheme adopted in the specification is as follows:
the specification provides a training method of an operator optimized scheduling model, which comprises the following steps:
acquiring an image classification model which is trained in advance based on image data in a training data set and taking the image classification model as a training sample;
determining information of each operator in the training sample at the current moment, wherein the information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment;
inputting the information into an operator optimizing scheduling model to be trained, and determining an operator to be optimized at the current moment to be used as the operator to be optimized;
determining the operation descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the image data;
Training the operator optimizing scheduling model to be trained by adopting a strategy gradient algorithm according to the information, the operator to be optimized and the operation descending time when the operator to be optimized carries out image classification on the image data, wherein the operator optimizing scheduling model after training is used for determining operators required to be optimized in the image classifying model to be deployed according to the information of each operator in the image classifying model to be deployed, optimizing the determined operators, determining the image classifying model to be deployed after optimizing the operators, operating the image classifying model to be deployed on target hardware after optimizing to carry out image classification on the target image, and determining the classifying result.
Optionally, determining the running down time when the operator to be optimized performs image classification on the image data after optimizing the operator to be optimized specifically includes:
determining the running time of the operator to be optimized for carrying out image classification on the image data before the operator to be optimized is optimized as historical running time, and determining the running time of the operator to be optimized for carrying out image classification on the image data after the operator to be optimized is optimized as optimized running time;
And taking the difference value of the historical running time minus the optimized running time as the running descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the operator to be optimized.
Optionally, determining information of each operator in the training sample at the current moment specifically includes:
determining information of each operator in the training sample before the current time slice starts, and taking the information as input information corresponding to the current time slice, wherein the current time slice is a time slice with a specified length by taking the current time as the starting time;
inputting the input information into an operator optimization scheduling model to be trained, and determining an operator to be optimized at the current moment as the operator to be optimized, wherein the operator optimization scheduling model specifically comprises the following steps:
and inputting the input information corresponding to the current time slice into an operator optimizing scheduling model to be trained, and determining an operator required to be optimized in the current time slice as an operator to be optimized.
Optionally, determining the running down time when the operator to be optimized performs image classification on the image data after optimizing the operator to be optimized specifically includes:
determining the running time of the operator to be optimized when the operator to be optimized performs image classification on the image data before the current time slice as historical running time, and determining the running time of the operator to be optimized after the operator to be optimized performs image classification on the image data in the current time slice as optimized running time;
And taking the difference value of the historical running time minus the optimized running time as the running descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization in the current time slice.
Optionally, training the operator to be trained by using a strategy gradient algorithm according to the information, the operator to be optimized and the running down time when the operator to be optimized performs image classification on the image data, and specifically includes:
judging whether an optimization ending condition is met;
if so, training the operator optimizing and scheduling model to be trained by adopting a strategy gradient algorithm according to input information corresponding to each time slice, an operator to be optimized and operation descending time when the operator to be optimized carries out image classification on the image data, wherein each time slice is obtained by dividing preset optimizing time according to the appointed length;
if not, the next time slice of the current time slice is re-used as the current time slice, the information of each operator in the training sample before the current time slice starts is continuously determined as the input information of the current time slice, the operator to be optimized is re-determined, and the running descending time when the operator to be optimized performs image classification on the image data is re-determined until the optimization ending condition is met, wherein the next time slice of the current time slice is the time slice with the appointed length by taking the ending moment of the current time slice as the starting moment.
Optionally, the method further comprises:
determining information of each operator in an image classification model to be deployed at the current moment, wherein the image classification model to be deployed is a model trained on the basis of image data in advance, and the information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time of each operator for performing image classification on the image data after optimizing each operator before the current moment;
inputting the information into a trained operator optimization scheduling model, and determining an operator required to be optimized at the current moment to be used as an operator to be optimized;
optimizing the operator to be optimized until an optimization ending condition is met;
generating an executable file of the image classification model to be deployed, which runs on target hardware, according to each operator after optimization is finished;
analyzing the executable file to obtain an analysis result;
and determining the target image to be classified, running the executable file on the target hardware according to the analysis result, performing image classification on the target image, and determining a classification result.
Optionally, determining information of each operator in the image classification model to be deployed at the current moment specifically includes:
determining information of each operator in an image classification model to be deployed before a current time slice starts, and taking the information as input information corresponding to the current time slice, wherein the current time slice is a time slice with a specified length by taking the current time as a starting time;
inputting the information into a trained operator optimization scheduling model, and determining an operator required to be optimized at the current moment as an operator to be optimized, wherein the operator to be optimized specifically comprises the following steps:
and inputting the input information corresponding to the current time slice into a trained operator optimization scheduling model, and determining an operator required to be optimized in the current time slice as an operator to be optimized.
Optionally, optimizing the operator to be optimized until an optimization ending condition is met, which specifically includes:
optimizing the operator to be optimized until the current time slice is finished;
judging whether an optimization ending condition is met;
if yes, determining that the optimization of each operator is finished;
if not, the next time slice of the current time slice is used as the current time slice again, information of each operator before the current time slice starts is continuously determined, an operator to be optimized is redetermined, and the redetermined operator to be optimized is optimized until the optimization ending condition is met, wherein the next time slice of the current time slice is the time slice with the appointed length taking the ending moment of the current time slice as the starting moment.
Optionally, determining information of each operator in the image classification model to be deployed at the current moment specifically includes:
determining a first computational graph of an image classification model to be deployed;
performing graph optimization on the first calculation graph to obtain a second calculation graph;
and determining information of each operator in the second calculation graph at the current moment.
The specification provides a training device of an operator optimization scheduling model, which comprises:
the acquisition module is used for acquiring an image classification model which is trained in advance based on image data in the training data set and taking the image classification model as a training sample;
the first determining module is used for determining information of each operator in the training sample at the current moment, wherein the information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment;
the scheduling module is used for inputting the information into an operator optimization scheduling model to be trained, and determining an operator required to be optimized at the current moment to be used as an operator to be optimized;
the second determining module is used for determining the operation descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the operator to be optimized;
The training module is used for training the operator optimizing scheduling model to be trained according to the information, the operator to be optimized and the running descending time when the operator to be optimized carries out image classification on the image data, the operator optimizing scheduling model to be trained is used for determining operators required to be optimized in the image classifying model to be deployed according to the information of each operator in the image classifying model to be deployed, optimizing the determined operators, determining the image classifying model to be deployed after the operators are optimized, and carrying out image classification on the target image by operating the optimized image classifying model to be deployed on target hardware to determine the classifying result.
The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the training method of the operator optimized scheduling model described above.
The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a training method of the operator optimized scheduling model described above when executing the program.
The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:
according to the training method of the operator optimization scheduling model, an image classification model which is trained in advance based on image data in a training data set is firstly obtained and is used as a training sample, information of each operator in the training sample at the current moment is determined, the information is input into the operator optimization scheduling model to be trained, and an operator to be optimized at the current moment is determined and is used as the operator to be optimized. And then determining the running descending time of the operator to be optimized after the operator to be optimized optimizes the image data. And training the operator optimization scheduling model to be trained by adopting a strategy gradient algorithm according to the information, the operator to be optimized and the running descending time when the operator to be optimized performs image classification on the image data.
According to the method, when the operator optimizing and scheduling model is trained, an image classification model which is trained in the training data set in advance based on image data is firstly obtained and is used as a training sample, information of each operator in the training sample at the current moment is determined, the information is input into the operator optimizing and scheduling model to be trained, and an operator which is required to be optimized at the current moment is determined and is used as the operator to be optimized. And then determining the running descending time of the operator to be optimized after the operator to be optimized optimizes the image data. And then according to the information, the operator to be optimized and the running descending time when the operator to be optimized carries out image classification on the image data, a strategy gradient algorithm is adopted to train an operator optimization scheduling model to be trained, so that the operator which is required to be scheduled to be optimized at the current moment can be determined through the operator optimization scheduling model after training is carried out on the image classification model to be deployed, the determined operator is optimized, the trouble of manually designing and selecting the strategy of the operator which is required to be optimized is reduced, the image classification model to be deployed after the operator is optimized is deployed on target hardware, the image is classified to obtain a classification result, the speed of deploying the image classification model to be deployed on the hardware is accelerated, and the image classification speed of the image classification model to be deployed after the optimization is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:
FIG. 1 is a flow chart of a training method of an operator optimized scheduling model provided in the present specification;
FIG. 2 is a schematic flow chart of an operator optimized schedule provided in the present specification;
FIG. 3 is a schematic diagram of an operator optimized schedule provided in the present specification;
FIG. 4 is a schematic illustration of a computational graph of an image classification model to be deployed provided in the present specification;
FIG. 5 is a schematic diagram of a training apparatus structure of an operator optimized scheduling model provided in the present specification;
fig. 6 is a schematic structural diagram of an electronic device corresponding to fig. 1 provided in the present specification.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.
Fig. 1 is a flow chart of a training method of an operator optimization scheduling model provided in the present specification, including the following steps:
s100: and acquiring an image classification model which is trained in advance based on the image data in the training data set, and taking the image classification model as a training sample.
In the present specification, the apparatus for training the operator optimization scheduling model may acquire an image classification model trained in advance based on images in a training data set, and use the image classification model as a training sample. The device for training the operator optimization scheduling model can be a server or an electronic device such as a desktop computer or a notebook computer. For convenience of description, the training method of the operator optimization scheduling model provided in the present specification will be described below with only a server as an execution body. The training data set may be a pre-constructed data set including a deep learning model, where the deep learning model included in the training data set may be a common model, such as a deep residual network (Deep residual network, abbreviated as res net), or may be a custom deep learning model, which is not specifically limited in this specification. The deep learning model included in the training data set may be an image classification model trained in advance based on image data, the image data may be an image acquired in advance by an image acquisition device such as a video camera or a still camera, or may be an image in any existing general image data set, and the specification is not limited specifically. The image classification model may be a model that the server trains in advance based on image data, specifically, the server may train the image classification model by taking a pre-collected image as a training sample, a class corresponding to the image as a label, for example, the server trains the image classification model by taking a plurality of animal images as training samples, and a class of an animal corresponding to the animal image as a label.
Of course, the training data set may also include a text processing model trained based on text data, where the text data may be pre-collected text data such as movie comments, man-machine dialogues, poems, etc., and the text processing model may be a model of a type such as emotion classification, topic classification, text generation, etc., trained based on the text data, which is not specifically limited in this specification.
S102: determining information of each operator in the training sample at the current moment, wherein the information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment.
The server may determine information for each operator in the training sample at the current time. The information of each operator at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment. The information of each operator can also comprise the type of each operator, and the type can comprise various types of operators such as data acquisition, statistics, format conversion and the like. The historical optimization times of each operator are the times that each operator is optimized before the current moment, and the server can count the times that each operator is optimized before the current moment.
In the specification, the running time of the operator for classifying the image data is reduced after the operator is optimized, for example, the operator is optimized by adjusting the calculation mode of the operator, so that the calculation speed of the operator is improved, the calculation speed of the operator is increased when the optimized operator classifies the image data, and the running time of the operator is reduced. Therefore, for each operator, the running time of the operator when the operator does not optimize before the current time for classifying the image data and the running time of the operator when the operator optimizes before the current time for classifying the image data can be determined, and then the difference value of the determined running time of the operator when the operator does not optimize for classifying the image data after the optimization is subtracted from the determined running time of the operator when the operator classifies the image data after the optimization is used as the running time of the operator after the operator optimizes before the current time for classifying the image data. Wherein the determined running time when the image data is subjected to image classification when the image data is input into the image classification model is the running time of the operator when the image data is subjected to image classification, and the operator in the image classification model input into the image data is not subjected to optimization. The determined running time for performing image classification on the image data after optimization is the running time of the operator when the image data is input into the image classification model to perform image classification on the image data, and the operator in the image classification model input into the image data is optimized.
The determined running time when the image data is subjected to image classification without optimization and the determined running time when the image data is subjected to image classification after optimization may be preset times, or may be determined by a server according to historical optimization data generated when the image classification model is optimized historically as a training sample, the historical optimization data may be data generated when the image classification model is optimized historically, and the historical optimization data may include the running time of the whole model when the image classification model is subjected to image classification before optimization, the running time of the whole model when the image classification model is subjected to image classification after optimization, the running time when each operator in the image classification model is subjected to image classification before optimization, and the running time when each operator in the image processing model is subjected to image classification after optimization. Therefore, the server can determine the running time when the operator performs image classification on the image data when the operator does not perform optimization before the current moment according to the historical optimization data of the image classification model, and determine the running time when the operator performs optimization before the current moment and performs image classification on the image data.
For example, the running time when an operator performs image classification on the image 1 when it is not optimized before the current time is 0.5s, and the running time when the operator performs image classification on the image 1 when it is optimized before the current time is 0.3s, so the running time when the operator performs image classification on the image 1 when it is optimized before the current time is 0.2s.
Of course, there may be operators that have not been optimized prior to the current time, so this operator, which has never been optimized prior to the current time, runs down to 0 when image processing the image data. In addition, there may be an operator that is optimized a plurality of times before the current time, so when determining the run-down time when such an operator that is optimized a plurality of times before the current time performs image processing on image data, determining the time of the optimization closest to the current time, determining the run-down time when such an operator performs image processing on image data before the time and the run-down time when such an operator performs image processing on image data after such an operator is optimized at the time, and determining the run-down time when such an operator performs image processing on image data according to the determined two run-down times.
In the present specification, there are various manners of optimizing the operators, for example, such as matrix multiplication, winograd transform method, and FFT transform method. The optimization modes of each operator can be the same or different, and the optimization mode corresponding to the operator can be determined according to the type of the operator, and the description is not limited specifically. In addition, for each operator, the optimization mode adopted for the operator is different, and the running time of the operator after optimization for image processing of the image data may be the same or different, so that the running time of the operator for image processing of the image data may be the same or different. Of course, since each operator is of a different type, the run time, and thus the run-down time, may be different for each optimized operator when image processing image data for the same optimization.
Therefore, the server can preset the running time of each operator for performing image processing on the image data after optimizing each operator by adopting different optimizing modes according to the optimizing mode corresponding to each operator and the type of each operator, namely, the running time of each operator for performing image processing on the image data after optimizing each operator by adopting various optimizing modes is preset. The subsequent server can determine the running time of each operator for processing the image data, which corresponds to each operator before optimizing each operator, and the running time of each operator for processing the image data, which corresponds to each operator after optimizing each operator, and calculate the running down time of each operator for processing the image data, which corresponds to each operator, according to the determined two running times of each operator for processing the image data (i.e. the running time of each operator for processing the image data before optimizing and the running time of each operator for processing the image data after optimizing).
S104: and inputting the information into an operator optimization scheduling model to be trained, and determining an operator to be optimized at the current moment to be used as the operator to be optimized.
S106: determining the running descending time of the operator to be optimized when the operator to be optimized performs image classification on the image data after the operator to be optimized is optimized.
S108: and training the operator optimizing scheduling model to be trained by adopting a strategy gradient algorithm according to the information, the operator to be optimized and the operation descending time when the operator to be optimized carries out image classification on the image data, wherein the operator optimizing scheduling model after training is used for determining operators required to be optimized in the image classifying model to be deployed according to the information of each operator in the image classifying model to be deployed, optimizing the determined operators, determining the image classifying model to be deployed after optimizing the operators, and carrying out image classification on the target image by operating the optimized image classifying model to be deployed on target hardware to determine the classification result.
The server can input information into an operator optimization scheduling model to be trained, and determine an operator to be optimized at the current moment to be used as the operator to be optimized. And then determining the running descending time of the operator to be optimized after the operator to be optimized optimizes the image data. And training the operator optimization scheduling model to be trained by adopting a strategy gradient algorithm according to the information, the operator to be optimized and the running descending time when the operator to be optimized performs image classification on the image data. When determining the running time of the operator to be optimized after optimizing the operator to be optimized when performing image classification on the image data, the server can determine the running time of the operator to be optimized when performing image classification on the image data before optimizing the operator to be optimized as historical running time, and determine the running time of the operator to be optimized when performing image classification on the image data after optimizing the operator to be optimized as optimized running time. And then, taking the difference value of the historical running time minus the optimized running time as the running descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization.
The operation time after the operator to be optimized is preset time. If the operator to be optimized is never optimized before the current moment, the historical operation time of the operator to be optimized is initial operation time, namely the operation time of the operator to be optimized when the image data is input into the image classification model to carry out image classification on the image data when the operator to be optimized is not optimized, wherein the initial operation time can be preset or the operation time of the operator to be optimized determined when the image data is input into the image classification model to carry out image classification on the image data. If the operator to be optimized is optimized before the current time, the historical running time of the operator to be optimized is the running time when the operator to be optimized performs image classification on the image data after the operator to be optimized is optimized once before and nearest to the current time.
In the specification, when the number of times of training the operator optimal scheduling model to be trained reaches a preset number of times, the operator optimal scheduling model to be trained is trained. The operator optimization scheduling model after training can be used for determining operators required to be optimized in the image classification model to be deployed according to the information of each operator in the image classification model to be deployed, the follow-up server can optimize the determined operators, determine the image classification model to be deployed after optimizing the operators, and run the optimized image classification model to be deployed on target hardware to classify the target image, and determine classification results. The image classification model to be deployed may be an image classification model which is trained by the server in advance based on image data, or may be a model which is sent by the user, specifically, the server may determine the image classification model to be deployed in response to an input operation of the user, and the image classification model may be an image classification model which is trained by the user in advance based on image data, which is not specifically limited in this specification. The target image may be the image data for training the image classification model, and may be any other image, such as an animal image, a plant image, and the like. Of course, the target image may also be an image input by the user, and the present specification is not particularly limited. The classification result is the type to which the target image belongs, for example, when the target image is a cat image, the classification result may be a cat.
According to the method, when the operator optimal scheduling model is trained, the server can acquire the image classification model which is trained in the training data set in advance based on the image data and is used as a training sample, then information of each operator in the training sample at the current moment is determined, the information is input into the operator optimal scheduling model to be trained, and an operator to be optimized at the current moment is determined and is used as the operator to be optimized. And then determining the running descending time of the operator to be optimized after the operator to be optimized optimizes the image data. And then according to the information, the operator to be optimized and the running down time when the operator to be optimized carries out image classification on the image data, a strategy gradient algorithm is adopted, the operator to be trained optimizes the scheduling model, so that the operator which is required to be scheduled to be optimized at the current moment can be determined through the operator optimizing scheduling model which is completed by training, the determined operator is optimized, the trouble of manually designing and selecting the strategy of the operator which is required to be optimized is reduced, the operator which is determined based on the operator optimizing scheduling model is optimized through training an operator scheduling strategy which is automatically learned, the operator optimizing effect of the image classifying model to be deployed is better, the image classifying model to be deployed after the operator is optimized is deployed on target hardware, the classifying result is obtained by classifying the image, the speed of deploying the image classifying model to be deployed on the hardware is accelerated, and the classifying speed of the image classifying model to be deployed after the optimization is improved.
In this specification, the time to optimize an operator may be limited, and the optimization potential of each operator, which refers to the run-down time when an operator performs image classification on image data after the operator is optimized, is different. Even if only an operator is optimized within the optimization time, the operation fall time of the operator when the operator performs image classification on the image data can be smaller, the optimization effect of the operator is poorer, and the optimization potential of the operator is smaller. However, for another operator, only the operator is optimized in the optimization time, the operation fall time of the operator in image classification of the image data may be relatively large, the optimization effect of the operator is relatively good, and the optimization potential of the operator is relatively large. Even if the operator is optimized using only a part of the optimization time, the operator has a relatively large run-down time when classifying the image data. Therefore, in order to ensure that the operator optimization effect of the whole model in the optimization time is good, the optimization time can be divided according to the appointed length in advance, and each time slice is obtained. And determining an operator to be optimized in each time slice according to each time slice in sequence, and optimizing the operator in the time slice until an optimization ending condition is met. Wherein, the optimization ending condition may be to reach an optimization time.
Based on this, in the above step S102, the server may determine information of each operator in the training samples before the current time slice starts, and use the information as input information corresponding to the current time slice. The current time slice is a time slice with the current time as the starting time and the designated length. The input information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment. The input information may also include types of operators, which may include various types of operators for data collection, statistics, and format conversion. The input information may also include a historical optimization ratio of each operator, and specifically, the server may determine, for each operator, the number of optimizations of the operator in other time slices before the current time slice as the historical number of optimizations. And determining the historical optimization proportion of the operator according to the historical optimization times of the operator and the number of time slices before the current time slice. For example, if operator 1 is optimized 3 times before the current time slice and 10 time slices exist before the current time slice, the historical optimization ratio of operator 1 is 0.3.
In the present specification, when determining the run-down time when each operator performs image classification on image data after optimizing each operator before the current time, if the operator has not been optimized before the current time, the run-down time when the operator performs image classification on image data after optimizing the operator before the current time is 0 for each operator. If the operator is optimized before the current moment, determining a time slice which is optimized for the operator historically, and determining a time slice closest to the current time slice from the determined time slices as a historical time slice. And then determining the running time when the operator performs image classification on the image data at the starting moment of the historical time slice, determining the running time when the operator performs image classification on the image data after optimizing the operator in the historical time slice, and calculating the running descent time when the operator performs image classification on the image data after optimizing the operator before the current moment according to the determined two running times.
Based on this, in the above step S104, the server may input the input information corresponding to the current time slice into the operator optimization scheduling model to be trained, and determine the operator to be optimized in the current time slice as the operator to be optimized. In the step S106, the server may determine the running time of the operator to be optimized when performing image classification on the image data before the current time slice, as the historical running time, and determine the running time of the operator to be optimized when performing image classification on the image data after optimizing the operator to be optimized in the current time slice, as the optimized running time. And then, taking the difference value of the historical running time minus the optimized running time as the running descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization in the current time slice.
The historical running time is the running time when the operator to be optimized performs image classification on the image data at the starting time because the current time is the starting time of the current time slice. When the operator to be optimized has never been optimized before the current time slice, the historical runtime is the initial runtime when the operator to be optimized performs image classification on the image data. However, when the operator to be optimized is optimized before the current time slice, that is, the operator to be optimized is optimized in the time slice before the current time slice, the time slice before the current time slice and closest to the current time slice is determined, and the operator to be optimized is optimized in the determined time slice, so the historical running time is the running time when the operator to be optimized performs image classification on the image data after the operator to be optimized optimizes the operator to be optimized in the determined time slice.
Because the time slices are obtained by dividing the optimization time in advance, there may be time slices after the current time slice, or there may be no time slice, that is, the optimization time may be reached after the current time slice is finished, or the optimization time may not be reached. Therefore, in the above-described step S108, the server may determine whether the optimization end condition is satisfied. If so, training an operator optimization scheduling model to be trained by adopting a strategy gradient algorithm according to input information corresponding to each time slice, an operator to be optimized and operation descending time when the operator to be optimized performs image classification on the image data. If not, the next time slice of the current time slice is re-used as the current time slice, information of each operator in the training sample before the current time slice starts is continuously determined as input information of the current time slice, the operator to be optimized is re-determined, and the running descending time when the operator to be optimized performs image classification on the image data is re-determined until the optimization ending condition is met. The optimization constraint may be that no other time slice exists after the current time slice, that is, the optimization time is reached after the current time slice ends. The next time slice of the current time slice is a time slice with a specified length by taking the ending time of the current time slice as the starting time, and each time slice is obtained by dividing the preset optimized time according to the specified length.
In this specification, if there are other time slices after the current time slice, it is indicated that the optimization time has not yet been reached, and the operator to be optimized may be further determined and optimized. Therefore, the server may re-use the next time slice of the current time slice as the current time slice, and execute the next time slice according to the steps S102 to S108 until the optimization ending condition is satisfied. If no other time slices exist after the current time slice, the fact that the optimization time is reached is indicated, and the server can train an operator optimization scheduling model to be trained by adopting a strategy gradient algorithm according to input information corresponding to each time slice, an operator to be optimized and operation descending time when the operator to be optimized performs image classification on the image data.
The specification also provides an operator optimization scheduling method, which comprises the steps of determining operators required to be optimized by using a trained operator optimization scheduling model when the model is deployed, and optimizing the determined operators. As shown in fig. 2, fig. 2 is a schematic flow chart of operator optimization scheduling provided in the present specification, and specifically includes the following steps:
s200: determining information of each operator in an image classification model to be deployed at the current moment, wherein the image classification model to be deployed is a model trained on the basis of image data in advance, and the information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time of each operator for performing image classification on the image data after optimizing each operator before the current moment.
The server can determine information of each operator in the image classification model to be deployed at the current moment. The information of each operator at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment. The information of each operator can also comprise the type of each operator, and the type can comprise various types of operators such as data acquisition, statistics, format conversion and the like. The historical optimization times of each operator are the times that each operator is optimized before the current moment, and the server can count the times that each operator is optimized before the current moment. The image classification model to be deployed may be an image classification model trained by the server in advance based on image data, or may be a model transmitted by the user, specifically, the server may determine the image classification model to be deployed in response to an input operation of the user, and the image classification model may be an image classification model trained by the user in advance based on image data, which is not specifically limited in this specification.
In the present specification, since the operation time when the operator performs image classification on the image data decreases after the operator is optimized, for each operator, the operation time when the operator performs image classification on the image data when the operator is not optimized before the current time and the operation time when the operator performs image classification on the image data when the operator is optimized before the current time can be determined, and then the difference between the determined operation time when the operator performs image classification on the image data when the operator is not optimized and the determined operation time when the operator performs image classification on the image data when the operator is optimized before the current time is taken as the operation decrease time when the operator performs image classification on the image data when the operator is optimized before the current time.
Of course, there may be operators that have never been optimized prior to the current time, so the run-down time when such operators that have never been optimized prior to the current time image categorize image data is 0. In addition, there may be an operator that is optimized a plurality of times before the current time, so when determining the run-down time when such an operator that is optimized a plurality of times before the current time performs image classification on image data, determining the time of the optimization closest to the current time, determining the run-down time when such an operator performs image classification on image data before the time and determining the run-down time when such an operator performs image classification on image data after such an operator is optimized at the time, and determining the run-down time when such an operator performs image classification on image data according to the determined two run-down times.
In the present specification, there are various manners of optimizing the operators, for example, such as matrix multiplication, winograd transform method, and FFT transform method. The optimization modes of each operator can be the same or different, and the optimization mode corresponding to the operator can be determined according to the type of the operator, and the description is not limited specifically. In addition, for each operator, the optimization mode adopted for the operator is different, the running time of the operator after optimization for classifying the image data may be the same or different, so that the running time of the operator may be the same or different. Of course, since each operator is of a different type, the run time, and thus the run-down time, may be different for each optimized operator when image classifying the image data for the same optimization.
S202: and inputting the information into a trained operator optimization scheduling model, and determining an operator required to be optimized at the current moment to be used as an operator to be optimized.
S204: and optimizing the operator to be optimized until the optimization ending condition is met.
The server can input information into the operator optimization scheduling model with the training completed, and determine an operator to be optimized at the current moment to be used as an operator to be optimized. And then, optimizing the operator to be optimized until the optimization ending condition is met. The optimization finishing condition can be that the operator to be optimized is optimized, that is, the operator to be optimized can be optimized for a plurality of times, and when the running time of the operator to be optimized for classifying the image data after the operator to be optimized is not reduced compared with the running time of the operator to be optimized for classifying the image data before the operator to be optimized is optimized, the operator to be optimized is optimized. In addition, since the time for optimizing the operator cannot be infinitely long, the optimization time for optimizing the operator can be preset, and the operator is optimized within the optimization time. The optimization end condition may also be to reach an optimization time.
Specifically, the server may input information into the operator optimization scheduling model after training, and determine an operator to be optimized at the current moment, as an operator to be optimized. And then, determining an optimization mode corresponding to the operator to be optimized, and optimizing the operator to be optimized by adopting the determined optimization mode until the operator to be optimized is optimized or the optimization time is reached.
Based on the above, when the optimization ending condition is that the optimization time is reached, after the operator to be optimized is optimized for a plurality of times, the operator to be optimized is optimized, but the optimization time may not be reached yet, the server may further determine information of each operator, input the determined information into the operator optimization scheduling model, redetermine the operator to be optimized, and optimize the redetermined operator to be optimized until the optimization time is reached, where the determined information is information of each operator when the operator to be optimized is optimized for a plurality of times and the operator to be optimized is optimized, and each operator includes the optimized operator to be optimized.
S206: and generating an executable file of the image classification model to be deployed, which is operated on the target hardware, according to each operator after the optimization is finished.
And the server can generate an executable file of the image classification model to be deployed, which is operated by the target hardware, according to each operator after the optimization is finished. The image classification model to be deployed is subjected to operator optimization, some operators may not be optimized, some operators may be optimized, and therefore operators after optimization is finished may contain operators which are not optimized.
S208: and analyzing the executable file to obtain an analysis result.
S210: and determining the target image, running the executable file on the target hardware according to the analysis result, classifying the target image, and determining the classification result.
The server can analyze the executable file to obtain an analysis result. And then, determining a target image, running an executable file on target hardware according to the analysis result, classifying the target image, and determining a classification result. The server optimizes operators in the image classification model to be deployed, improves the operation time of the operators, generates executable files according to the operators after the optimization is finished, operates the executable files on target hardware, classifies the target images, determines classification results, accelerates the speed of deploying the image classification model to be deployed on the hardware, and provides the classification speed of the image classification model to be deployed on the images.
When the operator optimization scheduling method shown in the figure 2 is used for realizing the deployment of the image classification model to be deployed, the operator optimization scheduling model is trained, the operator to be optimized is determined from operators in the image classification model to be deployed, the trouble of manual design selection of the strategy of the operator to be optimized is reduced, the operator optimization scheduling model is trained and automatically learned, and the operator to be optimized is determined by training the operator optimization scheduling model, so that the overall operator optimization effect of the image classification model to be deployed is better, the subsequent speed of deploying the image classification model to be deployed on hardware is increased, and the speed of the optimized image classification model to be deployed on the image classification model to be deployed is improved.
In this specification, the time to optimize an operator may be limited, and the optimization potential of each operator, which refers to the run-down time when an operator performs image classification on image data after the operator is optimized, is different. Even if only an operator is optimized within the optimization time, the operation fall time of the operator when the operator performs image classification on the image data can be smaller, the optimization effect of the operator is poorer, and the optimization potential of the operator is smaller. However, for another operator, only the operator is optimized in the optimization time, the operation fall time of the operator in image classification of the image data may be relatively large, the optimization effect of the operator is relatively good, and the optimization potential of the operator is relatively large. Even if the operator is optimized using only a part of the optimization time, the operator has a relatively large run-down time when classifying the image data. Therefore, in order to ensure that the operator optimization effect of the whole model in the optimization time is good, the optimization time can be divided according to the appointed length in advance, and each time slice is obtained.
Based on this, in the above step S200, the server may determine information of each operator in the image classification model to be deployed before the current time slice starts, and use the information as input information corresponding to the current time slice, where the current time slice is a time slice with a length specified by taking the current time as the starting time. The input information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment. The input information may also include types of operators, which may include various types of operators for data collection, statistics, and format conversion. The input information may also include historical optimization ratios for each operator. The specific determination process of the input information is as shown in the above step S102, and will not be described in detail here.
Based on this, in the above step S202, the server may input the input information corresponding to the current time slice into the operator optimization scheduling model after training is completed, and determine the operator to be optimized in the current time slice as the operator to be optimized.
Because the time slices are obtained by dividing the optimization time in advance, there may be time slices after the current time slice, or there may be no time slice, that is, the optimization time may be reached after the current time slice is finished, or the optimization time may not be reached. So in step S204, the server may optimize the operator to be optimized until the current time slice is over. And judging whether the optimization ending condition is met, if so, determining that the optimization of each operator is ended. If not, the next time slice of the current time slice is used as the current time slice again, information of each operator before the current time slice starts is continuously determined, the operator to be optimized is redetermined, and optimization is carried out on the redetermined operator to be optimized until the optimization ending condition is met. The next time slice of the current time slice is the time slice with the specified length and the ending time of the current time slice is the starting time. The optimization end condition is that no other time slices exist after the current time slice, that is, the optimization time is reached after the current time slice ends.
For example, as shown in fig. 3, fig. 3 is a schematic diagram of an operator optimization scheduling provided in the present specification, and it is assumed that after the optimization time is divided, 6 time slices, that is, time slices 1-6 are obtained, the current time slice is time slice 1, 3 operators, that is, operators 1-3, exist in an image classification model to be deployed, the operator to be optimized in the current time slice is operator 1, and the server optimizes operator 1 in time slice 1 until time slice 1 ends. Judging whether the optimization ending condition is met or not, if yes, taking the next time slice of the time slice 1, namely the time slice 2, as the current time slice again, continuously determining information of each operator before the time slice 2 starts, redefining an operator to be optimized, supposing the operator 2, optimizing the redetermined operator 2 until the optimization ending condition is met, and determining the operator to be optimized in other time later in the process of determining the operator to be optimized in other time, wherein the operators to be optimized corresponding to the time slices 1-6 shown in fig. 3 are respectively the operators 1, 2, 3, 2 and 1.
In order to accelerate the deployment speed of the model, a calculation diagram of the image classification model to be deployed may be determined first, for example, as shown in fig. 4, fig. 4 is a schematic diagram of a calculation diagram of an image classification model to be deployed provided in the present specification, 3 operators exist in the image classification model to be deployed in fig. 4, that is, operators 1-3, edges between the operators in fig. 4 represent a dependency relationship, edges between the operators 1 and 2 represent execution dependency operators 1 of the operator 2, and meanings represented by other edges in fig. 4 are similar, which will not be repeated in this example. Then, the image classification model to be deployed is optimized at the image level, and then the operator level of the optimized image classification model is optimized, so in the step S200, the server may determine the first image classification model to be deployed, and perform image optimization on the first image classification model to obtain the second image classification model. And then, determining the information of each operator in the second calculation map at the current moment.
In this specification, since some operators may appear multiple times in the computation graph, that is, there are multiple operators in the computation graph, such as the computation graph shown in fig. 4, in which the operator 2 travels twice, when determining the run-down time of such an operator appearing multiple times in the computation graph when classifying the image data, the number of times of the operator is enlarged based on the run-down time of the operator originally determined according to the run-time (i.e., the historical run-time) when classifying the image data before optimization and the run-down time of the operator when classifying the image data after optimization (i.e., the optimized run-time) when classifying the image data, and the enlarged time is taken as the run-down time of the operator when classifying the image data, the number of times of the operator appears in the computation graph. For example, the operator 2 determines the run-down time to be 0.2s according to the run-time when the image data is image classified before the optimization and the run-time when the image data is image classified after the optimization, but there are two operators 2 in the calculation map, so the actual run-down time of the operators 2 is 0.4s.
In this specification, since the training data set may further include a text processing model trained based on text data, taking an emotion classification model as an example, when the operator optimization scheduling model is trained according to the process of steps S100 to S108, the server may use the emotion classification model trained based on text data in the training data set as a training sample, and then train the operator optimization scheduling model according to the process of steps S102 to S108, where the operation drop time of the operator is the operation drop time when the operator performs emotion classification on the text data. When the model is deployed by using the operator optimized scheduling model after training, the model to be deployed may be an emotion classification model to be deployed, and the specific process is as shown in the operator optimized scheduling process shown in fig. 2, but the image classification model to be deployed in fig. 2 is replaced by an emotion classification model, and the target image is replaced by a target text, which is not described herein.
The above is a method implemented by one or more embodiments of the present specification, and based on the same thought, the present specification further provides a training device for optimizing a scheduling model by using a corresponding operator, as shown in fig. 5.
Fig. 5 is a schematic diagram of a training apparatus for an operator-optimized scheduling model provided in the present specification, including:
The acquisition module 300 is configured to acquire an image classification model in the training dataset, which is trained in advance based on image data, and use the image classification model as a training sample;
a first determining module 302, configured to determine information of each operator in the training sample at a current time, where the information includes at least a floating point operation number of each operator, a historical optimization number of each operator, and an operation down time when each operator performs image classification on the image data after optimizing each operator before the current time;
the scheduling module 304 is configured to input the information into an operator optimization scheduling model to be trained, and determine an operator to be optimized at the current moment, as an operator to be optimized;
a second determining module 306, configured to determine an operation descent time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the operator to be optimized;
the training module 308 is configured to train the operator-optimized scheduling model to be trained by using a policy gradient algorithm according to the information, the operator to be optimized, and the running down time when the operator to be optimized performs image classification on the image data, where the operator-optimized scheduling model after training is used to determine an operator to be optimized in the image classification model to be deployed according to information of each operator in the image classification model to be deployed, optimize the determined operator, determine the image classification model to be deployed after optimizing the operator, operate the image classification model to be deployed on target hardware after optimizing the image classification model to perform image classification on the target image, and determine a classification result.
Optionally, the second determining module 306 is specifically configured to determine, as a historical runtime, a runtime when the operator to be optimized performs image classification on the image data before the operator to be optimized performs optimization on the operator to be optimized, and determine, as an optimized runtime, a runtime when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the operator to be optimized; and taking the difference value of the historical running time minus the optimized running time as the running descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the operator to be optimized.
Optionally, the first determining module 302 is specifically configured to determine information of each operator in the training sample before a current time slice starts, and use the information as input information corresponding to the current time slice, where the current time slice is a time slice with a specified length with a current time as a starting time;
the scheduling module 304 is specifically configured to input the input information corresponding to the current time slice into an operator optimization scheduling model to be trained, and determine an operator to be optimized in the current time slice as the operator to be optimized.
Optionally, the second determining module 306 is specifically configured to determine, as a historical running time, a running time when the operator to be optimized performs image classification on the image data before the current time slice, and determine, as an optimized running time, a running time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization in the current time slice; and taking the difference value of the historical running time minus the optimized running time as the running descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization in the current time slice.
Optionally, the training module 308 is specifically configured to determine whether an optimization ending condition is satisfied; if so, training the operator optimizing and scheduling model to be trained by adopting a strategy gradient algorithm according to input information corresponding to each time slice, an operator to be optimized and operation descending time when the operator to be optimized carries out image classification on the image data, wherein each time slice is obtained by dividing preset optimizing time according to the appointed length; if not, the next time slice of the current time slice is re-used as the current time slice, the information of each operator in the training sample before the current time slice starts is continuously determined as the input information of the current time slice, the operator to be optimized is re-determined, and the running descending time when the operator to be optimized performs image classification on the image data is re-determined until the optimization ending condition is met, wherein the next time slice of the current time slice is the time slice with the appointed length by taking the ending moment of the current time slice as the starting moment.
Optionally, the apparatus further comprises:
an application module 310, configured to determine information of each operator in an image classification model to be deployed at a current time, where the image classification model to be deployed is a model trained in advance based on image data, and the information includes at least floating point operation times of each operator, historical optimization times of each operator, and operation drop time when each operator performs image classification on the image data after optimizing each operator before the current time; inputting the information into a trained operator optimization scheduling model, and determining an operator required to be optimized at the current moment to be used as an operator to be optimized; optimizing the operator to be optimized until an optimization ending condition is met; generating an executable file of the image classification model to be deployed, which runs on target hardware, according to each operator after optimization is finished; analyzing the executable file to obtain an analysis result; and determining the target image, running the executable file on the target hardware according to the analysis result, classifying the target image, and determining the classification result.
Optionally, the application module 310 is specifically configured to determine information of each operator in the image classification model to be deployed before a current time slice starts, and use the information as input information corresponding to the current time slice, where the current time slice is a time slice with a specified length with a current time as a starting time; and inputting the input information corresponding to the current time slice into a trained operator optimization scheduling model, and determining an operator required to be optimized in the current time slice as an operator to be optimized.
Optionally, the application module 310 is specifically configured to optimize the operator to be optimized until the current time slice ends; judging whether an optimization ending condition is met; if yes, determining that the optimization of each operator is finished; if not, the next time slice of the current time slice is used as the current time slice again, information of each operator before the current time slice starts is continuously determined, an operator to be optimized is redetermined, and the redetermined operator to be optimized is optimized until the optimization ending condition is met, wherein the next time slice of the current time slice is the time slice with the appointed length taking the ending moment of the current time slice as the starting moment.
Optionally, the application module 310 is specifically configured to determine a first computational graph of the image classification model to be deployed; performing graph optimization on the first calculation graph to obtain a second calculation graph; and determining information of each operator in the second calculation graph at the current moment.
The present specification also provides a computer readable storage medium having stored thereon a computer program operable to perform a method of training an operator optimized scheduling model as provided in fig. 1 above.
The present specification also provides a schematic structural diagram of an electronic device corresponding to fig. 1 shown in fig. 6. At the hardware level, as shown in fig. 6, the electronic device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile storage, and may of course include hardware required by other services. The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to realize the training method of the operator optimized scheduling model described in the above figure 1.
Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims (12)

1. A training method for an operator optimized scheduling model, comprising:
acquiring an image classification model which is trained in advance based on image data in a training data set and taking the image classification model as a training sample;
determining information of each operator in the training sample at the current moment, wherein the information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment;
inputting the information into an operator optimizing scheduling model to be trained, and determining an operator to be optimized at the current moment to be used as the operator to be optimized;
determining the operation descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the image data;
and training the operator optimizing scheduling model to be trained by adopting a strategy gradient algorithm according to the information, the operator to be optimized and the operation descending time when the operator to be optimized carries out image classification on the image data, wherein the operator optimizing scheduling model after training is used for determining operators required to be optimized in the image classifying model to be deployed according to the information of each operator in the image classifying model to be deployed, optimizing the determined operators, determining the image classifying model to be deployed after optimizing the operators, and carrying out image classification on the target image by operating the optimized image classifying model to be deployed on target hardware to determine the classification result.
2. The method according to claim 1, wherein determining the run-down time when the operator to be optimized performs image classification on the image data after optimizing the operator to be optimized specifically comprises:
determining the running time of the operator to be optimized for carrying out image classification on the image data before the operator to be optimized is optimized as historical running time, and determining the running time of the operator to be optimized for carrying out image classification on the image data after the operator to be optimized is optimized as optimized running time;
and taking the difference value of the historical running time minus the optimized running time as the running descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the operator to be optimized.
3. The method according to claim 1, wherein determining information of each operator in the training sample at the current time specifically comprises:
determining information of each operator in the training sample before the current time slice starts, and taking the information as input information corresponding to the current time slice, wherein the current time slice is a time slice with a specified length by taking the current time as the starting time;
Inputting the input information into an operator optimization scheduling model to be trained, and determining an operator to be optimized at the current moment as the operator to be optimized, wherein the operator optimization scheduling model specifically comprises the following steps:
and inputting the input information corresponding to the current time slice into an operator optimizing scheduling model to be trained, and determining an operator required to be optimized in the current time slice as an operator to be optimized.
4. A method according to claim 3, wherein determining the run-down time when the image data is classified by the operator to be optimized after optimizing the operator to be optimized, specifically comprises:
determining the running time of the operator to be optimized when the operator to be optimized performs image classification on the image data before the current time slice as historical running time, and determining the running time of the operator to be optimized after the operator to be optimized performs image classification on the image data in the current time slice as optimized running time;
and taking the difference value of the historical running time minus the optimized running time as the running descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization in the current time slice.
5. The method of claim 4, wherein training the operator-optimized scheduling model to be trained by using a strategy gradient algorithm according to the information, the operator to be optimized, and a run-down time when the operator to be optimized performs image classification on the image data, specifically comprises:
judging whether an optimization ending condition is met;
if so, training the operator optimizing and scheduling model to be trained by adopting a strategy gradient algorithm according to input information corresponding to each time slice, an operator to be optimized and operation descending time when the operator to be optimized carries out image classification on the image data, wherein each time slice is obtained by dividing preset optimizing time according to the appointed length;
if not, the next time slice of the current time slice is re-used as the current time slice, the information of each operator in the training sample before the current time slice starts is continuously determined as the input information of the current time slice, the operator to be optimized is re-determined, and the running descending time when the operator to be optimized performs image classification on the image data is re-determined until the optimization ending condition is met, wherein the next time slice of the current time slice is the time slice with the appointed length by taking the ending moment of the current time slice as the starting moment.
6. The method of claim 1, wherein the method further comprises:
determining information of each operator in an image classification model to be deployed at the current moment, wherein the image classification model to be deployed is a model trained on the basis of image data in advance, and the information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time of each operator for performing image classification on the image data after optimizing each operator before the current moment;
inputting the information into a trained operator optimization scheduling model, and determining an operator required to be optimized at the current moment to be used as an operator to be optimized;
optimizing the operator to be optimized until an optimization ending condition is met;
generating an executable file of the image classification model to be deployed, which runs on target hardware, according to each operator after optimization is finished;
analyzing the executable file to obtain an analysis result;
and determining the target image, running the executable file on the target hardware according to the analysis result, classifying the target image, and determining the classification result.
7. The method of claim 6, wherein determining information of each operator in the image classification model to be deployed at the current time specifically comprises:
determining information of each operator in an image classification model to be deployed before a current time slice starts, and taking the information as input information corresponding to the current time slice, wherein the current time slice is a time slice with a specified length by taking the current time as a starting time;
inputting the information into a trained operator optimization scheduling model, and determining an operator required to be optimized at the current moment as an operator to be optimized, wherein the operator to be optimized specifically comprises the following steps:
and inputting the input information corresponding to the current time slice into a trained operator optimization scheduling model, and determining an operator required to be optimized in the current time slice as an operator to be optimized.
8. The method according to claim 7, wherein optimizing the operator to be optimized until an optimization end condition is satisfied, specifically comprises:
optimizing the operator to be optimized until the current time slice is finished;
judging whether an optimization ending condition is met;
if yes, determining that the optimization of each operator is finished;
if not, the next time slice of the current time slice is used as the current time slice again, information of each operator before the current time slice starts is continuously determined, an operator to be optimized is redetermined, and the redetermined operator to be optimized is optimized until the optimization ending condition is met, wherein the next time slice of the current time slice is the time slice with the appointed length taking the ending moment of the current time slice as the starting moment.
9. The method of claim 6, wherein determining information of each operator in the image classification model to be deployed at the current time specifically comprises:
determining a first computational graph of an image classification model to be deployed;
performing graph optimization on the first calculation graph to obtain a second calculation graph;
and determining information of each operator in the second calculation graph at the current moment.
10. A training device for an operator optimized scheduling model, comprising:
the acquisition module is used for acquiring an image classification model which is trained in advance based on image data in the training data set and taking the image classification model as a training sample;
the first determining module is used for determining information of each operator in the training sample at the current moment, wherein the information at least comprises floating point operation times of each operator, historical optimization times of each operator and operation descending time when each operator performs image classification on the image data after optimizing each operator before the current moment;
the scheduling module is used for inputting the information into an operator optimization scheduling model to be trained, and determining an operator required to be optimized at the current moment to be used as an operator to be optimized;
The second determining module is used for determining the operation descending time when the operator to be optimized performs image classification on the image data after the operator to be optimized performs optimization on the operator to be optimized;
the training module is used for training the operator optimizing scheduling model to be trained according to the information, the operator to be optimized and the running descending time when the operator to be optimized carries out image classification on the image data, the operator optimizing scheduling model to be trained is used for determining operators required to be optimized in the image classifying model to be deployed according to the information of each operator in the image classifying model to be deployed, optimizing the determined operators, determining the image classifying model to be deployed after the operators are optimized, and carrying out image classification on the target image by operating the optimized image classifying model to be deployed on target hardware to determine the classifying result.
11. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-9.
12. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-9 when executing the program.
CN202311010092.7A 2023-08-11 2023-08-11 Training method, device, medium and equipment for operator optimized scheduling model Active CN116755862B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311010092.7A CN116755862B (en) 2023-08-11 2023-08-11 Training method, device, medium and equipment for operator optimized scheduling model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311010092.7A CN116755862B (en) 2023-08-11 2023-08-11 Training method, device, medium and equipment for operator optimized scheduling model

Publications (2)

Publication Number Publication Date
CN116755862A true CN116755862A (en) 2023-09-15
CN116755862B CN116755862B (en) 2023-12-19

Family

ID=87959341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311010092.7A Active CN116755862B (en) 2023-08-11 2023-08-11 Training method, device, medium and equipment for operator optimized scheduling model

Country Status (1)

Country Link
CN (1) CN116755862B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104766092A (en) * 2015-03-26 2015-07-08 杭州电子科技大学 Hyperspectral image classification method combined with potential function
WO2019073312A1 (en) * 2017-10-13 2019-04-18 Sigtuple Technologies Private Limited Method and device for integrating image channels in a deep learning model for classification
CN111522245A (en) * 2020-06-23 2020-08-11 北京三快在线科技有限公司 Method and device for controlling unmanned equipment
CN112801229A (en) * 2021-04-07 2021-05-14 北京三快在线科技有限公司 Training method and device for recognition model
CN113822173A (en) * 2021-09-01 2021-12-21 杭州电子科技大学 Pedestrian attribute recognition training acceleration method based on node merging and path prediction
CN114021770A (en) * 2021-09-14 2022-02-08 北京邮电大学 Network resource optimization method and device, electronic equipment and storage medium
US20220122344A1 (en) * 2019-01-09 2022-04-21 Samsung Electronics Co., Ltd Image optimization method and system based on artificial intelligence
CN115408072A (en) * 2022-05-20 2022-11-29 北京航空航天大学杭州创新研究院 Rapid adaptation model construction method based on deep reinforcement learning and related device
CN115934344A (en) * 2022-12-23 2023-04-07 广东省智能科学与技术研究院 Heterogeneous distributed reinforcement learning calculation method, system and storage medium
CN116311893A (en) * 2022-12-29 2023-06-23 南京星环智能科技有限公司 Traffic jam time prediction method and device, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104766092A (en) * 2015-03-26 2015-07-08 杭州电子科技大学 Hyperspectral image classification method combined with potential function
WO2019073312A1 (en) * 2017-10-13 2019-04-18 Sigtuple Technologies Private Limited Method and device for integrating image channels in a deep learning model for classification
US20220122344A1 (en) * 2019-01-09 2022-04-21 Samsung Electronics Co., Ltd Image optimization method and system based on artificial intelligence
CN111522245A (en) * 2020-06-23 2020-08-11 北京三快在线科技有限公司 Method and device for controlling unmanned equipment
CN112801229A (en) * 2021-04-07 2021-05-14 北京三快在线科技有限公司 Training method and device for recognition model
CN113822173A (en) * 2021-09-01 2021-12-21 杭州电子科技大学 Pedestrian attribute recognition training acceleration method based on node merging and path prediction
CN114021770A (en) * 2021-09-14 2022-02-08 北京邮电大学 Network resource optimization method and device, electronic equipment and storage medium
CN115408072A (en) * 2022-05-20 2022-11-29 北京航空航天大学杭州创新研究院 Rapid adaptation model construction method based on deep reinforcement learning and related device
CN115934344A (en) * 2022-12-23 2023-04-07 广东省智能科学与技术研究院 Heterogeneous distributed reinforcement learning calculation method, system and storage medium
CN116311893A (en) * 2022-12-29 2023-06-23 南京星环智能科技有限公司 Traffic jam time prediction method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
柯善武;金聪;: "基于优化空间金字塔匹配模型的图像分类", 电子测量技术, no. 07 *
王丽;郭振华;曹芳;高开;赵雅倩;赵坤;: "面向模型并行训练的模型拆分策略自动生成方法", 计算机工程与科学, no. 09 *

Also Published As

Publication number Publication date
CN116755862B (en) 2023-12-19

Similar Documents

Publication Publication Date Title
CN116126365B (en) Model deployment method, system, storage medium and electronic equipment
CN116304720B (en) Cost model training method and device, storage medium and electronic equipment
CN116880995B (en) Execution method and device of model task, storage medium and electronic equipment
CN116185532B (en) Task execution system, method, storage medium and electronic equipment
CN116450344A (en) Task execution method and device, storage medium and electronic equipment
CN116225669A (en) Task execution method and device, storage medium and electronic equipment
CN116521350B (en) ETL scheduling method and device based on deep learning algorithm
CN116341642B (en) Data processing method and device, storage medium and electronic equipment
CN116755862B (en) Training method, device, medium and equipment for operator optimized scheduling model
CN115543945B (en) Model compression method and device, storage medium and electronic equipment
CN117455015B (en) Model optimization method and device, storage medium and electronic equipment
CN117075918B (en) Model deployment method and device, storage medium and electronic equipment
CN116167431B (en) Service processing method and device based on hybrid precision model acceleration
CN117009729B (en) Data processing method and device based on softmax
CN117348999B (en) Service execution system and service execution method
CN115862675B (en) Emotion recognition method, device, equipment and storage medium
CN116434787B (en) Voice emotion recognition method and device, storage medium and electronic equipment
CN116881724B (en) Sample labeling method, device and equipment
CN117591217A (en) Information display method, device, equipment and storage medium
CN117933707A (en) Wind control model interpretation method and device, storage medium and electronic equipment
CN116028069A (en) Model deployment method and device, storage medium and electronic equipment
CN117593003A (en) Model training method and device, storage medium and electronic equipment
CN117668543A (en) Model training method and device, storage medium and electronic equipment
CN117828360A (en) Model training method, model training device, model code generating device, storage medium and storage medium
CN117591130A (en) Model deployment method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant