WO2023221371A1 - Task search method and apparatus, server and storage medium - Google Patents

Task search method and apparatus, server and storage medium Download PDF

Info

Publication number
WO2023221371A1
WO2023221371A1 PCT/CN2022/123598 CN2022123598W WO2023221371A1 WO 2023221371 A1 WO2023221371 A1 WO 2023221371A1 CN 2022123598 W CN2022123598 W CN 2022123598W WO 2023221371 A1 WO2023221371 A1 WO 2023221371A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
duration
training
inference
task scheduling
Prior art date
Application number
PCT/CN2022/123598
Other languages
French (fr)
Chinese (zh)
Inventor
郑辉煌
陈特峰
陈浩泽
王悦
王震
刘益群
孙黎
姜程
石晓伟
蓝翔
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Publication of WO2023221371A1 publication Critical patent/WO2023221371A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the field of computer technology, and specifically to a task search method and device, a server and a storage medium.
  • Deep learning is a research direction in the field of machine learning. It can learn the inherent laws and representation levels of sample data, so that the machine can imitate human activities such as audio-visual and thinking.
  • a deep learning compiler can be used to train and infer a deep learning model.
  • using a deep learning compiler to automatically optimize a deep learning model requires a long search time and is not suitable for training scenarios.
  • the present disclosure provides a task search method and device, a server and a storage medium.
  • the main purpose is to reduce the search time and improve the applicability of the task search solution.
  • a task search method including:
  • the task scheduling policy includes an inference duration task scheduling policy and a training task scheduling policy.
  • the inference duration task scheduling policy is used to adjust the inference duration corresponding to the initial task in the inference scenario.
  • the training task scheduling policy is used to adjust the search duration corresponding to the initial task in the training scenario;
  • the initial task is searched using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to obtain the target task.
  • a task search device including:
  • a policy acquisition unit is used to obtain a task scheduling policy corresponding to the initial task, wherein the task scheduling policy includes an inference duration task scheduling policy and a training task scheduling policy, and the inference duration task scheduling policy is used to adjust the inference duration task scheduling policy in the inference scenario.
  • the inference duration corresponding to the initial task, and the training task scheduling policy is used to adjust the search duration corresponding to the initial task in the training scenario;
  • a mode acquisition unit used to acquire the training operation mode corresponding to the initial task
  • a task acquisition unit is used to search the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to obtain the target task.
  • a server including:
  • the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform the method according to any one of the preceding aspects. .
  • a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the method according to any one of the preceding aspects.
  • a computer program product comprising a computer program that, when executed by a processor, implements the method of any one of the preceding aspects.
  • a computer program including computer program code, when the computer program code is run on a computer, causing the computer to perform the method according to any one of the preceding aspects. method.
  • an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the computer program, the above-mentioned steps are implemented.
  • the inference duration task scheduling policy is used Adjust the inference duration corresponding to the initial task in the inference scenario
  • the training task scheduling policy is used to adjust the search duration corresponding to the initial task in the training scenario; obtain the training operation mode corresponding to the initial task; use the inference duration
  • the task scheduling strategy, the training task scheduling strategy and the training operation mode search the initial task to obtain the target task. Therefore, the search time can be reduced and the applicability of the task search solution can be improved.
  • Figure 1 is a schematic flowchart of a task search method according to the first embodiment of the present disclosure
  • Figure 2 is a schematic flowchart of a task search method according to a second embodiment of the present disclosure
  • Figure 3 is a flowchart of selecting a scheduling strategy for inference duration tasks provided according to an embodiment of the present disclosure
  • Figure 4a is a schematic structural diagram of a first task search device used to implement the task search method according to an embodiment of the present disclosure
  • Figure 4b is a schematic structural diagram of a second task search device used to implement the task search method according to the embodiment of the present disclosure
  • Figure 4c is a schematic structural diagram of a third task search device used to implement the task search method according to the embodiment of the present disclosure
  • Figure 4d is a schematic structural diagram of a fourth task search device used to implement the task search method according to the embodiment of the present disclosure
  • Figure 4e is a schematic structural diagram of a fifth task search device used to implement the task search method according to the embodiment of the present disclosure
  • Figure 4f is a schematic structural diagram of a sixth task search device used to implement the task search method according to the embodiment of the present disclosure
  • Figure 5 is a block diagram of a server used to implement the task search method of an embodiment of the present disclosure.
  • server technology has become increasingly mature, which has improved the convenience of users' production and life.
  • users can train and infer deep learning models through the server.
  • the deep learning compiler is a compiler software used to solve multiple hardware platforms and deep learning docking problems.
  • the deep learning compiler can be composed of multiple layers of intermediate representation (Intermediate Representation, IR) and corresponding operating methods.
  • the high-level intermediate expression is used to express the deep learning calculation graph structure, which includes the representation of deep learning variables (Variable) and operators (Operator).
  • the low-level intermediate representation is the specific calculation of the operator, such as the matrix multiplication operator of the high-level intermediate representation.
  • the low-level intermediate representation is more specific loop, multiplication, and summation operations. These operations are also closer to the low-level instructions of the hardware.
  • a deep learning compiler may be provided in the server. Furthermore, when the server trains and infers the deep learning model, it can use the deep learning compiler to search for a configuration that matches the deep learning model.
  • the server when the server uses the deep learning compiler to search for a configuration that matches the deep learning model, it needs to input the calculation graph optimized by deep learning to the deep learning compiler for the deep learning compiler to search.
  • Figure 1 is a schematic flowchart of a task search method according to the first embodiment of the present disclosure.
  • This method can be implemented relying on a computer program and can be run on a device that performs task search. , can be a server with task search function.
  • the computer program can be integrated into an application or run as a stand-alone utility application.
  • the task search method includes: S101-S103.
  • the task Task refers to the computational graph subgraph used when the deep learning neural network model performs training and inference.
  • This task is not specific to a fixed task.
  • the task can change when the computational graph changes.
  • the computational graph subgraph changes the task can also change.
  • the initial task refers to a task that requires tuning search.
  • This initial task is not specific to a fixed task. For example, this initial task can change when the computational graph changes. When the computational graph subgraph changes, this initial task can also change.
  • the task scheduling policy refers to the policy adopted by the server when searching for initial tasks.
  • the task scheduling strategy includes but is not limited to inference duration task scheduling strategy, training task scheduling strategy, and so on. This task scheduling strategy does not specifically refer to a fixed strategy. For example, when the initial task changes, the task scheduling policy can change. When the computation graph changes, the task scheduling policy can also change.
  • the inference duration task scheduling policy refers to a strategy for adjusting the inference duration corresponding to the initial task in the inference scenario.
  • the inference duration task scheduling strategy does not specifically refer to a fixed strategy. For example, when the task scheduling policy changes, the inference duration task scheduling policy can change. When the initial task changes, the inference duration task scheduling policy can also change.
  • the training task scheduling policy refers to a policy used to adjust the search duration corresponding to the initial task in the training scenario.
  • the training task scheduling strategy does not specifically refer to a fixed strategy. For example, when the task scheduling policy changes, the training task scheduling policy can change. When the initial task changes, the training task scheduling policy can also change.
  • the server can obtain the task scheduling policy corresponding to the initial task.
  • the training operation mode refers to the training operation mode adopted by the server when training the deep learning neural network model.
  • the way the training is run is not specific to a fixed way. For example, when the deep learning neural network model changes, the way the training is run can change. When the initial task changes, the way the training is run can also change.
  • the server can obtain the training operation mode corresponding to the initial task.
  • the target task refers to a task obtained after performing a tuning search on the initial task.
  • the target task does not refer to a fixed task.
  • the target task can change when the initial task changes.
  • the target task can also change.
  • the server can use the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task.
  • the task scheduling strategy corresponding to the initial task by obtaining the task scheduling strategy corresponding to the initial task; obtaining the training operation mode corresponding to the initial task; and using the inference duration task scheduling strategy, training task scheduling strategy and training operation mode to search for the initial task to obtain the target Task. Therefore, by adopting the inference time task scheduling strategy, the task search time when inferring the deep learning model can be reduced, and the inference time of the deep learning model can be reduced.
  • the training task scheduling strategy and training operation mode the task search time of the deep learning model can be reduced.
  • the task search time when the model is trained can reduce the training time of the deep learning model. In turn, the task search time can be reduced while improving the applicability of the task search solution, making the task search method applicable to inference scenarios and training scenarios.
  • Figure 2 is a schematic flowchart of a task search method according to a second embodiment of the present disclosure. Specifically, the method includes: S201-S208.
  • the target configuration information refers to tuning configuration information corresponding to the task.
  • the target configuration information does not refer to certain fixed information.
  • the target configuration information includes but is not limited to loop block size, vectorization, loop unrolling, calculation position adjustment, thread parallelism, graphics processing unit (GPU) parallelism, etc.
  • the second target configuration information refers to tuning configuration information corresponding to the initial task.
  • the second target configuration information does not specifically refer to certain fixed information. For example, when the initial task changes, the second target configuration information may change.
  • the server obtains the information modification instruction for the second target configuration information the second target configuration information may also change.
  • hardware code refers to code generated by running on hardware.
  • the hardware code is not specific to a fixed code. For example, when the second target configuration information changes, the hardware code may change. When the initial task changes, this hardware code can also change.
  • the server when the server obtains hardware code based on the second target configuration information, the server may convert the second target configuration information into the underlying IR. In turn, the server can generate hardware code through the underlying IR.
  • the server can obtain the hardware code based on the second target configuration information.
  • the server may partition the computational graph to obtain at least one task.
  • the running information refers to the running information of the initial task in hardware.
  • This operating information does not specifically refer to a certain fixed operating information.
  • the operating information includes operating speed.
  • this running information can change when the initial task changes.
  • this operating information can also change.
  • the server when the server performs a task search, the server needs to arrange and combine the target configuration information and search for the arrangement and combination that results in the lowest running speed corresponding to the task.
  • the search algorithms used include but are not limited to genetic search algorithms, exhaustive search algorithms, and grid search. Algorithms and more.
  • the server can also directly search from the search space for the permutation and combination that results in the lowest running speed corresponding to the task.
  • the search space refers to the space including all executable permutations and combinations corresponding to the target configuration information.
  • the server when the server controls the hardware to run the hardware code and obtains the running information corresponding to the initial task, if the server determines that the running information meets the running information conditions, the server can end the search for the initial task and save the initial task. Set as target task.
  • the running information conditions refer to the conditions used by the server to determine whether the initial task needs to perform a tuning search.
  • This operating information condition does not specifically refer to a fixed condition. For example, when the server obtains a condition modification instruction for a running information condition, the running information condition may change.
  • the server can control the hardware to run the hardware code. Furthermore, the server can obtain the running information corresponding to the initial task.
  • the display interface refers to the display interface used when the server interacts with the user.
  • the display interface does not specifically refer to a fixed interface. For example, when the server changes, the presentation interface can change.
  • the inference duration task scheduling policy set refers to a set aggregated from at least one inference duration task scheduling policy.
  • the set of inference duration task scheduling strategies does not specifically refer to a fixed set. For example, when the thrust duration corresponding to the inference duration task scheduling policy changes, the inference duration task scheduling policy set may change. When the number of inference duration task scheduling policies changes, the set of inference duration task scheduling policies may change.
  • different inference duration task scheduling strategies correspond to different inference durations and tasks corresponding to different speed improvement values.
  • the inference duration task scheduling policy may be, for example, 10% of the search duration corresponds to a 90% speed improvement value; for example, the inference duration task scheduling policy may be, for example, 20% of the search duration, corresponding to a 92% speed improvement value; for example, the inference duration
  • the task scheduling policy may be, for example, that 100% of the search time corresponds to 100% of the speed improvement value.
  • the set of inference duration task scheduling policies includes at least one inference duration task scheduling policy, which includes but is not limited to long-duration task scheduling policies, short-duration task scheduling policies, and the like.
  • the inference time of the long-term task scheduling strategy is longer than the inference time of the short-term task scheduling strategy.
  • the inference performance of the long-duration task scheduling strategy is higher than that of the short-duration task scheduling strategy.
  • the server can display a set of inference duration task scheduling strategies corresponding to the initial task on the display interface.
  • the selection instruction refers to the instruction obtained by the terminal and entered by the user when selecting the inference duration task scheduling strategy.
  • This selection instruction does not refer to a fixed instruction.
  • the selection instructions include but are not limited to voice selection instructions, click selection instructions, and so on.
  • the server detects that the user speaks voice information corresponding to any inference duration task scheduling policy
  • the server can obtain the selection instruction corresponding to the inference duration task scheduling policy.
  • the server detects that the user clicks the selection button corresponding to any inference duration task scheduling policy
  • the server can also obtain the selection instruction corresponding to the inference duration task scheduling policy.
  • the server can obtain the selection instruction input for the inference duration task scheduling policy set.
  • FIG. 3 is a flow chart of selecting a scheduling strategy for inference duration tasks provided according to an embodiment of the present disclosure.
  • the server displays the inference duration task scheduling policy set on the display interface.
  • the set of inference-duration task scheduling strategies includes long-duration task scheduling strategies and short-duration task scheduling strategies.
  • the server detects that the user clicks on the short-term task scheduling policy, the server can obtain the selection instruction entered for the short-term task scheduling policy.
  • the server can set the inference duration task scheduling policy to a short-duration task scheduling policy.
  • the server can obtain the inference duration task scheduling policy corresponding to the selection instruction.
  • the training operation mode corresponding to the initial task includes but is not limited to the overall training operation mode, the cross-training operation mode, and so on.
  • the server when the server adopts the overall training operation mode, the server can first perform a tuning search on all tasks and then train the model.
  • the server when the server adopts cross-training operation mode, the server can train the model once and then perform task tuning once.
  • the server can obtain the training operation mode corresponding to the initial task.
  • S207 Use the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task.
  • the server when searching for initial tasks using the inference duration task scheduling strategy, training task scheduling strategy, and training operation mode, can obtain the optimization potential value corresponding to the initial task in the training scenario. Furthermore, the server can perform task search based on the optimization potential value.
  • the optimization potential value is used to indicate the optimization potential of the task.
  • the optimization potential value does not refer to a fixed value.
  • the optimization potential value can change.
  • the optimization potential value can be obtained based on derivatives or Bayesian models.
  • the server when the server obtains the optimization potential value corresponding to the initial task, and the server determines that the optimization potential value is less than the potential threshold, the server can stop searching for the initial task, that is, early search and early stop task scheduling for training. Strategy. Therefore, the search for tasks whose optimization potential value is smaller than the potential threshold can be stopped, thereby reducing the task search time.
  • the potential threshold refers to a threshold used by the server to evaluate whether a task has optimization potential.
  • the potential threshold is not specific to a fixed threshold. For example, when the terminal obtains a threshold modification instruction for the potential threshold, the potential threshold may change.
  • the server when the server obtains the optimization potential value corresponding to the initial task, the server may also obtain time resource information corresponding to the optimization potential value. Furthermore, the server can allocate a search duration corresponding to the time resource information to the initial task. Therefore, the search time can be allocated according to the optimization potential value corresponding to the task, thereby improving the efficiency of task search and reducing the total task search time.
  • the server when the server uses the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task, the server can obtain the first running time to control the iterative running of the training sample data of the initial task, and obtaining a second running time for controlling the hardware to run the training sample data.
  • the first run time and the second run time overlap. That is to say, the time when the initial task iteratively runs the training sample data overlaps with the time when the control hardware runs the training sample data, which can be a complete overlap or a partial overlap. Therefore, task search time can be reduced.
  • the server can use the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task.
  • the server may determine a search algorithm for permuting and combining different running configurations, including but not limited to loop block sizing, vectorization, loop unrolling, calculation position adjustment, etc.
  • the server can use the machine learning cost model to predict the running speed of the optimized configuration, select the faster one to run on real hardware, and measure the real faster configuration as the optimization result.
  • search algorithms such as genetic search algorithm, exhaustive search algorithm, grid search algorithm, etc.
  • the cost data database Database refers to the DataBase of the data of the actual operating speed of the hardware used to train the cost model Cost Model, and is used to train a more accurate Cost Model.
  • the Cost Model can be used to determine the search algorithm.
  • the server can use the machine learning Cost Model to predict the running speed of the optimized configuration, which can speed up the search speed of the search algorithm and reduce automatic tuning time. At the same time, the real speed of task running on the hardware is fed back to the Cost Model for machine learning training to optimize the Cost Model.
  • the server may use optimization methods such as ring block size, vectorization, loop unrolling, calculation position adjustment, thread parallelism, GPU parallelism, etc. in the matrix loop. These basic optimization methods are called schedule primitives in the automatic tuning system, and all runnable combinations composed of permutations and combinations of basic optimization methods are called search spaces.
  • the server can use a search algorithm to search for a fast task running method in the search space.
  • the first target configuration information refers to tuning configuration information corresponding to the target task.
  • the first target configuration information does not specifically refer to certain fixed information. For example, when the target task changes, the first target configuration information may change.
  • the server when the server performs model training and the server obtains the first target configuration information corresponding to the target task, the server may store the first target configuration information. Furthermore, when the server performs inference on the deep learning model, the first target configuration information can be reused, thereby reducing the task search time during model inference.
  • the server can obtain and store the first target configuration information corresponding to the target task.
  • the hardware code is obtained based on the second target configuration information; the hardware is controlled to run the hardware code, and the hardware code corresponding to the initial task is obtained. Running information; therefore, if it is determined based on the cached configuration information that there is no need to continue searching for the initial task, the time required to search for the initial task can be reduced, thereby reducing the task search time.
  • the inference duration task scheduling policy set corresponding to the initial task on the display interface; obtaining the selection instructions entered for the inference duration task scheduling policy set; obtaining the inference duration task scheduling policy corresponding to the selection instruction; therefore, you can choose according to your needs
  • the required inference duration task scheduling strategy can improve the flexibility of task search.
  • the training operation mode corresponding to the initial task using the inference duration task scheduling strategy, training task scheduling strategy and training operation mode to search the initial task to obtain the target task; therefore, by using the inference duration task scheduling strategy, the depth of the task can be reduced
  • the task search time when the learning model performs inference thereby reducing the inference time of the deep learning model.
  • the task search time when training the deep learning model can be reduced, which in turn can reduce the training time of the deep learning model.
  • the task search time can be reduced while improving the efficiency of the task search solution. Applicability makes this task search method applicable to inference scenarios and training scenarios.
  • the first target configuration information corresponding to the target task is obtained and stored; therefore, when the server performs inference on the deep learning model, the first target configuration information can be reused, thereby reducing the task search time during model inference.
  • the collection, storage, use, processing, transmission, provision and disclosure of user personal information are in compliance with relevant laws and regulations and do not violate public order and good customs.
  • the task search device 400 includes a strategy acquisition unit 401, a method acquisition unit 402 and a task acquisition unit 403, wherein:
  • the policy acquisition unit 401 is used to obtain the task scheduling policy corresponding to the initial task.
  • the task scheduling policy includes the inference duration task scheduling policy and the training task scheduling policy.
  • the inference duration task scheduling policy is used to adjust the inference corresponding to the initial task in the inference scenario.
  • Duration, the training task scheduling strategy is used to adjust the search duration corresponding to the initial task in the training scenario;
  • the mode acquisition unit 402 is used to acquire the training operation mode corresponding to the initial task
  • the task acquisition unit 403 is used to search for initial tasks using the inference duration task scheduling strategy, training task scheduling strategy and training operation mode to obtain the target task.
  • FIG. 4b is a schematic structural diagram of a second task search device used to implement the task search method according to an embodiment of the present disclosure.
  • the policy acquisition unit 401 includes a collection display sub-unit 411, an instruction acquisition sub-unit 421 and a policy acquisition sub-unit 431.
  • the policy acquisition unit 4001 is used to acquire the task scheduling policy corresponding to the initial task:
  • the set display subunit 411 is used to display the inference duration task scheduling policy set corresponding to the initial task on the display interface;
  • the instruction acquisition subunit 421 is used to obtain the selection instruction input for the inference duration task scheduling policy set;
  • the policy acquisition subunit 431 is used to acquire the inference duration task scheduling policy corresponding to the selection instruction.
  • FIG. 4c is a schematic structural diagram of a third task search device used to implement the task search method according to the embodiment of the present disclosure.
  • the task acquisition unit 403 includes a potential value acquisition sub-unit 413 and a search stop sub-unit 423.
  • the task acquisition unit 403 is used to search for the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode. , when getting the target task:
  • the potential value acquisition subunit 413 is used to obtain the optimization potential value corresponding to the initial task in the training scenario when searching for the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode;
  • the search stop subunit 423 is used to stop searching for the initial task when the optimization potential value is less than the potential threshold.
  • FIG. 4d is a schematic structural diagram of a fourth task search device used to implement the task search method according to an embodiment of the present disclosure.
  • the task search device 400 also includes an information acquisition unit 404 and a duration allocation unit 405, which are used to obtain the optimization potential value corresponding to the initial task:
  • Information acquisition unit 404 used to acquire time resource information corresponding to the optimization potential value
  • the duration allocation unit 405 is used to allocate a search duration corresponding to the time resource information to the initial task.
  • FIG. 4e is a schematic structural diagram of a fifth task search device used to implement the task search method according to an embodiment of the present disclosure.
  • the task search device 400 also includes an information storage unit 406, which is used to search for the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode, and after obtaining the target task:
  • the information storage unit 406 is used to obtain and store the first target configuration information corresponding to the target task.
  • FIG. 4f is a schematic structural diagram of a sixth task search device used to implement the task search method according to the embodiment of the present disclosure.
  • the task search device 400 also includes a code acquisition unit 407 and a code execution unit 408, which are used to obtain the task scheduling policy corresponding to the initial task:
  • the code acquisition unit 407 is configured to acquire the hardware code based on the second target configuration information when it is determined that the second target configuration information corresponding to the initial task exists in the cache;
  • the code running unit 408 is used to control the hardware to run the hardware code and obtain the running information corresponding to the initial task.
  • the task acquisition unit 404 is used to search for initial tasks using the inference duration task scheduling strategy, training task scheduling strategy and training operation mode, and when obtaining the target task, it is specifically used to:
  • the inference duration task scheduling strategy, training task scheduling strategy and training operation mode are used to search for the initial task.
  • the target task is obtained
  • the first running time for controlling the iterative running of the training sample data of the initial task is obtained
  • the first running time for controlling the hardware running of the training sample data is obtained. Two running times, wherein the first running time and the second running time overlap.
  • the task search device provided in the above embodiments performs the task search method
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed. , that is, dividing the internal structure of the device into different functional modules to complete all or part of the functions described above.
  • the task search device provided by the above embodiments and the task search method embodiments belong to the same concept. For details of the implementation process, please refer to the method embodiments, which will not be described again here.
  • the task scheduling policy corresponding to the initial task is obtained through the policy acquisition unit.
  • the task scheduling policy includes the inference duration task scheduling policy and the training task scheduling policy.
  • the inference duration task scheduling policy is used to adjust the initial task in the inference scenario.
  • the inference duration corresponding to the task, the training task scheduling strategy is used to adjust the search duration corresponding to the initial task in the training scenario;
  • the method acquisition unit obtains the training operation mode corresponding to the initial task;
  • the task acquisition unit adopts the inference duration task scheduling strategy, training task scheduling strategy and The training running mode searches the initial tasks and obtains the target tasks.
  • the task search time when inferring the deep learning model can be reduced, which in turn can reduce the inference time of the deep learning model.
  • the training task scheduling strategy and training operation mode the task search time when training the deep learning model can be reduced, which in turn can reduce the training time of the deep learning model.
  • the duration of task search can be reduced while improving the applicability of the task search solution, making the task search method applicable to inference scenarios and training scenarios.
  • the acquisition, storage and application of user personal information are in compliance with relevant laws and regulations and do not violate public order and good customs.
  • the present disclosure also provides a server, an electronic device, a readable storage medium, a computer program product, and a computer program.
  • Figure 5 illustrates a schematic block diagram of an example server 500 that may be used to implement embodiments of the present disclosure.
  • the server 500 includes a computing unit 501 that can execute according to a computer program stored in a read-only memory (ROM) 502 or loaded from a storage unit 508 into a random access memory (RAM) 503 Various appropriate actions and treatments. In the RAM 503, various programs and data required for the operation of the server 500 can also be stored.
  • Computing unit 501, ROM 502 and RAM 503 are connected to each other via bus 504.
  • An input/output (I/O) interface 505 is also connected to bus 504.
  • the I/O interface 505 includes: input unit 506, such as keyboard, mouse, etc.; output unit 507, such as various types of displays, speakers, etc.; storage unit 508, such as magnetic disk, optical disk, etc. ; and communication unit 509, such as a network card, modem, wireless communication transceiver, etc.
  • the communication unit 509 allows the server 500 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.
  • Computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processing processor (DSP), and any appropriate processor, controller, microcontroller, etc.
  • the computing unit 501 performs various methods and processes described above, such as the task search method.
  • the task search method may be implemented as a computer software program that is tangibly embodied in a machine-readable medium, such as storage unit 508.
  • part or all of the computer program may be loaded and/or installed on the server 500 via the ROM 502 and/or the communication unit 509.
  • the computer program When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more steps of the task search method described above may be performed.
  • the computing unit 501 may be configured to perform the task search method in any other suitable manner (eg, by means of firmware).
  • Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on a chip implemented in a system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or a combination thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard products
  • SOC system
  • CPLD load programmable logic device
  • computer hardware firmware, software, and/or a combination thereof.
  • These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor
  • the processor which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • An output device may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • An output device may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented.
  • the program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM portable compact disk read-only memory
  • magnetic storage device or any suitable combination of the above.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer.
  • a display device eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or a trackball
  • Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include: local area network (LAN), wide area network (WAN), the Internet, and blockchain networks.
  • Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other.
  • the server can be a cloud server, also known as cloud computing server or cloud host. It is a host product in the cloud computing service system to solve the problem of traditional physical host and VPS service ("Virtual Private Server", or "VPS" for short) Among them, there are defects such as difficult management and weak business scalability.
  • the server can also be a distributed system server or a server combined with a blockchain.

Abstract

Disclosed are a task search method and apparatus, a server and a storage medium. A specific implementation scheme comprises: acquiring a task scheduling strategy corresponding to an initial task, wherein the task scheduling strategy comprises a reasoning duration task scheduling strategy and a training task scheduling strategy, the reasoning duration task scheduling strategy is used for adjusting a reasoning duration corresponding to the initial task in a reasoning scenario, and the training task scheduling strategy is used for adjusting a search duration corresponding to the initial task in a training scenario; acquiring a training operation mode corresponding to the initial task; and searching the initial task by using the reasoning duration task scheduling strategy, the training task scheduling strategy and the training operation mode to obtain a target task.

Description

任务搜索方法及装置、服务器和存储介质Task search method and device, server and storage medium
相关申请的交叉引用Cross-references to related applications
本申请基于申请号为2022105481337、申请日为2022年5月19日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考This application is filed based on a Chinese patent application with application number 2022105481337 and a filing date of May 19, 2022, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference into this application.
技术领域Technical field
本公开涉及计算机技术领域,具体涉及一种任务搜索方法及装置、服务器和存储介质。The present disclosure relates to the field of computer technology, and specifically to a task search method and device, a server and a storage medium.
背景技术Background technique
深度学习是机器学习领域中的一个研究方向,其可以学习样本数据的内在规律和表示层次,从而可以使机器模仿视听和思考等人类的活动。相关技术中,可以采用深度学习编译器对深度学习模型进行训练和推理。然而,相关技术中,采用深度学习编译器对深度学习模型进行自动优化时需要的搜索时长较长,不适用于训练场景。Deep learning is a research direction in the field of machine learning. It can learn the inherent laws and representation levels of sample data, so that the machine can imitate human activities such as audio-visual and thinking. In related technologies, a deep learning compiler can be used to train and infer a deep learning model. However, in related technologies, using a deep learning compiler to automatically optimize a deep learning model requires a long search time and is not suitable for training scenarios.
发明内容Contents of the invention
本公开提供了一种任务搜索方法及装置、服务器和存储介质,主要目的在于降低搜索时长,提高任务搜索方案的适用性。The present disclosure provides a task search method and device, a server and a storage medium. The main purpose is to reduce the search time and improve the applicability of the task search solution.
根据本公开的一方面,提供了一种任务搜索方法,包括:According to an aspect of the present disclosure, a task search method is provided, including:
获取与初始任务对应的任务调度策略,其中,所述任务调度策略包括推理时长任务调度策略和训练任务调度策略,所述推理时长任务调度策略用于调节推理场景下所述初始任务对应的推理时长,所述训练任务调度策略用于调节训练场景下所述初始任务对应的搜索时长;Obtain the task scheduling policy corresponding to the initial task, where the task scheduling policy includes an inference duration task scheduling policy and a training task scheduling policy. The inference duration task scheduling policy is used to adjust the inference duration corresponding to the initial task in the inference scenario. , the training task scheduling policy is used to adjust the search duration corresponding to the initial task in the training scenario;
获取所述初始任务对应的训练运行方式;Obtain the training operation mode corresponding to the initial task;
采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务。The initial task is searched using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to obtain the target task.
根据本公开的另一方面,提供了一种任务搜索装置,包括:According to another aspect of the present disclosure, a task search device is provided, including:
策略获取单元,用于获取与初始任务对应的任务调度策略,其中,所述任务调度策略包括推理时长任务调度策略和训练任务调度策略,所述推理时长任务调度策略用 于调节推理场景下所述初始任务对应的推理时长,所述训练任务调度策略用于调节训练场景下所述初始任务对应的搜索时长;A policy acquisition unit is used to obtain a task scheduling policy corresponding to the initial task, wherein the task scheduling policy includes an inference duration task scheduling policy and a training task scheduling policy, and the inference duration task scheduling policy is used to adjust the inference duration task scheduling policy in the inference scenario. The inference duration corresponding to the initial task, and the training task scheduling policy is used to adjust the search duration corresponding to the initial task in the training scenario;
方式获取单元,用于获取所述初始任务对应的训练运行方式;A mode acquisition unit, used to acquire the training operation mode corresponding to the initial task;
任务获取单元,用于采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务。A task acquisition unit is used to search the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to obtain the target task.
根据本公开的另一方面,提供了一种服务器,包括:According to another aspect of the present disclosure, a server is provided, including:
至少一个处理器;以及at least one processor; and
与所述至少一个处理器通信连接的存储器;其中,a memory communicatively connected to the at least one processor; wherein,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行前述一方面中任一项所述的方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform the method according to any one of the preceding aspects. .
根据本公开的另一方面,提供了一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行前述一方面中任一项所述的方法。According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the method according to any one of the preceding aspects.
根据本公开的另一方面,提供了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现前述一方面中任一项所述的方法。According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program that, when executed by a processor, implements the method of any one of the preceding aspects.
根据本公开的另一方面,提供了一种计算机程序,所述计算机程序包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如前述一方面中任一项所述的方法。According to another aspect of the present disclosure, a computer program is provided, the computer program including computer program code, when the computer program code is run on a computer, causing the computer to perform the method according to any one of the preceding aspects. method.
根据本公开的另一方面,提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现如前述一方面中任一项所述的方法。According to another aspect of the present disclosure, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the above-mentioned steps are implemented. The method of any one of the aspects.
在本公开一个或多个实施例中,通过获取与初始任务对应的任务调度策略,其中,所述任务调度策略包括推理时长任务调度策略和训练任务调度策略,所述推理时长任务调度策略用于调节推理场景下所述初始任务对应的推理时长,所述训练任务调度策略用于调节训练场景下所述初始任务对应的搜索时长;获取所述初始任务对应的训练运行方式;采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务。因此可以降低搜索时长,提高任务搜索方案的适用性。In one or more embodiments of the present disclosure, by obtaining a task scheduling policy corresponding to the initial task, where the task scheduling policy includes an inference duration task scheduling policy and a training task scheduling policy, the inference duration task scheduling policy is used Adjust the inference duration corresponding to the initial task in the inference scenario, and the training task scheduling policy is used to adjust the search duration corresponding to the initial task in the training scenario; obtain the training operation mode corresponding to the initial task; use the inference duration The task scheduling strategy, the training task scheduling strategy and the training operation mode search the initial task to obtain the target task. Therefore, the search time can be reduced and the applicability of the task search solution can be improved.
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.
附图说明Description of the drawings
附图用于更好地理解本方案,不构成对本公开的限定。其中:The accompanying drawings are used to better understand the present solution and do not constitute a limitation of the present disclosure. in:
图1是根据本公开第一实施例的任务搜索方法的流程示意图;Figure 1 is a schematic flowchart of a task search method according to the first embodiment of the present disclosure;
图2是根据本公开第二实施例的任务搜索方法的流程示意图;Figure 2 is a schematic flowchart of a task search method according to a second embodiment of the present disclosure;
图3是根据本公开实施例提供的推理时长任务调度策略的选择流程图;Figure 3 is a flowchart of selecting a scheduling strategy for inference duration tasks provided according to an embodiment of the present disclosure;
图4a是用来实现本公开实施例的任务搜索方法的第一种任务搜索装置的结构示意图;Figure 4a is a schematic structural diagram of a first task search device used to implement the task search method according to an embodiment of the present disclosure;
图4b是用来实现本公开实施例的任务搜索方法的第二种任务搜索装置的结构示意图;Figure 4b is a schematic structural diagram of a second task search device used to implement the task search method according to the embodiment of the present disclosure;
图4c是用来实现本公开实施例的任务搜索方法的第三种任务搜索装置的结构示意图;Figure 4c is a schematic structural diagram of a third task search device used to implement the task search method according to the embodiment of the present disclosure;
图4d是用来实现本公开实施例的任务搜索方法的第四种任务搜索装置的结构示意图;Figure 4d is a schematic structural diagram of a fourth task search device used to implement the task search method according to the embodiment of the present disclosure;
图4e是用来实现本公开实施例的任务搜索方法的第五种任务搜索装置的结构示意图;Figure 4e is a schematic structural diagram of a fifth task search device used to implement the task search method according to the embodiment of the present disclosure;
图4f是用来实现本公开实施例的任务搜索方法的第六种任务搜索装置的结构示意图;Figure 4f is a schematic structural diagram of a sixth task search device used to implement the task search method according to the embodiment of the present disclosure;
图5是用来实现本公开实施例的任务搜索方法的服务器的框图。Figure 5 is a block diagram of a server used to implement the task search method of an embodiment of the present disclosure.
具体实施方式Detailed ways
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding and should be considered to be exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.
随着科学技术的发展,服务器技术的日益成熟,提高了用户生产生活的便利性。服务器应用场景中,用户可以通过服务器对深度学习模型进行训练和推理。With the development of science and technology, server technology has become increasingly mature, which has improved the convenience of users' production and life. In the server application scenario, users can train and infer deep learning models through the server.
根据一些实施例,深度学习编译器是一种用于解决多种硬件平台和深度学习对接问题的编译器软件。深度学习编译器可以由多层中间表达(Intermediate Representation,IR)及对应的运行方式组成。其中,高层中间表达用来表达深度学习计算图结构,其包含深度学习的变量(Variable),算子(Operator)的表示。低层中间表示则是算子的具体计算,比如高层中间表示的矩阵乘算子,在底层中间表示则是更具体的循环、求乘、求和操作,这些操作也更接近硬件的低层指令。According to some embodiments, the deep learning compiler is a compiler software used to solve multiple hardware platforms and deep learning docking problems. The deep learning compiler can be composed of multiple layers of intermediate representation (Intermediate Representation, IR) and corresponding operating methods. Among them, the high-level intermediate expression is used to express the deep learning calculation graph structure, which includes the representation of deep learning variables (Variable) and operators (Operator). The low-level intermediate representation is the specific calculation of the operator, such as the matrix multiplication operator of the high-level intermediate representation. The low-level intermediate representation is more specific loop, multiplication, and summation operations. These operations are also closer to the low-level instructions of the hardware.
在一些实施例中,服务器中可以设置深度学习编译器。进而,服务器在对深度学习模型进行训练和推理时,可以采用深度学习编译器搜索与深度学习模型匹配的配置。In some embodiments, a deep learning compiler may be provided in the server. Furthermore, when the server trains and infers the deep learning model, it can use the deep learning compiler to search for a configuration that matches the deep learning model.
在一些实施例中,服务器采用深度学习编译器搜索与深度学习模型匹配的配置时,需要将经过深度学习优化后的计算图输入至深度学习编译器以供深度学习编译器搜索。In some embodiments, when the server uses the deep learning compiler to search for a configuration that matches the deep learning model, it needs to input the calculation graph optimized by deep learning to the deep learning compiler for the deep learning compiler to search.
然而,深度学习编译器对计算图进行搜索时,由于需要尝试和搜索不同的优化配置,搜索量大,整个计算图都需要进行搜索尝试。因此,服务器进行搜索的时间较长。However, when the deep learning compiler searches the calculation graph, due to the need to try and search for different optimization configurations, the search volume is large, and the entire calculation graph needs to be searched. Therefore, the server takes longer to search.
易于理解的是,在对深度学习模型进行推理时,可以通过离线耗费时间来进行搜索,然后再上线对深度学习模型进行推理。然而,在对深度学习模型进行训练时,用户需要通过深度学习模型的训练迭代了解深度学习模型的训练效果,如果搜索时间过长会增加深度学习模型的训练时长。由于自动优化技术需要的优化时间长,往往数十小时,仅适用深度学习推理场景,但是在训练场景上适用性则较差。It is easy to understand that when inferring a deep learning model, you can spend time searching offline and then infer the deep learning model online. However, when training a deep learning model, users need to understand the training effect of the deep learning model through training iterations of the deep learning model. If the search time is too long, the training time of the deep learning model will be increased. Since automatic optimization technology requires a long optimization time, often dozens of hours, it is only suitable for deep learning inference scenarios, but its applicability in training scenarios is poor.
下面结合具体的实施例对本公开进行详细说明。The present disclosure will be described in detail below with reference to specific embodiments.
在第一个实施例中,如图1所示,图1是根据本公开第一实施例的任务搜索方法的流程示意图,该方法可依赖于计算机程序实现,可运行于进行任务搜索的装置上,可以是具有任务搜索功能的服务器。该计算机程序可集成在应用中,也可作为独立的工具类应用运行。In the first embodiment, as shown in Figure 1, Figure 1 is a schematic flowchart of a task search method according to the first embodiment of the present disclosure. This method can be implemented relying on a computer program and can be run on a device that performs task search. , can be a server with task search function. The computer program can be integrated into an application or run as a stand-alone utility application.
具体的,该任务搜索方法包括:S101-S103。Specifically, the task search method includes: S101-S103.
S101,获取与初始任务对应的任务调度策略。S101. Obtain the task scheduling policy corresponding to the initial task.
根据一些实施例,任务Task指的是深度学习神经网络模型进行训练推理时,使用的计算图子图。该任务并不特指某一固定任务。例如,当计算图发生变化时,该任务可以发生变化。当计算图子图发生变化时,该任务也可以发生变化。According to some embodiments, the task Task refers to the computational graph subgraph used when the deep learning neural network model performs training and inference. This task is not specific to a fixed task. For example, the task can change when the computational graph changes. When the computational graph subgraph changes, the task can also change.
在一些实施例中,初始任务指的是需要进行调优搜索的任务。该初始任务并不特指某一固定任务。例如,当计算图发生变化时,该初始任务可以发生变化。当计算图子图发生变化时,该初始任务也可以发生变化。In some embodiments, the initial task refers to a task that requires tuning search. This initial task is not specific to a fixed task. For example, this initial task can change when the computational graph changes. When the computational graph subgraph changes, this initial task can also change.
根据一些实施例,任务调度策略指的是服务器对初始任务进行搜索时所采用的策略。该任务调度策略包括但不限于推理时长任务调度策略、训练任务调度策略等等。该任务调度策略并不特指某一固定策略。例如,当初始任务发生变化时,该任务调度策略可以发生变化。当计算图发生变化时,该任务调度策略也可以发生变化。According to some embodiments, the task scheduling policy refers to the policy adopted by the server when searching for initial tasks. The task scheduling strategy includes but is not limited to inference duration task scheduling strategy, training task scheduling strategy, and so on. This task scheduling strategy does not specifically refer to a fixed strategy. For example, when the initial task changes, the task scheduling policy can change. When the computation graph changes, the task scheduling policy can also change.
在一些实施例中,推理时长任务调度策略指的是用于调节推理场景下初始任务对应的推理时长的策略。该推理时长任务调度策略并不特指某一固定策略。例如,当任务调度策略发生变化时,该推理时长任务调度策略可以发生变化。当初始任务发生变化时,该推理时长任务调度策略也可以发生变化。In some embodiments, the inference duration task scheduling policy refers to a strategy for adjusting the inference duration corresponding to the initial task in the inference scenario. The inference duration task scheduling strategy does not specifically refer to a fixed strategy. For example, when the task scheduling policy changes, the inference duration task scheduling policy can change. When the initial task changes, the inference duration task scheduling policy can also change.
在一些实施例中,训练任务调度策略指的是用于调节训练场景下初始任务对应的搜索时长的策略。该训练任务调度策略并不特指某一固定策略。例如,当任务调度策略发生变化时,该训练任务调度策略可以发生变化。当初始任务发生变化时,该训练任务调度策略也可以发生变化。In some embodiments, the training task scheduling policy refers to a policy used to adjust the search duration corresponding to the initial task in the training scenario. The training task scheduling strategy does not specifically refer to a fixed strategy. For example, when the task scheduling policy changes, the training task scheduling policy can change. When the initial task changes, the training task scheduling policy can also change.
易于理解的是,当服务器进行任务搜索时,服务器可以获取与初始任务对应的任务调度策略。It is easy to understand that when the server performs task search, the server can obtain the task scheduling policy corresponding to the initial task.
S102,获取初始任务对应的训练运行方式。S102: Obtain the training operation mode corresponding to the initial task.
根据一些实施例,训练运行方式指的是服务器对深度学习神经网络模型进行训练时,采用的训练运行方式。该训练运行方式并不特指某一固定方式。例如,当深度学习神经网络模型发生变化时,该训练运行方式可以发生变化。当初始任务发生变化时,该训练运行方式也可以发生变化。According to some embodiments, the training operation mode refers to the training operation mode adopted by the server when training the deep learning neural network model. The way the training is run is not specific to a fixed way. For example, when the deep learning neural network model changes, the way the training is run can change. When the initial task changes, the way the training is run can also change.
易于理解的是,当服务器获取到与初始任务对应的任务调度策略时,服务器可以获取初始任务对应的训练运行方式。It is easy to understand that when the server obtains the task scheduling policy corresponding to the initial task, the server can obtain the training operation mode corresponding to the initial task.
S103,采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务。S103. Use the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task.
根据一些实施例,目标任务指的是对初始任务进行调优搜索后得到的任务。该目标任务并不特指某一固定任务。例如,当初始任务发生变化时,该目标任务可以发生变化。当任务调度策略发生变化时,该目标任务也可以发生变化。According to some embodiments, the target task refers to a task obtained after performing a tuning search on the initial task. The target task does not refer to a fixed task. For example, the target task can change when the initial task changes. When the task scheduling policy changes, the target task can also change.
易于理解的是,当服务器获取到初始任务对应的训练运行方式时,服务器可以采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务。It is easy to understand that when the server obtains the training operation mode corresponding to the initial task, the server can use the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task.
在本公开实施例中,通过获取与初始任务对应的任务调度策略;获取初始任务对应的训练运行方式;采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务。因此通过采用推理时长任务调度策略,可以减少对深度学习模型进行推理时的任务搜索时长,可以减少深度学习模型的推理时长,同时,通过采用训练任务调度策略和训练运行方式,可以减少对深度学习模型进行训练时的任务搜索时长,可以减少深度学习模型的训练时长,进而,可以减少任务搜索的时长的同时提高任务搜索方案的适用性,使得该任务搜索方法可以适用于推理场景和训练场景。In the embodiment of the present disclosure, by obtaining the task scheduling strategy corresponding to the initial task; obtaining the training operation mode corresponding to the initial task; and using the inference duration task scheduling strategy, training task scheduling strategy and training operation mode to search for the initial task to obtain the target Task. Therefore, by adopting the inference time task scheduling strategy, the task search time when inferring the deep learning model can be reduced, and the inference time of the deep learning model can be reduced. At the same time, by using the training task scheduling strategy and training operation mode, the task search time of the deep learning model can be reduced. The task search time when the model is trained can reduce the training time of the deep learning model. In turn, the task search time can be reduced while improving the applicability of the task search solution, making the task search method applicable to inference scenarios and training scenarios.
请参见图2,图2是根据本公开第二实施例的任务搜索方法的流程示意图。具体的,该方法包括:S201-S208。Please refer to Figure 2, which is a schematic flowchart of a task search method according to a second embodiment of the present disclosure. Specifically, the method includes: S201-S208.
S201,在确定缓存中存在与初始任务对应的第二目标配置信息的情况下,基于第二目标配置信息获取硬件代码。S201: When it is determined that the second target configuration information corresponding to the initial task exists in the cache, obtain the hardware code based on the second target configuration information.
根据一些实施例,目标配置信息指的是任务对应的调优配置信息。该目标配置信息并不特指某一固定信息。该目标配置信息包括但不限于循环分块大小、向量化、循环展开、计算位置调整、线程并行、图形处理器(graphics processing unit,GPU)并行等等。According to some embodiments, the target configuration information refers to tuning configuration information corresponding to the task. The target configuration information does not refer to certain fixed information. The target configuration information includes but is not limited to loop block size, vectorization, loop unrolling, calculation position adjustment, thread parallelism, graphics processing unit (GPU) parallelism, etc.
在一些实施例中,第二目标配置信息指的是初始任务对应的调优配置信息。该第二目标配置信息并不特指某一固定信息。例如,当初始任务发生变化时,该第二目标配置信息可以发生变化。当服务器获取到针对第二目标配置信息的信息修改指令时,该第二目标配置信息也可以发生变化。In some embodiments, the second target configuration information refers to tuning configuration information corresponding to the initial task. The second target configuration information does not specifically refer to certain fixed information. For example, when the initial task changes, the second target configuration information may change. When the server obtains the information modification instruction for the second target configuration information, the second target configuration information may also change.
在一些实施例中,硬件代码指的是在硬件上运行生成的代码。该硬件代码并不特指某一固定代码。例如,当第二目标配置信息发生变化时,该硬件代码可以发生变化。当初始任务发生变化时,该硬件代码也可以发生变化。In some embodiments, hardware code refers to code generated by running on hardware. The hardware code is not specific to a fixed code. For example, when the second target configuration information changes, the hardware code may change. When the initial task changes, this hardware code can also change.
在一些实施例中,当服务器基于第二目标配置信息获取硬件代码时,服务器可以将第二目标配置信息转化为底层IR。进而,服务器可以通过底层IR生成硬件代码。In some embodiments, when the server obtains hardware code based on the second target configuration information, the server may convert the second target configuration information into the underlying IR. In turn, the server can generate hardware code through the underlying IR.
易于理解的是,当服务器进行任务搜索时,在服务器确定缓存中存在与初始任务对应的第二目标配置信息的情况下,服务器可以基于第二目标配置信息获取硬件代码。It is easy to understand that when the server performs a task search, if the server determines that the second target configuration information corresponding to the initial task exists in the cache, the server can obtain the hardware code based on the second target configuration information.
根据一些实施例,服务器可以对计算图进行划分,得到至少一个任务。According to some embodiments, the server may partition the computational graph to obtain at least one task.
S202,控制硬件运行硬件代码,获取初始任务对应的运行信息。S202: Control the hardware to run the hardware code and obtain the operation information corresponding to the initial task.
根据一些实施例,运行信息指的是初始任务在硬件中的运行信息。该运行信息并不特指某一固定运行信息。该运行信息包括运行速度。例如,当初始任务发生变化时,该运行信息可以发生变化。当硬件代码发生变化时,该运行信息也可以发生变化。According to some embodiments, the running information refers to the running information of the initial task in hardware. This operating information does not specifically refer to a certain fixed operating information. The operating information includes operating speed. For example, this running information can change when the initial task changes. When the hardware code changes, this operating information can also change.
在一些实施例中,当服务器进行任务搜索时,服务器需要对目标配置信息进行排列组合,搜索出使任务对应的运行速度最低的排列组合。In some embodiments, when the server performs a task search, the server needs to arrange and combine the target configuration information and search for the arrangement and combination that results in the lowest running speed corresponding to the task.
在一些实施例中,服务器需要对目标配置信息进行排列组合,搜索出使任务对应的运行速度最低的排列组合时,采用的搜索算法包括但不限于遗传搜索算法,穷举搜索算法、网格搜索算法等等。In some embodiments, when the server needs to arrange and combine the target configuration information and search for the permutation and combination with the lowest running speed corresponding to the task, the search algorithms used include but are not limited to genetic search algorithms, exhaustive search algorithms, and grid search. Algorithms and more.
在一些实施例中,服务器还可以直接从搜索空间中,搜索出使任务对应的运行速度最低的排列组合。其中,搜索空间指的是包括目标配置信息对应的所有可运行排列组合的空间。In some embodiments, the server can also directly search from the search space for the permutation and combination that results in the lowest running speed corresponding to the task. The search space refers to the space including all executable permutations and combinations corresponding to the target configuration information.
根据一些实施例,当服务器控制硬件运行硬件代码,获取初始任务对应的运行信息时,在服务器判断该运行信息符合运行信息条件的情况下,服务器可以结束对初始任务的搜索,并将该初始任务设置为目标任务。According to some embodiments, when the server controls the hardware to run the hardware code and obtains the running information corresponding to the initial task, if the server determines that the running information meets the running information conditions, the server can end the search for the initial task and save the initial task. Set as target task.
在一些实施例中,运行信息条件指的是服务器用于判断初始任务是否需要进行调优搜索时采用的条件。该运行信息条件并不特指某一固定条件。例如,当服务器获取到针对运行信息条件的条件修改指令时,该运行信息条件可以发生变化。In some embodiments, the running information conditions refer to the conditions used by the server to determine whether the initial task needs to perform a tuning search. This operating information condition does not specifically refer to a fixed condition. For example, when the server obtains a condition modification instruction for a running information condition, the running information condition may change.
易于理解的是,当服务器基于第二目标配置信息获取硬件代码时,服务器可以控制硬件运行硬件代码。进而,服务器可以获取初始任务对应的运行信息。It is easy to understand that when the server obtains the hardware code based on the second target configuration information, the server can control the hardware to run the hardware code. Furthermore, the server can obtain the running information corresponding to the initial task.
S203,在展示界面上展示与初始任务对应的推理时长任务调度策略集合。S203. Display the inference duration task scheduling policy set corresponding to the initial task on the display interface.
根据一些实施例,展示界面指的是服务器与用户进行人机交互时采用的展示界面。该展示界面并不特指某一固定界面。例如,当服务器发生变化时,该展示界面可以发生变化。According to some embodiments, the display interface refers to the display interface used when the server interacts with the user. The display interface does not specifically refer to a fixed interface. For example, when the server changes, the presentation interface can change.
在一些实施例中,推理时长任务调度策略集合指的是由至少一个推理时长任务调度策略汇聚而成的集合。该推理时长任务调度策略集合并不特指某一固定集合。例如,当推理时长任务调度策略对应的推力时长发生变化时,该推理时长任务调度策略集合可以发生变化。当推理时长任务调度策略的数量发生变化时,该推理时长任务调度策略集合可以发生变化。In some embodiments, the inference duration task scheduling policy set refers to a set aggregated from at least one inference duration task scheduling policy. The set of inference duration task scheduling strategies does not specifically refer to a fixed set. For example, when the thrust duration corresponding to the inference duration task scheduling policy changes, the inference duration task scheduling policy set may change. When the number of inference duration task scheduling policies changes, the set of inference duration task scheduling policies may change.
在一些实施例中,不同的推理时长任务调度策略对应的推理时长和任务对应的速度提升值不同。例如,推理时长任务调度策略例如可以是10%的搜索时长对应90%的速度提升值;例如,推理时长任务调度策略例如可以是20%的搜索时长对应92%的速度提升值;例如,推理时长任务调度策略例如可以是100%的搜索时长对应100%的速度提升值。In some embodiments, different inference duration task scheduling strategies correspond to different inference durations and tasks corresponding to different speed improvement values. For example, the inference duration task scheduling policy may be, for example, 10% of the search duration corresponds to a 90% speed improvement value; for example, the inference duration task scheduling policy may be, for example, 20% of the search duration, corresponding to a 92% speed improvement value; for example, the inference duration The task scheduling policy may be, for example, that 100% of the search time corresponds to 100% of the speed improvement value.
在一些实施例中,该推理时长任务调度策略集合包括至少一个推理时长任务调度策略,该推理时长任务调度策略包括但不限于长时任务调度策略、短时任务调度策略等等。其中,长时任务调度策略的推理时长大于短时任务调度策略的推理时长。长时任务调度策略的推理性能高于短时任务调度策略的推理性能。In some embodiments, the set of inference duration task scheduling policies includes at least one inference duration task scheduling policy, which includes but is not limited to long-duration task scheduling policies, short-duration task scheduling policies, and the like. Among them, the inference time of the long-term task scheduling strategy is longer than the inference time of the short-term task scheduling strategy. The inference performance of the long-duration task scheduling strategy is higher than that of the short-duration task scheduling strategy.
易于理解的是,当服务器进行任务搜索时,服务器可以在展示界面上展示与初始任务对应的推理时长任务调度策略集合。It is easy to understand that when the server performs task search, the server can display a set of inference duration task scheduling strategies corresponding to the initial task on the display interface.
S204,获取针对推理时长任务调度策略集合所输入的选择指令。S204: Obtain the selection instructions input for the inference duration task scheduling policy set.
根据一些实施例,选择指令指的是终端获取到的用户选择推理时长任务调度策略时输入的指令。该选择指令并不特指某一固定指令。该选择指令包括但不限于语音选择指令、点击选择指令等等。例如,当服务器检测到用户说出任一推理时长任务调度策略对应的语音信息时,则服务器可以获取到该推理时长任务调度策略对应的选择指令。当服务器检测到用户点击任一推理时长任务调度策略对应的选择按键时,则服务器也可以获取到该推理时长任务调度策略对应的选择指令。According to some embodiments, the selection instruction refers to the instruction obtained by the terminal and entered by the user when selecting the inference duration task scheduling strategy. This selection instruction does not refer to a fixed instruction. The selection instructions include but are not limited to voice selection instructions, click selection instructions, and so on. For example, when the server detects that the user speaks voice information corresponding to any inference duration task scheduling policy, the server can obtain the selection instruction corresponding to the inference duration task scheduling policy. When the server detects that the user clicks the selection button corresponding to any inference duration task scheduling policy, the server can also obtain the selection instruction corresponding to the inference duration task scheduling policy.
易于理解的是,当服务器在展示界面上展示与初始任务对应的推理时长任务调度策略集合时,服务器可以获取针对推理时长任务调度策略集合所输入的选择指令。It is easy to understand that when the server displays the inference duration task scheduling policy set corresponding to the initial task on the display interface, the server can obtain the selection instruction input for the inference duration task scheduling policy set.
S205,获取选择指令对应的推理时长任务调度策略,获取训练任务调度策略。S205: Obtain the inference duration task scheduling policy corresponding to the selection instruction and obtain the training task scheduling policy.
根据一些实施例,图3是根据本公开实施例提供的推理时长任务调度策略的选择流程图。如图3所示。服务器在展示界面上展示推理时长任务调度策略集合。其中,推理时长任务调度策略集合包括长时任务调度策略和短时任务调度策略。当服务器检测到用户点击短时任务调度策略时,服务器可以获取到针对短时任务调度策略所输入的选择指令。进而,服务器可以设置推理时长任务调度策略为短时任务调度策略。According to some embodiments, FIG. 3 is a flow chart of selecting a scheduling strategy for inference duration tasks provided according to an embodiment of the present disclosure. As shown in Figure 3. The server displays the inference duration task scheduling policy set on the display interface. Among them, the set of inference-duration task scheduling strategies includes long-duration task scheduling strategies and short-duration task scheduling strategies. When the server detects that the user clicks on the short-term task scheduling policy, the server can obtain the selection instruction entered for the short-term task scheduling policy. Furthermore, the server can set the inference duration task scheduling policy to a short-duration task scheduling policy.
易于理解的是,当服务器获取到针对推理时长任务调度策略集合所输入的选择指令时,服务器可以获取选择指令对应的推理时长任务调度策略。It is easy to understand that when the server obtains the selection instruction entered for the inference duration task scheduling policy set, the server can obtain the inference duration task scheduling policy corresponding to the selection instruction.
S206,获取初始任务对应的训练运行方式。S206: Obtain the training operation mode corresponding to the initial task.
具体过程如上所述,此处不再赘述。The specific process is as mentioned above and will not be described again here.
根据一些实施例,初始任务对应的训练运行方式包括但不限于整体训练运行方式、交叉训练运行方式等等。According to some embodiments, the training operation mode corresponding to the initial task includes but is not limited to the overall training operation mode, the cross-training operation mode, and so on.
在一些实施例中,当服务器采用整体训练运行方式时,服务器可以先对所有任务进行调优搜索,之后再对模型进行训练。In some embodiments, when the server adopts the overall training operation mode, the server can first perform a tuning search on all tasks and then train the model.
在一些实施例中,当服务器采用交叉训练运行方式时,服务器可以对模型训练一次后进行一次任务调优。In some embodiments, when the server adopts cross-training operation mode, the server can train the model once and then perform task tuning once.
易于理解的是,当服务器获取到与初始任务对应的任务调度策略时,服务器可以获取初始任务对应的训练运行方式。It is easy to understand that when the server obtains the task scheduling policy corresponding to the initial task, the server can obtain the training operation mode corresponding to the initial task.
S207,采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务。S207: Use the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task.
具体过程如上所述,此处不再赘述。The specific process is as mentioned above and will not be described again here.
根据一些实施例,采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索时,服务器可以获取训练场景下初始任务对应的优化潜力值。进而,服务器可以根据该优化潜力值进行任务搜索。According to some embodiments, when searching for initial tasks using the inference duration task scheduling strategy, training task scheduling strategy, and training operation mode, the server can obtain the optimization potential value corresponding to the initial task in the training scenario. Furthermore, the server can perform task search based on the optimization potential value.
在一些实施例中,优化潜力值用于指示任务的优化潜力。该优化潜力值并不特指某一固定值。例如,当任务发生变化时,该优化潜力值可以发生变化。该优化潜力值可以基于导数或者贝叶斯模型获取。In some embodiments, the optimization potential value is used to indicate the optimization potential of the task. The optimization potential value does not refer to a fixed value. For example, when the task changes, the optimization potential value can change. The optimization potential value can be obtained based on derivatives or Bayesian models.
在一些实施例中,当服务器获取到初始任务对应的优化潜力值时,在服务器判断优化潜力值小于潜力阈值的情况下,服务器可以停止对初始任务进行搜索,即为训练 早搜早停任务调度策略。因此可以停止对优化潜力值小于潜力阈值的任务进行搜索,从而可以减少任务搜索时长。In some embodiments, when the server obtains the optimization potential value corresponding to the initial task, and the server determines that the optimization potential value is less than the potential threshold, the server can stop searching for the initial task, that is, early search and early stop task scheduling for training. Strategy. Therefore, the search for tasks whose optimization potential value is smaller than the potential threshold can be stopped, thereby reducing the task search time.
在一些实施例中,潜力阈值指的是服务器用于评估任务是否具备优化潜力时采用的阈值。该潜力阈值并不特指某一固定阈值。例如,当终端获取到针对潜力阈值的阈值修改指令时,该潜力阈值可以发生变化。In some embodiments, the potential threshold refers to a threshold used by the server to evaluate whether a task has optimization potential. The potential threshold is not specific to a fixed threshold. For example, when the terminal obtains a threshold modification instruction for the potential threshold, the potential threshold may change.
在一些实施例中,当服务器获取到初始任务对应的优化潜力值时,服务器还可以获取与优化潜力值对应的时间资源信息。进而,服务器可以对初始任务分配与时间资源信息对应的搜索时长。因此可以针对任务对应的优化潜力值,对搜索时长进行分配,进而可以提高任务搜索的效率,可以减少总任务搜索的时长。In some embodiments, when the server obtains the optimization potential value corresponding to the initial task, the server may also obtain time resource information corresponding to the optimization potential value. Furthermore, the server can allocate a search duration corresponding to the time resource information to the initial task. Therefore, the search time can be allocated according to the optimization potential value corresponding to the task, thereby improving the efficiency of task search and reducing the total task search time.
根据一些实施例,当服务器采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务时,服务器可以获取控制初始任务迭代运行训练样本数据的第一运行时间,以及获取控制硬件运行训练样本数据的第二运行时间。According to some embodiments, when the server uses the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task, the server can obtain the first running time to control the iterative running of the training sample data of the initial task, and obtaining a second running time for controlling the hardware to run the training sample data.
在一些实施例中,第一运行时间和第二运行时间存在重合运行时间。也就是说,初始任务迭代运行训练样本数据的时间与控制硬件运行训练样本数据的时间存在重合,可以是全部重合或者部分重合。因此,可以减少任务搜索时间。In some embodiments, the first run time and the second run time overlap. That is to say, the time when the initial task iteratively runs the training sample data overlaps with the time when the control hardware runs the training sample data, which can be a complete overlap or a partial overlap. Therefore, task search time can be reduced.
易于理解的是,当服务器获取到初始任务对应的训练运行方式时,服务器可以采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务。It is easy to understand that when the server obtains the training operation mode corresponding to the initial task, the server can use the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search for the initial task and obtain the target task.
根据一些实施例,服务器可以确定搜索算法,该搜索算法用于对不同的运行配置进行排列组合,包括但不限于对循环分块大小、向量化、循环展开、计算位置调整等。服务器可以采用机器学习的cost model来预测优化配置的运行速度,选取其中较快者进行真实硬件运行,测量真实速度快的配置作为优化结果。根据系统不同,有不同的搜索算法,比如遗传搜索算法,穷举搜索算法、网格(Grid)搜索算法等。According to some embodiments, the server may determine a search algorithm for permuting and combining different running configurations, including but not limited to loop block sizing, vectorization, loop unrolling, calculation position adjustment, etc. The server can use the machine learning cost model to predict the running speed of the optimized configuration, select the faster one to run on real hardware, and measure the real faster configuration as the optimization result. Depending on the system, there are different search algorithms, such as genetic search algorithm, exhaustive search algorithm, grid search algorithm, etc.
根据一些实施例,代价Cost数据数据库Database是指用来训练代价模型Cost Model的硬件真实运行速度的数据的DataBase,用来训练更精确的Cost Model。该Cost Model可以用于确定搜索算法。服务器可以用机器学习的Cost Model来预测优化配置的运行速度,可以加快搜索算法的搜索速度,减少自动调优时间。同时将硬件上任务运行的真实速度反馈给Cost Model进行机器学习训练,以对Cost Model进行优化。According to some embodiments, the cost data database Database refers to the DataBase of the data of the actual operating speed of the hardware used to train the cost model Cost Model, and is used to train a more accurate Cost Model. The Cost Model can be used to determine the search algorithm. The server can use the machine learning Cost Model to predict the running speed of the optimized configuration, which can speed up the search speed of the search algorithm and reduce automatic tuning time. At the same time, the real speed of task running on the hardware is fed back to the Cost Model for machine learning training to optimize the Cost Model.
在一些实施例中,服务器可以在矩阵循环中使用环分块大小、向量化、循环展开、计算位置调整、线程并行、GPU并行等优化方法。这些基础的优化方法在自动调优系统中被称为调度原语schedule primitive,基础优化方法的排列组合组成的所有可运行 组合叫搜索空间。服务器可以采用搜索算法在该搜索空间里去搜索速度快的任务运行方法。In some embodiments, the server may use optimization methods such as ring block size, vectorization, loop unrolling, calculation position adjustment, thread parallelism, GPU parallelism, etc. in the matrix loop. These basic optimization methods are called schedule primitives in the automatic tuning system, and all runnable combinations composed of permutations and combinations of basic optimization methods are called search spaces. The server can use a search algorithm to search for a fast task running method in the search space.
S208,获取并存储目标任务对应的第一目标配置信息。S208: Obtain and store the first target configuration information corresponding to the target task.
根据一些实施例,第一目标配置信息指的是目标任务对应的调优配置信息。该第一目标配置信息并不特指某一固定信息。例如,当目标任务发生变化时,该第一目标配置信息可以发生变化。According to some embodiments, the first target configuration information refers to tuning configuration information corresponding to the target task. The first target configuration information does not specifically refer to certain fixed information. For example, when the target task changes, the first target configuration information may change.
在一些实施例中,当服务器进行模型训练时,在服务器获取到目标任务对应的第一目标配置信息的情况下,服务器可以存储该第一目标配置信息。进而,当服务器对深度学习模型进行推理时,可以复用该第一目标配置信息,进而可以减少模型推理时的任务搜索时长。In some embodiments, when the server performs model training and the server obtains the first target configuration information corresponding to the target task, the server may store the first target configuration information. Furthermore, when the server performs inference on the deep learning model, the first target configuration information can be reused, thereby reducing the task search time during model inference.
易于理解的是,当服务器获取到目标任务时,服务器可以获取并存储目标任务对应的第一目标配置信息。It is easy to understand that when the server obtains the target task, the server can obtain and store the first target configuration information corresponding to the target task.
在本公开实施例中,首先,通过在确定缓存中存在与初始任务对应的第二目标配置信息的情况下,基于第二目标配置信息获取硬件代码;控制硬件运行硬件代码,获取初始任务对应的运行信息;因此若根据所缓存的配置信息判断无需继续对初始任务进行搜索,则可以减少对初始任务进行搜索所需要的时长,进而可以减少任务搜索时长。其次,通过在展示界面上展示与初始任务对应的推理时长任务调度策略集合;获取针对推理时长任务调度策略集合所输入的选择指令;获取选择指令对应的推理时长任务调度策略;因此可以根据需求选择需要的推理时长任务调度策略,可以提高任务搜索的灵活性。接着,通过获取初始任务对应的训练运行方式;采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务;因此通过采用推理时长任务调度策略,可以减少对深度学习模型进行推理时的任务搜索时长,进而可以减少深度学习模型的推理时长。通过采用训练任务调度策略和训练运行方式,可以减少对深度学习模型进行训练时的任务搜索时长,进而可以减少深度学习模型的训练时长,进而,可以减少任务搜索的时长的同时提高任务搜索方案的适用性,使得该任务搜索方法可以适用于推理场景和训练场景。最后,通过获取并存储目标任务对应的第一目标配置信息;因此,当服务器对深度学习模型进行推理时,可以复用第一目标配置信息,进而可以减少模型推理时的任务搜索时长。In the embodiment of the present disclosure, first, by determining that the second target configuration information corresponding to the initial task exists in the cache, the hardware code is obtained based on the second target configuration information; the hardware is controlled to run the hardware code, and the hardware code corresponding to the initial task is obtained. Running information; therefore, if it is determined based on the cached configuration information that there is no need to continue searching for the initial task, the time required to search for the initial task can be reduced, thereby reducing the task search time. Secondly, by displaying the inference duration task scheduling policy set corresponding to the initial task on the display interface; obtaining the selection instructions entered for the inference duration task scheduling policy set; obtaining the inference duration task scheduling policy corresponding to the selection instruction; therefore, you can choose according to your needs The required inference duration task scheduling strategy can improve the flexibility of task search. Then, by obtaining the training operation mode corresponding to the initial task; using the inference duration task scheduling strategy, training task scheduling strategy and training operation mode to search the initial task to obtain the target task; therefore, by using the inference duration task scheduling strategy, the depth of the task can be reduced The task search time when the learning model performs inference, thereby reducing the inference time of the deep learning model. By adopting the training task scheduling strategy and training operation mode, the task search time when training the deep learning model can be reduced, which in turn can reduce the training time of the deep learning model. In turn, the task search time can be reduced while improving the efficiency of the task search solution. Applicability makes this task search method applicable to inference scenarios and training scenarios. Finally, the first target configuration information corresponding to the target task is obtained and stored; therefore, when the server performs inference on the deep learning model, the first target configuration information can be reused, thereby reducing the task search time during model inference.
本公开的技术方案中,所涉及的用户个人信息的收集、存储、使用、加工、传输、提供和公开等处理,均符合相关法律法规的规定,且不违背公序良俗。In the technical solution of this disclosure, the collection, storage, use, processing, transmission, provision and disclosure of user personal information are in compliance with relevant laws and regulations and do not violate public order and good customs.
下述为本公开装置实施例,可以用于执行本公开方法实施例。对于本公开装置实施例中未披露的细节,请参照本公开方法实施例。The following are device embodiments of the present disclosure, which can be used to perform method embodiments of the present disclosure. For details not disclosed in the device embodiments of the disclosure, please refer to the method embodiments of the disclosure.
请参见图4a,其示出了本公开一个示例性实施例提供的第一种任务搜索装置的结构示意图。该任务搜索装置可以通过软件、硬件或者两者的结合实现成为装置的全部或一部分。该任务搜索装置400包括策略获取单元401、方式获取单元402和任务获取单元403,其中:Please refer to Figure 4a, which shows a schematic structural diagram of a first task search device provided by an exemplary embodiment of the present disclosure. The task search device can be implemented as all or part of the device through software, hardware, or a combination of both. The task search device 400 includes a strategy acquisition unit 401, a method acquisition unit 402 and a task acquisition unit 403, wherein:
策略获取单元401,用于获取与初始任务对应的任务调度策略,其中,任务调度策略包括推理时长任务调度策略和训练任务调度策略,推理时长任务调度策略用于调节推理场景下初始任务对应的推理时长,训练任务调度策略用于调节训练场景下初始任务对应的搜索时长;The policy acquisition unit 401 is used to obtain the task scheduling policy corresponding to the initial task. The task scheduling policy includes the inference duration task scheduling policy and the training task scheduling policy. The inference duration task scheduling policy is used to adjust the inference corresponding to the initial task in the inference scenario. Duration, the training task scheduling strategy is used to adjust the search duration corresponding to the initial task in the training scenario;
方式获取单元402,用于获取初始任务对应的训练运行方式;The mode acquisition unit 402 is used to acquire the training operation mode corresponding to the initial task;
任务获取单元403,用于采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务。The task acquisition unit 403 is used to search for initial tasks using the inference duration task scheduling strategy, training task scheduling strategy and training operation mode to obtain the target task.
根据一些实施例,图4b是用来实现本公开实施例的任务搜索方法的第二种任务搜索装置的结构示意图。如图4b所示,策略获取单元401包括集合展示子单元411、指令获取子单元421和策略获取子单元431,策略获取单元4001用于获取与初始任务对应的任务调度策略时:According to some embodiments, FIG. 4b is a schematic structural diagram of a second task search device used to implement the task search method according to an embodiment of the present disclosure. As shown in Figure 4b, the policy acquisition unit 401 includes a collection display sub-unit 411, an instruction acquisition sub-unit 421 and a policy acquisition sub-unit 431. When the policy acquisition unit 4001 is used to acquire the task scheduling policy corresponding to the initial task:
集合展示子单元411,用于在展示界面上展示与初始任务对应的推理时长任务调度策略集合;The set display subunit 411 is used to display the inference duration task scheduling policy set corresponding to the initial task on the display interface;
指令获取子单元421,用于获取针对推理时长任务调度策略集合所输入的选择指令;The instruction acquisition subunit 421 is used to obtain the selection instruction input for the inference duration task scheduling policy set;
策略获取子单元431,用于获取选择指令对应的推理时长任务调度策略。The policy acquisition subunit 431 is used to acquire the inference duration task scheduling policy corresponding to the selection instruction.
根据一些实施例,图4c是用来实现本公开实施例的任务搜索方法的第三种任务搜索装置的结构示意图。如图4c所示,任务获取单元403包括潜力值获取子单元413和搜索停止子单元423,任务获取单元403用于采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务时:According to some embodiments, FIG. 4c is a schematic structural diagram of a third task search device used to implement the task search method according to the embodiment of the present disclosure. As shown in Figure 4c, the task acquisition unit 403 includes a potential value acquisition sub-unit 413 and a search stop sub-unit 423. The task acquisition unit 403 is used to search for the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode. , when getting the target task:
潜力值获取子单元413,用于采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索时,获取训练场景下初始任务对应的优化潜力值;The potential value acquisition subunit 413 is used to obtain the optimization potential value corresponding to the initial task in the training scenario when searching for the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode;
搜索停止子单元423,用于在优化潜力值小于潜力阈值的情况下,停止对初始任务进行搜索。The search stop subunit 423 is used to stop searching for the initial task when the optimization potential value is less than the potential threshold.
根据一些实施例,图4d是用来实现本公开实施例的任务搜索方法的第四种任务搜索装置的结构示意图。如图4d所示,任务搜索装置400还包括信息获取单元404和时长分配单元405,用于在获取初始任务对应的优化潜力值之后:According to some embodiments, FIG. 4d is a schematic structural diagram of a fourth task search device used to implement the task search method according to an embodiment of the present disclosure. As shown in Figure 4d, the task search device 400 also includes an information acquisition unit 404 and a duration allocation unit 405, which are used to obtain the optimization potential value corresponding to the initial task:
信息获取单元404,用于获取与优化潜力值对应的时间资源信息; Information acquisition unit 404, used to acquire time resource information corresponding to the optimization potential value;
时长分配单元405,用于对初始任务分配与时间资源信息对应的搜索时长。The duration allocation unit 405 is used to allocate a search duration corresponding to the time resource information to the initial task.
根据一些实施例,图4e是用来实现本公开实施例的任务搜索方法的第五种任务搜索装置的结构示意图。如图4e所示,任务搜索装置400还包括信息存储单元406,用于在采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务之后:According to some embodiments, FIG. 4e is a schematic structural diagram of a fifth task search device used to implement the task search method according to an embodiment of the present disclosure. As shown in Figure 4e, the task search device 400 also includes an information storage unit 406, which is used to search for the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode, and after obtaining the target task:
信息存储单元406,用于获取并存储目标任务对应的第一目标配置信息。The information storage unit 406 is used to obtain and store the first target configuration information corresponding to the target task.
根据一些实施例,图4f是用来实现本公开实施例的任务搜索方法的第六种任务搜索装置的结构示意图。如图4f所示,任务搜索装置400还包括代码获取单元407和代码运行单元408,用于在获取与初始任务对应的任务调度策略之前:According to some embodiments, FIG. 4f is a schematic structural diagram of a sixth task search device used to implement the task search method according to the embodiment of the present disclosure. As shown in Figure 4f, the task search device 400 also includes a code acquisition unit 407 and a code execution unit 408, which are used to obtain the task scheduling policy corresponding to the initial task:
代码获取单元407,用于在确定缓存中存在与初始任务对应的第二目标配置信息的情况下,基于第二目标配置信息获取硬件代码;The code acquisition unit 407 is configured to acquire the hardware code based on the second target configuration information when it is determined that the second target configuration information corresponding to the initial task exists in the cache;
代码运行单元408,用于控制硬件运行硬件代码,获取初始任务对应的运行信息。The code running unit 408 is used to control the hardware to run the hardware code and obtain the running information corresponding to the initial task.
根据一些实施例,任务获取单元404,用于采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务时,具体用于:According to some embodiments, the task acquisition unit 404 is used to search for initial tasks using the inference duration task scheduling strategy, training task scheduling strategy and training operation mode, and when obtaining the target task, it is specifically used to:
采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务时,获取控制初始任务迭代运行训练样本数据的第一运行时间,获取控制硬件运行训练样本数据的第二运行时间,其中,第一运行时间和第二运行时间存在重合运行时间。The inference duration task scheduling strategy, training task scheduling strategy and training operation mode are used to search for the initial task. When the target task is obtained, the first running time for controlling the iterative running of the training sample data of the initial task is obtained, and the first running time for controlling the hardware running of the training sample data is obtained. Two running times, wherein the first running time and the second running time overlap.
要说明的是,上述实施例提供的任务搜索装置在执行任务搜索方法时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的任务搜索装置与任务搜索方法实施例属于同一构思,其体现实现过程详见方法实施例,这里不再赘述。It should be noted that when the task search device provided in the above embodiments performs the task search method, only the division of the above functional modules is used as an example. In practical applications, the above function allocation can be completed by different functional modules as needed. , that is, dividing the internal structure of the device into different functional modules to complete all or part of the functions described above. In addition, the task search device provided by the above embodiments and the task search method embodiments belong to the same concept. For details of the implementation process, please refer to the method embodiments, which will not be described again here.
上述本公开实施例序号仅仅为了描述,不代表实施例的优劣。The above serial numbers of the embodiments of the present disclosure are only for description and do not represent the advantages and disadvantages of the embodiments.
在本公开实施例中,通过策略获取单元获取与初始任务对应的任务调度策略,其中,任务调度策略包括推理时长任务调度策略和训练任务调度策略,推理时长任务调度策略用于调节推理场景下初始任务对应的推理时长,训练任务调度策略用于调节训练场景下初始任务对应的搜索时长;方式获取单元获取初始任务对应的训练运行方式;任务获取单元采用推理时长任务调度策略、训练任务调度策略和训练运行方式对初始任务进行搜索,得到目标任务。因此通过采用推理时长任务调度策略,可以减少对深度学习模型进行推理时的任务搜索时长,进而可以减少深度学习模型的推理时长。通过采用训练任务调度策略和训练运行方式,可以减少对深度学习模型进行训练时的任 务搜索时长,进而可以减少深度学习模型的训练时长。进而,可以减少任务搜索的时长的同时提高任务搜索方案的适用性,使得该任务搜索方法可以适用于推理场景和训练场景。In the embodiment of the present disclosure, the task scheduling policy corresponding to the initial task is obtained through the policy acquisition unit. The task scheduling policy includes the inference duration task scheduling policy and the training task scheduling policy. The inference duration task scheduling policy is used to adjust the initial task in the inference scenario. The inference duration corresponding to the task, the training task scheduling strategy is used to adjust the search duration corresponding to the initial task in the training scenario; the method acquisition unit obtains the training operation mode corresponding to the initial task; the task acquisition unit adopts the inference duration task scheduling strategy, training task scheduling strategy and The training running mode searches the initial tasks and obtains the target tasks. Therefore, by adopting the inference time task scheduling strategy, the task search time when inferring the deep learning model can be reduced, which in turn can reduce the inference time of the deep learning model. By adopting the training task scheduling strategy and training operation mode, the task search time when training the deep learning model can be reduced, which in turn can reduce the training time of the deep learning model. Furthermore, the duration of task search can be reduced while improving the applicability of the task search solution, making the task search method applicable to inference scenarios and training scenarios.
本公开的技术方案中,所涉及的用户个人信息的获取、存储和应用等,均符合相关法律法规的规定,且不违背公序良俗。In the technical solution of this disclosure, the acquisition, storage and application of user personal information are in compliance with relevant laws and regulations and do not violate public order and good customs.
根据本公开的实施例,本公开还提供了一种服务器、一种电子设备、一种可读存储介质、一种计算机程序产品和一种计算机程序。According to embodiments of the present disclosure, the present disclosure also provides a server, an electronic device, a readable storage medium, a computer program product, and a computer program.
图5示出了可以用来实施本公开的实施例的示例服务器500的示意性框图。Figure 5 illustrates a schematic block diagram of an example server 500 that may be used to implement embodiments of the present disclosure.
如图5所示,服务器500包括计算单元501,其可以根据存储在只读存储器(ROM)502中的计算机程序或者从存储单元508加载到随机访问存储器(RAM)503中的计算机程序,来执行各种适当的动作和处理。在RAM 503中,还可存储服务器500操作所需的各种程序和数据。计算单元501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。As shown in FIG. 5 , the server 500 includes a computing unit 501 that can execute according to a computer program stored in a read-only memory (ROM) 502 or loaded from a storage unit 508 into a random access memory (RAM) 503 Various appropriate actions and treatments. In the RAM 503, various programs and data required for the operation of the server 500 can also be stored. Computing unit 501, ROM 502 and RAM 503 are connected to each other via bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
服务器500中的多个部件连接至I/O接口505,包括:输入单元506,例如键盘、鼠标等;输出单元507,例如各种类型的显示器、扬声器等;存储单元508,例如磁盘、光盘等;以及通信单元509,例如网卡、调制解调器、无线通信收发机等。通信单元509允许服务器500通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Multiple components in the server 500 are connected to the I/O interface 505, including: input unit 506, such as keyboard, mouse, etc.; output unit 507, such as various types of displays, speakers, etc.; storage unit 508, such as magnetic disk, optical disk, etc. ; and communication unit 509, such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the server 500 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.
计算单元501可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元501的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元501执行上文所描述的各个方法和处理,例如任务搜索方法。例如,在一些实施例中,任务搜索方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元508。在一些实施例中,计算机程序的部分或者全部可以经由ROM 502和/或通信单元509而被载入和/或安装到服务器500上。当计算机程序加载到RAM 503并由计算单元501执行时,可以执行上文描述的任务搜索方法的一个或多个步骤。备选地,在其他实施例中,计算单元501可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行任务搜索方法。 Computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processing processor (DSP), and any appropriate processor, controller, microcontroller, etc. The computing unit 501 performs various methods and processes described above, such as the task search method. For example, in some embodiments, the task search method may be implemented as a computer software program that is tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed on the server 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more steps of the task search method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the task search method in any other suitable manner (eg, by means of firmware).
本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、 软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described above may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on a chip implemented in a system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or a combination thereof. These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor The processor, which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. An output device.
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of this disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以 通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)、互联网和区块链网络。The systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include: local area network (LAN), wide area network (WAN), the Internet, and blockchain networks.
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与VPS服务("Virtual Private Server",或简称"VPS")中,存在的管理难度大,业务扩展性弱的缺陷。服务器也可以为分布式系统的服务器,或者是结合了区块链的服务器。Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other. The server can be a cloud server, also known as cloud computing server or cloud host. It is a host product in the cloud computing service system to solve the problem of traditional physical host and VPS service ("Virtual Private Server", or "VPS" for short) Among them, there are defects such as difficult management and weak business scalability. The server can also be a distributed system server or a server combined with a blockchain.
需要说明的是,前述对任务搜索方法实施例的解释说明也适用于本公开实施例的装置、服务器、电子设备、计算机可读存储介质、计算机程序产品和计算机程序,此处不再赘述。It should be noted that the foregoing explanations of the task search method embodiments are also applicable to the devices, servers, electronic equipment, computer-readable storage media, computer program products and computer programs in the embodiments of the present disclosure, and will not be described again here.
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that various forms of the process shown above may be used, with steps reordered, added or deleted. For example, each step described in the present disclosure can be executed in parallel, sequentially, or in a different order. As long as the desired results of the technical solution disclosed in the present disclosure can be achieved, there is no limitation here.
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the scope of the present disclosure. It will be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions are possible depending on design requirements and other factors. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this disclosure shall be included in the protection scope of this disclosure.

Claims (19)

  1. 一种任务搜索方法,包括:A task search method including:
    获取与初始任务对应的任务调度策略,其中,所述任务调度策略包括推理时长任务调度策略和训练任务调度策略,所述推理时长任务调度策略用于调节推理场景下所述初始任务对应的推理时长,所述训练任务调度策略用于调节训练场景下所述初始任务对应的搜索时长;Obtain the task scheduling policy corresponding to the initial task, where the task scheduling policy includes an inference duration task scheduling policy and a training task scheduling policy. The inference duration task scheduling policy is used to adjust the inference duration corresponding to the initial task in the inference scenario. , the training task scheduling policy is used to adjust the search duration corresponding to the initial task in the training scenario;
    获取所述初始任务对应的训练运行方式;Obtain the training operation mode corresponding to the initial task;
    采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务。The initial task is searched using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to obtain the target task.
  2. 根据权利要求1所述的方法,其中,所述获取与初始任务对应的任务调度策略,包括:The method according to claim 1, wherein said obtaining the task scheduling policy corresponding to the initial task includes:
    在展示界面上展示与所述初始任务对应的推理时长任务调度策略集合;Display a set of inference duration task scheduling strategies corresponding to the initial task on the display interface;
    获取针对所述推理时长任务调度策略集合所输入的选择指令;Obtain the selection instructions input for the inference duration task scheduling policy set;
    获取所述选择指令对应的推理时长任务调度策略。Obtain the inference duration task scheduling policy corresponding to the selection instruction.
  3. 根据权利要求1或2所述的方法,其中,所述采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务,包括:The method according to claim 1 or 2, wherein said using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search the initial task to obtain the target task includes:
    采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索时,获取所述训练场景下所述初始任务对应的优化潜力值;When searching for the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode, obtain the optimization potential value corresponding to the initial task in the training scenario;
    在所述优化潜力值小于潜力阈值的情况下,停止对所述初始任务进行搜索。In the case where the optimization potential value is less than the potential threshold, the search for the initial task is stopped.
  4. 根据权利要求3所述的方法,其中,在所述获取所述初始任务对应的优化潜力值之后,还包括:The method according to claim 3, wherein after obtaining the optimization potential value corresponding to the initial task, it further includes:
    获取与所述优化潜力值对应的时间资源信息;Obtain time resource information corresponding to the optimization potential value;
    对所述初始任务分配与所述时间资源信息对应的搜索时长。The initial task is assigned a search duration corresponding to the time resource information.
  5. 根据权利要求1至4中任一项所述的方法,其中,在采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务之后,还包括:The method according to any one of claims 1 to 4, wherein after using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to search the initial task and obtain the target task, Also includes:
    获取并存储所述目标任务对应的第一目标配置信息。Obtain and store the first target configuration information corresponding to the target task.
  6. 根据权利要求1至5中任一项所述的方法,其中,在获取与初始任务对应的任务调度策略之前,还包括:The method according to any one of claims 1 to 5, wherein before obtaining the task scheduling policy corresponding to the initial task, it further includes:
    在确定缓存中存在与所述初始任务对应的第二目标配置信息的情况下,基于所述第二目标配置信息获取硬件代码;If it is determined that the second target configuration information corresponding to the initial task exists in the cache, obtain the hardware code based on the second target configuration information;
    控制硬件运行所述硬件代码,获取所述初始任务对应的运行信息。Control the hardware to run the hardware code and obtain the operation information corresponding to the initial task.
  7. 根据权利要求1至6中任一项所述的方法,其中,所述采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务,包括:The method according to any one of claims 1 to 6, wherein the initial task is searched using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to obtain the target task, include:
    采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务时,获取控制所述初始任务迭代运行训练样本数据的第一运行时间,获取控制硬件运行所述训练样本数据的第二运行时间,其中,所述第一运行时间和所述第二运行时间存在重合运行时间。The initial task is searched using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode. When the target task is obtained, the first running time for controlling the iterative running of the training sample data of the initial task is obtained, and Control the hardware to run a second running time of the training sample data, wherein the first running time and the second running time overlap.
  8. 一种任务搜索装置,包括:A task search device includes:
    策略获取单元,用于获取与初始任务对应的任务调度策略,其中,所述任务调度策略包括推理时长任务调度策略和训练任务调度策略,所述推理时长任务调度策略用于调节推理场景下所述初始任务对应的推理时长,所述训练任务调度策略用于调节训练场景下所述初始任务对应的搜索时长;A policy acquisition unit is used to obtain a task scheduling policy corresponding to the initial task, wherein the task scheduling policy includes an inference duration task scheduling policy and a training task scheduling policy, and the inference duration task scheduling policy is used to adjust the inference duration task scheduling policy in the inference scenario. The inference duration corresponding to the initial task, and the training task scheduling policy is used to adjust the search duration corresponding to the initial task in the training scenario;
    方式获取单元,用于获取所述初始任务对应的训练运行方式;A mode acquisition unit, used to acquire the training operation mode corresponding to the initial task;
    任务获取单元,用于采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务。A task acquisition unit is used to search the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to obtain the target task.
  9. 根据权利要求8所述的装置,其中,所述策略获取单元包括集合展示子单元、指令获取子单元和策略获取子单元,所述策略获取单元用于获取与初始任务对应的任务调度策略时:The device according to claim 8, wherein the policy acquisition unit includes a collection display sub-unit, an instruction acquisition sub-unit and a policy acquisition sub-unit, and the policy acquisition unit is used to acquire the task scheduling policy corresponding to the initial task:
    所述集合展示子单元,用于在展示界面上展示与所述初始任务对应的推理时长任务调度策略集合;The set display subunit is used to display a set of inference duration task scheduling strategies corresponding to the initial task on the display interface;
    所述指令获取子单元,用于获取针对所述推理时长任务调度策略集合所输入的选择指令;The instruction acquisition subunit is used to obtain the selection instruction input for the inference duration task scheduling policy set;
    所述策略获取子单元,用于获取所述选择指令对应的推理时长任务调度策略。The policy acquisition subunit is used to acquire the inference duration task scheduling policy corresponding to the selection instruction.
  10. 根据权利要求8或9所述的装置,其中,所述任务获取单元包括潜力值获取子单元和搜索停止子单元,所述任务获取单元用于采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务时:The device according to claim 8 or 9, wherein the task acquisition unit includes a potential value acquisition subunit and a search stop subunit, and the task acquisition unit is used to adopt the inference duration task scheduling strategy and the training task scheduling strategy. Search the initial task with the training operation mode and obtain the target task:
    所述潜力值获取子单元,用于采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索时,获取所述训练场景下所述初始任务对应的优化潜力值;The potential value acquisition subunit is used to obtain the initial task corresponding to the training scenario when searching for the initial task using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode. Optimization potential value;
    所述搜索停止子单元,用于在所述优化潜力值小于潜力阈值的情况下,停止对所述初始任务进行搜索。The search stopping subunit is configured to stop searching for the initial task when the optimization potential value is less than a potential threshold.
  11. 根据权利要求10所述的装置,其中,所述装置还包括信息获取单元和时长分配单元,用于在所述获取所述初始任务对应的优化潜力值之后:The device according to claim 10, wherein the device further includes an information acquisition unit and a duration allocation unit, configured to: after obtaining the optimization potential value corresponding to the initial task:
    所述信息获取单元,用于获取与所述优化潜力值对应的时间资源信息;The information acquisition unit is used to acquire time resource information corresponding to the optimization potential value;
    所述时长分配单元,用于对所述初始任务分配与所述时间资源信息对应的搜索时长。The duration allocation unit is configured to allocate a search duration corresponding to the time resource information to the initial task.
  12. 根据权利要求8至11中任一项所述的装置,其中,所述装置还包括信息存储单元,用于在采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务之后:The device according to any one of claims 8 to 11, wherein the device further includes an information storage unit for using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to perform Search for the above initial task and obtain the target task:
    所述信息存储单元,用于获取并存储所述目标任务对应的第一目标配置信息。The information storage unit is used to obtain and store the first target configuration information corresponding to the target task.
  13. 根据权利要求8至12中任一项所述的装置,其中,所述装置还包括代码获取单元和代码运行单元,用于在获取与初始任务对应的任务调度策略之前:The device according to any one of claims 8 to 12, wherein the device further includes a code acquisition unit and a code execution unit, configured to obtain the task scheduling policy corresponding to the initial task before:
    所述代码获取单元,用于在确定缓存中存在与所述初始任务对应的第二目标配置信息的情况下,基于所述第二目标配置信息获取硬件代码;The code acquisition unit is configured to acquire hardware code based on the second target configuration information when it is determined that there is second target configuration information corresponding to the initial task in the cache;
    所述代码运行单元,用于控制硬件运行所述硬件代码,获取所述初始任务对应的运行信息。The code running unit is used to control the hardware to run the hardware code and obtain the running information corresponding to the initial task.
  14. 根据权利要求8至13中任一项所述的装置,其中,所述任务获取单元,用于采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务时,具体用于:The device according to any one of claims 8 to 13, wherein the task acquisition unit is configured to use the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode to perform the initial task When searching and obtaining the target task, it is specifically used for:
    采用所述推理时长任务调度策略、训练任务调度策略和所述训练运行方式对所述初始任务进行搜索,得到目标任务时,获取控制所述初始任务迭代运行训练样本数据的第一运行时间,获取控制硬件运行所述训练样本数据的第二运行时间,其中,所述第一运行时间和所述第二运行时间存在重合运行时间。The initial task is searched using the inference duration task scheduling strategy, the training task scheduling strategy and the training operation mode. When the target task is obtained, the first running time for controlling the iterative running of the training sample data of the initial task is obtained, and Control the hardware to run a second running time of the training sample data, wherein the first running time and the second running time overlap.
  15. 一种服务器,包括:A server that includes:
    至少一个处理器;以及at least one processor; and
    与所述至少一个处理器通信连接的存储器;其特征在于,A memory communicatively connected with the at least one processor; characterized in that,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1至7中任一项所述的方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1 to 7 Methods.
  16. 一种存储有计算机指令的非瞬时计算机可读存储介质,其特征在于,所述计算机指令用于使所述计算机执行根据权利要求1至7中任一项所述的方法。A non-transitory computer-readable storage medium storing computer instructions, characterized in that the computer instructions are used to cause the computer to execute the method according to any one of claims 1 to 7.
  17. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1至7中任一项所述的方法。A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 7.
  18. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现如权利要求1至7中任一项所述的方法。An electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the implementation as described in any one of claims 1 to 7 is achieved. Methods.
  19. 一种计算机程序,其中所述计算机程序包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如权利要求1至7中任一项所述的方法。A computer program, wherein the computer program includes computer program code, which when run on a computer causes the computer to perform the method according to any one of claims 1 to 7.
PCT/CN2022/123598 2022-05-19 2022-09-30 Task search method and apparatus, server and storage medium WO2023221371A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210548133.7 2022-05-19
CN202210548133.7A CN114968520B (en) 2022-05-19 2022-05-19 Task searching method and device, server and storage medium

Publications (1)

Publication Number Publication Date
WO2023221371A1 true WO2023221371A1 (en) 2023-11-23

Family

ID=82985089

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/123598 WO2023221371A1 (en) 2022-05-19 2022-09-30 Task search method and apparatus, server and storage medium

Country Status (2)

Country Link
CN (1) CN114968520B (en)
WO (1) WO2023221371A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114968520B (en) * 2022-05-19 2023-11-24 北京百度网讯科技有限公司 Task searching method and device, server and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200090048A1 (en) * 2017-05-19 2020-03-19 Deepmind Technologies Limited Multi-task neural network systems with task-specific policies and a shared policy
US20210342549A1 (en) * 2020-12-09 2021-11-04 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for training semantic analysis model, electronic device and storage medium
WO2022037039A1 (en) * 2020-08-18 2022-02-24 中国银联股份有限公司 Neural network architecture search method and apparatus
CN114968520A (en) * 2022-05-19 2022-08-30 北京百度网讯科技有限公司 Task searching method and device, server and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107783831A (en) * 2016-08-24 2018-03-09 深圳市中兴微电子技术有限公司 A kind of method for scheduling task and device
GB201809462D0 (en) * 2018-06-08 2018-07-25 Nplan Ltd A system and method for modelling a construction project
CN114186633B (en) * 2021-12-10 2023-04-07 北京百度网讯科技有限公司 Distributed training method, device, equipment and storage medium of model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200090048A1 (en) * 2017-05-19 2020-03-19 Deepmind Technologies Limited Multi-task neural network systems with task-specific policies and a shared policy
WO2022037039A1 (en) * 2020-08-18 2022-02-24 中国银联股份有限公司 Neural network architecture search method and apparatus
US20210342549A1 (en) * 2020-12-09 2021-11-04 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for training semantic analysis model, electronic device and storage medium
CN114968520A (en) * 2022-05-19 2022-08-30 北京百度网讯科技有限公司 Task searching method and device, server and storage medium

Also Published As

Publication number Publication date
CN114968520A (en) 2022-08-30
CN114968520B (en) 2023-11-24

Similar Documents

Publication Publication Date Title
US10204097B2 (en) Efficient dialogue policy learning
EP3913545A2 (en) Method and apparatus for updating parameter of multi-task model, and electronic device
US20110099135A1 (en) System, method and computer program product for evaluating a storage policy based on simulation
US11960837B2 (en) Fulfillment of actionable requests ahead of a user selecting a particular autocomplete suggestion for completing a current user input
US20220374776A1 (en) Method and system for federated learning, electronic device, and computer readable medium
US20230089268A1 (en) Semantic understanding method, electronic device, and storage medium
WO2023221371A1 (en) Task search method and apparatus, server and storage medium
US20230153337A1 (en) Question answering method, method of training a question answering model, electronic device, and medium
WO2023231350A1 (en) Task processing method implemented by using integer programming solver, device, and medium
CN114895773B (en) Energy consumption optimization method, system and device for heterogeneous multi-core processor and storage medium
JP7408741B2 (en) Multitasking deployment methods, equipment, electronic equipment and storage media
EP4287074A1 (en) Mixture-of-experts model implementation method and system, electronic device, and storage medium
US20220129753A1 (en) Pre-training method of neural network model, electronic device and medium
US20220374742A1 (en) Method, device and storage medium for running inference service platform
WO2021147620A1 (en) Communication method, device, and system based on model training
KR20220003444A (en) Optimizer learning method and apparatus, electronic device and readable storage medium
WO2023221370A1 (en) Batch task processing method and apparatus, and electronic device
US20220207427A1 (en) Method for training data processing model, electronic device and storage medium
WO2023236405A1 (en) End-to-end sensitive text recall model training method and sensitive text recall method
US20220391780A1 (en) Method of federated learning, electronic device, and storage medium
US20220269659A1 (en) Method, device and storage medium for deduplicating entity nodes in graph database
CN115334159B (en) Method, apparatus, device and medium for processing stream data
CN114860405B (en) Parameter updating method and device of multitask model and storage medium
US11836531B2 (en) Method, device, and program product for managing computing system
US20230012881A1 (en) Method and apparatus for reading data, electronic device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22942398

Country of ref document: EP

Kind code of ref document: A1