CN113010290A - Task management method, device, equipment and storage medium - Google Patents

Task management method, device, equipment and storage medium Download PDF

Info

Publication number
CN113010290A
CN113010290A CN202110290924.XA CN202110290924A CN113010290A CN 113010290 A CN113010290 A CN 113010290A CN 202110290924 A CN202110290924 A CN 202110290924A CN 113010290 A CN113010290 A CN 113010290A
Authority
CN
China
Prior art keywords
task
preset
target
tasks
threads
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110290924.XA
Other languages
Chinese (zh)
Inventor
荆荣讯
陈培
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yingxin Computer Technology Co Ltd
Original Assignee
Shandong Yingxin Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yingxin Computer Technology Co Ltd filed Critical Shandong Yingxin Computer Technology Co Ltd
Priority to CN202110290924.XA priority Critical patent/CN113010290A/en
Publication of CN113010290A publication Critical patent/CN113010290A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a task management method, a device, equipment and a storage medium, comprising the following steps: starting a first preset number of task configuration analysis threads, analyzing configuration information of preset tasks by using the task configuration analysis threads, determining the preset tasks meeting preset conditions based on analysis results, and creating corresponding target tasks by using the preset tasks meeting the preset conditions; starting a second preset number of task distribution threads, and distributing the target task to corresponding task execution threads by using the task distribution threads; and utilizing the task execution thread to perform corresponding processing on the target task. According to the task management method and device, systematic management is performed on the tasks by starting a certain number of task configuration analysis threads, task distribution threads and task execution threads, so that low resource consumption is achieved, functional usability is guaranteed under the condition that system performance is certain, and task management efficiency is improved.

Description

Task management method, device, equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for task management.
Background
With the continuous advancement of informatization, different types of tasks are performed in a stepwise informatization manner, and how to realize efficient management on a large number and variety of tasks is a key point for each task management system to exert the maximum value thereof. Especially for a deep learning training platform, at the present of rapid development of artificial intelligence technology, various industries are rapidly carrying out intelligent reconstruction, and as the deep learning technology represented by artificial intelligence, the demands of various fields on deep learning training are increased dramatically. Deep learning is a calculation-intensive training task, and in a deep learning training platform, in order to improve the training effect, platform resources need to be reserved for the deep learning training task as much as possible, so that the resource loss of a deep learning training platform task management system is reduced to the minimum.
At present, the following problems generally exist in the training platform task management system in the industry: (1) the upper limit of the bearing capacity of the task management system of the training platform exists, the task management system needs to apply for certain resources aiming at the training task of each platform, and the maximum resource which can be used by the task management system is limited by the environment (a physical machine or a virtual machine) where the task management system is located, so that the maximum capacity of the task management system is limited; (2) under the condition of high concurrent high pressure of tasks, a training platform task management system cannot meet requirements, usually needs a large amount of customized transformation to meet specific service requirements, and cannot achieve universality and real transformation without capacity limitation. How to efficiently manage training tasks with low resource consumption by a deep learning training platform becomes a technical problem to be urgently solved in the industry.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a task management method, device, apparatus, and storage medium, which can achieve low resource consumption, ensure functional availability under a certain system performance, and improve task management efficiency. The specific scheme is as follows:
a first aspect of the present application provides a task management method, including:
starting a first preset number of task configuration analysis threads, analyzing configuration information of preset tasks by using the task configuration analysis threads, determining the preset tasks meeting preset conditions based on analysis results, and creating corresponding target tasks by using the preset tasks meeting the preset conditions;
starting a second preset number of task distribution threads, and distributing the target task to corresponding task execution threads by using the task distribution threads;
and utilizing the task execution thread to perform corresponding processing on the target task.
Optionally, before starting the first preset number of task configuration analysis threads, the method further includes:
establishing a task configuration pool, a task running pool and a historical task pool in a mode of constructing a data table in a database;
the task configuration pool is used for storing configuration information of preset tasks, the task operation pool is used for storing target tasks before operation is finished, and the historical task pool is used for storing the target tasks after operation is finished.
Optionally, the analyzing the configuration information of the preset task by using the task configuration analysis thread, and determining the preset task meeting the preset condition based on the analysis result, so as to create the corresponding target task by using the preset task meeting the preset condition, includes:
and utilizing the task configuration analysis thread to periodically scan and analyze the configuration information of the preset task in the task configuration pool according to a preset period, judging whether the preset task meets a preset condition or not based on an analysis result, and if so, utilizing the currently scanned configuration information of the preset task to create the corresponding target task.
Optionally, the performing, by using the task configuration analysis thread, periodic scanning analysis on the configuration information of the preset task in the task configuration pool according to a preset period includes:
sequencing the configuration information of different preset tasks in the task configuration pool according to the priority to obtain task configuration sequences corresponding to the configuration information of the different preset tasks;
and utilizing the task configuration analysis thread to periodically and sequentially scan and analyze the task configuration sequence according to a preset period.
Optionally, the distributing the target task to the corresponding task execution thread by using the task distribution thread includes:
sequencing the target tasks according to the priority to obtain a task sequence of the target tasks;
and distributing the target tasks to corresponding task execution threads by using the task distribution threads according to the sequence of the target tasks in the task sequence.
Optionally, the performing, by using the task execution thread, corresponding processing on the target task includes:
and submitting the target task to a task running environment by using the task execution thread to run, acquiring the running state of the target task in real time, and updating the state of the target task in the task running pool according to the running state.
Optionally, the task management method further includes:
respectively locking the configuration information in the configuration information analysis process, the target task in the task distribution process and the target task in the task running process;
correspondingly, the configuration information after the configuration information analysis process is finished, the target task after the task distribution process is finished and the target task after the task operation process is finished are respectively unlocked.
A second aspect of the present application provides a task management apparatus including:
the task analysis module is used for starting a first preset number of task configuration analysis threads, analyzing configuration information of preset tasks by using the task configuration analysis threads, determining the preset tasks meeting preset conditions based on analysis results, and creating corresponding target tasks by using the preset tasks meeting the preset conditions;
the task distribution module is used for starting a second preset number of task distribution threads and distributing the target task to the corresponding task execution threads by using the task distribution threads;
and the task execution module is used for utilizing the task execution thread to perform corresponding processing on the target task.
A third aspect of the application provides an electronic device comprising a processor and a memory; wherein the memory is used for storing a computer program which is loaded and executed by the processor to implement the aforementioned task management method.
A fourth aspect of the present application provides a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are loaded and executed by a processor, the task management method is implemented.
According to the method, a first preset number of task configuration analysis threads are started, configuration information of preset tasks is analyzed through the task configuration analysis threads, the preset tasks meeting preset conditions are determined based on analysis results, corresponding target tasks are created through the preset tasks meeting the preset conditions, then a second preset number of task distribution threads are started, the target tasks are distributed to corresponding task execution threads through the task distribution threads, and finally the target tasks are processed correspondingly through the task execution threads. According to the task management method and the task management system, the tasks are systematically managed by starting a certain number of task configuration analysis threads, task distribution threads and task execution threads, so that low resource consumption is achieved, functional usability is guaranteed under the condition that system performance is certain, and task management efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flowchart of a task management method provided herein;
FIG. 2 is a flowchart of a specific task management method provided in the present application;
FIG. 3 is a schematic diagram of a task management method for a deep learning training platform according to the present application;
FIG. 4 is a schematic diagram illustrating a specific task configuration parsing thread execution logic according to the present application;
FIG. 5 is a diagram illustrating a specific task distribution thread execution logic according to the present application;
FIG. 6 is a diagram illustrating a specific task execution thread execution logic according to the present disclosure;
FIG. 7 is a flowchart of a specific task management method provided herein;
FIG. 8 is a schematic structural diagram of a task management device according to the present application;
fig. 9 is a structural diagram of a task management electronic device according to the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the existing task management scheme, due to the problem of scheduling performance of a scheduling system, the situations of task submission failure, high task state updating delay and even incapability of updating are caused, and the task management efficiency is low. Particularly for a deep learning training platform which manages a large number of tasks with high concurrency, on one hand, the maximum capacity of the task management system is limited because the upper limit exists in the bearing capacity of the task management system, the task management system needs to apply for a certain resource aiming at the training task of each platform, and the maximum resource which can be used by the task management system is limited by the environment (a physical machine or a virtual machine) where the task management system is located, on the other hand, under the condition of high concurrency and high pressure of the tasks, the task management system cannot meet the requirements, often needs a large number of customized transformation to meet specific service requirements, and cannot achieve universality and real transformation without capacity limitation. In view of the technical defects, the present application provides a task management scheme, which performs systematic management on tasks by starting a certain number of task configuration analysis threads, task distribution threads, and task execution threads, thereby achieving low resource consumption, ensuring functional availability under the condition of a certain system performance, and improving task management efficiency.
Fig. 1 is a flowchart of a task management method according to an embodiment of the present application. Referring to fig. 1, the task management method includes:
s11: starting a first preset number of task configuration analysis threads, analyzing configuration information of preset tasks by using the task configuration analysis threads, determining the preset tasks meeting preset conditions based on analysis results, and creating corresponding target tasks by using the preset tasks meeting the preset conditions.
In this embodiment, a first preset number of task configuration analysis threads are started, then the configuration information of a preset task is analyzed by using the task configuration analysis threads, and the preset task meeting a preset condition is determined based on an analysis result, so that a corresponding target task is created by using the preset task meeting the preset condition. The task configuration analysis thread is an execution main body of a task configuration analysis module, and the task configuration analysis module is composed of the task configuration analysis thread. In order to ensure the maximum resource utilization, the first preset number needs to comprehensively consider the number, the type, the configuration information, the occupancy rate of the system resources at the current moment and other factors of the preset tasks, so that the first preset number of task configuration analysis threads meeting the factors can be controlled to be started.
In this embodiment, the preset task is obtained by a User with a task running requirement configuring configuration information of a task in advance through a User Interface (UI) of a task management system, that is, the preset task is stored in the form of the configuration information. The configuration information includes, but is not limited to, service priority, resource requirements, execution scripts, trigger conditions (such as execution cycles), and the like, and a user can select the configuration information according to actual service requirements to obtain the required preset task. The service priority is a basis for analyzing the sequence of the configuration information of the preset task, the resource requirement is a required occupation condition of the preset task on system resources, the execution script is a batch processing file of the preset task, and the trigger condition is a condition for creating a corresponding target task from the configuration information of the preset task, that is, the preset task meeting the preset condition can be determined according to the execution period, where the preset condition is that the execution period of the preset task reaches the operation period condition of the preset task. In addition, the configuration information of the task may be configured in advance through a website interface or the like.
It can be understood that, in this embodiment, the task configuration analysis thread is mainly configured to analyze the pre-configured configuration information of the preset task, determine whether the preset task reaches the operating condition through the configuration information in the preset task, and if the preset task reaches the operating condition, generate the target task corresponding to the preset task based on the configuration information of the preset task at the current moment. It is to be understood that the configuration information may be broadly understood as a configuration field, that is, a plurality of configuration information types of the preset task, and therefore, specific configuration information corresponding to the same configuration field at different times may be different, that is, the target task is a specific representation form of the preset task at a certain time, where the time is a time when the preset task meets the preset condition, that is, a time when the preset task reaches an operating condition.
S12: and starting a second preset number of task distribution threads, and distributing the target task to the corresponding task execution threads by using the task distribution threads.
S13: and utilizing the task execution thread to perform corresponding processing on the target task.
In this embodiment, a second preset number of task distribution threads are first started, and then the target task is distributed to a corresponding task execution thread by using the task distribution threads. Similarly, the task distribution threads are the execution main bodies of the task distribution modules, the task distribution modules are formed by the task distribution threads, and the second preset number needs to comprehensively consider the number and the type of the target tasks and the use conditions of the GPU and the CPU resources at the current moment, so that the second preset number of task distribution threads meeting the factors can be controlled to be started. The task distribution thread is used for distributing the target task to the task execution thread. The task execution threads are the execution main bodies of the task execution modules, the task execution modules are formed by the task execution threads, and the number of the task execution threads needs to comprehensively consider the number and the type of the target tasks and the occupancy rate of system resources at the current moment, so that the corresponding number of the task execution threads can be controlled to be started. The task execution threads can be started after receiving the corresponding target tasks or can be started in advance, when the task distribution threads distribute the target tasks, the appropriate target tasks are distributed to the task execution threads according to the working states of the task execution threads, and in addition, the task execution efficiency can be improved by distributing one target task to one task execution thread. Furthermore, there may be a certain correspondence between the type of the task execution thread and the type of the target task, that is, a specific target task can only be executed by a specific task execution thread, and in this case, the task allocation execution thread needs to be allocated in a targeted manner when allocating tasks. However, when there is no specific correspondence between the type of the task execution thread and the type of the target task, the correspondence between the target task and the task execution thread exists after the target task is allocated to the task execution thread.
In this implementation, after receiving the corresponding target task, the task execution thread performs corresponding processing on the target task by using the task execution thread. Specifically, after the task execution is assigned with one target task, the target task is subjected to processing such as submission, status update, failure retry and the like. The task management system consumes little system resources, can efficiently manage the tasks, has more advantages particularly for the management effect of high concurrent tasks, can dynamically and transparently expand the capacity on the architecture, and meets the requirements of the task management systems and platforms with various scene scales.
As can be seen, in the embodiment of the present application, a first preset number of task configuration analysis threads are started first, the task configuration analysis threads are used to analyze configuration information of a preset task, the preset task meeting preset conditions is determined based on an analysis result, so as to create a corresponding target task by using the preset task meeting the preset conditions, then a second preset number of task distribution threads are started, the target task is distributed to a corresponding task execution thread by using the task distribution threads, and finally the task execution thread is used to perform corresponding processing on the target task. According to the embodiment of the application, the tasks are systematically managed by starting a certain number of task configuration analysis threads, task distribution threads and task execution threads, so that low resource consumption is achieved, functional usability is guaranteed under the condition that system performance is certain, and task management efficiency is improved.
Fig. 2 is a flowchart of a specific task management method according to an embodiment of the present application. Referring to fig. 2, the task management method includes:
s21: establishing a task configuration pool, a task running pool and a historical task pool in a mode of constructing a data table in a database; the task configuration pool is used for storing configuration information of preset tasks, the task operation pool is used for storing target tasks before operation is finished, and the historical task pool is used for storing the target tasks after operation is finished.
In this embodiment, a deep learning training platform is taken as an example to describe the task management scheme described in this application. The method comprises the steps of establishing a task management pool in a database in a data table building mode, managing training tasks by using the task management pool, wherein the training tasks are training tasks in a deep learning training platform, and the deep learning training platform submits, schedules, tracks states, processes abnormity and the like to the training tasks, for example, an artificial intelligence development platform AIStation is a typical deep learning service platform, is oriented to a deep learning development scene, integrates computing resources, data resources and an AI development environment, realizes unified allocation and scheduling of the computing resources, centralized management and acceleration of the training data and development training of model processes, and builds an agile and efficient integrated platform for AI research and development. The task management pool comprises a task configuration pool, a task operation pool and a historical task pool. The task configuration pool is used for storing configuration information of a preset training task, the task operation pool is used for storing the target task before operation is finished, and the historical task pool is used for storing the target task after operation is finished. In the embodiment, three task management pools are constructed based on the task management method of the deep learning training platform, in actual service, the method can not only be limited to the three task management pools, but also can create different types and numbers of task management pools according to service requirements, and the interaction between the task management pools and the threads can be correspondingly adjusted.
The existing deep learning training tasks are generally stored in a memory, so that the expandability of a deep learning training platform is poor, and the upper limit of the bearing capacity exists. As shown in fig. 3 in particular, the deep learning training platform may create three TASK management pools (T _ TASK _ CONFIG, T _ DL _ TASK, T _ TASK _ HISTORY) in the form of a mariaDB data table for managing deep learning training TASKs. The MariaDB database management system is a branch of MySQL, can store, inquire, search and update data and the like, is an open source implementation of MySQL, maintains the task state by using the MariaDB database, realizes low resource consumption of a task management module, improves the task training efficiency, simultaneously realizes no stateization of the task management module, can perform transparent management system expansion and contraction, and can adapt to efficient task management of each platform scale. The TASK configuration pool T _ TASK _ CONFIG is used for managing configuration information of all deep learning TASKs of the platform, the deep learning TASKs which run actually can be generated through configuration, the TASK running pool T _ DL _ TASK is used for managing all the deep learning TASKs which run actually, and the HISTORY TASK pool T _ TASK _ HISTORY is used for managing all the deep learning TASKs which run finished, so that a user can conveniently inquire the condition of HISTORY jobs.
S22: starting a first preset number of task configuration analysis threads, utilizing the task configuration analysis threads, carrying out periodic scanning analysis on the configuration information of the preset tasks in the task configuration pool according to a preset period, judging whether the preset tasks meet preset conditions or not based on an analysis result, and if so, establishing the corresponding target tasks by utilizing the currently scanned configuration information of the preset tasks.
In this embodiment, for the step of starting the first preset number of task configuration analysis threads, reference may be made to the specific contents disclosed in the foregoing embodiments, and details are not described herein again. For a deep learning training platform, a Timer class may be defined in a deep learning TASK configuration analysis thread, then the Timer class is called, the configuration information of the preset TASK in the TASK configuration pool T _ TASK _ CONFIG is periodically scanned and analyzed according to a preset period, if a certain preconfigured deep learning training TASK reaches an operating condition, a corresponding deep learning TASK, that is, the target TASK, is generated, and the deep learning TASK is stored in the TASK operating pool T _ DL _ TASK, as specifically shown in fig. 4.
It should be noted that, when the first preset number is greater than 1, that is, when a plurality of deep learning task configuration analysis threads are simultaneously run, in order to avoid problems of data asynchronism and even data errors caused when the same task is executed by a plurality of threads, locking processing needs to be performed on configuration information in a configuration information analysis process, that is, locking processing needs to be performed on configuration information of a pre-configured deep learning training task in the analysis process. In this embodiment, information such as metadata of a lock is stored in a third-party storage medium maria db data table, that is, a field related to a "lock" state is configured in configuration information of each deep learning training TASK in a TASK configuration pool T _ TASK _ CONFIG, when locking processing needs to be performed on the deep learning training TASK, only the field related to the "lock" state of the deep learning training TASK needs to be set to the locked state, and accordingly, after an actually-running target TASK is created by using a pre-configured deep learning training TASK, unlocking processing needs to be performed on the configuration information after an analysis process of the configuration information is completed, and at this time, only the field related to the "lock" state of the deep learning training TASK needs to be set to the unlocked state. When a plurality of task analysis threads access the same locking task, only one task analysis thread is executed at the same time, and other threads can execute the task only after the current thread is executed, but the other threads can execute non-locking tasks.
S23: and starting a second preset number of task distribution threads, and distributing the target task to the corresponding task execution threads by using the task distribution threads.
In this embodiment, as to the specific process of step S23, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here. It should be added that, for target TASKs of a plurality of deep learning training TASKs, a second preset number of deep learning TASK distribution threads dispatchers are started, a list of target TASKs to be executed is obtained from the TASK running pool T _ DL _ TASK according to a preset policy after the dispatch of the second preset number of deep learning TASK distribution threads dispatchers is started, the preset policy may be a priority policy, and then each target TASK in the list is sequentially distributed to one deep learning TASK execution thread Executor for execution, which is specifically shown in fig. 5. Similarly, in this embodiment, the target task in the task distribution process also needs to be locked, and the target task after the task distribution process is finished is also unlocked.
S24: and submitting the target task to a task running environment by using the task execution thread to run, acquiring the running state of the target task in real time, and updating the state of the target task in the task running pool according to the running state.
In this embodiment, after the deep learning task execution thread Executor acquires a target task corresponding to a deep learning training task, the target task needs to be submitted to a task running environment to run, and the state of the target task in the task running pool is updated according to the running state. For the deep learning training platform, the operating environment is generally a deep learning training cluster, that is, a server cluster that processes a plurality of target tasks, and accordingly, the operating state of the target tasks may be queried through a cluster interface, where the operating state includes but is not limited to-be-submitted, running failure, resubmit, and the like, and the operating state is fed back to the task running pool, so as to periodically perform synchronous update on the state in the task record of the corresponding target task in the task running pool. In addition, in this embodiment, the target TASK in the TASK running process may also be locked, and the target TASK after the TASK running process is finished is unlocked, that is, for this step, after the deep learning TASK execution thread Executor obtains a target TASK corresponding to the deep learning training TASK, the target TASK may be first locked, and after the target TASK is finished running, the record of the target TASK in the TASK running pool T _ DL _ TASK is transferred to the historical training TASK pool T _ TASK _ HISTORY and is unlocked, and the record of the target TASK in the TASK running pool T _ DL _ TASK is deleted, as shown in fig. 6 specifically.
Further, a user of the deep learning training platform can modify the TASK configuration pool T _ TASK _ CONFIG, the TASK running pool T _ DL _ TASK, and the historical TASK pool T _ TASK _ HISTORY through a platform interface, so as to update the deep learning TASK configuration and manage the ongoing or historical TASKs.
Therefore, in the embodiment of the application, the task configuration pool, the task running pool and the historical task pool are created in a manner of constructing the data table in the database, and the MariaDB database is specifically used for creating three task management pools for storing and managing the whole life cycle of the deep learning training task, so that the configuration, submission, state maintenance and the like of the deep learning training task are realized. And meanwhile, a deep learning task configuration analysis module Timer, a deep learning task distribution module Dispatcher and a deep learning task execution module executer are used for realizing the efficient management of the deep learning training task. The three modules are kept stateless on the framework, maintenance of task information is carried out by relying on the task management pool, and the problem that when a training platform has a large number of deep learning training tasks, the platform task management system solves the problem that the resource loss of the platform task management system increases along with the number of the tasks is solved. Therefore, transparent expansion of the management module is realized, the deep learning training platform task management system is suitable for large-scale training task scenes of the deep learning training platform, an efficient deep learning platform task management system can be realized, under the condition that resources are fixed, the task management capability of the deep learning training platform can be greatly improved, and meanwhile, the requirements of deep learning training platform task management of various scales can be met through a transparent expansion mode.
Fig. 7 is a flowchart of a specific task management method according to an embodiment of the present application. Referring to fig. 7, the task management method includes:
s31: establishing a task configuration pool, a task running pool and a historical task pool in a mode of constructing a data table in a database; the task configuration pool is used for storing configuration information of preset tasks, the task operation pool is used for storing target tasks before operation is finished, and the historical task pool is used for storing the target tasks after operation is finished.
In this embodiment, as to the specific process of the step S31, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated herein.
S32: and sequencing the configuration information of different preset tasks in the task configuration pool according to the priority to obtain task configuration sequences corresponding to the configuration information of different preset tasks.
In this embodiment, in order to manage training tasks in a more orderly manner, the task configuration analysis thread needs to process the preset tasks and the configuration information thereof according to a certain order, that is, before starting a first preset number of task configuration analysis threads, first, the configuration information of different preset tasks in the task configuration pool needs to be sorted according to priorities so as to obtain task configuration sequences corresponding to the configuration information of different preset tasks, so that the task configuration analysis thread performs periodic scanning analysis on the task configuration sequences according to a preset period, and determines whether the preset tasks meet preset conditions based on an analysis result, and if so, creates corresponding target tasks by using the currently scanned configuration information of the preset tasks. The step of sorting the configuration information of different preset tasks in the task configuration pool according to priority is to determine the priority of the preset tasks, and the determination of the priority needs to follow a priority policy, where the priority policy may be set in advance according to a service requirement, and this embodiment does not limit this.
For example, a single default priority policy may be adopted to determine the priority of the preset task, that is, the default priority may be set for the preset task according to the sequence in which the user configures the preset task through the user interface of the deep learning training platform, three priority levels of high priority, medium priority and low priority may be set, the earlier configured preset task has the higher priority, and by analogy, the preset tasks at the same priority level are randomly arranged. The priority of the preset task can also be determined by adopting a single configuration priority policy, namely, a field related to the priority is configured in the configuration information of the preset task, the content of the field is definitely marked as high priority, medium priority or low priority when the preset task is configured, and the preset tasks in the same priority level are randomly arranged. The priority of the preset task may also be determined by using a mixed priority policy, that is, when a field related to the priority of the preset task exists and is not a null value, the priority level of the preset task is determined according to priority information in the field, when the field does not exist or is a null value, a medium priority level is uniformly set for the preset task, and the preset task located in the medium priority level corresponds to the medium priority level, and when a field related to the priority exists and is not a null value in configuration information of the preset task, the priority of the corresponding priority level in the medium priority level is ranked earlier than the priority of the preset task which does not exist a field related to the priority or is a null value.
S33: starting a first preset number of task configuration analysis threads, utilizing the task configuration analysis threads, carrying out periodic scanning analysis on the task configuration sequence according to a preset period, judging whether the preset tasks meet preset conditions or not based on an analysis result, and if so, utilizing the currently scanned configuration information of the preset tasks to create the corresponding target tasks.
In this embodiment, as to the specific process of the step S33, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated herein.
S34: and sequencing the target tasks according to the priority to obtain the task sequence of the target tasks.
In this embodiment, before starting a second preset number of task distribution threads, the target tasks need to be ordered according to priority to obtain a task sequence of the target tasks, so that the task distribution threads distribute the target tasks to corresponding task execution threads according to the order of the target tasks in the task sequence. The foregoing sorting of the target tasks according to the priority follows the priority policy in step S33, and will not be described herein again.
S35: and starting a second preset number of task distribution threads, and distributing the target tasks to corresponding task execution threads by using the task distribution threads according to the sequence of the target tasks in the task sequence.
S36: and utilizing the task execution thread to perform corresponding processing on the target task.
In this embodiment, as to the specific processes of the step S35 and the step S36, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
As can be seen, in the embodiment of the present application, the configuration information of different preset tasks in the task configuration pool is sorted according to the priority to obtain the task configuration sequences corresponding to the configuration information of different preset tasks, and the target task is sorted according to the priority to obtain the task sequence of the target task. On the basis, the task configuration analysis thread is used for carrying out periodic scanning analysis on the task configuration sequence, and meanwhile, the task distribution thread is used for distributing the target tasks to corresponding task execution threads according to the sequence of the target tasks in the task sequence. On one hand, the embodiment of the application realizes the ordered management of the training tasks by constructing the corresponding priority strategy aiming at the training tasks, and improves the management efficiency of the training tasks. On the other hand, the task management method in the embodiment can perform operations such as submission, state tracking, fault recovery and the like on the training tasks, and can effectively solve the problems of task submission failure, high task state updating delay and even incapability of updating caused by the scheduling performance problem of the scheduling system in the scene of high concurrency of a large number of tasks, and improve the scheduling performance of the task management system under the condition of scheduling resource limitation.
Referring to fig. 8, an embodiment of the present application further discloses a task management device, which includes:
the task analysis module 11 is configured to start a first preset number of task configuration analysis threads, analyze configuration information of preset tasks by using the task configuration analysis threads, determine the preset tasks meeting preset conditions based on analysis results, and create corresponding target tasks by using the preset tasks meeting the preset conditions;
the task distributing module 12 is configured to start a second preset number of task distributing threads, and distribute the target task to a corresponding task execution thread by using the task distributing threads;
and the task execution module 13 is configured to perform corresponding processing on the target task by using the task execution thread.
As can be seen, in the embodiment of the present application, a first preset number of task configuration analysis threads are started first, the task configuration analysis threads are used to analyze configuration information of a preset task, the preset task meeting preset conditions is determined based on an analysis result, so as to create a corresponding target task by using the preset task meeting the preset conditions, then a second preset number of task distribution threads are started, the target task is distributed to a corresponding task execution thread by using the task distribution threads, and finally the task execution thread is used to perform corresponding processing on the target task. According to the embodiment of the application, the tasks are systematically managed by starting a certain number of task configuration analysis threads, task distribution threads and task execution threads, so that low resource consumption is achieved, and task management efficiency is improved.
In some specific embodiments, the task parsing module 11 specifically includes:
the first sequencing unit is used for sequencing the configuration information of different preset tasks in the task configuration pool according to priority to obtain task configuration sequences corresponding to the configuration information of the different preset tasks;
the scanning unit is used for utilizing the task configuration analysis thread to periodically scan and analyze the configuration information of the preset task in the task configuration pool according to a preset period;
and the creating unit is used for configuring an analysis thread by using the task, judging whether the preset task meets a preset condition based on an analysis result, and if so, creating the corresponding target task by using the currently scanned configuration information of the preset task.
In some specific embodiments, the task distribution module 12 specifically includes:
the second sequencing unit is used for sequencing the target tasks according to the priority to obtain a task sequence of the target tasks;
and the distribution unit is used for distributing the target tasks to corresponding task execution threads by using the task distribution threads according to the sequence of the target tasks in the task sequence.
In some specific embodiments, the task execution module 13 specifically includes:
the submitting unit is used for submitting the target task to a task running environment by using the task execution thread to run;
and the updating unit is used for acquiring the running state of the target task in real time and updating the state of the target task in the task running pool according to the running state.
In some specific embodiments, the task management apparatus further includes:
the data table module is used for establishing a task configuration pool, a task operation pool and a historical task pool in a mode of constructing a data table in a database; the task configuration pool is used for storing configuration information of preset tasks, the task operation pool is used for storing target tasks before operation is finished, and the historical task pool is used for storing the target tasks after operation is finished.
The locking module is used for respectively locking the configuration information in the configuration information analysis process, the target task in the task distribution process and the target task in the task running process;
and the unlocking module is used for respectively unlocking the configuration information after the configuration information analysis process is finished, the target task after the task distribution process is finished and the target task after the task operation process is finished.
Further, the embodiment of the application also provides electronic equipment. FIG. 9 is a block diagram illustrating an electronic device 20 according to an exemplary embodiment, and nothing in the figure should be taken as a limitation on the scope of use of the present application.
Fig. 9 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present disclosure. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein, the memory 22 is used for storing a computer program, and the computer program is loaded and executed by the processor 21 to implement the relevant steps in the task management method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically a server.
In this embodiment, the power supply 23 is configured to provide a working voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and a communication protocol followed by the communication interface is any communication protocol applicable to the technical solution of the present application, and is not specifically limited herein; the input/output interface 25 is configured to obtain external input data or output data to the outside, and a specific interface type thereof may be selected according to specific application requirements, which is not specifically limited herein.
In addition, the storage 22 is used as a carrier for resource storage, and may be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc., and the resources stored thereon may include an operating system 221, a computer program 222, task data 223, etc., and the storage may be a transient storage or a permanent storage.
The operating system 221 is used for managing and controlling each hardware device and the computer program 222 on the electronic device 20, so as to realize the operation and processing of the processor 21 on the massive task data 223 in the memory 22, and may be Windows Server, Netware, Unix, Linux, and the like. The computer programs 222 may further include computer programs that can be used to perform other specific tasks in addition to the computer programs that can be used to perform the task management methods disclosed by any of the foregoing embodiments and executed by the electronic device 20. Data 223 may include task data collected by electronic device 20.
Further, an embodiment of the present application further discloses a storage medium, where a computer program is stored in the storage medium, and when the computer program is loaded and executed by a processor, the steps of the task management method disclosed in any of the foregoing embodiments are implemented.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The task management method, apparatus, device and storage medium provided by the present invention are described in detail above, and a specific example is applied in the description to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for task management, comprising:
starting a first preset number of task configuration analysis threads, analyzing configuration information of preset tasks by using the task configuration analysis threads, determining the preset tasks meeting preset conditions based on analysis results, and creating corresponding target tasks by using the preset tasks meeting the preset conditions;
starting a second preset number of task distribution threads, and distributing the target task to corresponding task execution threads by using the task distribution threads;
and utilizing the task execution thread to perform corresponding processing on the target task.
2. The task management method according to claim 1, wherein before starting the first preset number of task configuration parsing threads, the method further comprises:
establishing a task configuration pool, a task running pool and a historical task pool in a mode of constructing a data table in a database;
the task configuration pool is used for storing configuration information of preset tasks, the task operation pool is used for storing target tasks before operation is finished, and the historical task pool is used for storing the target tasks after operation is finished.
3. The task management method according to claim 2, wherein the analyzing the configuration information of the preset task by using the task configuration analysis thread, and determining the preset task satisfying a preset condition based on an analysis result, so as to create a corresponding target task by using the preset task satisfying the preset condition, comprises:
and utilizing the task configuration analysis thread to periodically scan and analyze the configuration information of the preset task in the task configuration pool according to a preset period, judging whether the preset task meets a preset condition or not based on an analysis result, and if so, utilizing the currently scanned configuration information of the preset task to create the corresponding target task.
4. The task management method according to claim 3, wherein the performing, by using the task configuration analysis thread, periodic scanning analysis on the configuration information of the preset task in the task configuration pool according to a preset period includes:
sequencing the configuration information of different preset tasks in the task configuration pool according to the priority to obtain task configuration sequences corresponding to the configuration information of the different preset tasks;
and utilizing the task configuration analysis thread to periodically and sequentially scan and analyze the task configuration sequence according to a preset period.
5. The task management method according to claim 2, wherein the distributing the target task to the corresponding task execution thread by using the task distribution thread comprises:
sequencing the target tasks according to the priority to obtain a task sequence of the target tasks;
and distributing the target tasks to corresponding task execution threads by using the task distribution threads according to the sequence of the target tasks in the task sequence.
6. The task management method according to claim 2, wherein the performing, by the task execution thread, the corresponding processing on the target task includes:
and submitting the target task to a task running environment by using the task execution thread to run, acquiring the running state of the target task in real time, and updating the state of the target task in the task running pool according to the running state.
7. The task management method according to any one of claims 1 to 6, further comprising:
respectively locking the configuration information in the configuration information analysis process, the target task in the task distribution process and the target task in the task running process;
correspondingly, the configuration information after the configuration information analysis process is finished, the target task after the task distribution process is finished and the target task after the task operation process is finished are respectively unlocked.
8. A task management apparatus, comprising:
the task analysis module is used for starting a first preset number of task configuration analysis threads, analyzing configuration information of preset tasks by using the task configuration analysis threads, determining the preset tasks meeting preset conditions based on analysis results, and creating corresponding target tasks by using the preset tasks meeting the preset conditions;
the task distribution module is used for starting a second preset number of task distribution threads and distributing the target task to the corresponding task execution threads by using the task distribution threads;
and the task execution module is used for utilizing the task execution thread to perform corresponding processing on the target task.
9. An electronic device, comprising a processor and a memory; wherein the memory is for storing a computer program that is loaded and executed by the processor to implement the task management method of any of claims 1 to 7.
10. A computer-readable storage medium storing computer-executable instructions which, when loaded and executed by a processor, implement a task management method as claimed in any one of claims 1 to 7.
CN202110290924.XA 2021-03-18 2021-03-18 Task management method, device, equipment and storage medium Pending CN113010290A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110290924.XA CN113010290A (en) 2021-03-18 2021-03-18 Task management method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110290924.XA CN113010290A (en) 2021-03-18 2021-03-18 Task management method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113010290A true CN113010290A (en) 2021-06-22

Family

ID=76409697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110290924.XA Pending CN113010290A (en) 2021-03-18 2021-03-18 Task management method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113010290A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113971091A (en) * 2021-10-25 2022-01-25 重庆大学 Persistent memory allocation method considering process difference

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8209701B1 (en) * 2007-09-27 2012-06-26 Emc Corporation Task management using multiple processing threads
CN105159768A (en) * 2015-09-09 2015-12-16 浪潮集团有限公司 Task management method and cloud data center management platform
CN109814998A (en) * 2019-01-22 2019-05-28 中国联合网络通信集团有限公司 A kind of method and device of multi-process task schedule
CN112181621A (en) * 2020-09-27 2021-01-05 中国建设银行股份有限公司 Task scheduling system, method, equipment and storage medium
CN112328399A (en) * 2020-11-17 2021-02-05 中国平安财产保险股份有限公司 Cluster resource scheduling method and device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8209701B1 (en) * 2007-09-27 2012-06-26 Emc Corporation Task management using multiple processing threads
CN105159768A (en) * 2015-09-09 2015-12-16 浪潮集团有限公司 Task management method and cloud data center management platform
CN109814998A (en) * 2019-01-22 2019-05-28 中国联合网络通信集团有限公司 A kind of method and device of multi-process task schedule
CN112181621A (en) * 2020-09-27 2021-01-05 中国建设银行股份有限公司 Task scheduling system, method, equipment and storage medium
CN112328399A (en) * 2020-11-17 2021-02-05 中国平安财产保险股份有限公司 Cluster resource scheduling method and device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113971091A (en) * 2021-10-25 2022-01-25 重庆大学 Persistent memory allocation method considering process difference
CN113971091B (en) * 2021-10-25 2024-05-14 重庆大学 Method for distributing persistent memory in consideration of process difference

Similar Documents

Publication Publication Date Title
Özpeynirci et al. Parallel machine scheduling with tool loading
Pakize A comprehensive view of Hadoop MapReduce scheduling algorithms
CN112445598B (en) Task scheduling method and device based on quartz, electronic equipment and medium
CN112579267A (en) Decentralized big data job flow scheduling method and device
CN110611707A (en) Task scheduling method and device
Baresi et al. Fine-grained dynamic resource allocation for big-data applications
Han et al. EdgeTuner: Fast scheduling algorithm tuning for dynamic edge-cloud workloads and resources
CN113010290A (en) Task management method, device, equipment and storage medium
Naik et al. A review of adaptive approaches to MapReduce scheduling in heterogeneous environments
Liu et al. KubFBS: A fine‐grained and balance‐aware scheduling system for deep learning tasks based on kubernetes
CN116974994A (en) High-efficiency file collaboration system based on clusters
Ivanov et al. Improving efficiency of analysis jobs in CMS
CN110134533B (en) System and method capable of scheduling data in batches
CN112580816A (en) Machine learning training resource management
CN115496373A (en) Task allocation method and device applied to agile management platform
Feoktistov et al. Agent behavior model for distributed computing management in the environment with virtualized resources
Hongyan et al. Predicting misconfiguration-induced unsuccessful executions of jobs in big data system
US9304829B2 (en) Determining and ranking distributions of operations across execution environments
Wang et al. A genetic algorithm based efficient static load distribution strategy for handling large-scale workloads on sustainable computing systems
Hamad An overview of Hadoop scheduler algorithms
WO2020173553A1 (en) Decentralized job scheduling in a topology agnostic environment
Janardhanan et al. Analysis and modeling of resource management overhead in Hadoop YARN Clusters
CN113032131B (en) Redis-based distributed timing scheduling system and method
CN116107724B (en) AI (advanced technology attachment) acceleration core scheduling management method, device, equipment and storage medium
CN117453376B (en) Control method, device, equipment and storage medium for high-throughput calculation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210622