CN117132871A - Model scheduling method, device and storage medium - Google Patents

Model scheduling method, device and storage medium Download PDF

Info

Publication number
CN117132871A
CN117132871A CN202210551926.4A CN202210551926A CN117132871A CN 117132871 A CN117132871 A CN 117132871A CN 202210551926 A CN202210551926 A CN 202210551926A CN 117132871 A CN117132871 A CN 117132871A
Authority
CN
China
Prior art keywords
model
time
scheduling
initial
models
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210551926.4A
Other languages
Chinese (zh)
Inventor
张文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Cloud Computing Ltd
Original Assignee
Alibaba Cloud Computing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Cloud Computing Ltd filed Critical Alibaba Cloud Computing Ltd
Priority to CN202210551926.4A priority Critical patent/CN117132871A/en
Priority to PCT/CN2023/094370 priority patent/WO2023221949A1/en
Publication of CN117132871A publication Critical patent/CN117132871A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion

Abstract

The embodiment of the application provides a model scheduling method, model scheduling equipment and a storage medium. The start model may be specified among a plurality of models, and the start model processes the source data according to a preset processing frequency, which may secure the processing frequency of the source data. The time consumption of the initial model is not used up, so that the available time period generated by the initial model can be monitored, and the trigger time of each residual model is determined based on a preset scheduling strategy in the available time period, so that each residual model is controlled to complete the processing work of each source data in cooperation with the initial model. In this way, the remaining models are no longer limited by the preset processing frequency, but the time-consuming support of each remaining model can be fully utilized by the defragmentation time vacated during the execution of the starting model, which allows the multiple models to maintain the required model accuracy without any need to reduce the accuracy for saving time. Therefore, the model precision and the processing frequency can be ensured at the same time, so that the quality of a processing result is effectively improved.

Description

Model scheduling method, device and storage medium
Technical Field
The present application relates to the field of deep learning technologies, and in particular, to a method, an apparatus, and a storage medium for model scheduling.
Background
With the widespread application of visual AI, it has been difficult to deploy a model to meet application requirements, especially in high-frame-rate and high-precision detection and recognition scenarios, such as passenger flow recognition scenarios, a terminal needs to implement high-precision detection tracking and data preference and extraction features, and multiple models often need to be deployed to support increasingly complex application requirements.
Currently, since NPU (Neural-network Processing Unit) has monopolizing property, a serial manner is required to call a model, that is, a plurality of models are sequentially executed, which results in time consumption of compressing the model to meet the frame rate in case of higher input frame rate requirement, resulting in reduction of model accuracy, or reducing the input frame rate in order to guarantee accuracy in case of higher model accuracy requirement. In either case, the quality of the final processing results is not good.
Disclosure of Invention
Aspects of the present application provide a method, apparatus, and storage medium for model scheduling, which are used to improve the quality of processing results in a multi-model scenario.
The embodiment of the application provides a model scheduling method, which comprises the following steps:
triggering a designated initial model according to a preset processing frequency to process the source data;
calculating the triggering interval time corresponding to the initial model according to the preset processing frequency to determine the available time period which can be occupied by the residual model in the triggering interval time;
and determining the trigger time of each residual model based on a preset scheduling strategy in the available period generated by the initial model so as to control each residual model to complete the processing work of each source data in cooperation with the initial model.
The embodiment of the application also provides a computing device, which comprises a memory, a processor and a communication component;
the memory is used for storing one or more computer instructions;
the processor is coupled with the memory and the communication component for executing the one or more computer instructions for:
triggering a designated initial model according to a preset processing frequency to process the source data;
calculating the triggering interval time corresponding to the initial model according to the preset processing frequency to determine the available time period which can be occupied by the residual model in the triggering interval time;
and determining the trigger time of each residual model based on a preset scheduling strategy in the available period generated by the initial model so as to control each residual model to complete the processing work of each source data in cooperation with the initial model.
Embodiments of the present application also provide a computer-readable storage medium storing computer instructions that, when executed by one or more processors, cause the one or more processors to perform the foregoing model scheduling method.
In the embodiment of the application, the initial model can be specified in a plurality of models, and the initial model processes the source data according to the preset processing frequency, so that the processing frequency of the source data can be ensured. The time consumption of the initial model is not used up, so that the available time period generated by the initial model can be monitored, and the trigger time of each residual model is determined based on a preset scheduling strategy in the available time periods, so that each residual model is controlled to complete the processing work of each source data in cooperation with the initial model, the residual model is not limited by preset processing frequency, the fragmentation time vacated during the execution of the initial model can be fully utilized to support the time consumption required by each residual model, and a plurality of models can maintain the required model precision without reducing the precision for saving the time consumption. Therefore, the model precision and the processing frequency can be ensured at the same time, so that the quality of a processing result is effectively improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a flow chart of a model scheduling method according to an exemplary embodiment of the present application;
FIG. 2 is a logical schematic diagram of a model scheduling scheme according to an exemplary embodiment of the present application;
FIG. 3 is a logic diagram of an application scenario according to an exemplary embodiment of the present application;
fig. 4 is a schematic structural diagram of a computing device according to another exemplary embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
At present, in a multi-model scene, the model precision and the input frame rate cannot be ensured, so that the quality of a processing result is poor. To this end, in some embodiments of the application: the start model may be specified among a plurality of models, and the start model processes the source data at a preset processing frequency, which enables the processing frequency of the source data to be secured. The time consumption of the initial model is not used up, so that the available time period generated by the initial model can be monitored, and the trigger time of each residual model is determined based on a preset scheduling strategy in the available time periods, so that each residual model is controlled to complete the processing work of each source data in cooperation with the initial model, the residual model is not limited by preset processing frequency, the fragmentation time vacated during the execution of the initial model can be fully utilized to support the time consumption required by each residual model, and a plurality of models can maintain the required model precision without reducing the precision for saving the time consumption. Therefore, the model precision and the processing frequency can be ensured at the same time, so that the quality of a processing result is effectively improved.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a flow chart of a model scheduling method according to an exemplary embodiment of the present application, and fig. 2 is a logic diagram of a model scheduling scheme according to an exemplary embodiment of the present application. The method may be performed by a model scheduler, which may be implemented as a combination of software and/or hardware, which may be integrated in a computing device. Referring to fig. 1, the method may include:
step 100, triggering a designated initial model according to a preset processing frequency to process source data;
step 101, calculating a trigger interval time corresponding to an initial model according to a preset processing frequency to determine an available period which can be occupied by the residual model in the trigger interval time;
step 102, determining trigger time of each remaining model based on a preset scheduling strategy in an available period generated by the initial model, so as to control each remaining model to complete processing work of each source data in cooperation with the initial model.
The model scheduling method provided by the embodiment can be applied to a multi-model scene, that is, a scene in which a plurality of models are adopted to cope with application requirements, for example, in a passenger flow recognition scene, a plurality of models such as a human body detection model, a face detection model, a human body feature extraction model, a face feature extraction model and the like can be adopted to cope with passenger flow recognition requirements. Of course, this is merely exemplary, and the present embodiment is also applicable to other multi-model scenarios. The model in this embodiment may be a large model such as a neural network model or a machine learning model, but of course, may be another model, and the type of the model in this embodiment is not limited. In addition, the model scheduling scheme provided in this embodiment is mainly aimed at the case that the processor on which the model depends has exclusivity.
Referring to fig. 1 and 2, in step 100, a designated start model may be triggered to process source data according to a preset processing frequency. In this embodiment, an initial model may be specified among a plurality of models to be scheduled, and any model using source data as an input parameter may be generally specified as an initial model, and of course, the initial model may be determined by considering other factors such as time consumption, priority, etc., and the selection scheme of the initial model is not limited in this embodiment, and in practical application, a suitable model may be specified as an initial model as required.
The preset processing frequency refers to a data processing frequency required by application requirements, for example, in the image processing field, the preset processing frequency may refer to a frame rate, in the simulation field, the preset processing frequency may refer to a simulation frequency, and the like. The preset processing frequency defines the acquisition frequency of the source data, for example, the frame rate may define the frequency at which image frames are extracted from the input image stream. The initial model needs to strictly adhere to the preset processing frequency, which ensures that the source data extracted according to the preset processing frequency can be processed smoothly. It should be noted that, in order to ensure normal execution of multiple models, in this embodiment, the maximum time consumption of a single model in the multiple models meets the requirement of the preset processing frequency.
In this embodiment, in step 100, the source data may be extracted from the input data according to a preset processing frequency, for example, if the input data is an image stream, the source data may be an image frame in the image stream. In step 100, the extracted source data may be further input into the initial model according to a preset processing frequency, so as to trigger the initial model to process the received source data. The triggered frequency of the initial model will be consistent with the preset processing frequency, so that the time when the initial model is triggered each time is known, and in step 101, the trigger interval time corresponding to the initial model may be calculated according to the time when the initial model is triggered each time, so as to determine the available period that the remaining models can occupy in the trigger interval time. The interval time between the adjacent two triggered times of the initial model can be calculated according to the preset processing frequency and used as the trigger interval time; in addition, the processing time consumption of the initial model is also known, so the available period that the remaining models can occupy in the triggering interval time can be determined according to the triggering interval time corresponding to the initial model and the processing time consumption of the initial model. For example, the frame rate is 10 frames/s, the processing time of the initial model is 20ms, and assuming that the trigger interval time is 101ms-200ms, after the processing time of removing the initial model is 20ms, the available period of the remaining models in the trigger interval time is 121ms-200ms, and the total is 80ms.
On the basis that the starting model is triggered by frequency, the starting model will continuously generate available time periods, during which the processor on which the multiple models depend is in idle state, and the present embodiment proposes to make full use of the fragmentation times of the processor, and to use the fragmentation times to blank the remaining models in the multiple models.
It should be understood that for each source data, the present embodiment controls each remaining model to complete the processing work completely in cooperation with the starting model, that is, a complete model link is performed for each source data, but the triggering time of each remaining model may be discrete or discontinuous in the model links for different source data.
For this reason, referring to fig. 1 and 2, in step 102, the trigger timing of each remaining model may be determined based on a preset scheduling policy during the available period generated by the starting model, so as to control each remaining model to complete the processing work of each source data in cooperation with the starting model. Wherein, the scheduling policy can include, but is not limited to, a model with a lag in priority triggering the executed times; preferentially triggering the model with higher priority; and/or a model that takes longer time to trigger preferentially, etc., it should be understood that only a few exemplary policy dimensions are provided herein, and the embodiment is not limited thereto, and in addition, priorities may be configured between different policy dimensions, and a preferential trigger condition may be added in each policy dimension, for example, in the policy dimension of "a model that has been executed with a lag in preferential trigger time," a model that is selected may be added so that the time spent by the selected model cannot exceed the remaining time period in the current available time period "as a preferential trigger condition, so that in this policy dimension, a triggerable model is selected from models that take less time than the remaining time period in the current available time period. The scheduling policy in this embodiment may be set according to actual requirements, and maintenance operations such as adding, modifying, deleting, etc. may be performed as required.
Thus, the idle time of fragmentation on the processing on which the multiple models depend can be fully utilized, and the blank insertion triggers the rest models in the multiple models during the execution of the preset processing frequency of the initial model table, and ensures that the processing work is completed completely for each source data.
Referring to fig. 1 and 2, in consideration of the possible existence of a context between the multiple models, that is, the following model may need to rely on the processing results of the preceding model as input parameters, in this embodiment, a database or the like may be used to hold the processing results of the multiple models for use by the following model as needed. In the present embodiment, the type of the database is not limited, and for example, a sqlite database, a Mysql database, or the like may be used. The processing results held in the database may be model identification as an index, although the embodiment is not limited thereto.
In this embodiment, associated auxiliary operations may also be configured for the model as needed. Aiming at the special model associated with the auxiliary operation, the special model and the corresponding auxiliary operation can be executed in an associated way; and the operation result of the auxiliary operation is used as a part of the processing result of the special model. For example, for the human body detection model, the related matting operation may be performed, so that matting data obtained by the matting operation and a detection result of the human body detection model may be stored together as a processing result corresponding to the human body detection model. The time consumption of the auxiliary operation can be counted into the time consumption corresponding to the association model.
As mentioned above, in this embodiment, a plurality of models are controlled to execute a complete model link for each source data, and for this purpose, in this embodiment, the processing progress of each source data may also be monitored separately; and outputting a model processing result corresponding to the target source data after all the model processing work is completed. Alternatively, in this embodiment, the processing progress information may be maintained for each source data, and after each model completes processing the source data, an identification bit representing the model may be added to the processing progress information of the corresponding source data, so as to characterize that the model has completed processing the source data. Thus, whether the identification bits corresponding to all models are contained in the processing progress information corresponding to each source data can be detected, and if so, the processing work of the source data can be determined to be completed. Of course, this is merely exemplary, and other ways of monitoring the processing progress of the source data may be used in the present embodiment, and the present embodiment is not limited thereto.
Accordingly, in the present embodiment, the start model may be specified among a plurality of models, and the start model processes the source data according to the preset processing frequency, which makes it possible to secure the processing frequency of the source data. The time consumption of the initial model is not used up, so that the available time period generated by the initial model can be monitored, and the trigger time of each residual model is determined based on a preset scheduling strategy in the available time periods, so that each residual model is controlled to complete the processing work of each source data in cooperation with the initial model, the residual model is not limited by preset processing frequency, the fragmentation time vacated during the execution of the initial model can be fully utilized to support the time consumption required by each residual model, and a plurality of models can maintain the required model precision without reducing the precision for saving the time consumption. Therefore, the model precision and the processing frequency can be ensured at the same time, so that the quality of a processing result is effectively improved.
In the above or below embodiments, various implementations may be employed to schedule each remaining model during the period of availability of the starting model generation.
In an alternative implementation, multiple scheduling tasks with precedence relationships may be created, where multiple models to be scheduled are distributed under multiple scheduling tasks by groups, and the starting model is distributed under the starting scheduling task. Based on this, a plurality of scheduling tasks can be utilized to determine the trigger occasions of the remaining models based on a preset scheduling policy within the available period generated by the starting model.
In this implementation manner, the multiple models can be distributed under multiple scheduling tasks according to the components according to the order requirements, the context relation, the priority and other dimensions of the multiple models, the distribution scheme of the multiple models is not limited, in practical application, the multiple models can be distributed as required, and the distribution scheme can be adjusted at any time. However, it should be appreciated that in this implementation, for logic simplicity, there is a precedence relationship between the scheduled tasks by default, so that all models distributed under a previously scheduled task will be triggered before all models distributed under a later scheduled task, and the underlying triggering sequence due to the scheduled tasks can be fully considered in the process of distributing the models.
In this implementation manner, one or more models may be included in the model set under a single scheduling task, and in this embodiment, scheduling basis information such as priority, indicating required processing time consumption, indicating executed times and the like may be configured for the models in the model set, so as to provide a basis for executing a scheduling policy for the scheduling task. In this embodiment, each model is defaulted to sequentially perform the processing work of the extracted source data according to the extraction sequence of the source data, so that the number of times of execution can characterize the number of source data processed by the model. The policy dimension of the model with the lag of the number of times of execution being preferentially triggered can be interpreted as preferentially guaranteeing the complete processing work of the previously extracted source data, and the smooth realization of the scheduling dimension can be supported through the scheduling basis information of the number of times of execution. The explanation of supporting other scheduling dimensions for other scheduling basis information is not described in detail herein.
In addition, in the implementation manner, different scheduling tasks can be associated with different scheduling strategies, namely, different scheduling tasks can use a uniform scheduling strategy, and also can use a non-identical scheduling strategy, so that more proper scheduling strategies can be configured according to the scheduling requirements of model groups under different scheduling tasks, and the scheduling effect is better.
Based on this, in determining the trigger occasions of the respective remaining models using a plurality of scheduling tasks, it is possible to:
after the model group under the initial scheduling task is executed, calculating the remaining time periods in the available time periods;
sending a trigger signal to the next scheduling task of the initial scheduling task to trigger the next scheduling task to determine the trigger time of each model in the model group distributed below the next scheduling task based on a scheduling strategy;
if the remaining time period is not consumed after the execution of the model group under the next scheduling task is completed, the trigger signal is continuously sent to the subsequent scheduling task until the remaining time period is consumed.
That is, after the model group under the previous scheduling task is successfully executed in a single available period, a trigger signal can be sent to the next scheduling task, if the remaining period in the available period still can support the successful execution of the model group under the next scheduling task, the trigger signal can be sent to the next scheduling task again until the model group under the certain scheduling task after the previous scheduling task cannot be completely executed due to the exhaustion of the available period, and the trigger signal is not sent backwards any more. After the available time period is exhausted, the processor on which the multiple models depend triggers the initial model by default, a new available time period is generated after the initial model is executed, and under the new available time period, each scheduling task can determine the execution time of each remaining model again according to the scheduling logic.
Among the plurality of scheduling tasks, a starting scheduling task is more specific, and in the implementation mode, if a model group under the starting scheduling task only comprises a starting model, the available time period is taken as a residual time period; if the model group under the initial scheduling task contains a plurality of models, the time periods left after the time consumption of other models except the initial model are subtracted from the available time periods to serve as the remaining time periods.
In this implementation manner, a plurality of scheduling tasks with precedence relationships can sequentially consume the available time periods, and trigger time of each model in the model group distributed below the scheduling tasks can be determined according to a scheduling strategy under a single scheduling task, so that from the global point of view, trigger time of each remaining model can be determined in the available time period generated by the starting model, and processing work of each remaining model and each data source matched with the starting model is guaranteed.
In addition, in order to avoid the problem that the model group under the scheduling task at the rear cannot obtain enough execution time due to the fact that the available time period of the previous scheduling task is exhausted at high frequency, a relief strategy can be introduced in the implementation mode. For example, in the case where the number of times models under a scheduled task have been executed is lower than a specified standard, the position of the scheduled task in the context may be adjusted forward so that the scheduled task obtains more execution time, which may allow more flexible scheduling of multiple models.
It should be appreciated that other implementations, in addition to those described above, may be employed in the present embodiment to schedule each remaining model during the available time period generated by the starting model. For example, the remaining models are taken as a whole, and global scheduling is performed on the remaining models according to a scheduling policy, etc., and the embodiment is not limited thereto.
In this embodiment, by creating a plurality of scheduling tasks with precedence relationships, the plurality of models can be grouped and scheduled, so that the scheduling logic of the plurality of models is more detailed, the flexibility of model scheduling can be effectively improved, and the logic complexity of model scheduling can be reduced.
Fig. 3 is a logic schematic diagram of an application scenario according to an exemplary embodiment of the present application. In fig. 3, taking a passenger flow identification scenario as an example, an image flow may be acquired by a camera. The human body detection model, the face detection model, the human body feature extraction model, the face feature extraction model and the face attribute detection model are used for meeting the passenger flow identification requirement. Referring to fig. 3, two scheduled tasks may be created, a human detection model is distributed under the initial scheduled task, and the remaining 4 models are distributed under the second scheduled task.
Based on this, the technical solution can be divided into four parts in fig. 3: data preprocessing, initial scheduling task, second scheduling task and sending processing.
In the data preprocessing part, the image stream acquired by the camera can be subjected to processes such as unpacking and decoding, whether the current image frame needs to be extracted or not is judged according to the preset frame rate, if so, the extracted image frame data can be input into an initial scheduling task, and if not, the current image frame can be released (can be understood as neglected).
In the initial scheduling task, frame rate control can be performed, the extracted image frames are received according to a preset frame rate, and a human body detection model is triggered; if the human body detection model detects that a pedestrian exists, performing auxiliary operation-matting operation related to the human body detection model, and inserting a detection result of the human body detection model and a matting result of the matting operation into a database; if no pedestrian is detected by the human detection model, the next image frame will continue to be waited. After the model under the initial scheduling task is executed, a trigger signal can be sent to the second scheduling task. It should be noted that, if the human detection model does not detect a pedestrian, the initial scheduling task may consider the model below it to be executed.
In the second scheduling task, the processing result generated under the initial scheduling task can be obtained in advance from the database for standby. The second scheduled task may also calculate a remaining period of the available period. In the case of receiving the trigger signal, trigger opportunities of the 4 models thereunder may be determined based on a preset scheduling policy. An exemplary scheduling strategy is shown in fig. 3, where the second scheduling task first searches, according to the number of times that 4 models have been executed, whether there is any model that has not processed the last-loaded matting data (the processing result under the initial scheduling task), if so, it preferentially triggers the model to process the last-loaded matting data, and if not, it may determine a reasonable trigger opportunity for the 4 models according to the remaining period, for example, may preferentially trigger a model with a higher priority, may preferentially trigger a model with a longer time consumption, and so on.
For example, if the priorities of the 4 models are: face detection model > human feature extraction model = face attribute detection model; the time consuming of the 4 models is respectively: face detection model (60 ms) > human feature extraction model (50 ms) > facial feature extraction model (20 ms) > facial attribute detection model (10 ms). Then the face detection model may be triggered preferentially and the facial feature extraction model may be triggered again with a remaining time of 80ms, the remaining time being exhausted. In this case, after the second scheduling task is started next time, the human feature extraction model may be triggered preferentially, and then the face attribute detection model may be triggered. In the case where the remaining time is 50ms, the human feature extraction model may be preferentially triggered, and the remaining time is exhausted.
In the sending processing part, the processing progress of each image frame can be queried, when the image frames are completely processed by 5 models, a sending thread can be started, the sending thread can perform jpeg encoding on processing results queried from a database and store the encoded pictures into an OSS, the OSS can return a storage address provided for the pictures to the sending thread, and the sending thread can package the storage address and the processing results acquired from the database into an http message to be sent to a cloud for further processing.
It is known that in this exemplary traffic recognition scenario, 5 models can time-share NPU resources, so that NPU resources can be fully utilized, and in addition, high frame rate and high performance of each model can be simultaneously maintained, so that traffic recognition quality is high.
It should be noted that, the execution subjects of each step of the method provided in the above embodiment may be the same device, or the method may also be executed by different devices. In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations appearing in a specific order are included, but it should be clearly understood that the operations may be performed out of the order in which they appear herein or performed in parallel, the sequence numbers of the operations such as 101, 102, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel.
Fig. 4 is a schematic structural diagram of a computing device according to another exemplary embodiment of the present application. As shown in fig. 4, the computing device includes: memory 40, processor 41 and communication component 42.
A processor 41 coupled to the memory 40 for executing the computer program in the memory 40 for:
triggering a designated initial model according to a preset processing frequency to process the source data;
calculating the triggering interval time corresponding to the initial model according to the preset processing frequency to determine the available time period which can be occupied by the residual model in the triggering interval time;
determining trigger time of each residual model based on a preset scheduling strategy in an available period generated by the initial model so as to control each residual model to complete processing work of each source data in cooperation with the initial model;
wherein, the processing result of each model is saved for the use of the following model according to the requirement.
In an alternative embodiment, processor 41 may be further configured to:
creating a plurality of scheduling tasks with precedence relation, wherein a plurality of models to be scheduled are distributed under the plurality of scheduling tasks according to a group, and the initial models are distributed under the initial scheduling tasks;
and determining the trigger time of each remaining model based on a preset scheduling strategy in the available period generated by the initial model, wherein the trigger time comprises the following steps:
and determining the trigger time of each remaining model based on a preset scheduling strategy within the available period generated by the starting model by utilizing the plurality of scheduling tasks.
In an alternative embodiment, the processor 41 is configured to, in using the plurality of scheduling tasks, determine trigger occasions of each remaining model based on a preset scheduling policy during an available period generated by the starting model:
after the model group under the initial scheduling task is executed, calculating the remaining time periods in the available time periods;
sending a trigger signal to the next scheduling task of the initial scheduling task to trigger the next scheduling task to determine trigger time of each model in a model group distributed below the next scheduling task based on the scheduling strategy;
if the remaining time period is not consumed after the execution of the model group under the next scheduling task is completed, a trigger signal is continuously sent to the subsequent scheduling task until the remaining time period is consumed.
In an alternative embodiment, processor 41 may be further configured to:
if the model group under the initial scheduling task only comprises the initial model, the available time period is used as the residual time period;
and if the model group under the initial scheduling task comprises a plurality of models, subtracting the time periods left after the time consumption of other models except the initial model from the available time periods as the remaining time periods.
In an alternative embodiment, the scheduling policy includes:
preferentially triggering models with backward execution times;
preferentially triggering the model with higher priority; and/or
Preferentially triggering the longer time consuming models.
In an alternative embodiment, the processor 41 may be configured to, in triggering the specified start model to process the source data according to the preset processing frequency:
extracting source data from the input data according to the preset processing frequency;
and inputting the extracted source data into the initial model according to the preset processing frequency so as to trigger the initial model to process the received source data.
In an alternative embodiment, the processor 41 calculates the trigger interval time corresponding to the starting model according to the preset processing frequency, so as to determine the available period that the remaining models can occupy in the trigger interval time, where the available period is available for:
calculating the interval time between the adjacent two triggered times of the initial model according to the preset processing frequency, and taking the interval time as the trigger interval time;
and determining the available time period which can be occupied by the residual model in the triggering interval time according to the triggering interval time corresponding to the starting model and the processing time consumption of the starting model.
In an alternative embodiment, processor 41 may be further configured to:
respectively monitoring the processing progress of the source data received by the initial model each time;
and outputting a model processing result corresponding to the target source data after all the model processing work is completed.
In an alternative embodiment, the source data is an image frame in an image stream, and the preset processing frequency is a frame rate.
In an alternative embodiment, the plurality of models includes a plurality of human detection models, face detection models, human feature extraction models, and facial feature extraction models.
In an alternative embodiment, processor 41 may be further configured to:
aiming at a special model associated with auxiliary operations, executing the special model and the corresponding auxiliary operations;
and taking the operation result of the auxiliary operation as a part of the processing result of the special model.
In an alternative embodiment, the maximum time consumption of a single model of the plurality of models meets the requirement of the preset processing frequency.
Further, as shown in fig. 4, the computing device further includes: power supply assembly 43, and the like. Only some of the components are schematically shown in fig. 4, which does not mean that the computing device only includes the components shown in fig. 4.
It should be noted that, for the technical details of the embodiments of the computing device, reference may be made to the related descriptions of the embodiments of the method system, which are not repeated herein for the sake of brevity, but should not cause any loss of protection scope of the present application.
Accordingly, embodiments of the present application also provide a computer-readable storage medium storing a computer program that, when executed, is capable of implementing the steps of the method embodiments described above that are executable by a computing device.
The memory of FIG. 4 described above is used to store a computer program and may be configured to store various other data to support operations on a computing platform. Examples of such data include instructions for any application or method operating on a computing platform, contact data, phonebook data, messages, pictures, videos, and the like. The memory may be implemented by any type of volatile or nonvolatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The communication assembly of fig. 4 is configured to facilitate wired or wireless communication between the device in which the communication assembly is located and other devices. The device where the communication component is located can access a wireless network based on a communication standard, such as a mobile communication network of WiFi,2G, 3G, 4G/LTE, 5G, etc., or a combination thereof. In one exemplary embodiment, the communication component receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further comprises a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
The power supply assembly shown in fig. 4 provides power for various components of the device in which the power supply assembly is located. The power components may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the devices in which the power components are located.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (15)

1. A model scheduling method, comprising:
triggering a designated initial model according to a preset processing frequency to process the source data;
calculating the triggering interval time corresponding to the initial model according to the preset processing frequency to determine the available time period which can be occupied by the residual model in the triggering interval time;
and determining the trigger time of each residual model based on a preset scheduling strategy in the available period generated by the initial model so as to control each residual model to complete the processing work of each source data in cooperation with the initial model.
2. The method of claim 1, further comprising:
creating a plurality of scheduling tasks with precedence relation, wherein a plurality of models to be scheduled are distributed under the plurality of scheduling tasks according to a group, and the initial models are distributed under the initial scheduling tasks;
and determining the trigger time of each remaining model based on a preset scheduling strategy in the available period generated by the initial model, wherein the trigger time comprises the following steps:
and determining the trigger time of each remaining model based on a preset scheduling strategy within the available period generated by the starting model by utilizing the plurality of scheduling tasks.
3. The method of claim 2, wherein the determining, with the plurality of scheduling tasks, trigger opportunities of each remaining model based on a preset scheduling policy within an available period generated by the starting model, comprises:
after the model group under the initial scheduling task is executed, calculating the remaining time periods in the available time periods;
sending a trigger signal to the next scheduling task of the initial scheduling task to trigger the next scheduling task to determine trigger time of each model in a model group distributed below the next scheduling task based on the scheduling strategy;
if the remaining time period is not consumed after the execution of the model group under the next scheduling task is completed, a trigger signal is continuously sent to the subsequent scheduling task until the remaining time period is consumed.
4. A method according to claim 3, further comprising:
if the model group under the initial scheduling task only comprises the initial model, the available time period is used as the residual time period;
and if the model group under the initial scheduling task comprises a plurality of models, subtracting the time periods left after the time consumption of other models except the initial model from the available time periods as the remaining time periods.
5. The method of claim 1, the scheduling policy comprising:
preferentially triggering models with backward execution times;
preferentially triggering the model with higher priority; and/or
Preferentially triggering the longer time consuming models.
6. The method of claim 1, wherein the triggering the specified start model to process the source data according to the preset processing frequency comprises:
extracting source data from the input data according to the preset processing frequency;
and inputting the extracted source data into the initial model according to the preset processing frequency so as to trigger the initial model to process the received source data.
7. The method according to claim 1, wherein the calculating the trigger interval time corresponding to the starting model according to the preset processing frequency to determine the available period of time that the remaining models can occupy in the trigger interval time includes:
calculating the interval time between the adjacent two triggered times of the initial model according to the preset processing frequency, and taking the interval time as the trigger interval time;
and determining the available time period which can be occupied by the residual model in the triggering interval time according to the triggering interval time corresponding to the starting model and the processing time consumption of the starting model.
8. The method of claim 1, further comprising:
respectively monitoring the processing progress of the source data received by the initial model each time;
and outputting a model processing result corresponding to the target source data after all the model processing work is completed.
9. The method of claim 1, the source data being image frames in an image stream, the preset processing frequency being a frame rate.
10. The method of claim 9, the plurality of models comprising a plurality of human detection models, face detection models, human feature extraction models, or facial feature extraction models.
11. The method of claim 1, further comprising:
aiming at a special model associated with auxiliary operations, executing the special model and the corresponding auxiliary operations;
and taking the operation result of the auxiliary operation as a part of the processing result of the special model.
12. The method of claim 1, wherein a maximum time consumption of a single model of the plurality of models meets the requirement of the preset processing frequency.
13. The method of claim 1, further comprising:
and storing the processing results of each model for the later models to be used as required.
14. A computing device comprising a memory, a processor, and a communication component;
the memory is used for storing one or more computer instructions;
the processor is coupled with the memory and the communication component for executing the one or more computer instructions for:
triggering a designated initial model according to a preset processing frequency to process the source data;
calculating the triggering interval time corresponding to the initial model according to the preset processing frequency to determine the available time period which can be occupied by the residual model in the triggering interval time;
and determining the trigger time of each residual model based on a preset scheduling strategy in the available period generated by the initial model so as to control each residual model to complete the processing work of each source data in cooperation with the initial model.
15. A computer-readable storage medium storing computer instructions that, when executed by one or more processors, cause the one or more processors to perform the model scheduling method of any one of claims 1-13.
CN202210551926.4A 2022-05-18 2022-05-18 Model scheduling method, device and storage medium Pending CN117132871A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210551926.4A CN117132871A (en) 2022-05-18 2022-05-18 Model scheduling method, device and storage medium
PCT/CN2023/094370 WO2023221949A1 (en) 2022-05-18 2023-05-15 Model scheduling method and device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210551926.4A CN117132871A (en) 2022-05-18 2022-05-18 Model scheduling method, device and storage medium

Publications (1)

Publication Number Publication Date
CN117132871A true CN117132871A (en) 2023-11-28

Family

ID=88834641

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210551926.4A Pending CN117132871A (en) 2022-05-18 2022-05-18 Model scheduling method, device and storage medium

Country Status (2)

Country Link
CN (1) CN117132871A (en)
WO (1) WO2023221949A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126323B (en) * 2016-06-17 2019-11-22 广州商品清算中心股份有限公司 Real-time task scheduling method based on cloud platform
CN106773711B (en) * 2017-01-13 2019-09-17 清华大学 A kind of the hybrid tasks scheduling method and model of railway locomotive operation steerable system
CN110706263B (en) * 2019-09-30 2023-06-06 武汉工程大学 Image processing method, device, equipment and computer readable storage medium
CN113935472A (en) * 2021-11-04 2022-01-14 科大讯飞股份有限公司 Model scheduling processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
WO2023221949A1 (en) 2023-11-23

Similar Documents

Publication Publication Date Title
US10936359B2 (en) Task resource scheduling method and apparatus
US11102641B2 (en) SIM card status determination method and SIM card status determination device
CN107766132A (en) Multi-task scheduling method, application server and computer-readable recording medium
CN110069341B (en) Method for scheduling tasks with dependency relationship configured according to needs by combining functions in edge computing
CN104657212A (en) Task scheduling method and system
CN110262847B (en) Application program starting acceleration method and device and machine-readable storage medium
CN110532086B (en) Resource multiplexing method, device, system and storage medium
CN104113576A (en) Method and device for updating client
CN107818012B (en) Data processing method and device and electronic equipment
US20180352293A1 (en) Media information processing method, media information processing apparatus, and storage medium
CN113051054B (en) Method, apparatus and computer readable storage medium for scheduling artificial intelligence platform resources
CN108986810A (en) A kind of method and device for realizing interactive voice by earphone
CN103841562A (en) Time slot resource occupation processing method and time slot resource occupation processing device
CN115278648A (en) Service bearer switching method and device
US10028285B2 (en) Method, device and terminal for allocating network services to data channels
CN117132871A (en) Model scheduling method, device and storage medium
CN112231077B (en) Application scheduling method and electronic equipment
CN112416534A (en) Agent-based task management method and device
CN110955502B (en) Task scheduling method and device
CN107544248B (en) Task optimization method and device in mobile robot
CN114356567A (en) Method, system and equipment for stretching and retracting Slurm cluster
CN113885969A (en) Embedded device, embedded software loading method and storage medium
EP3197228A1 (en) Method and device for user access to base station
CN110536325B (en) Method for improving communication connection success rate of vehicle-mounted intelligent equipment
CN105653552B (en) Structured document processing method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination