CN113505759A

CN113505759A - Multitasking method, multitasking device and storage medium

Info

Publication number: CN113505759A
Application number: CN202111048132.8A
Authority: CN
Inventors: 李江昀; 皇甫玉彬; 刘博伟; 魏冬
Original assignee: University of Science and Technology Beijing USTB
Current assignee: University of Science and Technology Beijing USTB
Priority date: 2021-09-08
Filing date: 2021-09-08
Publication date: 2021-10-15
Anticipated expiration: 2041-09-08
Also published as: CN113505759B

Abstract

The present disclosure relates to a method, an apparatus, and a storage medium for multitasking, the method including: acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set; connecting a plurality of target head networks through a base network to form a multi-task model; training the multitask model by using the data set subjected to the labeling processing; and detecting the acquired real-time monitoring image through the trained multi-task model so as to realize multi-task processing. By adopting the technical means, the problems that in the prior art, a single neural network model cannot meet the service requirement of a complex scene, cannot process a plurality of tasks at the same time and the like are solved.

Description

Multitasking method, multitasking device and storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence, and in particular, to a multitasking method, apparatus, and storage medium.

Background

With the development of science and technology, artificial intelligence is widely applied to various industries. The artificial intelligence is necessarily related to a neural network model, and the existing neural network model can only aim at a single scene or a single task, such as a target detection model, and can only be used for detecting a target object. However, the application of neural network models often involves complex scenarios or multiple tasks. For multiple tasks in a complex scene, in the prior art, multiple network models are needed to complete the tasks in respective corresponding scenes, so that the real-time performance cannot be guaranteed, each image acquired in real time needs to enter the multiple models respectively, the requirements on operating resources are high, and no method is available for actual application and deployment on site.

In the course of implementing the disclosed concept, the inventors found that there are at least the following technical problems in the related art: the single neural network model cannot meet the service requirement of a complex scene, cannot simultaneously process a plurality of tasks and the like.

Disclosure of Invention

In order to solve the above technical problem or at least partially solve the above technical problem, embodiments of the present disclosure provide a method, an apparatus, and a storage medium for multitasking, so as to solve at least the problems in the prior art that safety monitoring in the coal industry needs to rely on hardware sensors and the alarm content is single.

The purpose of the present disclosure is realized by the following technical scheme:

in a first aspect, an embodiment of the present disclosure provides a method for multitasking, including: acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set; connecting a plurality of target head networks through a base network to form a multi-task model; training the multitask model by using the data set subjected to the labeling processing; and detecting the acquired real-time monitoring image through the trained multi-task model so as to realize multi-task processing.

In one exemplary embodiment, includes: the base network is a part of network left after the light weight neural network removes the full connection layer and the output layer; the target head network includes: a multilayer convolutional layer and an output layer, the output layer comprising: a fully connected layer and a normalized index layer.

In one exemplary embodiment, the plurality of target head networks includes: a dividing head network, a detecting head network and a classifying head network; the base network is connected with the dividing head network and is used for processing a track adjusting task of a conveyor belt; the base network is connected with the detection head network and is used for processing detection tasks of abnormal events; and the base network is connected with the classification head network and is used for processing the control task of the power of the conveyor belt.

In an exemplary embodiment, the training the multitask model using the dataset after the labeling process comprises: connecting the whole of each target head network with the base network to serve as a model to be trained, and respectively training a plurality of models to be trained; wherein, in the whole training process of the plurality of models to be trained: training for the first time, wherein under the condition that the parameters of the base network are not frozen, a current model to be trained is trained so as to update the parameters of the base network and the current target head network according to a training result; training a current model to be trained under the condition that the base network is frozen instead of the first training so as to update parameters of a current target head network according to the training result; after a plurality of models to be trained are respectively trained, the multitask model is trained under the condition that the parameters of the base network are not frozen, so that the parameters of the base network and the target head networks are updated according to the training results.

In an exemplary embodiment, the annotating the data set comprises: constructing a segmentation label: marking edge points of a conveyor belt in the historical monitoring image; constructing a detection label: marking a first target object in the historical monitoring image, wherein the first target object is related to an abnormal event; constructing a classification label: and marking the historical monitoring image according to whether a second target object exists in the conveyor belt.

In an exemplary embodiment, updating the parameters according to the training results includes: updating the parameters according to the calculation result of the loss function; and/or updating the parameter according to an inverted value of the gradient of the parameter, including: calculating the gradient of the parameter to obtain a binary list about the gradient and the parameter; mapping the parameter to a gradient transformation factor of the gradient according to the list of tuples; and obtaining a reversal value of the gradient according to the mapping, and updating the parameter according to the reversal value.

In an exemplary embodiment, the detecting, by the trained multitask model, the acquired real-time monitoring image to implement multitasking includes: when the multitask model detects that a conveyor belt in a real-time monitoring image deviates from a preset track, starting a track adjusting task of the conveyor belt; and/or starting a control task of the conveyor belt power according to the fact that whether a second target object exists on the conveyor belt or not is detected by the multitask model; and/or starting a detection task of an abnormal event, and sending an alarm when the multi-task model detects that the first target object exists in the real-time monitoring image.

In a second aspect, an embodiment of the present disclosure provides a method for multitasking, including: acquiring a task instruction to be processed, and determining a task head network from a plurality of target head networks according to the task instruction to be processed; connecting the task head networks through a base network to form a task model to be processed, wherein a plurality of task models formed by connecting the base network with the plurality of target head networks are trained, learn and store corresponding relations between input images and output detection results of the task models, and the plurality of task models comprise the task model to be processed; and detecting the acquired real-time monitoring image through the to-be-processed task model so as to complete the to-be-processed task.

In a third aspect, an embodiment of the present disclosure provides an apparatus for multitasking, including: the system comprises a labeling module, a data processing module and a data processing module, wherein the labeling module is used for acquiring a historical monitoring image, making the historical monitoring image into a data set and labeling the data set; the model module is used for connecting a plurality of target head networks through a base network to form a multi-task model; a training module for training the multi-tasking model using the dataset after the labeling process; and the detection module is used for detecting the acquired real-time monitoring image through the trained multi-task model so as to realize multi-task processing.

In a fourth aspect, embodiments of the present disclosure provide an electronic device. The electronic equipment comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus; a memory for storing a computer program; the processor is configured to implement the multitasking method or the image processing method as described above when executing the program stored in the memory.

In a fifth aspect, embodiments of the present disclosure provide a computer-readable storage medium. The above-mentioned computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the method of multitasking or the method of image processing as described above.

Compared with the prior art, the technical scheme provided by the embodiment of the disclosure at least has part or all of the following advantages: acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set; connecting a plurality of target head networks through a base network to form a multi-task model; training the multitask model by using the data set subjected to the labeling processing; and detecting the acquired real-time monitoring image through the trained multi-task model so as to realize multi-task processing. Because the embodiment of the disclosure can connect a plurality of target head networks through the base network to form the multitask model, and the obtained real-time monitoring image is detected through the trained multitask model to realize multitask processing, by adopting the technical means, the problems that a single neural network model cannot meet the service requirement of a complex scene, cannot process a plurality of tasks at the same time and the like in the prior art can be solved, and then the efficiency of processing the tasks in artificial intelligence is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the related art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

Fig. 1 schematically illustrates a hardware configuration block diagram of a computer terminal of a method of multitasking according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flow chart of a method of multitasking according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a schematic diagram of a training segmentation head network in accordance with an embodiment of the present disclosure;

FIG. 4 schematically illustrates a schematic diagram of training a network of detection heads in accordance with an embodiment of the present disclosure;

FIG. 5 schematically illustrates a schematic diagram of a training classification head network of an embodiment of the present disclosure;

FIG. 6 schematically illustrates a schematic diagram of training an overall multitask model according to an embodiment of the present disclosure;

FIG. 7 schematically illustrates a multitasking model diagram for detecting real-time monitored images according to an embodiment of the disclosure;

FIG. 8 is a block diagram schematically illustrating an apparatus for multitasking according to an embodiment of the present disclosure;

fig. 9 schematically shows a block diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments. It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

The method embodiments provided by the embodiments of the present disclosure may be executed in a computer terminal or a similar computing device. Taking the example of running on a computer terminal, fig. 1 schematically shows a hardware block diagram of a computer terminal of a method for multitasking according to an embodiment of the present disclosure. As shown in fig. 1, a computer terminal may include one or more processors 102 (only one is shown in fig. 1), wherein the processors 102 may include but are not limited to a processing device such as a Microprocessor (MPU) or a Programmable Logic Device (PLD) and a memory 104 for storing data, and optionally, the computer terminal may further include a transmission device 106 for communication function and an input/output device 108, it is understood by those skilled in the art that the structure shown in fig. 1 is merely illustrative and not a limitation to the structure of the computer terminal, for example, the computer terminal may further include more or less components than those shown in fig. 1, or have equivalent functions or different configurations than those shown in fig. 1.

The memory 104 can be used for storing computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the method of multitasking in the embodiment of the present disclosure, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to a computer terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.

In an embodiment of the present disclosure, a method of multitasking is provided, and fig. 2 schematically illustrates a flowchart of a method of multitasking in an embodiment of the present disclosure, where as shown in fig. 2, the flowchart includes the following steps:

step S202, acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set;

step S204, connecting a plurality of target head networks through a base network to form a multitask model;

step S206, training the multitask model by using the data set subjected to the labeling processing;

and S208, detecting the acquired real-time monitoring image through the trained multitask model so as to realize multitask processing.

The embodiment of the disclosure can be applied to coal mine industrial methods: and obtaining a historical monitoring image of the coal mine industry, training the multitask model according to the historical monitoring image, and detecting the obtained real-time monitoring image of the coal mine industry by the trained multitask model so as to realize multitask processing of the coal mine industry.

Through the method, a historical monitoring image is obtained, the historical monitoring image is made into a data set, and the data set is subjected to labeling processing; connecting a plurality of target head networks through a base network to form a multi-task model; training the multitask model by using the data set subjected to the labeling processing; and detecting the acquired real-time monitoring image through the trained multi-task model so as to realize multi-task processing. Because the embodiment of the disclosure can connect a plurality of target head networks through the base network to form the multitask model, and the obtained real-time monitoring image is detected through the trained multitask model to realize multitask processing, by adopting the technical means, the problems that a single neural network model cannot meet the service requirement of a complex scene, cannot process a plurality of tasks at the same time and the like in the prior art can be solved, and then the efficiency of processing the tasks in artificial intelligence is improved.

The base network is a part of network left after the light weight neural network removes the full connection layer and the output layer; the target head network includes: a multilayer convolutional layer and an output layer, the output layer comprising: a fully connected layer and a normalized index layer.

The light-weight neural network can be MobileNetV3, and the base network is a partial network remaining after the light-weight neural network removes the full connection layer and the output layer, that is, the base network is a partial network removing the last stage network for the light-weight neural network, wherein the light-weight neural network comprises a plurality of stages of networks. It should be noted that the output layer may further include: and (4) rolling up the layers.

The plurality of target head networks includes: a dividing head network, a detecting head network and a classifying head network; the base network is connected with the dividing head network and is used for processing a track adjusting task of a conveyor belt; the base network is connected with the detection head network and is used for processing detection tasks of abnormal events; and the base network is connected with the classification head network and is used for processing the control task of the power of the conveyor belt.

The split head network is a head network of the split network, the detection head network is a head network of the detection network, and the classification head network is a head network of the classification network. In the embodiment of the disclosure, multitasking can be realized only by one base network and a plurality of target head networks. The dividing head network can detect the position of the edge point of the conveyor belt, and the track adjustment of the conveyor belt can be realized by comparing the position of the edge point of the conveyor belt with a preset track, so that the base network is connected with the dividing head network and can process the track adjustment task of the conveyor belt. Wherein the predetermined trajectory is a trajectory in a normal case of the conveyor belt. The base network is connected with the detection head network and can detect abnormal events. In the disclosed embodiment, the classification head network is trained to detect whether a second target object exists on the conveyor belt, such as in the coal mine industry field, the second target object may be a person or coal, etc., the person on the conveyor belt is detected, the conveyor belt should be turned off for personnel safety, no coal is on the conveyor belt for detecting that the coal is transported, the time period that the coal is not on the conveyor belt is longer than a preset time period, and the conveyor belt should be turned off for equipment energy conservation. On the contrary, if the coal on the conveyor belt for conveying the coal is detected, and the volume or the weight of the coal on the conveyor belt is larger than the preset time, the closed conveyor belt is opened for normal operation of the equipment. Switched conveyors may be understood to control the power of the conveyor, so that the base network is connected to the network of sorting heads for handling the control tasks of the conveyor power.

In step S206, training the multitask model using the data set after the labeling process includes: connecting the whole of each target head network with the base network to serve as a model to be trained, and respectively training a plurality of models to be trained; wherein, in the whole training process of the plurality of models to be trained: training for the first time, wherein under the condition that the parameters of the base network are not frozen, a current model to be trained is trained so as to update the parameters of the base network and the current target head network according to a training result; training a current model to be trained under the condition that the base network is frozen instead of the first training so as to update parameters of a current target head network according to the training result; after a plurality of models to be trained are respectively trained, the multitask model is trained under the condition that the parameters of the base network are not frozen, so that the parameters of the base network and the target head networks are updated according to the training results.

A multi-task model to be trained comprises a plurality of models to be trained, and the plurality of models to be trained are trained according to a certain sequence. The training result may be information such as a calculation result of a loss function. In the embodiment of the present disclosure, one multitask model is regarded as a plurality of models to be trained, and a plurality of times of training of the multitask model are regarded as training the plurality of models to be trained respectively. During the whole training process of the plurality of models to be trained: and training the current model to be trained for the first time under the condition that the parameters of the base network are not frozen, and training the current model to be trained for the second time under the condition that the parameters of the base network are frozen. After a plurality of models to be trained are respectively trained, the parameters of the base network are not frozen, and the multi-task model is trained, wherein the training of the step is equivalent to the training of the whole network and is used for fine tuning the parameters of the multi-task model.

In the prior art, an original method for solving the multitask is to stack a plurality of different single models, and the embodiment of the disclosure provides the method for solving the multitask by using one multitask model, so that the real-time performance of solving the multitask is improved.

In step S202, the labeling process is performed on the data set, and includes: constructing a segmentation label: marking edge points of a conveyor belt in the historical monitoring image; constructing a detection label: marking a first target object in the historical monitoring image, wherein the first target object is related to an abnormal event; constructing a classification label: and marking the historical monitoring image according to whether a second target object exists in the conveyor belt.

Marking the edge point of the conveyor belt in the historical monitoring image can be that the pixel value of the position of the edge point of the conveyor belt is marked as 1, and the pixel point values of other positions of the conveyor belt are marked as 0. For example, in the coal mine industry, the first target object may be a person, a smoke, an open fire, a fire extinguisher and other targets, and the abnormal event may be the entrance of a person into a prohibited area, a fire, the shortage of fire equipment such as smoke and fire extinguishers and the like. The second target object may be a person or coal or the like.

In step S206, updating the parameters according to the training result includes: updating the parameters according to the calculation result of the loss function; and/or updating the parameter according to an inverted value of the gradient of the parameter, including: calculating the gradient of the parameter to obtain a binary list about the gradient and the parameter; mapping the parameter to a gradient transformation factor of the gradient according to the list of tuples; and obtaining a reverse transmission value of the gradient according to the mapping, and updating the parameter according to the reverse transmission value.

In the embodiment of the disclosure, the accuracy of identification is enhanced by a method of reversing the gradient. The mapping of the parameter to the gradient transformation factor of the gradient is realized according to the binary list, and the mapping relationship can be adjusted according to specific situations, for example, the gradient transformation factor can take a negative one in some situations.

In an alternative embodiment, the parameters may also be updated by gradient back-propagation: and obtaining a calculation result of the loss function, and updating the parameters by using a gradient descent method according to the calculation result. Wherein the gradient descent method comprises: the gradient propagates in the opposite direction.

In step S208, detecting the acquired real-time monitoring image through the trained multitask model to implement multitask processing, including: when the multitask model detects that a conveyor belt in a real-time monitoring image deviates from a preset track, starting a track adjusting task of the conveyor belt; and/or starting a control task of the conveyor belt power according to the fact that whether a second target object exists on the conveyor belt or not is detected by the multitask model; and/or starting a detection task of an abnormal event, and sending an alarm when the multi-task model detects that the first target object exists in the real-time monitoring image.

Inputting the real-time monitoring image into a multitask model, detecting that a conveyor belt in the real-time monitoring image deviates from a preset track by a part of a base network in the multitask model, namely the position of an edge point of the conveyor belt deviates from the preset track, starting a track adjusting task of the conveyor belt at the moment, and adjusting the track of the conveyor belt according to the position of the edge point of the conveyor belt and the preset track so as to enable the position of the edge point of the conveyor belt to return to the preset track.

Taking the application of the coal mine industry as an example, the real-time monitoring image is input into the multitask model, and the part of the base network in the multitask model, which is connected with the detection head network, detects the first target object such as people, smoke, open fire, fire extinguishers and the like in the real-time monitoring image and judges whether an abnormal event occurs. For example, an open fire is detected, indicating that a fire has occurred; when people are detected in the forbidden area, the explanation staff enter the forbidden area, the number of fire-fighting equipment such as fire extinguishers is detected to be smaller than the preset number, and the fire-fighting equipment such as the fire extinguishers is in short supply. Inputting the real-time monitoring image into a multitask model, detecting that a second target object exists on a conveyor belt in the real-time monitoring image by a part of a base network in the multitask model, judging whether a control task of the power of the conveyor belt is started, for example, detecting that people exist on the conveyor belt, turning off the conveyor belt, detecting that no coal exists on the conveyor belt, wherein the time length of the coal does not exist on the conveyor belt is longer than the preset time length, and in order to save energy of equipment, the conveyor belt should be turned off. On the contrary, if the coal on the conveyor belt for conveying the coal is detected, and the volume or the weight of the coal on the conveyor belt is larger than the preset time, the closed conveyor belt is opened for normal operation of the equipment. Switched conveyors may be understood to control the power of the conveyor, so that the base network is connected to the network of sorting heads for handling the control tasks of the conveyor power.

The embodiment of the disclosure provides a multitasking method, which comprises the following steps: acquiring a task instruction to be processed, and determining a task head network from a plurality of target head networks according to the task instruction to be processed; connecting the task head networks through a base network to form a task model to be processed, wherein a plurality of task models formed by connecting the base network with the plurality of target head networks are trained, learn and store corresponding relations between input images and output detection results of the task models, and the plurality of task models comprise the task model to be processed; and detecting the acquired real-time monitoring image through the to-be-processed task model so as to complete the to-be-processed task.

According to the method, a task instruction to be processed is obtained, and a task head network is determined from a plurality of target head networks according to the task instruction to be processed; connecting the task head networks through a base network to form a task model to be processed, wherein a plurality of task models formed by connecting the base network with the plurality of target head networks are trained, learn and store corresponding relations between input images and output detection results of the task models, and the plurality of task models comprise the task model to be processed; and detecting the acquired real-time monitoring image through the to-be-processed task model so as to complete the to-be-processed task. Because the task head network is determined from the multiple target head networks according to the task instruction to be processed, the task head network is connected through the base network to form the task model to be processed, and the acquired real-time monitoring image is detected through the task model to be processed to complete the task to be processed, the technical means can solve the problems that in the prior art, a single neural network model cannot meet the service requirement of a complex scene, cannot process multiple tasks simultaneously and the like, and further improve the efficiency of processing tasks in artificial intelligence.

In an optional embodiment, a task instruction to be processed is obtained, and a task model to be processed is determined from a model database according to the task instruction to be processed, wherein the model database includes a plurality of task models, the plurality of task models include the task model to be processed, each task model corresponds to one application scene, and each task model learns and saves a corresponding relationship between an input image and an output detection result of the task model through training in the application scene corresponding to each task model; and detecting the acquired real-time monitoring image through the to-be-processed task model so as to complete the to-be-processed task.

It should be noted that, the present disclosure only exemplifies the coal mine industry, and exemplifies a multitasking method in the coal mine industry, and actually, the multitasking method based on the neural network model provided by the present disclosure is applicable to all scenarios of multitasking. Such as the oil industry, the gas industry, the machinery manufacturing, agriculture … … and in turn tasks such as pest control, irrigation and fertilization, etc., can be processed together through a multitask model.

In order to better understand the technical solutions, the embodiments of the present disclosure also provide an alternative embodiment for explaining the technical solutions.

Fig. 3 schematically illustrates a schematic diagram of training a partition head network according to an embodiment of the present disclosure, as shown in fig. 3:

acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set; processing the data set after the labeling processing by using a data enhancement technology to obtain a training data set, wherein the data enhancement technology comprises the following steps: the image horizontal turning, vertical turning, multi-scale transformation, random angle rotation and other technologies, and the training data set comprises: the transformed image and the transformed label, the data set after the labeling process comprising: training set images and training set labels; and training the whole network of the base network connected with the dividing head network by using a training data set, and updating the parameters of the base network connected with the dividing head network according to the loss function calculation result obtained by training and gradient inversion or gradient back transmission. A cross entropy loss function is used as the loss function for the section.

The base network connects the entire network of the segment head network as a part of the multitask model, and the training of the entire network of the base network connected to the segment head network is performed as one training of the multitask model. Wherein, in the whole training process of the plurality of models to be trained: training for the first time, wherein under the condition that the parameters of the base network are not frozen, a current model to be trained is trained so as to update the parameters of the base network and the current target head network according to a training result; training a current model to be trained under the condition that the base network is frozen instead of the first training so as to update parameters of a current target head network according to the training result; after a plurality of models to be trained are respectively trained, the multitask model is trained under the condition that the parameters of the base network are not frozen, so that the parameters of the base network and the target head networks are updated according to the training results. FIG. 3 illustrates the first training of the task model.

Fig. 4 schematically illustrates a schematic diagram of training a network of detection heads according to an embodiment of the present disclosure, as shown in fig. 4:

acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set; processing the data set after the labeling processing by using a data enhancement technology to obtain a training data set, wherein the data enhancement technology comprises the following steps: the image horizontal turning, vertical turning, multi-scale transformation, random angle rotation, CutMix transformation and other technologies, and the training data set comprises: the transformed image and the transformed label, the data set after the labeling process comprising: training set images and training set labels; and training the whole network of the base network connected with the detection head network by using a training data set, and updating the parameters of the base network connected with the detection head network according to the loss function calculation result obtained by training and gradient inversion or gradient back transmission. FIG. 4 illustrates non-first training of the task model. The partial detection loss function includes: the category information of the detected target object adopts a cross entropy function, the confidence loss adopts a two-category cross entropy function, and the positioning loss adopts a GIOU (generalized cross-correlation-comparison) loss function. The base network connects the entire network of the detection head network as part of a multitasking model. CutMix is a data enhancement method.

Fig. 5 schematically illustrates a schematic diagram of training a classification head network according to an embodiment of the present disclosure, as shown in fig. 5:

acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set; processing the data set after the labeling processing by using a data enhancement technology to obtain a training data set, wherein the data enhancement technology comprises the following steps: the image horizontal turning, vertical turning, multi-scale transformation, random angle rotation and other technologies, and the training data set comprises: the transformed image and the transformed label, the data set after the labeling process comprising: training set images and training set labels; and training the whole network of the base network connected with the classification head network by using a training data set, and updating the parameters of the base network connected with the classification head network according to the loss function calculation result obtained by training and gradient inversion or gradient back transmission. FIG. 5 illustrates non-first training of the task model. This partial detection uses a two-class cross entropy loss function. The base network connects the overall network of the classification head network as part of a multitasking model.

FIG. 6 schematically illustrates a diagram of training a multitask model according to an embodiment of the present disclosure, as shown in FIG. 6:

acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set; processing the data set after the labeling processing by using a data enhancement technology to obtain a training data set, wherein the data enhancement technology comprises the following steps: the image horizontal turning, vertical turning, multi-scale transformation, random angle rotation and other technologies, and the training data set comprises: the transformed image and the transformed label, the data set after the labeling process comprising: training set images and training set labels; and training the multitask model by using a training data set, and updating parameters of the multitask model according to a loss function calculation result obtained by training and gradient inversion or gradient back transmission. Fig. 6 shows the final training of the task model. The loss functions used in this section include: two classes of cross-entropy loss function, and GIOU (generalized cross-over ratio) loss function.

Fig. 7 schematically illustrates a schematic diagram of detecting a real-time monitoring image according to an embodiment of the present disclosure, as shown in fig. 7:

taking the coal industry as an example, in the multitasking model: the base network is connected with the dividing head network, can detect the edge points of the conveyor belt and judges whether to start the track adjusting task of the conveyor belt according to the edge points of the conveyor belt; the base network is connected with the detection head network, so that whether a person, smoke, an open fire, a fire extinguisher and other targets exist can be detected, and whether abnormal events such as personnel entering a forbidden area, fire, smoke, fire extinguisher and other fire-fighting equipment shortage occur is further judged; the base network is connected with the classification head network, and can detect whether people or coal exist on the conveyor belt, further judge whether the conveyor belt is switched on or off, and control the power of the conveyor belt.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present disclosure or portions thereof that contribute to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, and an optical disk), and includes several instructions for enabling a terminal device (which may be a mobile phone, a computer, a component server, or a network device) to execute the methods of the embodiments of the present disclosure.

In this embodiment, a multitasking device is further provided, where the multitasking device is used to implement the foregoing embodiments and preferred embodiments, and details are not described again after the description is given. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 8 is a block diagram schematically illustrating a multitasking device according to an alternative embodiment of the present disclosure, where, as shown in fig. 8, the device includes:

the labeling module 802 is configured to acquire a historical monitoring image, create the historical monitoring image into a data set, and label the data set;

a model module 804, configured to connect multiple target head networks through a base network to form a multitask model;

a training module 806, configured to train the multi-tasking model using the dataset after the labeling process;

and the detection module 808 is configured to detect the acquired real-time monitoring image through the trained multitask model, so as to implement multitask processing.

Optionally, the training module 806 is further configured to connect the whole of each target head network with the base network as a model to be trained, and train a plurality of models to be trained respectively; wherein, in the whole training process of the plurality of models to be trained: training for the first time, wherein under the condition that the parameters of the base network are not frozen, a current model to be trained is trained so as to update the parameters of the base network and the current target head network according to a training result; training a current model to be trained under the condition that the base network is frozen instead of the first training so as to update parameters of a current target head network according to the training result; after a plurality of models to be trained are respectively trained, the multitask model is trained under the condition that the parameters of the base network are not frozen, so that the parameters of the base network and the target head networks are updated according to the training results.

Optionally, the labeling module 802 is further configured to construct a segmentation label: marking edge points of a conveyor belt in the historical monitoring image; constructing a detection label: marking a first target object in the historical monitoring image, wherein the first target object is related to an abnormal event; constructing a classification label: and marking the historical monitoring image according to whether a second target object exists in the conveyor belt.

Optionally, the training module 806 is further configured to update the parameter according to the calculation result of the loss function; and/or updating the parameter according to an inverted value of the gradient of the parameter, including: calculating the gradient of the parameter to obtain a binary list about the gradient and the parameter; mapping the parameter to a gradient transformation factor of the gradient according to the list of tuples; and obtaining a reverse transmission value of the gradient according to the mapping, and updating the parameter according to the reverse transmission value.

Optionally, the detecting module 808 is further configured to start a track adjusting task of the conveyor belt when the multitask model detects that the conveyor belt in the real-time monitoring image deviates from a predetermined track; and/or starting a control task of the conveyor belt power according to the fact that whether a second target object exists on the conveyor belt or not is detected by the multitask model; and/or starting a detection task of an abnormal event, and sending an alarm when the multi-task model detects that the first target object exists in the real-time monitoring image.

In an embodiment of the present disclosure, an apparatus for multitasking is provided, including:

a first obtaining module for obtaining the task instruction to be processed,

the determining module is used for determining a task head network from a plurality of target head networks according to the task instruction to be processed;

the connection module is used for connecting the task head networks through a base network to form a task model to be processed, wherein the base network is connected with a plurality of task models formed by the target head networks, and the corresponding relation between the input images and the output detection results of the task models is learned and stored through training, and the task models comprise the task model to be processed;

and the first processing task module is used for detecting the acquired real-time monitoring image through the to-be-processed task model so as to complete the to-be-processed task.

In an alternative embodiment, there is provided an apparatus for multitasking, comprising:

a second obtaining module, configured to obtain a task instruction to be processed, and determine a task model to be processed from a model database according to the task instruction to be processed, where the model database includes a plurality of task models, the plurality of task models include the task model to be processed, each task model corresponds to one application scenario, and each task model has been trained in the application scenario corresponding to each task model, and learns and stores a correspondence between an input image and an output detection result of the task model;

and the second processing task module is used for detecting the acquired real-time monitoring image through the to-be-processed task model so as to complete the to-be-processed task.

It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

Embodiments of the present disclosure provide an electronic device.

Referring to fig. 9, an electronic device 900 provided in the embodiment of the present disclosure includes a processor 901, a communication interface 902, a memory 903, and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 complete communication with each other through the communication bus 904; a memory 903 for storing computer programs; the processor 901 is configured to implement the steps in any one of the above method embodiments when executing the program stored in the memory.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the input/output device is connected to the processor.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set;

s2, connecting a plurality of target head networks through a base network to form a multitask model;

s3, training the multitask model by using the data set after the labeling processing;

and S4, detecting the acquired real-time monitoring image through the trained multitask model to realize multitask processing.

Embodiments of the present disclosure also provide a computer-readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of any of the method embodiments described above.

Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.

It will be apparent to those skilled in the art that the modules or steps of the present disclosure described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. As such, the present disclosure is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. A method of multitasking, comprising:

acquiring a historical monitoring image, making the historical monitoring image into a data set, and labeling the data set;

connecting a plurality of target head networks through a base network to form a multi-task model;

training the multitask model by using the data set subjected to the labeling processing;

and detecting the acquired real-time monitoring image through the trained multi-task model so as to realize multi-task processing.

2. The method of claim 1, comprising:

the base network is a part of network left after the light weight neural network removes the full connection layer and the output layer;

the target head network includes: a multilayer convolutional layer and an output layer, the output layer comprising: a fully connected layer and a normalized index layer.

3. The method of claim 1, wherein the plurality of target head networks comprises: a dividing head network, a detecting head network and a classifying head network;

the base network is connected with the dividing head network and is used for processing a track adjusting task of a conveyor belt;

the base network is connected with the detection head network and is used for processing detection tasks of abnormal events;

and the base network is connected with the classification head network and is used for processing the control task of the power of the conveyor belt.

4. The method of claim 1, wherein the training of the multitask model using the dataset after the labeling process comprises:

connecting the whole of each target head network with the base network to serve as a model to be trained, and respectively training a plurality of models to be trained;

wherein, in the whole training process of the plurality of models to be trained: training for the first time, wherein under the condition that the parameters of the base network are not frozen, a current model to be trained is trained so as to update the parameters of the base network and the current target head network according to a training result;

training a current model to be trained under the condition that the base network is frozen instead of the first training so as to update parameters of a current target head network according to the training result;

after a plurality of models to be trained are respectively trained, the multitask model is trained under the condition that the parameters of the base network are not frozen, so that the parameters of the base network and the target head networks are updated according to the training results.

5. The method of claim 1, wherein the annotating the dataset comprises:

constructing a segmentation label: marking edge points of a conveyor belt in the historical monitoring image;

constructing a detection label: marking a first target object in the historical monitoring image, wherein the first target object is related to an abnormal event;

constructing a classification label: and marking the historical monitoring image according to whether a second target object exists in the conveyor belt.

6. The method of claim 4, wherein updating the parameters according to the training results comprises:

updating the parameters according to the calculation result of the loss function; and/or

Updating the parameter according to an inverted value of the gradient of the parameter, including:

calculating the gradient of the parameter to obtain a binary list about the gradient and the parameter;

mapping the parameter to a gradient transformation factor of the gradient according to the list of tuples;

and obtaining a reversal value of the gradient according to the mapping, and updating the parameter according to the reversal value.

7. The method according to claim 1, wherein the detecting the acquired real-time monitoring image through the trained multitask model to realize multitask processing comprises:

when the multitask model detects that a conveyor belt in a real-time monitoring image deviates from a preset track, starting a track adjusting task of the conveyor belt; and/or

Starting a control task of the conveyor belt power according to whether a second target object exists on the conveyor belt or not detected by the multitask model; and/or

And starting a detection task of an abnormal event, and sending an alarm when the multi-task model detects that the first target object exists in the real-time monitoring image.

8. A method of multitasking, comprising:

the instruction of the task to be processed is obtained,

determining a task head network from a plurality of target head networks according to the task instruction to be processed;

connecting the task head networks through a base network to form a task model to be processed, wherein a plurality of task models formed by connecting the base network with the plurality of target head networks are trained, learn and store corresponding relations between input images and output detection results of the task models, and the plurality of task models comprise the task model to be processed;

and detecting the acquired real-time monitoring image through the to-be-processed task model so as to complete the to-be-processed task.

9. An apparatus for multitasking, comprising:

the system comprises a labeling module, a data processing module and a data processing module, wherein the labeling module is used for acquiring a historical monitoring image, making the historical monitoring image into a data set and labeling the data set;

the model module is used for connecting a plurality of target head networks through a base network to form a multi-task model;

a training module for training the multi-tasking model using the dataset after the labeling process;

and the detection module is used for detecting the acquired real-time monitoring image through the trained multi-task model so as to realize multi-task processing.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1 to 8.