CN113378855A - Method for processing multitask, related device and computer program product - Google Patents

Method for processing multitask, related device and computer program product Download PDF

Info

Publication number
CN113378855A
CN113378855A CN202110690496.XA CN202110690496A CN113378855A CN 113378855 A CN113378855 A CN 113378855A CN 202110690496 A CN202110690496 A CN 202110690496A CN 113378855 A CN113378855 A CN 113378855A
Authority
CN
China
Prior art keywords
feature information
tasks
processed
attention
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110690496.XA
Other languages
Chinese (zh)
Inventor
李莹莹
叶晓青
谭啸
孙昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110690496.XA priority Critical patent/CN113378855A/en
Publication of CN113378855A publication Critical patent/CN113378855A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides a method, an apparatus, an electronic device, a storage medium, and a computer program product for processing multitasks, which relate to the field of artificial intelligence, and in particular to computer vision and deep learning technologies, and may be used in smart cities and smart traffic scenarios. The specific implementation scheme is as follows: respectively extracting the characteristics of data to be processed through a plurality of branch networks in the multitask model to obtain the characteristic information corresponding to a plurality of tasks to be executed; for each piece of obtained feature information, fusing other feature information on the basis of the feature information according to the correlation among a plurality of tasks to obtain fused feature information corresponding to the feature information; and processing the fusion characteristic information corresponding to each branch network to obtain a multitask processing result. The present disclosure improves the accuracy of multitasking results based on fusing feature information.

Description

Method for processing multitask, related device and computer program product
Technical Field
The present disclosure relates to the field of artificial intelligence, and more particularly to computer vision and deep learning techniques, and more particularly to methods, apparatus, electronic devices, storage media, and computer program products for handling multitasking, which may be used in smart cities and smart traffic scenarios.
Background
For the multitasking situation, generally, corresponding tasks are processed in a plurality of branch networks in the network model to obtain a final multitasking result. In the multitasking process, the operations such as feature extraction, processing and the like are independently performed among the plurality of branch networks, and respective task results are obtained.
Disclosure of Invention
The present disclosure provides a method, an apparatus, an electronic device, a storage medium, and a computer program product for processing multitasking.
According to a first aspect, there is provided a method for processing multitasking, comprising: respectively extracting the characteristics of data to be processed through a plurality of branch networks in the multitask model to obtain the characteristic information corresponding to a plurality of tasks to be executed; for each piece of obtained feature information, fusing other feature information on the basis of the feature information according to the correlation among a plurality of tasks to obtain fused feature information corresponding to the feature information; and processing the fusion characteristic information corresponding to each branch network to obtain a multitask processing result.
According to a second aspect, there is provided an apparatus for processing multitasking, comprising: the extraction unit is configured to respectively extract the features of the data to be processed through a plurality of branch networks in the multitask model to obtain feature information corresponding to a plurality of tasks to be executed; the fusion unit is configured to fuse other feature information on the basis of the feature information according to the correlation among a plurality of tasks for each of the obtained plurality of feature information to obtain fusion feature information corresponding to the feature information; and the processing unit is configured to process the fusion characteristic information corresponding to each branch network to obtain a multitasking result.
According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in any one of the implementations of the first aspect.
According to a fourth aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method as described in any one of the implementations of the first aspect.
According to a fifth aspect, there is provided a computer program product comprising: computer program which, when being executed by a processor, carries out the method as described in any of the implementations of the first aspect.
According to the technology disclosed by the invention, the feature information extracted by each branch network is fused with other branch networks to obtain the feature information based on the correlation among a plurality of tasks to be executed, so that the fused feature information comprising the correlation information among the tasks is obtained, and the accuracy of the multitask processing result obtained based on the fused feature information is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is an exemplary system architecture diagram in which one embodiment according to the present disclosure may be applied;
FIG. 2 is a flow diagram for one embodiment of a method for processing multitasking according to the present disclosure;
FIG. 3 is a schematic diagram of an application scenario of the method for processing multitasking according to the present embodiment;
FIG. 4 is a flow diagram of yet another embodiment of a method for processing multitasking according to the present disclosure;
FIG. 5 is a schematic block diagram illustrating yet another embodiment of a method for processing multitasking according to the present disclosure;
FIG. 6 is a block diagram of one embodiment of an apparatus for processing multitasking according to the present disclosure;
FIG. 7 is a schematic block diagram of a computer system suitable for use in implementing embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.
Fig. 1 illustrates an exemplary architecture 100 to which the disclosed method and apparatus for processing multitasking may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The communication connections between the terminal devices 101, 102, 103 form a topological network, and the network 104 serves to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The terminal devices 101, 102, 103 may be hardware devices or software that support network connections for data interaction and data processing. When the terminal devices 101, 102, and 103 are hardware, they may be various electronic devices supporting network connection, information acquisition, interaction, display, processing, and other functions, including but not limited to vehicle-mounted smart devices, monitoring devices, smart phones, tablet computers, e-book readers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented, for example, as multiple software or software modules to provide distributed services, or as a single software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, for example, a background server receiving the to-be-processed image sent by the terminal device and performing multitasking. The server fuses the feature information extracted by each branch network for executing the multitask with other branch networks to obtain feature information based on the correlation among the multiple tasks to be executed, and obtains fusion feature information including the correlation information among the multiple tasks, so as to obtain a multitask processing result. As an example, the server 105 may be a cloud server.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be further noted that the method for processing multitasking provided by the embodiments of the present disclosure may be executed by a server, may also be executed by a terminal device, and may also be executed by the server and the terminal device in cooperation with each other. Accordingly, each part (for example, each unit) included in the apparatus for processing multitask may be entirely provided in the server, may be entirely provided in the terminal device, and may be provided in the server and the terminal device, respectively.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. The system architecture may only include the electronic device (e.g., server or terminal device) on which the method for handling multitasking is running, when the electronic device on which the method for handling multitasking is running does not need to perform data transfer with other electronic devices.
Referring to fig. 2, fig. 2 is a flowchart of a method for processing multitasks according to an embodiment of the present disclosure, wherein the process 200 includes the following steps:
step 201, respectively performing feature extraction on data to be processed through a plurality of branch networks in the multitask model to obtain feature information corresponding to a plurality of tasks to be executed.
In this embodiment, an execution main body (for example, a terminal device or a server in fig. 1) of the method for processing multiple tasks may perform feature extraction on data to be processed through multiple branch networks in a multiple task model, so as to obtain feature information corresponding to each of the multiple tasks to be executed.
The data to be processed can be data which is characterized in a text form, an image form, a video form and an audio form and comprises various contents. The plurality of tasks to be executed may be tasks that perform information processing based on data to be processed to obtain a target result.
As an example, when the data to be processed is data in a text form, the plurality of tasks may be semantic recognition, intention recognition, emotion analysis, text classification, and other tasks for the text to be processed; when the data to be processed is data in an image form, the tasks can be semantic segmentation, target recognition, face authentication and the like aiming at the image to be processed; when the data to be processed is data in a video form, the plurality of tasks may be tasks such as target object tracking, target object recognition, target object segmentation and the like for the video to be processed; when the data to be processed is data in the form of audio, the plurality of tasks may be audio noise reduction, speech synthesis, speech recognition, and the like for the audio to be processed.
The multitask model comprises a plurality of branch networks, and each branch network is used for processing a corresponding task based on the data to be processed. In the process of executing the task, each branch network firstly needs to extract the features of the data to be processed to obtain the feature information corresponding to the branch network. Examples of multitask models are PAD-NET (A permission-Aided Single Image Dehazing Network), MTI-NET (Multi-Scale Task Interaction Networks for Multi-Task Learning-oriented Multi-Scale Task Interaction Network).
In addition, each branch network may adopt an existing network model, such as a convolutional neural network, a residual neural network, a long-term memory network, and the like, or may be an improved model obtained by improving the existing network model. And integrating the plurality of branch networks to obtain a multi-task model for executing a plurality of tasks.
Step 202, for each of the obtained plurality of feature information, fusing other feature information based on the feature information according to the correlation between the plurality of tasks to obtain fused feature information corresponding to the feature information.
In this embodiment, the execution agent may obtain, for each of the plurality of pieces of feature information obtained, fused feature information corresponding to the feature information by fusing other feature information to the feature information according to a correlation between the plurality of tasks.
The relevance among the tasks is characterized in that relevance exists among processing results of the tasks. As an example, there is a correlation between the segmentation result of the semantic segmentation task and the recognition result of the target recognition task for the image to be processed, and there is a correlation between the semantic segmentation task and the target recognition task. As yet another example, there is an association between the semantic result of the semantic recognition task and the intent result of the intent recognition task for the text to be processed, and there is a correlation between the semantic recognition task and the intent recognition task.
In this embodiment, according to the correlation between the multiple tasks to be executed, on the basis of each piece of feature information, other pieces of feature information in the multiple pieces of feature information corresponding to the multiple tasks are fused based on a preset manner, so that fused feature information corresponding to each piece of feature information can be obtained.
As an example, for each of a plurality of feature information, the execution body sets a fusion weight in advance for feature information other than the feature information, thereby fusing other feature information to the feature information based on the fusion weight. The fusion weight may be specifically set according to an actual situation, and is not limited herein.
For example, the plurality of feature information includes first feature information X1Second characteristic information X2And third characteristic information X3For the first feature information X1Is the second characteristic information X2And third characteristic information X3The fusion weights set to 0.4 and 0.3, respectively, can pass through X1+0.4X2+0.3X3Obtaining the first characteristic information X1Corresponding fused feature information. For the first characteristic information X2Is the first characteristic information X1And third characteristic information X3The fusion weights set to 0.5 and 0.2, respectively, can pass through X2+0.5X12+0.2X3Obtaining the first characteristic information X1Corresponding fused feature information.
And 203, processing the fusion characteristic information corresponding to each branch network through a plurality of branch networks to obtain a multitask processing result.
In this embodiment, the execution main body may process the fusion feature information corresponding to each of the plurality of branch networks to obtain a multitasking result.
The multitasking result includes a task processing result for each task to be executed. For each branch network in the multi-task model, the branch network processes the corresponding fusion characteristic information to obtain a task processing result of the task corresponding to the branch network. In this embodiment, the execution main body may further adjust the task processing result of each task in combination with the task processing result of another task in the plurality of tasks according to the correlation between the plurality of tasks, so as to obtain the multitask processing result.
With continued reference to fig. 3, fig. 3 is a schematic diagram 300 of an application scenario of the method for handling multitasking according to the present embodiment. In the application scenario of fig. 3, after capturing the to-be-processed data 302, the terminal device 301 sends the to-be-processed data 302 to the server 303, and requests the server 303 to perform a semantic segmentation task, a target recognition task, and a face authentication task for the to-be-processed data. Firstly, the server 303 performs feature extraction on the data to be processed 302 through three branch networks 3041, 3042, 3043 in the multitask model 304, which sequentially execute a semantic segmentation task, a target recognition task, and a face identification task, to obtain feature information 3044, 3045, 3046 corresponding to each of the three tasks to be executed. Then, for each of the obtained plurality of feature information, based on the correlation between the three tasks, other feature information is fused on the basis of the feature information to obtain fused feature information corresponding to the feature information, thereby obtaining fused feature information 3047, 3048, 3049 corresponding to the feature information 3044, 3045, 3046 in this order. Finally, the fusion feature information corresponding to each of the three branch networks 3041, 3042, 3043 is processed to obtain a multitask processing result.
In this embodiment, based on the correlation between the multiple tasks to be executed, the feature information extracted by each branch network is fused with other branch networks to obtain feature information, so as to obtain fused feature information including the correlation information between the multiple tasks, thereby improving the accuracy of the multi-task processing result obtained based on the fused feature information.
In some optional implementations of this embodiment, the executing main body may execute the step 202 by:
firstly, splicing a plurality of characteristic information to obtain spliced characteristic information.
As an example, the two feature information have sizes ofH×W×C1,H×W×C2Wherein H, W respectively represent the length and width of the characteristic information, C1、C2Respectively representing the channel number of the two pieces of feature information, and then the size information of the spliced feature information is H multiplied by W multiplied by C1+C2)。
Secondly, obtaining attention characteristic information of the spliced characteristic information through a self-attention mechanism.
In this implementation, the execution main body may input the spliced feature information from the attention model to obtain the attention feature information. And the self-attention model represents the corresponding relation between the spliced characteristic information and the attention characteristic information.
Thirdly, for each piece of feature information in the plurality of pieces of feature information, combining the feature information and the attention feature information to obtain fused feature information corresponding to the feature information.
In this implementation, the execution agent may multiply each piece of feature information by using an attention map (attention map) represented by the attention feature information as a weight, to obtain fused feature information corresponding to each piece of feature information.
In the implementation mode, a specific mode for obtaining the fusion feature information based on the self-attention mechanism is provided, and the correlation among a plurality of tasks is fully considered, so that other feature information is fully fused in the fusion feature information, and the accuracy of the multi-task processing result can be further improved.
In some optional implementations of this embodiment, the executing body may execute the second step by: and obtaining attention characteristic information of the spliced characteristic information through a multi-head self-attention mechanism.
In this implementation, the accuracy of the obtained attention feature information can be further improved by the multi-head self-attention mechanism, so that the accuracy of the multitask processing result can be further improved.
In some optional implementations of this embodiment, the data to be processed is an image to be processed, and the plurality of tasks include a segmentation task for a target object in the image to be processed and a depth estimation task for the image to be processed. For example, the image to be processed represents traffic environment information, and the target object is a road marking line, a vehicle and the like.
Specifically, the multitask model includes a first branch network for performing the segmentation task and a second branch network for performing the depth estimation task. Firstly, feature extraction is respectively carried out on the image to be processed through a first branch network and a second branch network, and first feature information and second feature information are obtained. And then, splicing the first characteristic information and the second characteristic information to obtain spliced characteristic information. And then, obtaining attention feature information of the spliced feature information through a multi-head self-attention mechanism. Then, the first feature information and the second feature information are multiplied by the attention feature information to obtain first fusion feature information and second fusion feature information. And finally, sequentially processing the first fusion characteristic information and the second fusion characteristic information through the first branch network and the second branch network to obtain a multitask processing result.
In the implementation mode, a specific application scene is provided, and the accuracy of the segmentation result of the target object and the accuracy of the depth result of the depth estimation task of the segmentation task are improved.
In some optional implementations of this embodiment, the executing main body may execute the step 202 by:
first, dependencies between a plurality of tasks are determined.
As an example, a correlation information table in which correlations between tasks are described is set in advance. The correlation information table may be set manually or obtained based on statistical analysis of data. Through the relevance information table, the execution subject can determine the relevance among a plurality of tasks.
As yet another example, the execution subject may determine the correlation between the plurality of tasks through a correlation determination model. Wherein the relevance determination model is used for determining whether the plurality of tasks are relevant or not.
Secondly, for each feature information in the plurality of feature information, on the basis of the feature information, fusing relevant feature information in the plurality of feature information to obtain fused feature information corresponding to the feature information.
And the task corresponding to the relevant characteristic information has correlation with the task corresponding to the characteristic information.
As an example, there are three tasks to be performed, and further, three pieces of feature information Y are obtained1、Y2、Y3. Wherein, Y1Corresponding task and Y2Having correlation between corresponding tasks, the feature information Y1Only fusing the feature information Y2Without fusing the feature information Y3
In this implementation manner, the execution main body may determine the correlation between the tasks, and then may determine the feature information to be fused according to the correlation between the tasks, thereby improving the flexibility and the practicability of performing feature information fusion.
With continued reference to FIG. 4, there is shown an exemplary flow 400 of one method embodiment for processing multitasking according to the method of the present disclosure including the steps of:
step 401, respectively performing feature extraction on data to be processed through a plurality of branch networks in the multitask model to obtain feature information corresponding to a plurality of tasks to be executed.
At step 402, dependencies between a plurality of tasks are determined.
And 403, splicing a plurality of pieces of feature information with correlation corresponding to the task group to obtain spliced feature information corresponding to the task group.
In this embodiment, there may be both a task having a dependency and a task having no dependency among a plurality of tasks to be executed. Based on the determined dependencies between the plurality of tasks, the tasks having dependencies may be grouped into task groups. It will be appreciated that the plurality of tasks to be performed may include a plurality of task groups.
And step 404, obtaining the attention feature information of each spliced feature information through a multi-head self-attention mechanism.
Step 405, for each feature information in the plurality of feature information, combining the feature information and the attention feature information corresponding to the feature information to obtain fused feature information corresponding to the feature information.
The attention feature information corresponding to the feature information, the splicing feature information representing the obtained attention feature information, includes the feature information.
And step 406, processing the fusion characteristic information corresponding to each branch network through a plurality of branch networks to obtain a multitask processing result.
With continued reference to FIG. 5, a block diagram 500 of one embodiment of a method for processing multiple tasks is shown. In this embodiment, a total of four tasks are included, and are sequentially executed through the branch networks 501, 502, 503, and 504 in the multitasking model. The task corresponding to the branch network 501 and the task corresponding to the branch network 502 have a correlation therebetween, and the task corresponding to the branch network 503 and the task corresponding to the branch network 504 have a correlation therebetween. Firstly, each branch network sequentially treats data to be processed to obtain characteristic information 505, 506, 507, 508. Then, splicing the characteristic information 505 and 506 to obtain spliced characteristic information 509; and splicing the characteristic information 507 and 508 to obtain spliced characteristic information 510. Then, inputting the spliced feature information 509 into the multi-head self-attention network 511 to obtain attention feature information 512; the stitched feature information 510 is input into the multi-head self-attention network 511 to obtain attention feature information 513. Then, multiplying the attention feature information 512 by the feature information 505 and 506 respectively to obtain fused feature information 514 and 515; the attention feature information 513 is multiplied by the feature information 507 and 508, respectively, to obtain fused feature information 516 and 517. Finally, the merged feature information 514, 515, 516 and 517 is processed in sequence through the branch networks 501, 502, 503 and 504 to obtain a multitasking result.
As can be seen from this embodiment, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for processing multiple tasks in this embodiment specifically illustrates a determination process of relevance between multiple tasks and a determination process of fusing feature information, thereby further improving flexibility and accuracy of the multiple task processing.
With continuing reference to FIG. 6, as an implementation of the methods illustrated in the above-described figures, the present disclosure provides one embodiment of an apparatus for handling multitasking, which corresponds to the method embodiment illustrated in FIG. 2, and which may be applied in various electronic devices in particular.
As shown in fig. 6, the apparatus for processing multitasking includes: an extracting unit 601, configured to perform feature extraction on data to be processed through a plurality of branch networks in the multitask model, respectively, to obtain feature information corresponding to each of a plurality of tasks to be executed; a fusion unit 602 configured to fuse, for each of the obtained plurality of feature information, other feature information based on the feature information according to a correlation between the plurality of tasks to obtain fused feature information corresponding to the feature information; the processing unit 603 is configured to process the corresponding fusion feature information through a plurality of branch networks, and obtain a multitasking result.
In some optional implementations of this embodiment, the fusion unit 602 is further configured to: splicing the plurality of characteristic information to obtain spliced characteristic information; obtaining attention feature information of the spliced feature information through a self-attention mechanism; and for each piece of feature information in the plurality of pieces of feature information, combining the feature information and the attention feature information to obtain fused feature information corresponding to the feature information.
In some optional implementations of this embodiment, the fusion unit 602 is further configured to: and obtaining attention characteristic information of the spliced characteristic information through a multi-head self-attention mechanism.
In some optional implementations of this embodiment, the data to be processed is an image to be processed, and the plurality of tasks include a segmentation task for a target object in the image to be processed and a depth estimation task for the image to be processed.
In some optional implementations of this embodiment, the fusion unit 602 is further configured to: determining a correlation between a plurality of tasks; for each piece of feature information in the plurality of pieces of feature information, on the basis of the feature information, relevant feature information in the plurality of pieces of feature information is fused to obtain fused feature information corresponding to the feature information, wherein a task corresponding to the relevant feature information has correlation with a task corresponding to the feature information.
In this embodiment, based on the correlation between the multiple tasks to be executed, the feature information extracted by each branch network is fused with other branch networks to obtain feature information, so as to obtain fused feature information including the correlation information between the multiple tasks, thereby improving the accuracy of the multi-task processing result obtained based on the fused feature information.
According to an embodiment of the present disclosure, the present disclosure also provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for processing multitasking described in any of the embodiments.
According to an embodiment of the present disclosure, there is also provided a readable storage medium storing computer instructions for enabling a computer to implement the method for processing multitasking described in any of the embodiments.
The embodiments of the present disclosure provide a computer program product, which when executed by a processor is capable of implementing the method for processing multitasking described in any of the embodiments above.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 701 performs the respective methods and processes described above, such as a method for processing multitasking. For example, in some embodiments, the method for handling multitasking may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into the RAM703 and executed by the computing unit 701, one or more steps of the method for handling multitasking described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured by any other suitable means (e.g., by means of firmware) to perform the method for handling multitasking.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility existing in the traditional physical host and Virtual Private Server (VPS) service; it may also be a server of a distributed system, or a server incorporating a blockchain.
According to the technical scheme of the embodiment of the disclosure, based on the correlation among a plurality of tasks to be executed, the feature information extracted by each branch network is fused with other branch networks to obtain the feature information, and the fused feature information comprising the correlation information among the tasks is obtained, so that the accuracy of the multi-task processing result obtained based on the fused feature information is improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in this disclosure may be performed in parallel, sequentially, or in a different order, as long as the desired results of the technical solutions provided by this disclosure can be achieved, and are not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (13)

1. A method for processing multitasking, comprising:
respectively extracting the characteristics of data to be processed through a plurality of branch networks in the multitask model to obtain the characteristic information corresponding to a plurality of tasks to be executed;
for each piece of obtained feature information, fusing other feature information on the basis of the feature information according to the correlation among the tasks to obtain fused feature information corresponding to the feature information;
and processing the fusion characteristic information corresponding to each branch network to obtain a multitask processing result.
2. The method according to claim 1, wherein the obtaining, for each of the obtained feature information, fused feature information corresponding to the feature information by fusing other feature information based on the feature information according to the correlation between the tasks includes:
splicing the plurality of characteristic information to obtain spliced characteristic information;
obtaining attention feature information of the spliced feature information through a self-attention mechanism;
and for each piece of feature information in the plurality of pieces of feature information, combining the feature information and the attention feature information to obtain fused feature information corresponding to the feature information.
3. The method of claim 2, wherein the obtaining attention feature information of the stitched feature information through an attention-self mechanism comprises:
and obtaining the attention feature information of the spliced feature information through a multi-head self-attention mechanism.
4. The method of claim 1, wherein the data to be processed is an image to be processed, and the plurality of tasks include a segmentation task for a target object in the image to be processed and a depth estimation task for the image to be processed.
5. The method according to claim 1, wherein the obtaining, for each of the obtained feature information, fused feature information corresponding to the feature information by fusing other feature information based on the feature information according to the correlation between the tasks includes:
determining a correlation between the plurality of tasks;
and for each piece of feature information, fusing relevant feature information in the plurality of pieces of feature information on the basis of the feature information to obtain fused feature information corresponding to the feature information, wherein a task corresponding to the relevant feature information has correlation with a task corresponding to the feature information.
6. An apparatus for processing multitasking, comprising:
the extraction unit is configured to respectively extract the features of the data to be processed through a plurality of branch networks in the multitask model to obtain feature information corresponding to a plurality of tasks to be executed;
the fusion unit is configured to fuse other feature information on the basis of the feature information according to the correlation among the tasks for each of the obtained feature information to obtain fusion feature information corresponding to the feature information;
and the processing unit is configured to process the fusion characteristic information corresponding to each branch network to obtain a multitasking result.
7. The apparatus of claim 6, wherein the fusion unit is further configured to:
splicing the plurality of characteristic information to obtain spliced characteristic information; obtaining attention feature information of the spliced feature information through a self-attention mechanism; and for each piece of feature information in the plurality of pieces of feature information, combining the feature information and the attention feature information to obtain fused feature information corresponding to the feature information.
8. The apparatus of claim 7, wherein the fusion unit is further configured to:
and obtaining the attention feature information of the spliced feature information through a multi-head self-attention mechanism.
9. The apparatus of claim 6, wherein the data to be processed is an image to be processed, and the plurality of tasks comprise a segmentation task for a target object in the image to be processed and a depth estimation task for the image to be processed.
10. The apparatus of claim 6, wherein the fusion unit is further configured to:
determining a correlation between the plurality of tasks; and for each piece of feature information, fusing relevant feature information in the plurality of pieces of feature information on the basis of the feature information to obtain fused feature information corresponding to the feature information, wherein a task corresponding to the relevant feature information has correlation with a task corresponding to the feature information.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
13. A computer program product, comprising: computer program which, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN202110690496.XA 2021-06-22 2021-06-22 Method for processing multitask, related device and computer program product Pending CN113378855A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110690496.XA CN113378855A (en) 2021-06-22 2021-06-22 Method for processing multitask, related device and computer program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110690496.XA CN113378855A (en) 2021-06-22 2021-06-22 Method for processing multitask, related device and computer program product

Publications (1)

Publication Number Publication Date
CN113378855A true CN113378855A (en) 2021-09-10

Family

ID=77578202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110690496.XA Pending CN113378855A (en) 2021-06-22 2021-06-22 Method for processing multitask, related device and computer program product

Country Status (1)

Country Link
CN (1) CN113378855A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113885956A (en) * 2021-09-29 2022-01-04 北京百度网讯科技有限公司 Service deployment method and device, electronic equipment and storage medium
CN114446474A (en) * 2021-12-25 2022-05-06 新瑞鹏宠物医疗集团有限公司 Pet disease early warning device, method, electronic equipment and storage medium
CN114860411A (en) * 2022-05-17 2022-08-05 北京百度网讯科技有限公司 Multitask learning method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110349215A (en) * 2019-07-10 2019-10-18 北京悉见科技有限公司 A kind of camera position and orientation estimation method and device
US20190362190A1 (en) * 2018-05-28 2019-11-28 Samsung Electronics Co., Ltd. Method and system for dnn based imaging
CN111582201A (en) * 2020-05-12 2020-08-25 重庆理工大学 Lane line detection system based on geometric attention perception
CN111797266A (en) * 2020-07-10 2020-10-20 北京字节跳动网络技术有限公司 Image processing method and apparatus, storage medium, and electronic device
CN111814706A (en) * 2020-07-14 2020-10-23 电子科技大学 Face recognition and attribute classification method based on multitask convolutional neural network
WO2020215985A1 (en) * 2019-04-22 2020-10-29 腾讯科技(深圳)有限公司 Medical image segmentation method and device, electronic device and storage medium
CN112183547A (en) * 2020-10-19 2021-01-05 中国科学院计算技术研究所 Multi-mode data-based multi-task learning method and system
CN112966644A (en) * 2021-03-24 2021-06-15 中国科学院计算技术研究所 Multi-mode multi-task model for gesture detection and gesture recognition and training method thereof

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190362190A1 (en) * 2018-05-28 2019-11-28 Samsung Electronics Co., Ltd. Method and system for dnn based imaging
WO2020215985A1 (en) * 2019-04-22 2020-10-29 腾讯科技(深圳)有限公司 Medical image segmentation method and device, electronic device and storage medium
CN110349215A (en) * 2019-07-10 2019-10-18 北京悉见科技有限公司 A kind of camera position and orientation estimation method and device
CN111582201A (en) * 2020-05-12 2020-08-25 重庆理工大学 Lane line detection system based on geometric attention perception
CN111797266A (en) * 2020-07-10 2020-10-20 北京字节跳动网络技术有限公司 Image processing method and apparatus, storage medium, and electronic device
CN111814706A (en) * 2020-07-14 2020-10-23 电子科技大学 Face recognition and attribute classification method based on multitask convolutional neural network
CN112183547A (en) * 2020-10-19 2021-01-05 中国科学院计算技术研究所 Multi-mode data-based multi-task learning method and system
CN112966644A (en) * 2021-03-24 2021-06-15 中国科学院计算技术研究所 Multi-mode multi-task model for gesture detection and gesture recognition and training method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何霞: "基于级联多任务深度学习的卡口识别引擎研究", 《计算机科学》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113885956A (en) * 2021-09-29 2022-01-04 北京百度网讯科技有限公司 Service deployment method and device, electronic equipment and storage medium
CN113885956B (en) * 2021-09-29 2023-08-29 北京百度网讯科技有限公司 Service deployment method and device, electronic equipment and storage medium
CN114446474A (en) * 2021-12-25 2022-05-06 新瑞鹏宠物医疗集团有限公司 Pet disease early warning device, method, electronic equipment and storage medium
CN114860411A (en) * 2022-05-17 2022-08-05 北京百度网讯科技有限公司 Multitask learning method and device, electronic equipment and storage medium
CN114860411B (en) * 2022-05-17 2023-05-05 北京百度网讯科技有限公司 Multi-task learning method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN113378855A (en) Method for processing multitask, related device and computer program product
CN113407850B (en) Method and device for determining and acquiring virtual image and electronic equipment
CN113627536B (en) Model training, video classification method, device, equipment and storage medium
CN112949767A (en) Sample image increment, image detection model training and image detection method
CN113591864B (en) Training method, device and system for text recognition model framework
CN113657269A (en) Training method and device for face recognition model and computer program product
CN113377958A (en) Document classification method and device, electronic equipment and storage medium
CN113780098A (en) Character recognition method, character recognition device, electronic equipment and storage medium
CN114186681A (en) Method, apparatus and computer program product for generating model clusters
CN114494747A (en) Model training method, image processing method, device, electronic device and medium
CN113724398A (en) Augmented reality method, apparatus, device and storage medium
CN113627361A (en) Training method and device for face recognition model and computer program product
CN113360672B (en) Method, apparatus, device, medium and product for generating knowledge graph
CN114842541A (en) Model training and face recognition method, device, equipment and storage medium
CN114093006A (en) Training method, device and equipment of living human face detection model and storage medium
CN114724144A (en) Text recognition method, model training method, device, equipment and medium
CN115312042A (en) Method, apparatus, device and storage medium for processing audio
CN114329164A (en) Method, apparatus, device, medium and product for processing data
CN113850072A (en) Text emotion analysis method, emotion analysis model training method, device, equipment and medium
CN113591709A (en) Motion recognition method, motion recognition device, motion recognition apparatus, motion recognition medium, and computer program product
CN113379750A (en) Semi-supervised learning method of semantic segmentation model, related device and product
CN112948584A (en) Short text classification method, device, equipment and storage medium
CN113742564A (en) Target resource pushing method and device
CN115982466B (en) Method, device, equipment and storage medium for retrieving data
CN114491040B (en) Information mining method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination