CN113537365A - Multitask learning self-adaptive balancing method based on information entropy dynamic weighting - Google Patents

Multitask learning self-adaptive balancing method based on information entropy dynamic weighting Download PDF

Info

Publication number
CN113537365A
CN113537365A CN202110820646.4A CN202110820646A CN113537365A CN 113537365 A CN113537365 A CN 113537365A CN 202110820646 A CN202110820646 A CN 202110820646A CN 113537365 A CN113537365 A CN 113537365A
Authority
CN
China
Prior art keywords
task
depth
multitask
learning model
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110820646.4A
Other languages
Chinese (zh)
Other versions
CN113537365B (en
Inventor
王玉峰
丁文锐
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202110820646.4A priority Critical patent/CN113537365B/en
Publication of CN113537365A publication Critical patent/CN113537365A/en
Application granted granted Critical
Publication of CN113537365B publication Critical patent/CN113537365B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a multitask learning method based on information entropy dynamic weighting, and belongs to the technical field of machine learning. Firstly, an initial multi-task learning model M is set up, model inference is carried out on an input image to obtain a plurality of task output graphs, normalization processing is carried out on the task output graphs respectively to obtain corresponding normalized probability graphs; then, calculating a fixed weight multi-task loss function by utilizing each normalized probability graph, and carrying out primary training on the multi-task learning model M; and finally, on the basis of the initially trained multi-task learning model M, constructing a final self-adaptive multi-task loss function through an information entropy dynamic weighting algorithm, performing iterative optimization training on the initially trained multi-task learning model until the multi-task learning model converges, and terminating the training to obtain the optimized multi-task learning model M1. The invention can effectively deal with different types of tasks, adaptively balance the relative importance of each task, and has strong algorithm applicability, simplicity and high efficiency.

Description

Multitask learning self-adaptive balancing method based on information entropy dynamic weighting
Technical Field
The invention belongs to the technical field of machine learning, and particularly relates to a multitask learning self-adaptive balancing method based on information entropy dynamic weighting.
Background
Machine learning is one of the core technologies of artificial intelligence, and the machine learning improves the performance of a computer algorithm through empirical knowledge to realize intelligent and autonomous learning work. However, machine learning techniques generally require a large number of learning samples, and especially, the most recent and popular deep learning models generally require a large number of labeled samples to train the network. However, in many applications, some task labels of training samples are difficult to collect or are time consuming and laborious to manually label. In this case, multitask learning may be utilized to maximize the utilization of the limited training samples in each task.
The multi-task learning aims at jointly learning a plurality of related tasks to improve the generalization performance of each task, and is widely applied to the fields of natural language processing, computer vision and the like. Where each task may be a general learning task such as a supervised task (e.g., a classification or regression problem), an unsupervised task (e.g., a clustering problem), a reinforcement learning task, or a multi-view learning task, among others.
In recent years, the performance of various computer vision tasks is greatly improved by deep learning, and the combination of the deep multi-task learning and the multi-task learning, namely deep multi-task learning research, is a huge progress by jointly learning a plurality of tasks in one model so as to obtain better generalization performance and lower memory occupation. However, the following problems still exist in the deep multitask learning at present: (1) the information exchange among different subtasks is not sufficient, so that the advantage of multi-task learning is difficult to be fully exerted; (2) the loss function of most existing MTL studies is usually derived from linear weighting of losses for subtasks, which is dependent on human experience and lacks adaptability.
Current deep multitask learning research is mainly focused on the design of network structure and optimization strategies:
in the network structure research, there are two main ways of performing a multitask learning mechanism in a deep neural network, namely hard parameter sharing and soft parameter sharing. Where hard parameter sharing typically shares a hidden layer among all tasks while preserving multiple task-specific output layers. The more tasks that are learned simultaneously, the more the model needs to find an expression that is suitable for all tasks, so hard parameter sharing greatly reduces the risk of overfitting. On the other hand, in soft parameter sharing, each task has its own model and corresponding parameters, and then regularization adjustment is performed on model parameter distances to increase the degree of coherence of the parameters.
In optimization strategy research, most of multi-task learning related works simply set the weight of each task to be a fixed proportion, but the method is heavily dependent on human experience, and in some cases, improper weight can cause some subtasks not to work normally. Therefore, different from the design of the structure of the multi-task sharing model, the other part of research focuses on balancing the influence of different tasks on the network, including research on uncertainty weight, gradient normalization algorithm, dynamic weight averaging strategy and the like.
In summary, since the multitask model includes multiple learning tasks, how to adaptively balance the importance among different tasks has important research significance.
Disclosure of Invention
In order to improve the generalization of the multi-task learning model, the invention designs a multi-task learning self-adaptive balancing method based on information entropy dynamic weighting on the model optimization strategy through the analysis of the characteristics of different tasks and the requirements of multi-task model application, namely, the relative weight of each task loss function is dynamically adjusted in the model training process, and the self-adaptive training and accurate prediction of the multi-task learning model are realized.
The multitask learning self-adaptive balancing method based on the information entropy dynamic weighting comprises the following specific steps:
step one, a multitask learning model M is built, model inference and normalization processing are carried out on an input image through the current multitask learning model M, and different types of normalized probability graphs are obtained;
the initial multi-task learning model M comprises one shared encoder and three task-specific decoders.
After the multi-task learning model M carries out model inference on the input image, three pixel-level task outputs are generated and are respectively a semantic segmentation output graph PsDepth estimation output map PdAnd edge detection output map PbRespectively carrying out normalization processing on each task output graph to obtain different types of normalized probability graphs, which specifically comprises the following steps:
1) semantic segmentation output graph PsProcessing by adopting a softmax function to obtain a normalized semantic segmentation probability map:
Figure BDA0003171869960000021
wherein M is the total category number of semantic segmentation, i represents the ith semantic category in the prediction graph, and Ps,iFor model output map PsLayer i semantically partitions the value map, and P's,iThen representing the normalized ith layer semantic segmentation probability map P's
2) Edge detection output graph PbProcessing by adopting sigmoid function to obtain normalized edge detection probability map P'b
Figure BDA0003171869960000022
3) Depth estimation output map PdConverting the depth regression task into a classification task by using a logarithmic space discretization strategy, and obtaining a normalized depth classification probability map by using a softmax function;
firstly, a logarithmic space discretization strategy is adopted to discretely divide the depth value of a continuous space into K sub-intervals corresponding to K categories;
the method specifically comprises the following steps: interval of depth values [ D1,D2]Mapping to [ D1+1,D2+1]Is recorded as [ D'1,D′2]And according to a discretized depth threshold dkDividing to obtain K sub-intervals { [ d ]0,d1],[d1,d2],...,[dK-1,dK]}。
Discretized depth threshold dkIs defined as:
Figure BDA0003171869960000023
then, discretizing the depth estimation truth value into a depth classification truth value according to the strategy;
i.e. when the depth estimation truth value is at dk-1,dk]It is assigned class k and trains the deep task branch with deep classification truth.
Finally, obtaining a depth classification prediction map in a training stage, and processing by adopting a softmax function to obtain a normalized depth classification probability map P'd,k
The depth classification probability map is:
Figure BDA0003171869960000031
wherein K is the total class number of the depth classification, K represents the kth depth class, Pd,kRepresents a k-th layer depth classification prediction map, P'd,kRepresenting a normalized k-th layer depth classification probability map.
Calculating a multitask loss function by using the normalized probability map, and performing primary training on the current multitask learning model M;
the method specifically comprises the following steps:
firstly, calculating the loss corresponding to each type of normalized probability graph obtained by adopting a cross entropy function;
cross entropy loss function LtComprises the following steps:
Figure BDA0003171869960000032
wherein, ytA one-hot form corresponding to each task is provided with a supervision type label; t is s, d or b, i.e. P'tIs a normalized probability map of semantic segmentation, edge detection or depth estimation tasks; c is the total category number corresponding to each task, and i represents the ith category in the prediction graph.
Then, an equal-weight-sum multitask loss function L is constructed according to the fixed weight of each taskmtlComprises the following steps:
Figure BDA0003171869960000033
finally, a multi-tasking penalty function L is utilizedmtlAnd performing gradient back transmission and parameter updating of the network model, and performing iterative training to obtain a multi-task learning model after preliminary training.
Thirdly, on the basis of the initially trained multitask learning model M, constructing a final self-adaptive multitask loss function L 'by utilizing an information entropy dynamic weighting algorithm'mtl
The method specifically comprises the following steps:
firstly, calculating the information entropy E of each task by using a multi-layer probability map of each categoryt
Figure BDA0003171869960000034
Wherein, W and H are respectively probability chart row and column coordinates, and W and H are respectively the maximum values of the probability chart row and column lengths; c is the channel value number of the probability map, and C is the total number of categories corresponding to each task.
Then, relative weight w of each task is distributed by using information entropy valuet
Relative weight wtComprises the following steps:
Figure BDA0003171869960000035
when the prediction result of the task is worse, the uncertainty of the output probability map is higher, and the corresponding information entropy value is larger. Therefore, the task with poor prediction performance is assigned with larger weight, so that the model can be used for training the corresponding task.
Finally, according to the relative weight of each task and the cross entropy loss function LtAnd constructing a final self-adaptive multitask loss function in a weighted summation mode.
Final adaptive multitask loss function L'mtlComprises the following steps:
Figure BDA0003171869960000041
step four, utilizing the final adaptive multitask loss function L'mtlPerforming back propagation to obtain the parameter gradient of the current multi-task learning model M, updating the parameter of the current multi-task learning model M by using a gradient descent algorithm, and completing one-time iterative training;
and step five, after the iterative training is finished, obtaining a new multi-task learning model M1, returning to the step three to carry out the next iteration until the multi-task learning model M1 reaches convergence, and terminating the training.
The invention has the advantages that:
(1) the invention relates to a multi-task learning self-adaptive balancing method based on information entropy dynamic weighting, which adopts a discretization strategy to convert regression tasks into classification tasks, can effectively deal with different types of tasks and has strong algorithm applicability;
(2) the invention relates to a multi-task learning self-adaptive balancing method based on dynamic weighting of information entropy, which utilizes a prediction graph output by a task to calculate the information entropy without changing the structural design of a model or the updating process of parameters, and is simple, efficient, plug-and-play;
(3) the invention relates to a multitask learning self-adaptive balancing method based on information entropy dynamic weighting, which can be used for dynamically adjusting the weight of a task loss function based on an information entropy value, and can be used for self-adaptively balancing the relative importance of each task so as to improve the performance of the whole task.
(4) The invention relates to a multitask learning self-adaptive balancing method based on information entropy dynamic weighting, which can effectively extract the general shared characteristics and task specific characteristics of a model and quickly and uniformly finish the training of a multitask learning model.
Drawings
FIG. 1 is an overall flow chart of the multi-task learning adaptive balancing method based on information entropy dynamic weighting according to the present invention;
FIG. 2 is a schematic diagram of a multitasking learning model in the present invention;
FIG. 3 is a schematic diagram of the discretization of the regression task in the present invention.
Detailed Description
The following describes in detail a specific implementation method of the present invention with reference to the accompanying drawings and taking a multitask learning network in which semantic segmentation, depth estimation and edge detection are jointly implemented in computer vision as an example.
The invention provides a multitask learning self-adaptive balance method based on information entropy dynamic weighting. In the model training process, the information entropy algorithm can effectively evaluate the prediction result of each task, and the relative weight of the tasks is adjusted through a dynamic weighting strategy, so that the multi-task prediction model focuses more on the tasks with relatively poor performance, and the self-adaptive balance learning of different task performances is realized.
The invention relates to a multitask learning self-adaptive balancing method based on information entropy dynamic weighting, which comprises the following steps as shown in figure 1:
initializing network parameters, and training to obtain an initial multi-task learning model.
A multitask learning network model based on a single encoder-multiple decoders is constructed, as shown in fig. 2, specifically:
the encoder contains network parameters that are shared by all tasks and is initialized with a skeletal network (e.g., ResNet) pre-trained on ImageNet. The decoder comprises task-specific network parameters, each task corresponds to one decoder, and a random parameter initialization mode is adopted. In this embodiment, three tasks are set to be solved: semantic segmentation, depth estimation and edge detection, the multi-task learning model comprises a shared encoder and three task-specific decoders.
After the three tasks are respectively output by a decoder, three cross entropy losses L are obtained1、L2And L3Corresponding relative weight w to each task1、w2And w3And the cross entropy loss of the multi-task loss function is obtained by weighted summationmtl
Lmtl=w1L1+w2L2+w3L3
Performing model inference and normalization processing on the input image through a multi-task learning model to obtain different types of normalized probability maps;
after model inference is carried out on an input image by the multi-task learning model, three pixel-level task outputs are generated and are respectively a semantic segmentation output graph PsDepth estimation output map PdAnd edge detection output map PbNormalizing each task output graph to obtain different types of normalized probability graphs, which specifically comprises the following steps:
1) semantic segmentation output graph PsProcessing by adopting a softmax function to obtain a normalized multi-classification semantic segmentation probability map:
Figure BDA0003171869960000051
wherein S is the total category number of semantic segmentation, i represents the ith semantic category in the prediction graph, and Ps,iFor model output map PsLayer i semantically partitions the value map, and P's,iThe normalized ith layer semantic segmentation probability map is represented.
2) Edge detection output graph PbProcessing by adopting sigmoid function (equivalent to binary softmax function) to obtain normalized edge detection probability map P'b
Figure BDA0003171869960000052
3) Depth estimation output map PdConverting the depth regression task into a classification task by using a logarithmic space discretization strategy, and obtaining a normalized depth classification probability map by using a softmax function;
first, as shown in fig. 3, a logarithmic space discretization strategy is adopted to discretely divide the depth value of the continuous space into K sub-intervals corresponding to K categories, specifically:
interval of depth values [ D1,D2]Mapping to [ D1+1,D2+1]Is recorded as [ D'1,D′2]And according to a discretized depth threshold dkDividing to obtain K sub-intervals { [ d ]0,d1],[d1,d2],...,[dK-1,dK]}。
Discretized depth threshold dkIs defined as:
Figure BDA0003171869960000053
then, the depth estimation truth value is discretized into a depth classification truth value according to the strategy, namely when the depth estimation truth value is in [ d ]k-1,dk]It is assigned class k and trains the deep task branch with deep classification truth.
Finally, obtaining a depth classification prediction map in a training stage, and processing by adopting a softmax function to obtain a normalized depth classification probability map P'd,k
The depth classification probability map is:
Figure BDA0003171869960000061
wherein K is the total class number of the depth classification, K represents the kth depth class, Pd,kRepresents a k-th layer depth classification prediction map, P'd,kRepresenting a normalized k-th layer depth classification probability map.
In the embodiment of the present invention, the discretization of the depth estimation is performed by taking K80. The supervised truth of the depth branch is in the form of classification, so the depth estimation task is directly trained in the form of depth classification here.
Step three, performing preliminary training on the multi-task learning model;
because the result error of each task predicted by the initialization model is large and unstable, the multi-task network model needs to be initially trained, specifically:
firstly, calculating the loss corresponding to each type of normalized probability graph obtained by adopting a cross entropy function:
Figure BDA0003171869960000062
wherein, ytA one-hot form corresponding to each task is provided with a supervision type label; t corresponds to each task in step one and can be represented as s, d or b, i.e., P'tIs a normalized probability map of semantic segmentation, edge detection or depth estimation tasks; c is the total category number corresponding to each task, and i represents the ith category in the prediction graph.
Secondly, an equal weight addition and multitask loss function L is constructedmtlComprises the following steps:
Figure BDA0003171869960000063
during the preliminary training process, the loss function of each task is given equal fixed weight.
Then, a multitask penalty function L is utilizedmtlAnd (3) performing gradient back transmission and parameter updating of the network model, and training a multi-task learning model obtained after a certain number of iterations to perform preliminary task prediction.
And fourthly, on the basis of the multi-task learning model obtained through the primary training, constructing a self-adaptive multi-task loss function by using an information entropy dynamic weighting algorithm, and further optimizing and training the multi-task learning model.
The method specifically comprises the following steps:
firstly, calculating information entropy E of each task by using various types of multilayer probability graphst
Figure BDA0003171869960000064
Wherein, W and H are respectively probability chart row and column coordinates, and W and H are respectively the maximum values of the probability chart row and column lengths; c is the number of channels of the probability map, and C is the total number of corresponding categories of each task;
then, relative weight w of each task is distributed by using information entropy valuet
The information entropy can reflect the uncertainty of the prediction probability map, so the information entropy of the task output probability map can be used for allocating relative weights:
Figure BDA0003171869960000071
when the prediction result of the task is worse, the uncertainty of the output probability map is higher, and the corresponding information entropy value is larger. Therefore, the task with poor prediction performance is assigned with larger weight so that the model can train the corresponding task.
Finally, according to the relative weight of each task and the cross entropy loss function LtAnd constructing an integral self-adaptive multitask loss function in a weighted summation mode.
Overall adaptive multitask loss function L'mtlComprises the following steps:
Figure BDA0003171869960000072
step five, utilizing an overall adaptive multitask loss function L'mtlPerforming backward propagation to obtain a model parameter gradient, and then updating the model parameter by using a gradient descent algorithm to complete one-time iterative training;
and step six, after the model parameters are updated, a new multi-task learning model is obtained. And returning to the step four to carry out the next iteration until the multi-task learning model reaches convergence, and terminating the training.
After each network parameter is updated, the prediction performance of each task changes, so that the corresponding relative weight also changes dynamically, and the adaptive adjustment of the loss function in the network model training is realized.
The above embodiment only describes three specific tasks of semantic segmentation, depth estimation and edge detection, but the application of the method of the present invention is not limited to the above three specific tasks, and may also be applied to other tasks, and may also be applied to the case of three or more tasks, and the multi-task learning model is adjusted according to the actual situation. The case including other tasks or three or more tasks is within the technical problem solved by the present invention.

Claims (5)

1. A multitask learning self-adaptive balancing method based on information entropy dynamic weighting is characterized by comprising the following steps:
firstly, building an initial multi-task learning model M, deducing an input image through the multi-task learning model M to obtain different types of outputs of different tasks, and respectively carrying out normalization processing to obtain normalized probability maps corresponding to the different tasks;
then, calculating a multitask loss function by using each normalized probability graph, and performing primary training on the multitask learning model M through the multitask loss function;
finally, on the basis of the initially trained multi-task learning model M, a final self-adaptive multi-task loss function is constructed through an information entropy dynamic weighting algorithm, the parameter gradient of the current multi-task learning model M is obtained through a back propagation algorithm, parameter updating is carried out, and one-time iterative training is completed;
after iterative training, a new multi-task learning model M1 is obtained, the input image is deduced and normalized again, the next iteration is carried out by using the self-adaptive multi-task loss function until the multi-task learning model M1 converges, and the training is terminated.
2. The adaptive balancing method for multitask learning based on information entropy dynamic weighting as claimed in claim 1, wherein the multitask learning model comprises a shared encoder and a decoder corresponding to each specific task.
3. The information entropy dynamic weighting-based multitask learning self as claimed in claim 1The adaptive balancing method is characterized in that the three task output graphs are as follows: semantic segmentation output graph PsDepth estimation output map PdAnd edge detection output map Pb(ii) a The corresponding normalized probability map is:
1) semantic segmentation output graph PsProcessing by adopting a sonmax function to obtain a normalized semantic segmentation probability map:
Figure FDA0003171869950000011
wherein S is the total category number of semantic segmentation, i represents the ith semantic category in the prediction graph, and Ps,iFor model output map PsLayer i semantically partitions the value map, and P's,iThen representing the normalized ith layer semantic segmentation probability map P's
2) Classified edge detection output graph PbProcessing by adopting sigmoid function to obtain normalized edge detection probability map P'b
Figure FDA0003171869950000012
3) Depth estimation output map PdConverting the depth regression task into a classification task by using a logarithmic space discretization strategy, and obtaining a normalized depth classification probability map by using a sonmax function;
firstly, a logarithm space discretization strategy is adopted to discretely divide the depth value of a continuous space into K sub-intervals corresponding to K categories, specifically:
interval of depth values [ D1,D2]Mapping to [ D1+1,D2+1]Is recorded as [ D'1,D′2]And according to a discretized depth threshold dkDividing to obtain K sub-intervals { [ d ]0,d1],[d1,d2],...,[dK-1,dK]};
Discretized depth threshold dkIs defined as:
Figure FDA0003171869950000013
then, the depth estimation truth value is discretized into a depth classification truth value according to the strategy, namely when the depth estimation truth value is in [ d ]k-1,dk]The class of the training data is k, and the deep task branches are trained according to the depth classification truth value;
finally, obtaining a depth classification prediction map in a training stage, and processing by adopting a softmax function to obtain a normalized depth classification probability map P'd,k
The depth classification probability map is:
Figure FDA0003171869950000021
wherein K is the total class number of the depth classification, K represents the kth depth class, Pd,kRepresents a k-th layer depth classification prediction map, P'd,kRepresenting a normalized k-th layer depth classification probability map.
4. The information entropy dynamic weighting-based multitask learning self-adaptive balancing method according to claim 1, wherein the specific process of calculating the multitask loss function and initially training the multitask learning model is as follows:
firstly, calculating loss corresponding to each type of normalized probability graph obtained by adopting a cross entropy function;
cross entropy loss function LtComprises the following steps:
Figure FDA0003171869950000022
wherein, ytA one-hot form corresponding to each task is provided with a supervision type label; t is s, d or b, i.e. P'tIs a semantic meaningA normalized probability map of segmentation, edge detection, or depth estimation tasks; c is the total category number corresponding to each task, and i represents the ith layer category in the prediction graph;
then, an equal-weight-sum multitask loss function L is constructed according to the fixed weight of each taskmtlComprises the following steps:
Figure FDA0003171869950000023
finally, a multi-tasking penalty function L is utilizedmtlAnd performing gradient back transmission and parameter updating of the network model, and performing iterative training to obtain a multi-task learning model after preliminary training.
5. The information entropy dynamic weighting-based multitask learning adaptive balancing method according to claim 1, wherein the specific process for constructing the final adaptive multitask loss function is as follows:
step 501, calculating information entropy E of each task by using multi-layer probability maps of various categoriest
Figure FDA0003171869950000024
Wherein, W and H are respectively probability chart row and column coordinates, and W and H are respectively the maximum values of the probability chart row and column lengths; c is the number of channels of the probability map, and C is the total number of corresponding categories of each task;
502, distributing relative weight w of each task by using information entropy valuet
Relative weight wtComprises the following steps:
Figure FDA0003171869950000025
step 503, according to the relative weight of each task and the cross entropy loss function LtConstructing the final self-adaptation by means of weighted summationA multitask penalty function;
final adaptive multitask loss function L'mtlComprises the following steps:
L′mtl=-∑twtLt
CN202110820646.4A 2021-07-20 2021-07-20 Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method Active CN113537365B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110820646.4A CN113537365B (en) 2021-07-20 2021-07-20 Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110820646.4A CN113537365B (en) 2021-07-20 2021-07-20 Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method

Publications (2)

Publication Number Publication Date
CN113537365A true CN113537365A (en) 2021-10-22
CN113537365B CN113537365B (en) 2024-02-06

Family

ID=78100520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110820646.4A Active CN113537365B (en) 2021-07-20 2021-07-20 Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method

Country Status (1)

Country Link
CN (1) CN113537365B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114714146A (en) * 2022-04-08 2022-07-08 北京理工大学 Method for simultaneously predicting surface roughness and cutter abrasion
WO2023097616A1 (en) * 2021-12-02 2023-06-08 Intel Corporation Apparatus, method, device and medium for loss balancing in multi-task learning
CN117273068A (en) * 2023-09-28 2023-12-22 东南大学 Model initialization method based on linearly expandable learning genes

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451620A (en) * 2017-08-11 2017-12-08 深圳市唯特视科技有限公司 A kind of scene understanding method based on multi-task learning
CN110837836A (en) * 2019-11-05 2020-02-25 中国科学技术大学 Semi-supervised semantic segmentation method based on maximized confidence

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451620A (en) * 2017-08-11 2017-12-08 深圳市唯特视科技有限公司 A kind of scene understanding method based on multi-task learning
CN110837836A (en) * 2019-11-05 2020-02-25 中国科学技术大学 Semi-supervised semantic segmentation method based on maximized confidence

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Y WANG, ET AL.: "Boundary-aware multitask learning for remote sensing imagery", 《IEEE》 *
张磊;曹跃云;李彬;崔佳林;: "基于组合赋权法的舰船动力系统使用效能评估研究", 舰船科学技术, no. 03 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023097616A1 (en) * 2021-12-02 2023-06-08 Intel Corporation Apparatus, method, device and medium for loss balancing in multi-task learning
CN114714146A (en) * 2022-04-08 2022-07-08 北京理工大学 Method for simultaneously predicting surface roughness and cutter abrasion
CN117273068A (en) * 2023-09-28 2023-12-22 东南大学 Model initialization method based on linearly expandable learning genes
CN117273068B (en) * 2023-09-28 2024-04-16 东南大学 Model initialization method based on linearly expandable learning genes

Also Published As

Publication number Publication date
CN113537365B (en) 2024-02-06

Similar Documents

Publication Publication Date Title
CN113537365B (en) Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method
CN109948029B (en) Neural network self-adaptive depth Hash image searching method
CN106600059B (en) Intelligent power grid short-term load prediction method based on improved RBF neural network
Kim et al. SplitNet: Learning to semantically split deep networks for parameter reduction and model parallelization
CN108985515B (en) New energy output prediction method and system based on independent cyclic neural network
CN114022693B (en) Single-cell RNA-seq data clustering method based on double self-supervision
Kamruzzaman et al. Medical diagnosis using neural network
CN113554156B (en) Multitask image processing method based on attention mechanism and deformable convolution
CN106897744A (en) A kind of self adaptation sets the method and system of depth confidence network parameter
CN113722980A (en) Ocean wave height prediction method, system, computer equipment, storage medium and terminal
CN114819143A (en) Model compression method suitable for communication network field maintenance
CN115204035A (en) Generator set operation parameter prediction method and device based on multi-scale time sequence data fusion model and storage medium
CN111353534B (en) Graph data category prediction method based on adaptive fractional order gradient
Moriya et al. Evolution-strategy-based automation of system development for high-performance speech recognition
CN114202021A (en) Knowledge distillation-based efficient image classification method and system
CN111753995A (en) Local interpretable method based on gradient lifting tree
CN116451859A (en) Bayesian optimization-based stock prediction method for generating countermeasure network
CN116415177A (en) Classifier parameter identification method based on extreme learning machine
CN113408610B (en) Image identification method based on adaptive matrix iteration extreme learning machine
CN115906959A (en) Parameter training method of neural network model based on DE-BP algorithm
CN113807005A (en) Bearing residual life prediction method based on improved FPA-DBN
CN113408602A (en) Tree process neural network initialization method
KR20210157826A (en) Method for sturcture learning and model compression for deep neural netwrok
CN113033495B (en) Weak supervision behavior identification method based on k-means algorithm
US20220343162A1 (en) Method for structure learning and model compression for deep neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant