CN113537365A - Multitask learning self-adaptive balancing method based on information entropy dynamic weighting - Google Patents
Multitask learning self-adaptive balancing method based on information entropy dynamic weighting Download PDFInfo
- Publication number
- CN113537365A CN113537365A CN202110820646.4A CN202110820646A CN113537365A CN 113537365 A CN113537365 A CN 113537365A CN 202110820646 A CN202110820646 A CN 202110820646A CN 113537365 A CN113537365 A CN 113537365A
- Authority
- CN
- China
- Prior art keywords
- task
- depth
- multitask
- learning model
- map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 230000006870 function Effects 0.000 claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims abstract description 14
- 238000010606 normalization Methods 0.000 claims abstract description 6
- 230000011218 segmentation Effects 0.000 claims description 21
- 238000003708 edge detection Methods 0.000 claims description 15
- 230000003044 adaptive effect Effects 0.000 claims description 11
- 230000005540 biological transmission Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 238000010801 machine learning Methods 0.000 abstract description 5
- 238000005457 optimization Methods 0.000 abstract description 4
- 238000011160 research Methods 0.000 description 7
- 238000013461 design Methods 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a multitask learning method based on information entropy dynamic weighting, and belongs to the technical field of machine learning. Firstly, an initial multi-task learning model M is set up, model inference is carried out on an input image to obtain a plurality of task output graphs, normalization processing is carried out on the task output graphs respectively to obtain corresponding normalized probability graphs; then, calculating a fixed weight multi-task loss function by utilizing each normalized probability graph, and carrying out primary training on the multi-task learning model M; and finally, on the basis of the initially trained multi-task learning model M, constructing a final self-adaptive multi-task loss function through an information entropy dynamic weighting algorithm, performing iterative optimization training on the initially trained multi-task learning model until the multi-task learning model converges, and terminating the training to obtain the optimized multi-task learning model M1. The invention can effectively deal with different types of tasks, adaptively balance the relative importance of each task, and has strong algorithm applicability, simplicity and high efficiency.
Description
Technical Field
The invention belongs to the technical field of machine learning, and particularly relates to a multitask learning self-adaptive balancing method based on information entropy dynamic weighting.
Background
Machine learning is one of the core technologies of artificial intelligence, and the machine learning improves the performance of a computer algorithm through empirical knowledge to realize intelligent and autonomous learning work. However, machine learning techniques generally require a large number of learning samples, and especially, the most recent and popular deep learning models generally require a large number of labeled samples to train the network. However, in many applications, some task labels of training samples are difficult to collect or are time consuming and laborious to manually label. In this case, multitask learning may be utilized to maximize the utilization of the limited training samples in each task.
The multi-task learning aims at jointly learning a plurality of related tasks to improve the generalization performance of each task, and is widely applied to the fields of natural language processing, computer vision and the like. Where each task may be a general learning task such as a supervised task (e.g., a classification or regression problem), an unsupervised task (e.g., a clustering problem), a reinforcement learning task, or a multi-view learning task, among others.
In recent years, the performance of various computer vision tasks is greatly improved by deep learning, and the combination of the deep multi-task learning and the multi-task learning, namely deep multi-task learning research, is a huge progress by jointly learning a plurality of tasks in one model so as to obtain better generalization performance and lower memory occupation. However, the following problems still exist in the deep multitask learning at present: (1) the information exchange among different subtasks is not sufficient, so that the advantage of multi-task learning is difficult to be fully exerted; (2) the loss function of most existing MTL studies is usually derived from linear weighting of losses for subtasks, which is dependent on human experience and lacks adaptability.
Current deep multitask learning research is mainly focused on the design of network structure and optimization strategies:
in the network structure research, there are two main ways of performing a multitask learning mechanism in a deep neural network, namely hard parameter sharing and soft parameter sharing. Where hard parameter sharing typically shares a hidden layer among all tasks while preserving multiple task-specific output layers. The more tasks that are learned simultaneously, the more the model needs to find an expression that is suitable for all tasks, so hard parameter sharing greatly reduces the risk of overfitting. On the other hand, in soft parameter sharing, each task has its own model and corresponding parameters, and then regularization adjustment is performed on model parameter distances to increase the degree of coherence of the parameters.
In optimization strategy research, most of multi-task learning related works simply set the weight of each task to be a fixed proportion, but the method is heavily dependent on human experience, and in some cases, improper weight can cause some subtasks not to work normally. Therefore, different from the design of the structure of the multi-task sharing model, the other part of research focuses on balancing the influence of different tasks on the network, including research on uncertainty weight, gradient normalization algorithm, dynamic weight averaging strategy and the like.
In summary, since the multitask model includes multiple learning tasks, how to adaptively balance the importance among different tasks has important research significance.
Disclosure of Invention
In order to improve the generalization of the multi-task learning model, the invention designs a multi-task learning self-adaptive balancing method based on information entropy dynamic weighting on the model optimization strategy through the analysis of the characteristics of different tasks and the requirements of multi-task model application, namely, the relative weight of each task loss function is dynamically adjusted in the model training process, and the self-adaptive training and accurate prediction of the multi-task learning model are realized.
The multitask learning self-adaptive balancing method based on the information entropy dynamic weighting comprises the following specific steps:
step one, a multitask learning model M is built, model inference and normalization processing are carried out on an input image through the current multitask learning model M, and different types of normalized probability graphs are obtained;
the initial multi-task learning model M comprises one shared encoder and three task-specific decoders.
After the multi-task learning model M carries out model inference on the input image, three pixel-level task outputs are generated and are respectively a semantic segmentation output graph PsDepth estimation output map PdAnd edge detection output map PbRespectively carrying out normalization processing on each task output graph to obtain different types of normalized probability graphs, which specifically comprises the following steps:
1) semantic segmentation output graph PsProcessing by adopting a softmax function to obtain a normalized semantic segmentation probability map:
wherein M is the total category number of semantic segmentation, i represents the ith semantic category in the prediction graph, and Ps,iFor model output map PsLayer i semantically partitions the value map, and P's,iThen representing the normalized ith layer semantic segmentation probability map P's。
2) Edge detection output graph PbProcessing by adopting sigmoid function to obtain normalized edge detection probability map P'b:
3) Depth estimation output map PdConverting the depth regression task into a classification task by using a logarithmic space discretization strategy, and obtaining a normalized depth classification probability map by using a softmax function;
firstly, a logarithmic space discretization strategy is adopted to discretely divide the depth value of a continuous space into K sub-intervals corresponding to K categories;
the method specifically comprises the following steps: interval of depth values [ D1,D2]Mapping to [ D1+1,D2+1]Is recorded as [ D'1,D′2]And according to a discretized depth threshold dkDividing to obtain K sub-intervals { [ d ]0,d1],[d1,d2],...,[dK-1,dK]}。
Discretized depth threshold dkIs defined as:
then, discretizing the depth estimation truth value into a depth classification truth value according to the strategy;
i.e. when the depth estimation truth value is at dk-1,dk]It is assigned class k and trains the deep task branch with deep classification truth.
Finally, obtaining a depth classification prediction map in a training stage, and processing by adopting a softmax function to obtain a normalized depth classification probability map P'd,k;
The depth classification probability map is:
wherein K is the total class number of the depth classification, K represents the kth depth class, Pd,kRepresents a k-th layer depth classification prediction map, P'd,kRepresenting a normalized k-th layer depth classification probability map.
Calculating a multitask loss function by using the normalized probability map, and performing primary training on the current multitask learning model M;
the method specifically comprises the following steps:
firstly, calculating the loss corresponding to each type of normalized probability graph obtained by adopting a cross entropy function;
cross entropy loss function LtComprises the following steps:
wherein, ytA one-hot form corresponding to each task is provided with a supervision type label; t is s, d or b, i.e. P'tIs a normalized probability map of semantic segmentation, edge detection or depth estimation tasks; c is the total category number corresponding to each task, and i represents the ith category in the prediction graph.
Then, an equal-weight-sum multitask loss function L is constructed according to the fixed weight of each taskmtlComprises the following steps:
finally, a multi-tasking penalty function L is utilizedmtlAnd performing gradient back transmission and parameter updating of the network model, and performing iterative training to obtain a multi-task learning model after preliminary training.
Thirdly, on the basis of the initially trained multitask learning model M, constructing a final self-adaptive multitask loss function L 'by utilizing an information entropy dynamic weighting algorithm'mtl。
The method specifically comprises the following steps:
firstly, calculating the information entropy E of each task by using a multi-layer probability map of each categoryt:
Wherein, W and H are respectively probability chart row and column coordinates, and W and H are respectively the maximum values of the probability chart row and column lengths; c is the channel value number of the probability map, and C is the total number of categories corresponding to each task.
Then, relative weight w of each task is distributed by using information entropy valuet;
Relative weight wtComprises the following steps:
when the prediction result of the task is worse, the uncertainty of the output probability map is higher, and the corresponding information entropy value is larger. Therefore, the task with poor prediction performance is assigned with larger weight, so that the model can be used for training the corresponding task.
Finally, according to the relative weight of each task and the cross entropy loss function LtAnd constructing a final self-adaptive multitask loss function in a weighted summation mode.
Final adaptive multitask loss function L'mtlComprises the following steps:
step four, utilizing the final adaptive multitask loss function L'mtlPerforming back propagation to obtain the parameter gradient of the current multi-task learning model M, updating the parameter of the current multi-task learning model M by using a gradient descent algorithm, and completing one-time iterative training;
and step five, after the iterative training is finished, obtaining a new multi-task learning model M1, returning to the step three to carry out the next iteration until the multi-task learning model M1 reaches convergence, and terminating the training.
The invention has the advantages that:
(1) the invention relates to a multi-task learning self-adaptive balancing method based on information entropy dynamic weighting, which adopts a discretization strategy to convert regression tasks into classification tasks, can effectively deal with different types of tasks and has strong algorithm applicability;
(2) the invention relates to a multi-task learning self-adaptive balancing method based on dynamic weighting of information entropy, which utilizes a prediction graph output by a task to calculate the information entropy without changing the structural design of a model or the updating process of parameters, and is simple, efficient, plug-and-play;
(3) the invention relates to a multitask learning self-adaptive balancing method based on information entropy dynamic weighting, which can be used for dynamically adjusting the weight of a task loss function based on an information entropy value, and can be used for self-adaptively balancing the relative importance of each task so as to improve the performance of the whole task.
(4) The invention relates to a multitask learning self-adaptive balancing method based on information entropy dynamic weighting, which can effectively extract the general shared characteristics and task specific characteristics of a model and quickly and uniformly finish the training of a multitask learning model.
Drawings
FIG. 1 is an overall flow chart of the multi-task learning adaptive balancing method based on information entropy dynamic weighting according to the present invention;
FIG. 2 is a schematic diagram of a multitasking learning model in the present invention;
FIG. 3 is a schematic diagram of the discretization of the regression task in the present invention.
Detailed Description
The following describes in detail a specific implementation method of the present invention with reference to the accompanying drawings and taking a multitask learning network in which semantic segmentation, depth estimation and edge detection are jointly implemented in computer vision as an example.
The invention provides a multitask learning self-adaptive balance method based on information entropy dynamic weighting. In the model training process, the information entropy algorithm can effectively evaluate the prediction result of each task, and the relative weight of the tasks is adjusted through a dynamic weighting strategy, so that the multi-task prediction model focuses more on the tasks with relatively poor performance, and the self-adaptive balance learning of different task performances is realized.
The invention relates to a multitask learning self-adaptive balancing method based on information entropy dynamic weighting, which comprises the following steps as shown in figure 1:
initializing network parameters, and training to obtain an initial multi-task learning model.
A multitask learning network model based on a single encoder-multiple decoders is constructed, as shown in fig. 2, specifically:
the encoder contains network parameters that are shared by all tasks and is initialized with a skeletal network (e.g., ResNet) pre-trained on ImageNet. The decoder comprises task-specific network parameters, each task corresponds to one decoder, and a random parameter initialization mode is adopted. In this embodiment, three tasks are set to be solved: semantic segmentation, depth estimation and edge detection, the multi-task learning model comprises a shared encoder and three task-specific decoders.
After the three tasks are respectively output by a decoder, three cross entropy losses L are obtained1、L2And L3Corresponding relative weight w to each task1、w2And w3And the cross entropy loss of the multi-task loss function is obtained by weighted summationmtl:
Lmtl=w1L1+w2L2+w3L3
Performing model inference and normalization processing on the input image through a multi-task learning model to obtain different types of normalized probability maps;
after model inference is carried out on an input image by the multi-task learning model, three pixel-level task outputs are generated and are respectively a semantic segmentation output graph PsDepth estimation output map PdAnd edge detection output map PbNormalizing each task output graph to obtain different types of normalized probability graphs, which specifically comprises the following steps:
1) semantic segmentation output graph PsProcessing by adopting a softmax function to obtain a normalized multi-classification semantic segmentation probability map:
wherein S is the total category number of semantic segmentation, i represents the ith semantic category in the prediction graph, and Ps,iFor model output map PsLayer i semantically partitions the value map, and P's,iThe normalized ith layer semantic segmentation probability map is represented.
2) Edge detection output graph PbProcessing by adopting sigmoid function (equivalent to binary softmax function) to obtain normalized edge detection probability map P'b:
3) Depth estimation output map PdConverting the depth regression task into a classification task by using a logarithmic space discretization strategy, and obtaining a normalized depth classification probability map by using a softmax function;
first, as shown in fig. 3, a logarithmic space discretization strategy is adopted to discretely divide the depth value of the continuous space into K sub-intervals corresponding to K categories, specifically:
interval of depth values [ D1,D2]Mapping to [ D1+1,D2+1]Is recorded as [ D'1,D′2]And according to a discretized depth threshold dkDividing to obtain K sub-intervals { [ d ]0,d1],[d1,d2],...,[dK-1,dK]}。
Discretized depth threshold dkIs defined as:
then, the depth estimation truth value is discretized into a depth classification truth value according to the strategy, namely when the depth estimation truth value is in [ d ]k-1,dk]It is assigned class k and trains the deep task branch with deep classification truth.
Finally, obtaining a depth classification prediction map in a training stage, and processing by adopting a softmax function to obtain a normalized depth classification probability map P'd,k;
The depth classification probability map is:
wherein K is the total class number of the depth classification, K represents the kth depth class, Pd,kRepresents a k-th layer depth classification prediction map, P'd,kRepresenting a normalized k-th layer depth classification probability map.
In the embodiment of the present invention, the discretization of the depth estimation is performed by taking K80. The supervised truth of the depth branch is in the form of classification, so the depth estimation task is directly trained in the form of depth classification here.
Step three, performing preliminary training on the multi-task learning model;
because the result error of each task predicted by the initialization model is large and unstable, the multi-task network model needs to be initially trained, specifically:
firstly, calculating the loss corresponding to each type of normalized probability graph obtained by adopting a cross entropy function:
wherein, ytA one-hot form corresponding to each task is provided with a supervision type label; t corresponds to each task in step one and can be represented as s, d or b, i.e., P'tIs a normalized probability map of semantic segmentation, edge detection or depth estimation tasks; c is the total category number corresponding to each task, and i represents the ith category in the prediction graph.
Secondly, an equal weight addition and multitask loss function L is constructedmtlComprises the following steps:
during the preliminary training process, the loss function of each task is given equal fixed weight.
Then, a multitask penalty function L is utilizedmtlAnd (3) performing gradient back transmission and parameter updating of the network model, and training a multi-task learning model obtained after a certain number of iterations to perform preliminary task prediction.
And fourthly, on the basis of the multi-task learning model obtained through the primary training, constructing a self-adaptive multi-task loss function by using an information entropy dynamic weighting algorithm, and further optimizing and training the multi-task learning model.
The method specifically comprises the following steps:
firstly, calculating information entropy E of each task by using various types of multilayer probability graphst:
Wherein, W and H are respectively probability chart row and column coordinates, and W and H are respectively the maximum values of the probability chart row and column lengths; c is the number of channels of the probability map, and C is the total number of corresponding categories of each task;
then, relative weight w of each task is distributed by using information entropy valuet:
The information entropy can reflect the uncertainty of the prediction probability map, so the information entropy of the task output probability map can be used for allocating relative weights:
when the prediction result of the task is worse, the uncertainty of the output probability map is higher, and the corresponding information entropy value is larger. Therefore, the task with poor prediction performance is assigned with larger weight so that the model can train the corresponding task.
Finally, according to the relative weight of each task and the cross entropy loss function LtAnd constructing an integral self-adaptive multitask loss function in a weighted summation mode.
Overall adaptive multitask loss function L'mtlComprises the following steps:
step five, utilizing an overall adaptive multitask loss function L'mtlPerforming backward propagation to obtain a model parameter gradient, and then updating the model parameter by using a gradient descent algorithm to complete one-time iterative training;
and step six, after the model parameters are updated, a new multi-task learning model is obtained. And returning to the step four to carry out the next iteration until the multi-task learning model reaches convergence, and terminating the training.
After each network parameter is updated, the prediction performance of each task changes, so that the corresponding relative weight also changes dynamically, and the adaptive adjustment of the loss function in the network model training is realized.
The above embodiment only describes three specific tasks of semantic segmentation, depth estimation and edge detection, but the application of the method of the present invention is not limited to the above three specific tasks, and may also be applied to other tasks, and may also be applied to the case of three or more tasks, and the multi-task learning model is adjusted according to the actual situation. The case including other tasks or three or more tasks is within the technical problem solved by the present invention.
Claims (5)
1. A multitask learning self-adaptive balancing method based on information entropy dynamic weighting is characterized by comprising the following steps:
firstly, building an initial multi-task learning model M, deducing an input image through the multi-task learning model M to obtain different types of outputs of different tasks, and respectively carrying out normalization processing to obtain normalized probability maps corresponding to the different tasks;
then, calculating a multitask loss function by using each normalized probability graph, and performing primary training on the multitask learning model M through the multitask loss function;
finally, on the basis of the initially trained multi-task learning model M, a final self-adaptive multi-task loss function is constructed through an information entropy dynamic weighting algorithm, the parameter gradient of the current multi-task learning model M is obtained through a back propagation algorithm, parameter updating is carried out, and one-time iterative training is completed;
after iterative training, a new multi-task learning model M1 is obtained, the input image is deduced and normalized again, the next iteration is carried out by using the self-adaptive multi-task loss function until the multi-task learning model M1 converges, and the training is terminated.
2. The adaptive balancing method for multitask learning based on information entropy dynamic weighting as claimed in claim 1, wherein the multitask learning model comprises a shared encoder and a decoder corresponding to each specific task.
3. The information entropy dynamic weighting-based multitask learning self as claimed in claim 1The adaptive balancing method is characterized in that the three task output graphs are as follows: semantic segmentation output graph PsDepth estimation output map PdAnd edge detection output map Pb(ii) a The corresponding normalized probability map is:
1) semantic segmentation output graph PsProcessing by adopting a sonmax function to obtain a normalized semantic segmentation probability map:
wherein S is the total category number of semantic segmentation, i represents the ith semantic category in the prediction graph, and Ps,iFor model output map PsLayer i semantically partitions the value map, and P's,iThen representing the normalized ith layer semantic segmentation probability map P's;
2) Classified edge detection output graph PbProcessing by adopting sigmoid function to obtain normalized edge detection probability map P'b:
3) Depth estimation output map PdConverting the depth regression task into a classification task by using a logarithmic space discretization strategy, and obtaining a normalized depth classification probability map by using a sonmax function;
firstly, a logarithm space discretization strategy is adopted to discretely divide the depth value of a continuous space into K sub-intervals corresponding to K categories, specifically:
interval of depth values [ D1,D2]Mapping to [ D1+1,D2+1]Is recorded as [ D'1,D′2]And according to a discretized depth threshold dkDividing to obtain K sub-intervals { [ d ]0,d1],[d1,d2],...,[dK-1,dK]};
Discretized depth threshold dkIs defined as:
then, the depth estimation truth value is discretized into a depth classification truth value according to the strategy, namely when the depth estimation truth value is in [ d ]k-1,dk]The class of the training data is k, and the deep task branches are trained according to the depth classification truth value;
finally, obtaining a depth classification prediction map in a training stage, and processing by adopting a softmax function to obtain a normalized depth classification probability map P'd,k;
The depth classification probability map is:
wherein K is the total class number of the depth classification, K represents the kth depth class, Pd,kRepresents a k-th layer depth classification prediction map, P'd,kRepresenting a normalized k-th layer depth classification probability map.
4. The information entropy dynamic weighting-based multitask learning self-adaptive balancing method according to claim 1, wherein the specific process of calculating the multitask loss function and initially training the multitask learning model is as follows:
firstly, calculating loss corresponding to each type of normalized probability graph obtained by adopting a cross entropy function;
cross entropy loss function LtComprises the following steps:
wherein, ytA one-hot form corresponding to each task is provided with a supervision type label; t is s, d or b, i.e. P'tIs a semantic meaningA normalized probability map of segmentation, edge detection, or depth estimation tasks; c is the total category number corresponding to each task, and i represents the ith layer category in the prediction graph;
then, an equal-weight-sum multitask loss function L is constructed according to the fixed weight of each taskmtlComprises the following steps:
finally, a multi-tasking penalty function L is utilizedmtlAnd performing gradient back transmission and parameter updating of the network model, and performing iterative training to obtain a multi-task learning model after preliminary training.
5. The information entropy dynamic weighting-based multitask learning adaptive balancing method according to claim 1, wherein the specific process for constructing the final adaptive multitask loss function is as follows:
step 501, calculating information entropy E of each task by using multi-layer probability maps of various categoriest:
Wherein, W and H are respectively probability chart row and column coordinates, and W and H are respectively the maximum values of the probability chart row and column lengths; c is the number of channels of the probability map, and C is the total number of corresponding categories of each task;
502, distributing relative weight w of each task by using information entropy valuet;
Relative weight wtComprises the following steps:
step 503, according to the relative weight of each task and the cross entropy loss function LtConstructing the final self-adaptation by means of weighted summationA multitask penalty function;
final adaptive multitask loss function L'mtlComprises the following steps:
L′mtl=-∑twtLt。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110820646.4A CN113537365B (en) | 2021-07-20 | 2021-07-20 | Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110820646.4A CN113537365B (en) | 2021-07-20 | 2021-07-20 | Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113537365A true CN113537365A (en) | 2021-10-22 |
CN113537365B CN113537365B (en) | 2024-02-06 |
Family
ID=78100520
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110820646.4A Active CN113537365B (en) | 2021-07-20 | 2021-07-20 | Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113537365B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114714146A (en) * | 2022-04-08 | 2022-07-08 | 北京理工大学 | Method for simultaneously predicting surface roughness and cutter abrasion |
WO2023097616A1 (en) * | 2021-12-02 | 2023-06-08 | Intel Corporation | Apparatus, method, device and medium for loss balancing in multi-task learning |
CN117273068A (en) * | 2023-09-28 | 2023-12-22 | 东南大学 | Model initialization method based on linearly expandable learning genes |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451620A (en) * | 2017-08-11 | 2017-12-08 | 深圳市唯特视科技有限公司 | A kind of scene understanding method based on multi-task learning |
CN110837836A (en) * | 2019-11-05 | 2020-02-25 | 中国科学技术大学 | Semi-supervised semantic segmentation method based on maximized confidence |
-
2021
- 2021-07-20 CN CN202110820646.4A patent/CN113537365B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451620A (en) * | 2017-08-11 | 2017-12-08 | 深圳市唯特视科技有限公司 | A kind of scene understanding method based on multi-task learning |
CN110837836A (en) * | 2019-11-05 | 2020-02-25 | 中国科学技术大学 | Semi-supervised semantic segmentation method based on maximized confidence |
Non-Patent Citations (2)
Title |
---|
Y WANG, ET AL.: "Boundary-aware multitask learning for remote sensing imagery", 《IEEE》 * |
张磊;曹跃云;李彬;崔佳林;: "基于组合赋权法的舰船动力系统使用效能评估研究", 舰船科学技术, no. 03 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023097616A1 (en) * | 2021-12-02 | 2023-06-08 | Intel Corporation | Apparatus, method, device and medium for loss balancing in multi-task learning |
CN114714146A (en) * | 2022-04-08 | 2022-07-08 | 北京理工大学 | Method for simultaneously predicting surface roughness and cutter abrasion |
CN117273068A (en) * | 2023-09-28 | 2023-12-22 | 东南大学 | Model initialization method based on linearly expandable learning genes |
CN117273068B (en) * | 2023-09-28 | 2024-04-16 | 东南大学 | Model initialization method based on linearly expandable learning genes |
Also Published As
Publication number | Publication date |
---|---|
CN113537365B (en) | 2024-02-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113537365B (en) | Information entropy dynamic weighting-based multi-task learning self-adaptive balancing method | |
CN109948029B (en) | Neural network self-adaptive depth Hash image searching method | |
CN106600059B (en) | Intelligent power grid short-term load prediction method based on improved RBF neural network | |
Kim et al. | SplitNet: Learning to semantically split deep networks for parameter reduction and model parallelization | |
CN108985515B (en) | New energy output prediction method and system based on independent cyclic neural network | |
CN114022693B (en) | Single-cell RNA-seq data clustering method based on double self-supervision | |
Kamruzzaman et al. | Medical diagnosis using neural network | |
CN113554156B (en) | Multitask image processing method based on attention mechanism and deformable convolution | |
CN106897744A (en) | A kind of self adaptation sets the method and system of depth confidence network parameter | |
CN113722980A (en) | Ocean wave height prediction method, system, computer equipment, storage medium and terminal | |
CN114819143A (en) | Model compression method suitable for communication network field maintenance | |
CN115204035A (en) | Generator set operation parameter prediction method and device based on multi-scale time sequence data fusion model and storage medium | |
CN111353534B (en) | Graph data category prediction method based on adaptive fractional order gradient | |
Moriya et al. | Evolution-strategy-based automation of system development for high-performance speech recognition | |
CN114202021A (en) | Knowledge distillation-based efficient image classification method and system | |
CN111753995A (en) | Local interpretable method based on gradient lifting tree | |
CN116451859A (en) | Bayesian optimization-based stock prediction method for generating countermeasure network | |
CN116415177A (en) | Classifier parameter identification method based on extreme learning machine | |
CN113408610B (en) | Image identification method based on adaptive matrix iteration extreme learning machine | |
CN115906959A (en) | Parameter training method of neural network model based on DE-BP algorithm | |
CN113807005A (en) | Bearing residual life prediction method based on improved FPA-DBN | |
CN113408602A (en) | Tree process neural network initialization method | |
KR20210157826A (en) | Method for sturcture learning and model compression for deep neural netwrok | |
CN113033495B (en) | Weak supervision behavior identification method based on k-means algorithm | |
US20220343162A1 (en) | Method for structure learning and model compression for deep neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |