WO2023085610A1 - Procédé et dispositif électronique pour effectuer un apprentissage de modèle multitâche - Google Patents

Procédé et dispositif électronique pour effectuer un apprentissage de modèle multitâche Download PDF

Info

Publication number
WO2023085610A1
WO2023085610A1 PCT/KR2022/014988 KR2022014988W WO2023085610A1 WO 2023085610 A1 WO2023085610 A1 WO 2023085610A1 KR 2022014988 W KR2022014988 W KR 2022014988W WO 2023085610 A1 WO2023085610 A1 WO 2023085610A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
model
learning
accuracy
electronic device
Prior art date
Application number
PCT/KR2022/014988
Other languages
English (en)
Korean (ko)
Inventor
조한주
윤하영
Original Assignee
삼성전자 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020220115795A external-priority patent/KR20230068989A/ko
Application filed by 삼성전자 주식회사 filed Critical 삼성전자 주식회사
Publication of WO2023085610A1 publication Critical patent/WO2023085610A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects

Definitions

  • a method and electronic device for performing multi-task model learning are described.
  • neural network models that perform each of the various vision tasks have an excellent level of accuracy.
  • problems may occur depending on the performance and resources of an embedded device. Accordingly, interest in a multi-task model that can efficiently and quickly perform multi-tasks is growing.
  • a method for performing learning of a multi-task model is provided.
  • a method for performing learning of the multi-task model is based on the accuracy of each task of the multi-task model and the accuracy according to the learning degree of each task, and the task corresponding to each task corresponds to the loss of each task of the multi-task model. and setting a task weight corresponding to each task of a loss function of the multi-task model corresponding to a weight sum to which weights have been applied.
  • the method of performing learning of the multi-task model includes performing learning of the multi-task model by using a loss function of the multi-task model.
  • an electronic device performing learning of a multi-task model includes a memory storing one or more instructions and a processor executing the one or more instructions stored in the memory.
  • the processor sets the task weight corresponding to each task to the loss of each task of the multi-task model, based on the unique accuracy of each task of the multi-task model and the accuracy according to the learning degree of each task.
  • the processor performs learning of the multi-task model using the loss function of the multi-task model by executing the one or more instructions.
  • 1 is a diagram for explaining a multi-task model.
  • FIG. 2 is a flowchart illustrating a method of performing learning of a multi-task model according to an embodiment of the present disclosure.
  • FIG. 3 is a detailed flowchart of a step of setting a task weight corresponding to each task of a loss function of a multi-task model according to an embodiment of the present disclosure.
  • FIG. 4 is a detailed flowchart of a step of obtaining accuracy of a single-task teaching model corresponding to each task of a multi-task model according to an embodiment of the present disclosure.
  • FIG. 5 is a diagram for explaining a process of performing learning of a single-task teaching model corresponding to each task of a multi-task model according to an embodiment of the present disclosure.
  • FIG. 6 is a detailed flowchart of a step of performing learning of a multi-task model using a loss function of a multi-task model according to an embodiment of the present disclosure.
  • FIG. 7 is a diagram for explaining an example of loss of each task in a process of performing self-learning of a multi-task model according to an embodiment of the present disclosure.
  • FIG. 8 illustrates an example of loss of each task when an arbitrary task having a ground-truth is included in a process of performing self-learning of a multi-task model according to an embodiment of the present disclosure. It is a drawing for
  • FIG. 9 is a diagram for explaining a result of learning a vision model that performs a multi-task according to a method for learning a multi-task model according to an embodiment of the present disclosure.
  • FIG. 10 is a block diagram illustrating the configuration of an electronic device according to an embodiment of the present disclosure.
  • FIG. 11 is a diagram illustrating an example of an electronic device according to an embodiment of the present disclosure.
  • 1 is a diagram for explaining a multi-task model.
  • the electronic device 100 may include a mobile device such as a smart phone, smart glasses, a wearable device, a personal digital assistant (PDA) or a mobile scanner, but is not limited thereto.
  • a mobile device such as a smart phone, smart glasses, a wearable device, a personal digital assistant (PDA) or a mobile scanner, but is not limited thereto.
  • PDA personal digital assistant
  • the electronic device 100 may be loaded with various types of neural network models.
  • the electronic device 100 may include at least one model such as a deep neural network (DNN), a convolution neural network (CNN), a recurrent neural network (RNN), and a bidirectional recurrent deep neural network (BRDNN). These may be used in combination.
  • DNN deep neural network
  • CNN convolution neural network
  • RNN recurrent neural network
  • BBDNN bidirectional recurrent deep neural network
  • the electronic device 100 When the electronic device 100 is an embedded device loaded with various neural network models, problems may occur depending on resources and performance of the electronic device 100 . Accordingly, as shown in FIG. 1 , the multi-task model can be loaded into the electronic device 100 so that tasks processed by various neural network models can be efficiently and quickly performed.
  • the multi-task model may be in the form of a neural network composed of a plurality of layers.
  • the multi-task model may be composed of shared backbone layers and additional layers for each task, each of which takes an output of the shared backbone as an input.
  • the multi-task model may include a feature extractor that extracts features necessary for prediction and a predictor that performs prediction using the features extracted therefrom.
  • the multi-task model is a vision model that performs a task in the computer vision field as an example, but the type or function of the multi-task model described in the present disclosure is not limited thereto, and voice It can be an artificial intelligence model that can be used in various fields such as recognition and natural language processing.
  • the multi-task model must be trained so that all tasks can have uniform performance without being biased toward a specific task, it is very important which loss function and which data set to train the multi-task model with.
  • the loss function of the multi-task model may be defined as a weight sum obtained by applying task weights corresponding to each task to each task loss, as shown in Equation 1 below.
  • the accuracy of each task is greatly influenced by how the task weight corresponding to each task is set.
  • the multi-task model can be learned in consideration of both the characteristics and the degree of learning of each task of the multi-task model, all tasks are learned in a balanced way, and even performance of the multi-task model can be expected.
  • the multi-task data set for learning the multi-task model requires a large amount of inputs (eg, input images) and annotations for various tasks, large-scale data that satisfies these requirements A set is difficult to secure realistically.
  • pseudo labels are generated for inputs of tasks for which labels do not exist in partially labeled inputs, By securing a multi-task data set, learning of a multi-task model may be performed.
  • FIG. 2 is a flowchart illustrating a method of performing learning of a multi-task model according to an embodiment of the present disclosure.
  • the electronic device 100 sets a task weight corresponding to each task of the loss function of the multi-task model based on the unique accuracy of each task of the multi-task model and the accuracy according to the learning degree of each task.
  • the loss function of the multi-task model corresponds to a weighted sum obtained by applying a task weight corresponding to each task to the loss of each task of the multi-task model.
  • the inherent accuracy of each task of the multi-task model means the expected accuracy of each task when the multi-task model is sufficiently trained.
  • the inherent accuracy of each task in the multi-task model may be different for each task.
  • the learning process is sufficiently performed for each task, the learning level may be different depending on the type of task. For example, when a model corresponding to the first task undergoes a sufficient learning process, a highly accurate prediction value for the first task may be output.
  • the model corresponding to the second task undergoes a sufficient learning process, if learning is difficult or the difficulty of the task is high, the accuracy of the predicted value output for the second task may not be high.
  • the accuracy according to the learning level of each task of the multi-task model means the current accuracy of each task according to the current learning level in the learning process of the multi-task model. Since the degree of influence by learning may be different for each task, the accuracy according to the learning degree of each task may be different for each task. For example, when the difficulty of the task is low and learning is successful, the accuracy according to the learning degree of the task rapidly converges to the intrinsic accuracy. On the other hand, when learning is not performed well due to the high difficulty of the task, the accuracy according to the learning degree of the task slowly converges to the intrinsic accuracy.
  • the intrinsic accuracy of each task in the multi-task model is different, considering only the current accuracy of each task, if the task weight is given higher as the current accuracy of each task is lower, all tasks in the multi-task model Balanced learning cannot be achieved. For example, in the case of a task with low intrinsic accuracy, even if the current accuracy is close to the intrinsic accuracy, if the current accuracy is lower than other tasks and the task weight is given high, only the specific task is learned, and all tasks are learned. A problem may occur that is not learned quickly and in a balanced way.
  • the electronic device 100 corresponding to an embodiment of the present disclosure may acquire the unique accuracy of each task in advance before learning the multi-task model.
  • the unique accuracy of each task may be set to a predetermined value by a manager in consideration of characteristics of each task.
  • the unique accuracy of each task may be obtained through a predetermined process.
  • FIG. 3 is a detailed flowchart of a step of setting a task weight corresponding to each task of a loss function of a multi-task model according to an embodiment of the present disclosure.
  • the electronic device 100 is a single-task teaching model corresponding to each task of the multi-task model. Accuracy may be obtained.
  • Each single-task teaching model may correspond to each task of the multi-task model.
  • the accuracy of the single-task teaching model means the accuracy of the single-task teaching model upon completion of learning. When the loss calculated in the learning process of the single-task teaching model is minimized, the single-task teaching model can be regarded as completed learning. Alternatively, when the accuracy of the learned single-task teaching model satisfies a predetermined condition, for example, when the accuracy of the learned single-task teaching model converges to a specific value, the single-task teaching model may be regarded as completed learning.
  • the electronic device 100 may communicate with at least one external device equipped with a single-task teaching model corresponding to each task of the multi-task model.
  • the electronic device 100 may obtain the accuracy of the single-task teaching model corresponding to each task by receiving the accuracy of the single-task teaching model from at least one external device.
  • the electronic device 100 may load a single-task teaching model corresponding to each task of the multi-task model.
  • the electronic device 100 may obtain the accuracy of the single-task teaching model corresponding to each task by calculating the accuracy of the single-task teaching model corresponding to each task. In this regard, it will be described in detail with reference to FIG. 4 .
  • FIG. 4 is a detailed flowchart of a step of obtaining accuracy of a single-task teaching model corresponding to each task of a multi-task model according to an embodiment of the present disclosure.
  • the electronic device 100 may perform learning of a single-task instructional model corresponding to each task by using a dedicated learning data set corresponding to each task (S410).
  • the single-task instructional model corresponding to each task For each, a dedicated training data set may be prepared.
  • each dedicated training data set for each task such as object detection, segmentation, depth estimation, and optical flow there may be
  • Each dedicated training data set is a data set specialized for a task and may be a labeled data set having a much smaller size than a data set required to train a multi-task model.
  • FIG. 5 is a diagram for explaining a process of performing learning of a single-task teaching model corresponding to each task of a multi-task model according to an embodiment of the present disclosure.
  • a single-task instructional model A corresponding to task A of the multi-task model, a single-task instructional model B corresponding to task B of the multi-task model, and a single-task instructional model corresponding to task C of the multi-task model Task teaching model C shows the process of performing each learning.
  • the number of single-task teaching models is not limited to the example shown in FIG. 5 and may be determined according to the number of tasks of the multi-task model.
  • the single-task teaching model A may be learned using a dedicated learning data set A corresponding to task A.
  • Dedicated training data set A contains labeled input data small enough to learn task A.
  • the electronic device 100 may calculate a loss corresponding to a difference between the actual value and the output A for the single-task teaching model A.
  • the electronic device 100 may perform learning of the single-task teaching model A in a direction in which loss is reduced through backpropagation.
  • the single-task teaching model B may be learned using a dedicated learning data set B corresponding to task B.
  • Dedicated training data set B contains labeled input data small enough to train task B.
  • the electronic device 100 may calculate a loss corresponding to a difference between the actual value and the output B for the single-task teaching model B.
  • the electronic device 100 may perform learning of the single-task teaching model B in a direction in which loss is reduced through backpropagation.
  • the single-task teaching model C may be learned using a dedicated learning data set C corresponding to task C.
  • a dedicated training dataset C contains labeled input data small enough to learn task C.
  • the electronic device 100 may calculate a loss corresponding to a difference between the actual value and the output C for the single-task teaching model C.
  • the electronic device 100 may perform learning of the single-task teaching model C in a direction in which loss is reduced through backpropagation.
  • the single-task teaching model A is an object detection model.
  • single-task instruction model B can be a segmentation model
  • single-task instruction model C can be a depth estimation model.
  • a single-task teaching model A object detection model
  • a single-task teaching model B segmentation model
  • a single-task teaching model C depth estimation model
  • the electronic device 100 may determine whether learning of the single-task teaching model corresponding to each task is completed (S420). Learning of the single-task teaching model corresponding to each task is completed. If it is determined that the learning is not completed, the electronic device 100 may repeatedly perform the single-task teaching model until learning is completed.
  • the electronic device 100 may consider that learning of the single-task instruction model A is completed when the loss calculated in the learning process of the single-task instruction model A is minimized.
  • the electronic device 100 may consider that learning of the single-task instruction model B is completed when the loss calculated in the learning process of the single-task instruction model B is minimized.
  • the electronic device 100 may consider that learning of the single-task instruction model C is completed when the loss calculated in the learning process of the single-task instruction model C is minimized.
  • the electronic device 100 may measure accuracy when learning of the learned single-task teaching model corresponding to each task is completed (S430). Upon completion of learning of the single-task teaching model A, the electronic device 100 may measure accuracy. , the accuracy of single-task instruction model B upon completion of learning, and the accuracy of single-task instruction model C upon completion of learning can be measured.
  • the electronic device 100 provides a single-task teaching model corresponding to each task. Based on the accuracy and the accuracy according to the learning degree of each task, a task weight corresponding to each task of the loss function of the multi-task model may be set (S320). task teaching model Accuracy is obtained, and the obtained single-task instruction model corresponding to each task Accuracy can be used as the specific accuracy of each task.
  • the accuracy of the single-task teaching model can be viewed as the potential accuracy that can be expected when the task is performed in the multi-task model
  • the upper bound of the accuracy according to the learning degree of the multi-task model can be defined
  • a task weight corresponding to each task in the loss function of the multi-task model of Equation 1 may be defined as Equation 3 as follows.
  • the electronic device 100 has a unique accuracy of each task ( ) and the accuracy according to the learning degree of each task ( ), it is possible to set a task weight corresponding to each task.
  • the unique accuracy of each task ( ) is the single-task teaching model corresponding to each task. may be related to accuracy.
  • the electronic device 100 may adjust the task weight corresponding to each task in proportion to the difference in accuracy.
  • the first equation (1) of Equation 2 can be normalized as the second equation (2) of Equation 2 to avoid underfitting.
  • the task weight of each task decreases, which has the same effect as reducing the learning rate.
  • the update amount of the multi-task model decreases, so that it is necessary to prevent the multi-task model from not being sufficiently trained and fine-tuned at an early stage.
  • the first equation (1) of Equation 2 can be normalized such that the sum of task weights corresponding to each task becomes the number of tasks.
  • the second equation (2) of Equation 2 is in the form of adding a slight margin, like the third equation (3) of Equation 2, so that the multi-task model can outperform the single-task model. can transform
  • a task weight corresponding to each task in the loss function of the multi-task model of Equation 1 may be defined as Equation 3 as follows.
  • the electronic device 100 has a unique accuracy of each task ( Accuracy according to the learning degree of each task for ) ( ), a task weight corresponding to each task may be set.
  • the unique accuracy of each task ( ) is the single-task teaching model corresponding to each task. may be related to accuracy.
  • Inherent accuracy of each task ( Accuracy according to the learning degree of each task for ) ( ) means the relative learning rate according to the learning level of each task.
  • the electronic device 100 has a unique accuracy of each task ( Accuracy according to the learning degree of each task for ) ( ), the task weight corresponding to each task can be adjusted in inverse proportion to the ratio. According to the learning degree of the task, the electronic device 100 may increase the task weight of the corresponding task when the relative learning rate of the task is low, and decrease the task weight of the corresponding task when the relative learning rate of the task is high.
  • the first equation (1) of Equation 3 can be normalized as the second equation (2) of Equation 3.
  • the task weight of each task is not limited to the equations shown in Equations 2 and 3, and can be modified in various forms.
  • the electronic device 100 performs learning of the multi-task model using the loss function of the multi-task model (S220).
  • S220 the loss function of the multi-task model
  • a large-scale multi-task It is necessary to use a training data set, but it is very difficult in terms of time and cost to perform annotations on a large amount of input data and a plurality of tasks.
  • an example of performing learning of a multi-task model using a partially labeled multi-task learning data set will be described in detail with reference to FIGS. 6 to 8 .
  • FIG. 6 is a detailed flowchart of a step of performing learning of a multi-task model using a loss function of a multi-task model according to an embodiment of the present disclosure.
  • the electronic device 100 may input the augmented multi-task learning data set to a single-task teaching model and a multi-task model corresponding to each task (S610).
  • the multi-task learning data set is partially labeled. It can be an input data set.
  • the electronic device 100 may input the augmented multi-task learning data set to not only the multi-task model but also pre-learned single-task teaching models. For example, when the multi-task learning data set is images, augmentation of the images may be performed through transformation such as rotation, flipping, or shifting. Augmented images may be equally input to a single-task teaching model and a multi-task model corresponding to each task.
  • the single-task teaching model corresponding to each task can generate pseudo labels for tasks for which labels do not exist.
  • the output of the single-task teaching model corresponding to each task may be generated as a pseudo label represented in the form of a probability value according to a soft label method. If the hard label method is followed, a problem may occur in which the ground-truth of the output of the single-task teaching model is incorrectly generated. Since such erroneously generated real values can greatly affect the self-learning and accuracy of the multi-task instructional model, in order to minimize the influence, in one embodiment of the present disclosure, the output of the single-task instructional model is set as a soft label. follow the method For example, when the multi-task model is a vision model, when each task is image classification, object detection, and segmentation, pseudo labels are generated according to the soft label method for not only image classification but also object detection and segmentation.
  • augmented multi-task learning data is multi-tasked.
  • -Pseudo-levels can be created dynamically by inputting not only the task model but also the single-task teaching model corresponding to each task.
  • the electronic device 100 assigns a task weight corresponding to each task to each task loss corresponding to the difference between the output of the single-task teaching model corresponding to each task and the output of each task of the multi-task model corresponding to the output.
  • Self-learning of the multi-task model may be performed so that the sum of applied weights is minimized (S620).
  • Each task loss may be defined differently depending on whether there is an actual value of the task.
  • task loss may be determined based on a difference between a pseudo label output from a single-task teaching model corresponding to the task and an output of the multi-task model corresponding to the task.
  • the task loss may be determined based on the difference between the actual value corresponding to the task and each of the pseudo labels output by the single-task teaching model and the output of the multi-task model corresponding to the task. .
  • FIG. 7 is a diagram for explaining an example of loss of each task in a process of performing self-learning of a multi-task model according to an embodiment of the present disclosure.
  • the electronic device 100 augments a partially labeled multi-task learning data set and inputs the augmented multi-task learning data into a multi-task model and a single-task model corresponding to each task.
  • the multi-task model outputs output A, output B, and output C as predicted values corresponding to each task
  • the single-task model corresponding to each task produces pseudo label A, pseudo label B, and pseudo label. output C.
  • the electronic device 100 provides a task loss corresponding to the difference between the output of the single-task model corresponding to each task and the output of each task of the multi-task model respectively corresponding to the output ( ) and the task weight of each task ( ), the loss function of the multi-task model ( ) can be calculated.
  • the electronic device 100 through backpropagation, the loss function ( ) can be performed by self-learning of the multi-task model so that
  • the loss function of the multi-task model ( ), the task weight of each task ( ) may be defined based on the accuracy of the single-task teaching model corresponding to each task, as discussed above in the description of FIGS. 3 to 5.
  • the loss function of the multi-task model of Equation 4 as follows ( ), the task loss corresponding to each task can be defined as follows.
  • the task loss corresponding to each task is as shown in Equation 4 above. can be defined as is the difference between the pseudo label output by the single-task teaching model corresponding to each task and the output of the multi-task model corresponding to the corresponding task.
  • FIG. 8 illustrates an example of loss of each task when an arbitrary task having a ground-truth is included in a process of performing self-learning of a multi-task model according to an embodiment of the present disclosure. It is a drawing for
  • the electronic device 100 augments a partially labeled multi-task learning data set and inputs the augmented multi-task learning data into a multi-task model and a single-task model corresponding to each task.
  • Task A and task B have actual values of label A and label C, respectively, and task B has no actual value.
  • the multi-task model outputs output A, output B, and output C as predicted values corresponding to each task, and the single-task model outputs pseudo label A, pseudo label B, and pseudo label C corresponding to each task. do.
  • the electronic device 100 determines the difference between the output of the single-task model corresponding to each task and the output of each task of the multi-task model respectively corresponding to the output ( ) and the difference between the actual value corresponding to each task and the corresponding output of the multi-task model ( ) task loss corresponding to the sum of ( ) and the task weight of each task ( ), the loss function of the multi-task model ( ) can be calculated.
  • the electronic device 100 through backpropagation, the loss function ( ) can be performed by self-learning of the multi-task model so that
  • the loss function of the multi-task model ( ), the task weight of each task ( ) may be defined based on the accuracy of the single-task teaching model corresponding to each task, as discussed above in the description of FIGS. 3 to 5.
  • the loss function of the multi-task model of Equation 5 as follows ( ) the task loss corresponding to each task can be expressed as follows.
  • the task loss corresponding to each task is as shown in Equation 5 above. can be defined as In the case of a task without an actual value, the difference between the pseudo label output by the single-task teaching model corresponding to the task and the output of the multi-task model corresponding to the task ( ), the task loss of the corresponding task may be determined. In the case of a task with an actual value, the difference between the actual value corresponding to the task and the output of the multi-task model corresponding to the task ( ) and the difference between the pseudo label output by the single-task teaching model and the output of the multi-task model corresponding to the task ( ), the task loss of the corresponding task may be determined.
  • Equation 4 Comparing Equation 4 and Equation 5, in the case of an arbitrary task having an actual value among each task of the multi-task model, the difference between the actual value and the output of the corresponding arbitrary task ( ), the task weight corresponding to the arbitrary task corresponding to the task loss of the corresponding arbitrary task ( ), it can be seen that self-learning of the multi-task model is performed so that the sum of the weights applied is also minimized.
  • FIG. 9 is a diagram for explaining a result of learning a vision model that performs a multi-task according to a method for learning a multi-task model according to an embodiment of the present disclosure.
  • a method for learning a multi-task model (PAL, potential-aware multi-task loss) according to an embodiment, and a method for learning a multi-task model according to an embodiment of the present disclosure include a soft label method. It shows the result of learning according to the method (PAL-ST, potential-aware multi-task loss with self training) in which self-learning of the multi-task model is further performed using the pseudo-labels according to the method.
  • PAL and PAL-ST have relatively high accuracy because the entire task is learned in a balanced manner, compared to other methods. It can be seen that PAL-ST shows better learning results by performing more self-learning of the multi-task model than PAL. Compared to the method using each single teaching model for each task, PAL-ST achieves a high accuracy of 97.83%, but has the effect of significantly reducing the amount of computation.
  • FIG. 10 is a block diagram illustrating the configuration of an electronic device 100 according to an embodiment of the present disclosure.
  • an electronic device 100 may include a memory 110 and a processor 120.
  • the memory 110 may store instructions, data structures, and program codes readable by the processor 120 . Operations performed by the processor 120 may be implemented by executing instructions or codes of a program stored in the memory 110 .
  • the memory 110 includes a flash memory type, a hard disk type, a multimedia card micro type, and a card type memory (eg SD or XD memory).
  • Non-volatile memory including at least one of ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk, and optical disk and volatile memory such as RAM (Random Access Memory) or SRAM (Static Random Access Memory).
  • the memory 110 may store one or more instructions and/or programs for controlling the electronic device 100 to perform learning of the multi-task model.
  • a model learning module may be stored in the memory 110 .
  • the processor 120 may control operations or functions performed by the electronic device 100 by executing instructions stored in the memory 110 or programmed software modules.
  • the processor 120 may be composed of hardware components that perform arithmetic, logic and input/output operations and signal processing. For example, the processor 120 may call and execute a model learning module stored in the memory 110 .
  • the processor 120 may include, for example, a central processing unit, a microprocessor, a graphic processing unit, application specific integrated circuits (ASICs), digital signal processors (DSPs), DSPDs ( Digital Signal Processing Devices), PLDs (Programmable Logic Devices), FPGAs (Field Programmable Gate Arrays), Application Processors, Neural Processing Units, or artificial intelligence designed with hardware structures specialized in processing AI models. It may be configured with at least one of dedicated processors, but is not limited thereto. Each processor constituting the processor 120 may be a dedicated processor for performing a predetermined function.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs Digital Signal Processing Devices
  • PLDs Programmable Logic Devices
  • FPGAs Field Programmable Gate Arrays
  • Application Processors Neural Processing Units, or artificial intelligence designed with hardware structures specialized in processing AI models. It may be configured with at least one of dedicated processors, but is not limited thereto.
  • the processor 120 executes one or more instructions stored in the memory 110, so that the electronic device 100 performs learning of the multi-task model or overall operation for processing a task using the multi-task model. can control them.
  • the processor 120 executes one or more instructions stored in the memory 110, based on the unique accuracy of each task of the multi-task model and the accuracy according to the learning degree of each task, Set the task weight corresponding to each task of the loss function of the multi-task model.
  • the loss function of the multi-task model may correspond to a weighted sum obtained by applying a task weight corresponding to each task to a loss of each task of the multi-task model.
  • the processor 120 may set a task weight corresponding to each task based on an accuracy difference between an accuracy inherent in each task and an accuracy according to a learning degree of each task.
  • the processor 120 may adjust a task weight corresponding to each task in proportion to the difference in accuracy.
  • the processor 120 may set a task weight corresponding to each task based on a ratio of accuracy according to the learning degree of each task to the unique accuracy of each task.
  • the processor 120 may adjust the task weight corresponding to each task in inverse proportion to the ratio of accuracy according to the learning degree of each task to the unique accuracy of each task.
  • Processor 120 according to an embodiment of the single-task teaching model corresponding to each task Accuracy is obtained, and the obtained single-task instruction model corresponding to each task Accuracy can be used as the specific accuracy of each task.
  • the processor 120 may perform learning of a single-task teaching model corresponding to each task by using a dedicated learning data set corresponding to each task.
  • the processor 120 may obtain the accuracy of the single-task teaching model corresponding to each task by measuring the accuracy of the learned single-task teaching model corresponding to each task upon completion of learning.
  • the processor 120 performs learning of the multi-task model by using a loss function of the multi-task model by executing one or more instructions stored in the memory 110 .
  • the processor 120 may input the augmented multi-task learning data set to a single-task teaching model and a multi-task model respectively corresponding to each task of the multi-task model.
  • the processor 120 assigns each task loss corresponding to the difference between the output of the single-task teaching model corresponding to each task and the output of each task of the multi-task model, respectively, corresponding to the output of the single-task teaching model.
  • Self-learning of the multi-task model may be performed so that the sum of weights to which task weights corresponding to is applied is minimized.
  • An output of the single-task teaching model corresponding to each task may be a pseudo label represented in the form of a probability value according to a soft label method.
  • the processor 120 determines the task loss of the arbitrary task corresponding to the difference between the actual value and the output of the arbitrary task and the task corresponding to the arbitrary task.
  • Self-learning of the multi-task model may be performed so that the sum of the weights to which the weights are applied is also minimized.
  • task loss may be determined by comparing the actual value corresponding to the task and each of the pseudo labels output by the single-task teaching model with the output of the multi-task model corresponding to the task.
  • task loss may be determined by comparing a pseudo label output from a single-task teaching model corresponding to the task with an output of the multi-task model corresponding to the task.
  • FIG. 11 is a diagram illustrating an example of an electronic device 100 according to an embodiment of the present disclosure.
  • the electronic device 100 When the multi-task model loaded in the electronic device 100 is a vision model, as shown in FIG. 11 , the electronic device 100 according to an embodiment includes a memory 110, a processor 120, and a camera 130. ), a display 140, and a communication interface 150.
  • a model learning module and a model execution module may be stored in the memory 110 .
  • the processor 120 may load and execute a model learning module or a model execution module from the memory 110 .
  • the camera 130 is a hardware module that acquires images.
  • the camera 130 may capture images such as photos or videos.
  • the camera 130 may include at least one camera module, and may support macro, depth-of-field, telephoto, wide-angle, and ultra-wide-angle functions according to specifications of the electronic device 100 .
  • the display 140 includes an output unit for providing information or images, and may further include an input unit for receiving an input.
  • the output unit may include a display panel and a controller controlling the display panel, and may include a variety of display devices such as OLED (Organic Light Emitting Diodes) displays, AM-OLED (Active-Matrix Organic Light-Emitting Diodes) displays, LCD (Liquid Crystal Display) displays, and the like. can be implemented in this way.
  • the input unit may receive various types of inputs from a user, and may include at least one of a touch panel, a keypad, and a pen recognition panel.
  • the display 140 may be provided in the form of a touch screen in which a display panel and a touch panel are combined, and may be implemented in a flexible or foldable manner.
  • the communication interface 150 may perform wired/wireless communication with other devices or networks.
  • the communication interface 150 may include a communication circuit or communication module supporting at least one of various wired/wireless communication methods.
  • the communication interface 150 may be, for example, wired LAN, wireless LAN, Wi-Fi, Bluetooth, ZigBee, Wi-Fi Direct (WFD), infrared Communication (IrDA, Infrared Data Association), BLE (Bluetooth Low Energy), NFC (Near Field Communication), WiBro (Wireless Broadband Internet, Wibro), WiMAX (World Interoperability for Microwave Access, WiMAX), SWAP (Shared Wireless Access Protocol)
  • Data communication between the electronic device 100 and other devices may be performed using at least one of data communication methods including WiGig, Wireless Gigabit Alliances (RF), and RF communication.
  • WiGig Wireless Gigabit Alliances
  • RF Wireless Gigabit Alliances
  • the communication interface 150 transmits and receives an artificial intelligence model (eg, a multi-task model or a single-task model) used to learn a multi-task model or a data set for learning with an external device. can do.
  • the communication interface 150 may transmit an image captured through the camera 130 of the electronic device 100 to a server (not shown) or receive an artificial intelligence model or data set learned by the server (not shown). .
  • the processor 120 or a first processor constituting the processor 120 may execute a model learning module to learn a multi-task model. Learning of the multi-task model may be performed using multi-task learning data.
  • the processor 120 or the first processor may execute a model learning module to learn a single task module corresponding to each task of the multi-task model.
  • the learned single task module can be used to obtain unique accuracy corresponding to each task, and can also be used for self-learning of a multi-task model.
  • the processor 120 or the first processor may train a multi-task model or a single-task model using the training data set.
  • the processor 120 or the first processor may learn a multi-task model or a single-task model by using learning data related to criteria used for processing each task.
  • the training data set may be classified according to data attribute information such as data generation time or place, size, creator, and data type, and may be used for model learning.
  • the processor 120 or the first processor may acquire a plurality of images for model learning.
  • the processor 120 or the first processor may acquire an image through the camera 130 of the electronic device 100 or acquire an image from an external device through the communication interface 150 .
  • the processor 120 or the first processor may perform a pre-processing process on the obtained image or select input data for learning a model from a training data set.
  • the processor 120 or the first processor may train the multi-task model or the single-task model using a learning algorithm including error back-propagation or gradient descent.
  • the processor 120 or a second processor constituting the processor 120 may execute a model execution module to process each task of the multi-task model.
  • the processor 120 or the second processor may obtain input data necessary for processing each task.
  • the processor 120 or the second processor may input input data to the multi-task model and output a result corresponding to each task.
  • the learned multi-task model may output prediction or inference values corresponding to each task in response to inputs corresponding to each task.
  • the processor 120 or the second processor may provide a predetermined service through an additional processing operation on an output value.
  • one of the model learning module and the model execution module may be installed in the electronic device 100 and the other may be installed in the server.
  • the electronic device 100 and the server may provide information about a model built in a model learning module loaded in one device to a model execution module loaded in another device through communication.
  • the electronic device 100 and the server may provide a data set input to a model execution module loaded in one device to a model learning module loaded in another device through communication.
  • Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media.
  • Computer readable media may include computer storage media and communication media.
  • Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Communication media may typically include computer readable instructions, data structures, or other data in a modulated data signal such as program modules.
  • the computer-readable storage medium may be provided in the form of a non-transitory storage medium.
  • 'non-temporary storage medium' only means that it is a tangible device and does not contain signals (e.g., electromagnetic waves), and this term refers to the case where data is stored semi-permanently in the storage medium and temporary It does not discriminate if it is saved as .
  • a 'non-temporary storage medium' may include a buffer in which data is temporarily stored.
  • the method according to various embodiments disclosed in this document may be provided by being included in a computer program product.
  • Computer program products may be traded between sellers and buyers as commodities.
  • a computer program product is distributed in the form of a device-readable storage medium (eg compact disc read only memory (CD-ROM)), or through an application store or between two user devices (eg smartphones). It can be distributed (e.g., downloaded or uploaded) directly or online.
  • a computer program product eg, a downloadable app
  • a device-readable storage medium such as a memory of a manufacturer's server, an application store server, or a relay server. It can be temporarily stored or created temporarily.
  • a method for performing learning of a multi-task model is provided.
  • a method for performing learning of the multi-task model is based on the accuracy of each task of the multi-task model and the accuracy according to the learning degree of each task, and the task corresponding to each task corresponds to the loss of each task of the multi-task model. and setting a task weight corresponding to each task of a loss function of the multi-task model corresponding to a weight sum to which weights have been applied.
  • the method of performing learning of the multi-task model includes performing learning of the multi-task model by using a loss function of the multi-task model.
  • the step of setting a task weight corresponding to each task may include corresponding to each task based on an accuracy difference between an accuracy inherent in each task and an accuracy according to a learning degree of each task. Set the task weight to be
  • the step of setting the task weight corresponding to each task adjusts the task weight corresponding to each task in proportion to a difference in accuracy between the unique accuracy of each task and the accuracy according to the learning degree of each task.
  • setting a task weight corresponding to each task may include corresponding to each task based on a ratio of accuracy according to a learning degree of each task to unique accuracy of each task. Set the task weight to be
  • the task weight corresponding to each task is adjusted in inverse proportion to the ratio of the accuracy according to the learning degree of each task to the unique accuracy of each task.
  • the step of setting task weights corresponding to each task may include a single-task teaching model corresponding to each task. and obtaining accuracy.
  • the step of setting the task weight corresponding to each task may include the obtained single-task teaching model corresponding to each task. and setting a task weight corresponding to each task of the loss function of the multi-task model based on the accuracy and the accuracy according to the learning degree of each task.
  • the step of acquiring the accuracy of the single-task teaching model corresponding to each task includes the step of learning the single-task teaching model corresponding to each task using a dedicated training data set corresponding to each task.
  • acquiring the accuracy of the single-task teaching model corresponding to each task includes measuring the accuracy of the learned single-task teaching model corresponding to each task when learning is completed.
  • the step of learning the multi-task model includes inputting the augmented multi-task learning data set to a single-task teaching model and a multi-task model corresponding to each task, respectively. It includes steps to
  • the step of performing learning of the multi-task model may include a difference between the output of the single-task teaching model corresponding to each task and the output of each task of the multi-task model corresponding to the output of the single-task teaching model. and performing self-learning of the multi-task model so that a weight sum obtained by applying a task weight corresponding to each task to each task loss corresponding to is minimized.
  • the output of the single-task teaching model corresponding to each task is a pseudo label represented in the form of a probability value according to the soft label method.
  • the step of performing self-learning of the multi-task model may include, in the case of an arbitrary task having an actual value among tasks of the multi-task model, a random task corresponding to a difference between the actual value and the output of the arbitrary task.
  • Self-learning of the multi-task model is performed so that the weight sum obtained by applying the task weight corresponding to an arbitrary task to the task loss is also minimized.
  • an electronic device 100 that performs learning of a multi-task model.
  • the electronic device 100 that performs learning of the multi-task model includes a memory 110 that stores one or more instructions and a processor 120 that executes one or more instructions stored in the memory 110 .
  • the processor 120 calculates a loss corresponding to each task of the multi-task model based on the unique accuracy of each task of the multi-task model and the accuracy according to the learning degree of each task. Set the task weight corresponding to each task of the loss function of the multi-task model corresponding to the weight sum to which the task weights are applied.
  • the processor 120 performs learning of the multi-task model by using the loss function of the multi-task model by executing one or more instructions.
  • the processor 120 executes one or more instructions, and based on the difference in accuracy between the unique accuracy of each task and the accuracy according to the learning degree of each task, the processor 120 corresponds to each task. Set task weight.
  • the processor 120 adjusts a task weight corresponding to each task in proportion to a difference in accuracy between the unique accuracy of each task and the accuracy according to the learning degree of each task.
  • the processor 120 executes one or more instructions, and based on the ratio of the accuracy according to the learning degree of each task to the unique accuracy of each task, the processor 120 corresponds to each task. Set task weight.
  • the processor 120 adjusts the task weight corresponding to each task in inverse proportion to the ratio of the accuracy according to the learning degree of each task to the unique accuracy of each task.
  • the processor 120 executes one or more instructions to generate a single-task instructional model corresponding to each task. Accuracy is obtained, and the obtained single-task instruction model corresponding to each task Accuracy is used as the intrinsic accuracy of each task.
  • the processor 120 performs learning of a single-task teaching model corresponding to each task by using a dedicated training data set corresponding to each task by executing one or more instructions. In addition, the processor 120 measures the accuracy of the learned single-task teaching model corresponding to each task upon completion of learning by executing one or more instructions, thereby determining the accuracy of the single-task teaching model corresponding to each task. Acquire
  • the processor 120 executes one or more instructions to input the augmented multi-task learning data set to a single-task teaching model and a multi-task model corresponding to each task, respectively. do.
  • the processor 120 executes one or more instructions, so that the difference between the output of the single-task instruction model corresponding to each task and the output of each task of the multi-task model corresponding to the output of the single-task instruction model.
  • Self-learning of the multi-task model is performed so that the weight sum obtained by applying the task weight corresponding to each task to each task loss corresponding to is minimized.
  • the output of the single-task teaching model corresponding to each task is a pseudo label represented in the form of a probability value according to the soft label method.
  • the processor 120 executes one or more instructions, so that, in the case of an arbitrary task having an actual value among each task of the multi-task model, the task of the arbitrary task corresponding to the difference between the actual value and the output of the arbitrary task.
  • Self-learning of the multi-task model is performed so that the weight sum obtained by applying the task weight corresponding to an arbitrary task to the loss is also minimized.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

Un procédé et un dispositif électronique pour effectuer un apprentissage de modèle multitâche, selon la présente divulgation, consistent à : configurer un poids de tâche correspondant à chaque tâche d'une fonction de perte d'un modèle multitâche sur la base de la précision unique de chaque tâche du modèle multitâche et de la précision en fonction du degré d'apprentissage de chaque tâche ; et effectuer un apprentissage de modèle multitâche en utilisant la fonction de perte du modèle multitâche.
PCT/KR2022/014988 2021-11-11 2022-10-05 Procédé et dispositif électronique pour effectuer un apprentissage de modèle multitâche WO2023085610A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR20210154475 2021-11-11
KR10-2021-0154475 2021-11-11
KR10-2022-0115795 2022-09-14
KR1020220115795A KR20230068989A (ko) 2021-11-11 2022-09-14 멀티-태스크 모델의 학습을 수행하는 방법 및 전자 장치

Publications (1)

Publication Number Publication Date
WO2023085610A1 true WO2023085610A1 (fr) 2023-05-19

Family

ID=86336351

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/014988 WO2023085610A1 (fr) 2021-11-11 2022-10-05 Procédé et dispositif électronique pour effectuer un apprentissage de modèle multitâche

Country Status (1)

Country Link
WO (1) WO2023085610A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102037484B1 (ko) * 2019-03-20 2019-10-28 주식회사 루닛 멀티태스크 학습 방법 및 그 장치
KR20200078531A (ko) * 2017-10-26 2020-07-01 매직 립, 인코포레이티드 딥 멀티태스크 네트워크들에서 적응적 손실 밸런싱을 위한 그라디언트 정규화 시스템들 및 방법들

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200078531A (ko) * 2017-10-26 2020-07-01 매직 립, 인코포레이티드 딥 멀티태스크 네트워크들에서 적응적 손실 밸런싱을 위한 그라디언트 정규화 시스템들 및 방법들
KR102037484B1 (ko) * 2019-03-20 2019-10-28 주식회사 루닛 멀티태스크 학습 방법 및 그 장치

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LEE, SEONG-WHAN ; LI, STAN Z: "SAT 2015 18th International Conference, Austin, TX, USA, September 24-27, 2015", vol. 11220 Chap.17, 6 October 2018, SPRINGER , Berlin, Heidelberg , ISBN: 3540745491, article GUO MICHELLE; HAQUE ALBERT; HUANG DE-AN; YEUNG SERENA; FEI-FEI LI: "Dynamic Task Prioritization for Multitask Learning", pages: 282 - 299, XP047488480, 032548, DOI: 10.1007/978-3-030-01270-0_17 *
PAUL MICHEL; SEBASTIAN RUDER; DANI YOGATAMA: "Balancing Average and Worst-case Accuracy in Multitask Learning", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 12 October 2021 (2021-10-12), 201 Olin Library Cornell University Ithaca, NY 14853, XP091076127 *
SICONG LIANG; YU ZHANG: "A Simple General Approach to Balance Task Difficulty in Multi-Task Learning", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 12 February 2020 (2020-02-12), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081597847 *

Similar Documents

Publication Publication Date Title
WO2018212494A1 (fr) Procédé et dispositif d'identification d'objets
WO2017213398A1 (fr) Modèle d'apprentissage pour détection de région faciale saillante
WO2018135881A1 (fr) Gestion de l'intelligence de vision destinée à des dispositifs électroniques
WO2019177344A1 (fr) Appareil électronique et son procédé de commande
WO2019231130A1 (fr) Dispositif électronique et son procédé de commande
WO2020122432A1 (fr) Dispositif électronique et procédé d'affichage d'une image tridimensionnelle de celui-ci
WO2020060223A1 (fr) Dispositif et procédé de fourniture d'informations de traduction d'application
WO2020027454A1 (fr) Système d'apprentissage automatique multicouches pour prendre en charge un apprentissage d'ensemble
EP3915063A1 (fr) Structures multi-modèles pour la classification et la détermination d'intention
WO2020071854A1 (fr) Appareil électronique et son procédé de commande
WO2019172642A1 (fr) Dispositif électronique et procédé pour mesurer la fréquence cardiaque
EP3698258A1 (fr) Appareil électronique et son procédé de commande
KR20230068989A (ko) 멀티-태스크 모델의 학습을 수행하는 방법 및 전자 장치
WO2022197136A1 (fr) Système et procédé permettant d'améliorer un modèle d'apprentissage machine destiné à une compréhension audio/vidéo au moyen d'une attention suscitée à multiples niveaux et d'une formation temporelle par antagonisme
WO2022139327A1 (fr) Procédé et appareil de détection d'énoncés non pris en charge dans la compréhension du langage naturel
WO2019190171A1 (fr) Dispositif électronique et procédé de commande associé
WO2022244997A1 (fr) Procédé et appareil pour le traitement de données
WO2020091268A1 (fr) Appareil électronique et procédé de commande associé
WO2019194356A1 (fr) Dispositif électronique et son procédé de commande
WO2023085610A1 (fr) Procédé et dispositif électronique pour effectuer un apprentissage de modèle multitâche
WO2024014870A1 (fr) Procédé et dispositif électronique permettant une segmentation d'image interactive
WO2020122513A1 (fr) Procédé de traitement d'image bidimensionnelle et dispositif d'exécution dudit procédé
WO2022092487A1 (fr) Appareil électronique et son procédé de commande
EP3803797A1 (fr) Procédés et systèmes de réalisation d'opérations de modification sur un support
WO2022039494A1 (fr) Serveur pour mettre à jour un modèle de terminal, et son procédé de fonctionnement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22893033

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE