TWI840640B

TWI840640B - Semi-supervised learning system and semi-supervised learning method

Info

Publication number: TWI840640B
Application number: TW109146113A
Authority: TW
Inventors: 黃詠舜; 蘇育正; 張晉維
Original assignee: 台達電子工業股份有限公司
Filing date: 2020-12-25
Publication date: 2024-05-01

Abstract

A semi-supervised learning method is provided. The method includes the following steps: obtaining source-domain data of one or more source domains and target-domain data of a target domain; training a feature-extraction model using the source-domain data and the target-domain data; calculating a domain-determination loss function, a task loss function, and a semi-supervised loss function respectively by a domain-determination model, a task model, and a semi-supervised learning mechanism; calculating a total loss function according to the domain-determination loss function, the task loss function, and the semi-supervised loss function, and updating weights of the feature-extraction model according to the total loss function; and finishing a training procedure of the an overall model in response to the overall model satisfying a model converge condition, wherein the overall model includes the feature-extraction model, the task model, and the domain-determination model.

Description

Semi-supervisory learning system and semi-supervisory learning method

本發明係有關於機器學習，特別是有關於一種半監督學習系統及半監督學習方法。The present invention relates to machine learning, and more particularly to a semi-supervised learning system and a semi-supervised learning method.

資料導向的分析建模技術在特定應用領域(下稱源域，source domain)中蒐集大量含標籤資料標籤後，訓練所得之模型在同領域的預測表現多能逼近甚至能超越人類。然而當我們想要在另一新領域（下稱目標域，target domain）重複利用該模型進行相同預測任務時，新領域的資料集往往和過去訓練用資料分佈有落差，導致模型的預測表現大幅折損。例如同一產品在多個工廠生產的情況下，使用A工廠生產資料建立出的外觀缺陷辨識系統，若直接套到B工廠進行辨識時，可能因為拍攝角度、採光、相機型號等差別，在資料分佈上有所差異，導致運作效果不如預期。After collecting a large amount of labeled data in a specific application domain (hereinafter referred to as the source domain), the data-oriented analytical modeling technology can train the model to perform predictions close to or even better than humans in the same domain. However, when we want to reuse the model in another new domain (hereinafter referred to as the target domain) for the same prediction task, the data set in the new domain often has a distribution gap with the past training data, resulting in a significant reduction in the model's prediction performance. For example, if the same product is produced in multiple factories, the appearance defect recognition system established using the production data of Factory A, if directly applied to Factory B for recognition, may have differences in data distribution due to differences in shooting angles, lighting, camera models, etc., resulting in less than expected operating results.

有鑑於此，本發明係提供一種半監督學習系統及半監督學習方法以解決上述問題。In view of this, the present invention provides a semi-supervised learning system and a semi-supervised learning method to solve the above problems.

本發明係提供一種半監督學習系統，包括：一非揮發性記憶體，用以儲存一半監督學習應用程式；以及一處理器，用以執行該半監督學習應用程式以執行下列步驟：取得一或多個源域的源域資料及一目標域的目標域資料；使用該源域資料及該目標域資料以訓練一特徵萃取模型；利用一域判別模型、一任務模型及一半監督學習機制以分別計算該特徵萃取模型之域判別損失函數、任務損失函數及半監督損失函數；依據該域判別損失函數、該任務損失函數及該半監督損失函數以計算一總損失函數，並依據該總損失函數以更新該特徵萃取模型、該任務模型及該域判別模型之權重；以及因應於該整體模型滿足模型收斂條件，結束該整體模型之訓練過程，其中該整體模型包括該特徵萃取模型、該任務模型及該域判別模型。The present invention provides a semi-supervised learning system, comprising: a non-volatile memory for storing a semi-supervised learning application; and a processor for executing the semi-supervised learning application to execute the following steps: obtaining source domain data of one or more source domains and target domain data of a target domain; using the source domain data and the target domain data to train a feature extraction model; using a domain discrimination model, a task model and a semi-supervised learning mechanism to respectively calculate the feature extraction model; A domain discrimination loss function, a task loss function and a semi-supervisory loss function are included in the training process; a total loss function is calculated based on the domain discrimination loss function, the task loss function and the semi-supervisory loss function, and the weights of the feature extraction model, the task model and the domain discrimination model are updated based on the total loss function; and in response to the overall model satisfying the model convergence condition, the training process of the overall model is terminated, wherein the overall model includes the feature extraction model, the task model and the domain discrimination model.

在一些實施例中，該特徵萃取模型為ResNet50模型，該半監督學習機制為一致性正規化(consistency regularization)的未監督資料增強機制(unsupervised data augmentation)，該任務模型為一第一全連接層，且該域判別模型為一第二全連接層加上一梯度反轉層(gradient reversal layer)。In some embodiments, the feature extraction model is a ResNet50 model, the semi-supervised learning mechanism is an unsupervised data augmentation mechanism with consistency regularization, the task model is a first fully connected layer, and the domain discrimination model is a second fully connected layer plus a gradient reversal layer.

在一些實施例中，該處理器係分別指派第一超參數(Hyper Parameter)、第二超參數及第三超參數至該域判別損失函數、該任務損失函數及該半監督損失函數以計算該總損失函數。In some embodiments, the processor assigns a first hyper parameter, a second hyper parameter, and a third hyper parameter to the domain determination loss function, the task loss function, and the semi-supervisory loss function, respectively, to calculate the total loss function.

在一些實施例中，該源域資料包括第一已標籤資料及第一未標籤資料，且該目標域資料包括第二已標籤資料及第二未標籤資料。該處理器係使用該第一已標籤資料、該第一未標籤資料、該第二已標籤資料及該第二未標籤資料以計算該域判別函數，並依據該域判別損失函數以更新該特徵萃取模型及該域判別模型的權重。該處理器係使用該第一已標籤資料及該第二已標籤資料以計算該任務損失函數，並依據該任務損失函數以更新該特徵萃取模型之權重及該任務模型之該第二權重。該處理器係使用該第一未標籤資料及該第二未標籤資料以計算該半監督損失函數，並依據該半監督損失函數以更新該特徵萃取模型之權重。此外，該模型收斂條件係表示該整體模型在固定訓練批次(epoch)及固定訓練時間下的該總損失函數之改善幅度低於臨界值。In some embodiments, the source domain data includes first labeled data and first unlabeled data, and the target domain data includes second labeled data and second unlabeled data. The processor uses the first labeled data, the first unlabeled data, the second labeled data, and the second unlabeled data to calculate the domain discriminant function, and updates the weights of the feature extraction model and the domain discriminant model according to the domain discriminant loss function. The processor uses the first labeled data and the second labeled data to calculate the task loss function, and updates the weight of the feature extraction model and the second weight of the task model according to the task loss function. The processor uses the first unlabeled data and the second unlabeled data to calculate the semi-supervised loss function, and updates the weights of the feature extraction model according to the semi-supervised loss function. In addition, the model convergence condition indicates that the improvement of the total loss function of the overall model under a fixed training batch (epoch) and a fixed training time is lower than a critical value.

本發明更提供一種半監督學習方法，包括：取得一或多個源域的源域資料及一目標域的目標域資料；使用該源域資料及該目標域資料以訓練一特徵萃取模型；利用一域判別模型、一任務模型及一半監督學習機制以分別計算該特徵萃取模型之域判別損失函數、任務損失函數及半監督損失函數；依據該域判別損失函數、該任務損失函數及該半監督損失函數以計算一總損失函數，並依據該總損失函數以更新相應於該特徵萃取模型、該任務模型及該域判別模型之第一權重、第二權重及第三權重；以及因應於一整體模型滿足模型收斂條件，結束該整體模型之訓練過程，其中該整體模型包括該特徵萃取模型、該任務模型及該域判別模型。The present invention further provides a semi-supervised learning method, comprising: obtaining source domain data of one or more source domains and target domain data of a target domain; using the source domain data and the target domain data to train a feature extraction model; using a domain discrimination model, a task model and a semi-supervised learning mechanism to respectively calculate the domain discrimination loss function, the task loss function and the semi-supervised loss function of the feature extraction model; and The task loss function and the semi-supervisory loss function are used to calculate a total loss function, and the first weight, the second weight and the third weight corresponding to the feature extraction model, the task model and the domain discrimination model are updated according to the total loss function; and in response to an overall model satisfying a model convergence condition, the training process of the overall model is terminated, wherein the overall model includes the feature extraction model, the task model and the domain discrimination model.

以下說明係為完成發明的較佳實現方式，其目的在於描述本發明的基本精神，但並不用以限定本發明。實際的發明內容必須參考之後的權利要求範圍。The following description is a preferred embodiment of the invention, and its purpose is to describe the basic spirit of the invention, but it is not intended to limit the invention. The actual content of the invention must refer to the scope of the following claims.

必須了解的是，使用於本說明書中的"包含"、"包括"等詞，係用以表示存在特定的技術特徵、數值、方法步驟、作業處理、元件以及/或組件，但並不排除可加上更多的技術特徵、數值、方法步驟、作業處理、元件、組件，或以上的任意組合。It must be understood that the words "comprise", "include" and the like used in this specification are used to indicate the existence of specific technical features, numerical values, method steps, operation processes, elements and/or components, but do not exclude the addition of more technical features, numerical values, method steps, operation processes, elements, components, or any combination thereof.

於權利要求中使用如"第一"、"第二"、"第三"等詞係用來修飾權利要求中的元件，並非用來表示之間具有優先權順序，先行關係，或者是一個元件先於另一個元件，或者是執行方法步驟時的時間先後順序，僅用來區別具有相同名字的元件。The terms "first", "second", "third", etc. used in the claims are used to modify the elements in the claims and are not used to indicate a priority order, a preceding relationship, or that one element precedes another element, or a temporal sequence in performing method steps. They are only used to distinguish elements with the same name.

第1圖為依據本發明一實施例中之半監督學習系統的示意圖。FIG. 1 is a schematic diagram of a semi-supervised learning system according to an embodiment of the present invention.

半監督學習系統100包括一或多個處理器110、一記憶體單元130、一儲存裝置140及傳輸介面150。處理單元110例如可為中央處理器(CPU)、一般用途處理器(general-purpose processor)等等，但本發明並不限於此。The semi-supervised learning system 100 includes one or more processors 110, a memory unit 130, a storage device 140, and a transmission interface 150. The processing unit 110 may be, for example, a central processing unit (CPU), a general-purpose processor, etc., but the present invention is not limited thereto.

記憶體單元130為一隨機存取記憶體，例如是動態隨機存取記憶體(DRAM)或靜態隨機存取記憶體(SRAM)，但本發明並不限於此。儲存裝置140為一非揮發性記憶體(non-volatile memory)，例如可為一硬碟機(hard disk drive)、一固態硬碟(solid-state disk)、一快閃記憶體(flash memory)、或一唯讀記憶體(read-only memory)，但本發明並不限於此。The memory unit 130 is a random access memory, such as a dynamic random access memory (DRAM) or a static random access memory (SRAM), but the present invention is not limited thereto. The storage device 140 is a non-volatile memory, such as a hard disk drive, a solid-state disk, a flash memory, or a read-only memory, but the present invention is not limited thereto.

舉例來說，儲存裝置140可儲存特徵萃取模型(feature extraction model)141、半監督學習機制(semi-supervised learning mechanism)142、任務模型143及域判別模型144，且可統稱為半監督學習應用程式(例如可執行本發明之半監督學習方法)。處理器110係將特徵萃取模型141、半監督學習機制142、任務模型143及域判別模型144讀取至記憶體單元130並執行。在一些實施例中，儲存裝置140更可儲存從不同源域及目標域所取得的源域資料及目標域資料，且上述源域資料及目標域資料包含已標籤及未標籤的資料。For example, the storage device 140 can store a feature extraction model 141, a semi-supervised learning mechanism 142, a task model 143, and a domain discrimination model 144, and can be collectively referred to as a semi-supervised learning application (for example, the semi-supervised learning method of the present invention can be executed). The processor 110 reads the feature extraction model 141, the semi-supervised learning mechanism 142, the task model 143, and the domain discrimination model 144 into the memory unit 130 and executes them. In some embodiments, the storage device 140 may further store source domain data and target domain data obtained from different source domains and target domains, and the source domain data and target domain data include labeled and unlabeled data.

傳輸介面150例如可為有線傳輸介面或無線傳輸介面，且半監督學習系統100可透過傳輸介面150與一或多個外部裝置20連接，並從外部裝置接收源域資料或目標域資料。The transmission interface 150 may be, for example, a wired transmission interface or a wireless transmission interface, and the semi-supervised learning system 100 may be connected to one or more external devices 20 via the transmission interface 150 and receive source domain data or target domain data from the external devices.

在一實施例中，半監督學習系統100係結合領域自適應(domain adaptation)技術及半監督學習(semi-supervised learning)技術之優點。舉例來説，領域自適應技術可從源域和目標域的資料中學習出領域不變(domain invariant)的通用特徵，並從這些通用特徵與源域標籤的關係中建立判別模型。由於模型是根據領域不變的特徵來進行判斷，較不易受到目標域資料分佈的改變而影響表現，因而可以運用於目標域進行預測。若訓練資料中僅包含少量的標籤資料及大量的無標籤資料，通常在模型的學習步驟中會融合監督式學習及非監督學習的相關技術。優良的半監督學習模型表現會比僅使用非監督學習時大幅提升，並接近完全使用含標籤資料的監督式學習的模型表現，大幅節省下標記的成本。因此，本發明之半監督學習系統100可透過領域自適應技術與半監督學習的機制更好地學習領域不變且對目標領域有效的通用特徵，在源域標籤資料量或目標域標籤資料量較少的情況下，仍能達成良好的模型表現。In one embodiment, the semi-supervised learning system 100 combines the advantages of domain adaptation technology and semi-supervised learning technology. For example, domain adaptation technology can learn domain invariant universal features from data in the source domain and the target domain, and establish a discriminant model from the relationship between these universal features and source domain labels. Since the model makes judgments based on domain invariant features, it is less likely to be affected by changes in the distribution of target domain data, and thus can be used in the target domain for prediction. If the training data only contains a small amount of labeled data and a large amount of unlabeled data, the relevant technologies of supervised learning and unsupervised learning are usually integrated in the learning step of the model. The performance of an excellent semi-supervised learning model will be greatly improved compared to when only unsupervised learning is used, and it is close to the performance of a supervised learning model that uses only labeled data, which greatly saves the cost of labeling. Therefore, the semi-supervised learning system 100 of the present invention can better learn universal features that are domain-invariant and effective for the target domain through domain adaptation technology and the mechanism of semi-supervised learning, and can still achieve good model performance when the amount of source domain labeled data or the amount of target domain labeled data is small.

需注意的是，本發明並不限定半監督學習系統100所使用的模型，且上述模型可以是機器學習、統計模型與深度學習等方法。此外，半監督學習系統100並不限制所使用的訓練資料的類型，可包括但不限於結構化資料、訊號資料、影像圖片、文本資料等等。上述模型的訓練方式不限於階段性，可以分多階段分別訓練不同模型或是採用整個端到端模型同時進行訓練。It should be noted that the present invention does not limit the model used by the semi-supervised learning system 100, and the above-mentioned model can be a method such as machine learning, statistical model and deep learning. In addition, the semi-supervised learning system 100 does not limit the type of training data used, which may include but not limited to structured data, signal data, image pictures, text data, etc. The training method of the above-mentioned model is not limited to stages, and different models can be trained separately in multiple stages or the entire end-to-end model can be used for training at the same time.

當源域資料及目標域資料均為影像，則可使用VGG、ResNet或Inception模型以做為特徵萃取模型141。在一較佳實施例中，半監督學習系統100係使用ResNet50以做為特徵萃取模型141，其中特徵萃取模型141可抽取出來源影像(例如源域資料及目標域資料中的未標籤影像)之抽象特徵，例如可用特徵向量(feature vector)表示，並可用於後續的缺陷偵測處理。When both the source domain data and the target domain data are images, VGG, ResNet or Inception models can be used as feature extraction models 141. In a preferred embodiment, the semi-supervised learning system 100 uses ResNet50 as the feature extraction model 141, wherein the feature extraction model 141 can extract abstract features of source images (e.g., unlabeled images in the source domain data and the target domain data), which can be represented by feature vectors, for example, and can be used for subsequent defect detection processing.

半監督學習機制142例如可為熵基礎式正規化(entropy-based regularization)或一致性正規化(consistency regularization)之技術。在一較佳實施例中，半監督學習機制142係使用一致性正規化的未監督資料增強(unsupervised data augmentation)機制，其可利用未標籤的樣本以計算資訊一致性，以確保任務模型能夠在未知的樣本上能有缺陷判別的能力。The semi-supervised learning mechanism 142 may be, for example, an entropy-based regularization technique or a consistency regularization technique. In a preferred embodiment, the semi-supervised learning mechanism 142 is an unsupervised data augmentation mechanism using consistency regularization, which may utilize unlabeled samples to calculate information consistency to ensure that the task model is capable of defect discrimination on unknown samples.

任務模型143例如可為全連接層(fully connected layer)，其可將特徵萃取模型143所產生的特徵向量進行映射以產生一分類結果(classification result)。The task model 143 may be, for example, a fully connected layer, which may map the feature vector generated by the feature extraction model 143 to generate a classification result.

域判別模型144例如可為全連接層加上梯度反轉層(gradient reversal layer)、或生成對抗網路(generative adversarial network)。在一較佳實施例中，域判別模型144係使用全連接層加上梯度反轉層，其可確保特徵萃取模型所學習到的抽象特徵具有域不變性(domain invariant)，以利於面對不同源域的資料可有較穩健的判斷能力。The domain discrimination model 144 can be, for example, a fully connected layer plus a gradient reversal layer, or a generative adversarial network. In a preferred embodiment, the domain discrimination model 144 uses a fully connected layer plus a gradient reversal layer, which can ensure that the abstract features learned by the feature extraction model are domain invariant, so as to have a more robust judgment ability when facing data from different source domains.

第2圖為依據本發明一實施例中之特徵萃取模型之訓練過程的示意圖。FIG. 2 is a schematic diagram of the training process of the feature extraction model according to an embodiment of the present invention.

在一實施例中，特徵萃取模型141之訓練資訊包含源域資料210～21N及目標域資料220，且各源域資料及210～21N及目標域資料220均分別包含已標籤資料及未標籤資料，例如源域資料210包含已標籤資料210A及未標籤資料210B，源域資料211包含已標籤資料211A及未標籤資料211B，依此類推。In one embodiment, the training information of the feature extraction model 141 includes source domain data 210~21N and target domain data 220, and each source domain data 210~21N and target domain data 220 respectively includes labeled data and unlabeled data. For example, the source domain data 210 includes labeled data 210A and unlabeled data 210B, the source domain data 211 includes labeled data 211A and unlabeled data 211B, and so on.

在一實施例中，處理器110係使用源域資料210～21N及目標域資料220中的未標籤資料210B～21NB及220B以訓練特徵萃取模型141，例如可使用自監督學式學習(self-supervised learning)或是生成對抗網路(generative adversarial network)，且特徵萃取模型141可取出源域與目標域共同的抽象描述特徵。In one embodiment, the processor 110 uses the source domain data 210-21N and the unlabeled data 210B-21NB and 220B in the target domain data 220 to train the feature extraction model 141, for example, self-supervised learning or a generative adversarial network may be used, and the feature extraction model 141 may extract abstract descriptive features common to the source domain and the target domain.

本發明之半監督學習系統100可用於多種領域，例如人臉辨識、物件偵測、圖片語音分割、文本資料(例如可用於對話機器人及文章摘要)、訊號異常偵測、元件壽命預測等等，但本發明並不限於上述領域之應用。因此，依據不同的應用之資料類型以及感測器資料的取得方式，源域資料及目標域資料例如是已經過資料前處理的資料以輸入至特徵萃取模型141進行訓練。因此，利用源域資料及目標域資料可建立域判別模型144，其可確保不同源域資料所萃取出的特徵有共同的表達空間。The semi-supervised learning system 100 of the present invention can be used in a variety of fields, such as face recognition, object detection, image speech segmentation, text data (for example, can be used for dialogue robots and article abstracts), signal anomaly detection, component life prediction, etc., but the present invention is not limited to the application in the above fields. Therefore, according to the data type of different applications and the method of obtaining sensor data, the source domain data and the target domain data are, for example, data that have been pre-processed to be input into the feature extraction model 141 for training. Therefore, the domain discrimination model 144 can be established using the source domain data and the target domain data, which can ensure that the features extracted from different source domain data have a common expression space.

接著，處理器110係使用源域資料210～21N及目標域資料220中之已標籤資料以訓練任務模型142以產生複數個特徵向量231。此外，半監督學習機制143係使用源域資料及目標域資料中之未標籤資料以輔助任務模型訓練，其目的是從未標籤資料中找出有效的萃取方式。Next, the processor 110 uses the labeled data in the source domain data 210-21N and the target domain data 220 to train the task model 142 to generate a plurality of feature vectors 231. In addition, the semi-supervised learning mechanism 143 uses the unlabeled data in the source domain data and the target domain data to assist the task model training, and its purpose is to find an effective extraction method from the unlabeled data.

在特徵萃取模型141的訓練過程中，處理器110會持續依據源域資料及目標域資料計算域判別模型144、任務模型143及半監督學習機制142的損失函數(loss function)，例如域判別損失函數234、任務損失函數233及半監督損失函數232。During the training process of the feature extraction model 141, the processor 110 continuously calculates the domain discrimination model 144, the task model 143 and the loss function of the semi-supervised learning mechanism 142 according to the source domain data and the target domain data, such as the domain discrimination loss function 234, the task loss function 233 and the semi-supervised loss function 232.

域判別損失函數234例如可表示為：，其中 x為原始資料。處理器110並依據域判別損失函數234並搭配領域對抗訓練(domain adversarial training)的機制以計算域判別模型144與特徵萃取模型141權重的更新方法，藉以確保特徵萃取模型141取出之抽象特徵具有域不變性(domain invariance)的特性。 The domain discrimination loss function 234 can be expressed as: , where x is the original data. The processor 110 calculates the updating method of the weights of the domain discriminant model 144 and the feature extraction model 141 according to the domain discriminant loss function 234 and the domain adversarial training mechanism, so as to ensure that the abstract features extracted by the feature extraction model 141 have the characteristics of domain invariance.

任務損失函數233例如可表示為：，其中 x為原始資料， y*為標籤、 L為已標籤資料的集合。處理器110係使用已標籤樣本進行任務模型的監督式學習，並利用反向傳播算法計算任務模型143與特徵萃取模型141權重的更新方向，確保模型能對任務有準確的判斷。 The task loss function 233 may be expressed, for example, as: , where x is the original data, y* is the label, and L is the set of labeled data. The processor 110 uses the labeled samples to perform supervised learning of the task model and uses the back propagation algorithm to calculate the update direction of the weights of the task model 143 and the feature extraction model 141 to ensure that the model can make accurate judgments on the task.

半監督損失函數232例如可表示為：，其中U為未標籤資料之集合，D _KL為衡量兩分布的度量方式(例如使用KL散度演算法)。處理器110係使用未標籤資料以提供特徵萃取模型141與任務模型143學習的資訊，並透過反向傳播算法計算半監督學習機制142之權重的更新方向。 The semi-supervisory loss function 232 may be expressed, for example, as: , where U is a set of unlabeled data, and D _KL is a metric for measuring two distributions (e.g., using the KL divergence algorithm). The processor 110 uses the unlabeled data to provide information for the feature extraction model 141 and the task model 143 to learn, and calculates the update direction of the weights of the semi-supervised learning mechanism 142 through the back propagation algorithm.

處理器110並給定不同的權重至域判別損失函數、任務損失函數及半監督損失函數，例如分別為、及。因此，處理器110可計算加權後所得到的總損失函數SL，如式(1)所示： The processor 110 also assigns different weights to the domain discrimination loss function, the task loss function and the semi-supervision loss function, for example, , and Therefore, the processor 110 can calculate the weighted total loss function SL, as shown in formula (1):

其中，權重、及均為非負實數； x為原始資料； y*為標籤；D _KL為衡量兩分布的度量方式；L為已標籤資料之集合；U為未標籤資料之集合。權重、及由使用者事先決定，通常權重、及之總合為1，其數值大小直接反應對應之損失函數的影響程度，因此若希望增加任務損失的重要程度則可設定較大的值，該數值另外跟資料的筆數相關，若具標籤樣本數非常稀少則可設定較大的以避免該損失函數失去影響力。 Among them, the weight , and are all non-negative real numbers; x is the original data; y* is the label; D _KL is the measurement method for measuring the two distributions; L is the set of labeled data; U is the set of unlabeled data. , and Determined in advance by the user, usually weighted , and The sum of is 1. Its value directly reflects the impact of the corresponding loss function. Therefore, if you want to increase the importance of task loss, you can set a larger value. The value is also related to the number of data. If the number of labeled samples is very small, a larger value can be set. To avoid the loss function losing its influence.

處理器110並依據總損失函數SL之計算結果透過反向傳播法計算整體模型中之各模型的權重之調整方向並用以更新整體模型中之各模型的權重。詳細而言，處理器110係使用來自不同源域的源域資料及目標域資料以計算域判別損失函數，並依據域判別損失函數以更新特徵萃取模型的權重及域判別模型的權重。處理器110係使用源域資料及目標域資料中的已標籤資料以計算任務損失函數，並依據任務損失函數以更新特徵萃取模型之權重及任務模型之權重。舉例來説，特徵萃取模型會使用可用資料進行預訓練，常見的實現方法有自監督學習或是生成對抗網路等。後續進行多源域目標任務訓練時，特徵萃取模型之權重亦會依據整體損失函數計算的反向傳播法更新方向而調整，此時並未對應到專屬的損失函數(註：意即未將特徵萃取函數之輸出用於計算總損失函數SL)。此外，處理器110並使用源域資料及目標域資料中的未標籤資料以計算半監督損失函數，並依據半監督損失函數以更新特徵萃取模型(及/或任務模型)的權重。在一些實施例中，視處理器110所使用的半監督學習機制142而定，處理器110會使用源域資料及目標域資料中的已標籤資料及未標籤資料以計算半監督損失函數，並依據半監督損失函數以更新特徵萃取模型(及/或任務模型)的權重。The processor 110 also calculates the adjustment direction of the weights of each model in the overall model through the back propagation method according to the calculation result of the total loss function SL and uses it to update the weights of each model in the overall model. In detail, the processor 110 uses source domain data and target domain data from different source domains to calculate the domain discrimination loss function, and updates the weights of the feature extraction model and the weights of the domain discrimination model according to the domain discrimination loss function. The processor 110 uses the labeled data in the source domain data and the target domain data to calculate the task loss function, and updates the weights of the feature extraction model and the weights of the task model according to the task loss function. For example, the feature extraction model is pre-trained using available data, and common implementation methods include self-supervised learning or generative adversarial networks. When performing subsequent multi-source domain target task training, the weights of the feature extraction model will also be adjusted according to the update direction of the back propagation method calculated by the overall loss function, and at this time, it does not correspond to a dedicated loss function (Note: This means that the output of the feature extraction function is not used to calculate the total loss function SL). In addition, the processor 110 uses the unlabeled data in the source domain data and the target domain data to calculate the semi-supervised loss function, and updates the weights of the feature extraction model (and/or task model) based on the semi-supervised loss function. In some embodiments, depending on the semi-supervised learning mechanism 142 used by the processor 110, the processor 110 uses the labeled data and unlabeled data in the source domain data and the target domain data to calculate the semi-supervised loss function, and updates the weights of the feature extraction model (and/or task model) based on the semi-supervised loss function.

接著，處理器110係重複執行計算三種損失函數及更新模型權重的步驟。當特徵萃取模型141達到模型收斂條件後，處理器110即結束特徵萃取模型141的訓練過程，並得到訓練完成的特徵萃取模型141。舉例來説，上述模型收斂條件可為整體模型在固定訓練批次(epoch)或固定訓練時間或總損失函數達到目標值等等，但本發明並不限於此。Then, the processor 110 repeatedly executes the steps of calculating the three loss functions and updating the model weights. When the feature extraction model 141 reaches the model convergence condition, the processor 110 ends the training process of the feature extraction model 141 and obtains the trained feature extraction model 141. For example, the above-mentioned model convergence condition can be that the overall model reaches the target value in a fixed training batch (epoch) or a fixed training time or a total loss function, etc., but the present invention is not limited thereto.

第3A-3C圖為依據本發明一實施例中之源域資料及目標域資料的示意圖。3A-3C are schematic diagrams of source domain data and target domain data according to an embodiment of the present invention.

在工業生產領域裡，缺陷偵測(defect detection)為一種常見之品質檢驗的手段，以確保加工過程產品未出現瑕疵。在電子產業中，焊錫為典型之加工程序，對於印刷電路板(PCB)類型的產品良率扮演重要的角色，因此往往會配置對應之自動光學檢測(AOI)檢測站針對焊錫進行缺陷偵測。然而，成熟的製造商不良率通常相當低，導致有效標籤樣本的數量少且取得成本昂貴。再者，印刷電路板的種類眾多且每個焊錫點的形狀與角度皆不盡相同，導致缺陷檢測的準確率不足需要大量人工進行複判。因此，在第3A-3C圖之實施例中係採用半監督學習系統100以解決缺陷偵測所面臨之問題。In the field of industrial production, defect detection is a common quality inspection method to ensure that there are no defects in the products during the processing. In the electronics industry, soldering is a typical processing procedure and plays an important role in the yield of printed circuit board (PCB) type products. Therefore, corresponding automatic optical inspection (AOI) inspection stations are often configured to perform defect detection on solder. However, the defect rate of mature manufacturers is usually very low, resulting in a small number of valid label samples and high acquisition costs. Furthermore, there are many types of printed circuit boards and the shapes and angles of each solder joint are different, resulting in insufficient accuracy of defect detection and requiring a large amount of manual re-judgment. Therefore, in the embodiment of Figures 3A-3C, a semi-supervised learning system 100 is used to solve the problems faced by defect detection.

為了便於說明，在此實施例中，在訓練特徵萃取模型141之前，半監督學習系統100先取得數量不等的第一源域資料、第二源域資料及目標域資料，其中第一源域資料及第二源域資料係分別從工廠A及工廠B取得，且是針對型號80之印刷電路板之各檢測點(例如焊錫點)的檢測點影像。目標域資料係由工廠A取得，且是針對型號60之印刷電路板之部分檢測點(例如焊錫點)的檢測點影像。第3A圖為第一源域資料中的其中一張檢測點影像，第3B圖為第二源域資料中之其中一張檢測點影像。第3C圖則為目標域資料中之其中一張檢測點影像。For ease of explanation, in this embodiment, before training the feature extraction model 141, the semi-supervised learning system 100 first obtains different amounts of first source domain data, second source domain data, and target domain data, wherein the first source domain data and the second source domain data are obtained from factory A and factory B, respectively, and are detection point images of each detection point (e.g., solder points) of the printed circuit board of model 80. The target domain data is obtained from factory A, and is a detection point image of some detection points (e.g., solder points) of the printed circuit board of model 60. Figure 3A is one of the detection point images in the first source domain data, and Figure 3B is one of the detection point images in the second source domain data. Figure 3C is one of the detection point images in the target domain data.

需說明的是，在工廠A及工廠B中的自動光學檢測裝置的拍攝條件可能不同，例如會受到光線、拍攝距離、拍攝角度、曝光時間的影響，再加上印刷電路板的種類眾多且每個焊錫點的形狀與角度皆不盡相同，進而導致對同一型號的印刷電路板的同一檢測點所拍攝的檢測點影像不同。It should be noted that the shooting conditions of the automatic optical inspection equipment in Factory A and Factory B may be different. For example, they may be affected by light, shooting distance, shooting angle, and exposure time. In addition, there are many types of printed circuit boards and the shapes and angles of each solder joint are different, which leads to different inspection point images taken for the same inspection point of the same model of printed circuit boards.

此外，半監督學習系統100之目標是利用第一源域資料及第二源域資料以導入在工廠A生產型號60之印刷電路板(即目標域)的缺陷偵測功能，其中缺陷偵測為二元分類，意即樣本(即檢測點影像)之判斷結果為通過或不通過。若判斷結果為通過，則表示半監督學習系統100判斷此樣本無缺陷；若判斷結果為不通過，則表示半監督學習系統100判斷此樣本有缺陷。In addition, the goal of the semi-supervised learning system 100 is to use the first source domain data and the second source domain data to import the defect detection function of the printed circuit board (i.e., the target domain) produced in Factory A Model 60, wherein the defect detection is a binary classification, which means that the judgment result of the sample (i.e., the inspection point image) is passed or failed. If the judgment result is passed, it means that the semi-supervised learning system 100 judges that the sample has no defects; if the judgment result is failed, it means that the semi-supervised learning system 100 judges that the sample has defects.

第一源域資料、第二源域資料及目標域資料中之良品樣本數、不良品樣本數、未標籤樣本數及總樣本數如表1所示：第一源域資料第二源域資料目標域資料良品樣本數 26 493 17 不良品樣本數 138 7 41 未標籤樣本數 0 1504 0 總樣本數 164 2004 58 表1 The number of good samples, defective samples, unlabeled samples and total samples in the first source domain data, the second source domain data and the target domain data are shown in Table 1: First source domain data Second source domain data Target domain data Number of good samples 26 493 17 Number of defective samples 138 7 41 Number of unlabeled samples 0 1504 0 Total number of samples 164 2004 58 Table 1

在處理器110將第一源域資料(例如包含164筆已標籤資料)、第二源域資料(例如包含500筆已標籤資料及1504筆未標籤資料)及目標域資料(例如20筆未標籤資料)輸入至特徵萃取模型141並經過上述實施例之流程完成訓練後，處理器110可利用三個指標：準確率(accuracy)、精確率(precision)及召回率(recall)以衡量特徵萃取模型141之分類能力。After the processor 110 inputs the first source domain data (for example, including 164 labeled data), the second source domain data (for example, including 500 labeled data and 1504 unlabeled data) and the target domain data (for example, 20 unlabeled data) into the feature extraction model 141 and completes the training through the process of the above embodiment, the processor 110 can use three indicators: accuracy, precision and recall to measure the classification ability of the feature extraction model 141.

舉例來説，不良品的檢出數量可分為四大類：真陽性(true positive)、真陰性(true negative)、偽陽性(false positive)及偽陰性(false negative)，例如分別用TP、TN、FP及FN表示相應的數量。TP係表示特徵萃取模型141判斷樣本為不良品且實際上為不良品的數量。TN係表示特徵萃取模型141判斷樣本為良品且實際上為良品的數量。FP係表示特徵萃取模型141錯誤地將良品的樣本判斷為不良品的數量。FP係表示特徵萃取模型141錯誤地將不良品的樣本判斷為良品的數量。上述表示方式係針對不良品之缺陷偵測。本發明領域中具有通常知識者亦可利用類似方式以計算出良品的四種檢測數量。For example, the number of defective products detected can be divided into four categories: true positive, true negative, false positive and false negative, and the corresponding numbers are represented by TP, TN, FP and FN respectively. TP represents the number of samples that the feature extraction model 141 judges to be defective and are actually defective. TN represents the number of samples that the feature extraction model 141 judges to be good and are actually good. FP represents the number of samples that the feature extraction model 141 mistakenly judges to be good as defective. FP represents the number of samples that the feature extraction model 141 mistakenly judges to be defective as good. The above representation method is for defect detection of defective products. A person skilled in the art can also use a similar method to calculate the four test quantities of good products.

對於不良品而言，特徵萃取模型141之準確率(accuracy)＝(TP+TN)/ (TP+FP+TN+FN)，意即特徵萃取模型141做出正確判斷的整體準確率。特徵萃取模型141之召回率(recall)= TP/(TP+FN)，意即在所有不良品中，特徵萃取模型141能正確判斷出不良品的比例。特徵萃取模型141之精確率(precision)= TP/(TP+FP)，意即在被特徵萃取模型141判斷為不良品的樣本中有多少比例是真的不良品。For defective products, the accuracy of the feature extraction model 141 = (TP+TN)/ (TP+FP+TN+FN), which means the overall accuracy of the feature extraction model 141 in making correct judgments. The recall of the feature extraction model 141 = TP/(TP+FN), which means the proportion of defective products that the feature extraction model 141 can correctly judge as defective products among all defective products. The precision of the feature extraction model 141 = TP/(TP+FP), which means the proportion of samples judged as defective products by the feature extraction model 141 are actually defective products.

依據類似方式，處理器110亦可計算出良品的準確率、召回率及精確率，其中良品的準確率與不良品的準確率是相同的，均是指特徵萃取模型141正確判斷的整體準確率。良品的召回率則表示在所有良品中，特徵萃取模型141可正確判斷出多少比例的良品。良品的精確率則表示在被特徵萃取模型141判斷為良品的樣本中有多少比例是真的良品。In a similar manner, the processor 110 can also calculate the accuracy, recall and precision of good products, where the accuracy of good products is the same as the accuracy of bad products, both referring to the overall accuracy of the correct judgment of the feature extraction model 141. The recall rate of good products indicates the percentage of good products that the feature extraction model 141 can correctly judge among all good products. The precision of good products indicates the percentage of samples judged as good products by the feature extraction model 141 that are truly good products.

因此，針對良品及不良品的準確率、召回率及精確率如表2所示：標籤召回率精確率準確率良品良品召回率良品精確率整體準確率不良品不良品召回率不良品精確率表2 Therefore, the accuracy, recall and precision for good and bad products are shown in Table 2: Tags Recall Accuracy Accuracy Good quality Good product recall rate Good product accuracy Overall accuracy Defective products Defective product recall rate Defective product accuracy rate Table 2

在此，本發明係提供兩種常見的機器學習方法以與本發明之半監督學習系統100之表現進行比較。對照組1係採用典型的監督式學習(supervised learning)方法，且僅使用目標域資料中的47筆已標籤資料進行訓練，並以剩餘的11筆已標籤資料進行測試。對照組1之情境係表示重新蒐集目標域的多筆資料，並耗費資源進行標籤工作，在不參考其他源域之相似資料的情況下，機器學習模型的準確率、召回率及精確率表現。Here, the present invention provides two common machine learning methods for comparison with the performance of the semi-supervised learning system 100 of the present invention. Control group 1 adopts a typical supervised learning method and only uses 47 labeled data in the target domain data for training, and uses the remaining 11 labeled data for testing. The scenario of control group 1 represents the accuracy, recall and precision of the machine learning model without referring to similar data in other source domains by re-collecting multiple data in the target domain and consuming resources for labeling.

對照組1針對良品及不良品的準確率、召回率及精確率如表3所示：標籤召回率精確率準確率良品 0.94 0.94 0.92 不良品 0.92 0.82 表3 The accuracy, recall and precision of control group 1 for good and defective products are shown in Table 3: Tags Recall Accuracy Accuracy Good quality 0.94 0.94 0.92 Defective products 0.92 0.82 table 3

對照組2係採用領域自適應技術，且機器學習模型之訓練資料則使用第一源域資料中的164筆已標籤資料、第二源域資料中的500筆已標籤資料及1504筆未標籤資料、以及目標域資料中的20筆未標籤資料。對照組2之情境係表示當有多個源域可供參考且同時包含標籤及未標籤資料時，若僅有少數目標域的未標籤資料，機器學習模型在目標域的表現。Control group 2 uses domain adaptation technology, and the training data of the machine learning model uses 164 labeled data in the first source domain data, 500 labeled data and 1504 unlabeled data in the second source domain data, and 20 unlabeled data in the target domain data. The scenario of control group 2 represents the performance of the machine learning model in the target domain when there are multiple source domains for reference and both labeled and unlabeled data, and only a small amount of unlabeled data in the target domain.

當對照組2之機器學習模型訓練完成後，利用目標域資料中剩餘的38筆資料以對機器學習模型進行測試，可得到對照組2針對良品及不良品的準確率、召回率及精確率，如表4所示：標籤召回率精確率準確率良品 0.54 0.58 0.71 不良品 0.80 0.77 表4 After the training of the machine learning model of control group 2 is completed, the remaining 38 data in the target domain data are used to test the machine learning model. The accuracy, recall and precision of control group 2 for good and defective products can be obtained, as shown in Table 4: Tags Recall Accuracy Accuracy Good quality 0.54 0.58 0.71 Defective products 0.80 0.77 Table 4

本發明之半監督學習系統100係採用與對照組2相同的訓練資料。當特徵萃取模型141經由上述實施例之流程完成訓練後，處理器110利用目標域資料中剩餘的38筆資料以對機器學習模型進行測試，可得到半監督學習系統100針對良品及不良品的準確率、召回率及精確率，如表5所示：標籤召回率精確率準確率良品 1.00 0.81 0.92 不良品 0.88 1.00 表5 The semi-supervised learning system 100 of the present invention uses the same training data as the control group 2. After the feature extraction model 141 is trained through the process of the above embodiment, the processor 110 uses the remaining 38 data in the target domain data to test the machine learning model, and the accuracy, recall rate and precision rate of the semi-supervised learning system 100 for good and bad products can be obtained, as shown in Table 5: Tags Recall Accuracy Accuracy Good quality 1.00 0.81 0.92 Defective products 0.88 1.00 table 5

本發明之半監督學習系統100的情境係表示可同時充分採用源域資料中之已標籤資料及未標籤資料，並利用目標域中少量的未標籤資料以進行模型轉移時，可以達到比其他方法更高的準確率及精確度，甚至可逼近對照組1使用大量已標籤資料建立機器學習模型時的表現，且具有極高的精確率。The scenario of the semi-supervised learning system 100 of the present invention indicates that it can fully utilize the labeled data and unlabeled data in the source domain data at the same time, and use a small amount of unlabeled data in the target domain for model transfer, which can achieve higher accuracy and precision than other methods, and can even approach the performance of the control group 1 when using a large amount of labeled data to establish a machine learning model, and has extremely high accuracy.

第4圖為依據本發明一實施例中之半監督學習方法的流程圖。FIG. 4 is a flow chart of a semi-supervised learning method according to an embodiment of the present invention.

在步驟S410，取得一或多個源域的源域資料及一目標域的目標域資料。舉例來説，各源域的源域資料以及目標域的目標域資料均包含已標籤資料及未標籤資料。In step S410, source domain data of one or more source domains and target domain data of a target domain are obtained. For example, the source domain data of each source domain and the target domain data of the target domain both include labeled data and unlabeled data.

在步驟S420，使用該源域資料及該目標域資料以訓練一特徵萃取模型。舉例來説，在第3A-3C圖之實施例中，處理器110將第一源域資料(例如包含164筆已標籤資料)、第二源域資料(例如包含500筆已標籤資料及1504筆未標籤資料)及目標域資料(例如20筆未標籤資料)輸入至特徵萃取模型141以進行訓練。In step S420, the source domain data and the target domain data are used to train a feature extraction model. For example, in the embodiment of FIGS. 3A-3C, the processor 110 inputs the first source domain data (e.g., including 164 labeled data), the second source domain data (e.g., including 500 labeled data and 1504 unlabeled data) and the target domain data (e.g., 20 unlabeled data) into the feature extraction model 141 for training.

在步驟S430，計算域判別損失函數。域判別損失函數例如可表示為：，其中 x為原始資料。處理器110並依據域判別損失函數並搭配領域對抗訓練(domain adversarial training)的機制以計算域判別模型144與特徵萃取模型141權重的更新方法，藉以確保特徵萃取模型141取出之抽象特徵具有域不變性(domain invariance)的特性。 In step S430, a domain discrimination loss function is calculated. The domain discrimination loss function can be expressed as: , where x is the original data. The processor 110 calculates the updating method of the weights of the domain discriminant model 144 and the feature extraction model 141 according to the domain discriminant loss function and the domain adversarial training mechanism, so as to ensure that the abstract features extracted by the feature extraction model 141 have the characteristics of domain invariance.

在步驟S440，計算任務損失函數。任務損失函數例如可表示為：，其中 x為原始資料， y*為標籤、 L為已標籤資料的集合。處理器110係使用已標籤樣本進行任務模型的監督式學習，並利用反向傳播算法計算任務模型143與特徵萃取模型141權重的更新方向，確保模型能對任務有準確的判斷。 In step S440, the task loss function is calculated. The task loss function can be expressed as: , where x is the original data, y* is the label, and L is the set of labeled data. The processor 110 uses the labeled samples to perform supervised learning of the task model and uses the back propagation algorithm to calculate the update direction of the weights of the task model 143 and the feature extraction model 141 to ensure that the model can make accurate judgments on the task.

在步驟S450，計算半監督損失函數。半監督損失函數例如可表示為：，其中U為未標籤資料之集合，D _KL為衡量兩分布的度量方式(例如使用KL散度演算法)。處理器110係使用未標籤資料以提供特徵萃取模型141與任務模型143學習的資訊，並透過反向傳播算法計算特徵萃取模型141之權重的更新方向。在此實施例中，步驟S430～450可統整為同一步驟、或是其順序可任意調換、或是步驟S430～450可同時執行。 In step S450, a semi-supervisory loss function is calculated. The semi-supervisory loss function can be expressed as: , where U is a set of unlabeled data, and D _KL is a metric for measuring two distributions (e.g., using a KL divergence algorithm). The processor 110 uses the unlabeled data to provide information for the feature extraction model 141 and the task model 143 to learn, and calculates the update direction of the weight of the feature extraction model 141 through a back propagation algorithm. In this embodiment, steps S430-450 can be integrated into the same step, or their order can be arbitrarily changed, or steps S430-450 can be executed simultaneously.

在步驟S460，更新模型權重。舉例來説，處理器110可依據總損失函數SL之計算結果透過反向傳播法以計算並更新整體模型中相應於該特徵萃取模型、該任務模型及該域判別模型之第一權重、第二權重及第三權重。In step S460, the model weights are updated. For example, the processor 110 may calculate and update the first weight, the second weight, and the third weight corresponding to the feature extraction model, the task model, and the domain discrimination model in the overall model through the back propagation method according to the calculation result of the total loss function SL.

在步驟S470，判斷整體模型是否滿足模型收斂條件。若是，則執行步驟S480以結束整體模型之訓練過程。若否，則回到步驟S430以重複執行計算三種損失函數及更新模型權重的步驟。舉例來説，上述模型收斂條件可為整體模型在固定訓練批次(epoch)或固定訓練時間或總損失函數達到目標值等等，但本發明並不限於此。In step S470, it is determined whether the overall model satisfies the model convergence condition. If so, step S480 is executed to end the training process of the overall model. If not, the process returns to step S430 to repeat the steps of calculating the three loss functions and updating the model weights. For example, the above-mentioned model convergence condition may be that the overall model reaches the target value in a fixed training batch (epoch) or a fixed training time or a total loss function, etc., but the present invention is not limited thereto.

綜上所述，本發明係提供一種半監督學習系統及半監督學習方法，其可結合領域自適應技術與半監督學習機制，並可使用源域資料及目標域資料進行模型訓練，並且搭配域判別損失函數、任務損失函數及半監督損失函數以進行模型權重之更新，以使模型具有較佳的訓練階段，以達到更好的機器學習表現。In summary, the present invention provides a semi-supervised learning system and a semi-supervised learning method, which can combine domain adaptation technology with a semi-supervised learning mechanism, and can use source domain data and target domain data for model training, and can be combined with a domain discrimination loss function, a task loss function and a semi-supervised loss function to update the model weights, so that the model has a better training stage to achieve better machine learning performance.

本發明之方法，或特定型態或其部份，可以以程式碼的型態包含於實體媒體，如軟碟、光碟片、硬碟、或是任何其他機器可讀取(如電腦可讀取)儲存媒體，其中，當程式碼被機器，如電腦載入且執行時，此機器變成用以參與本發明之裝置或系統。本發明之方法、系統與裝置也可以以程式碼型態透過一些傳送媒體，如電線或電纜、光纖、或是任何傳輸型態進行傳送，其中，當程式碼被機器，如電腦接收、載入且執行時，此機器變成用以參與本發明之裝置或系統。當在一般用途處理器實作時，程式碼結合處理器提供一操作類似於應用特定邏輯電路之獨特裝置。The method of the present invention, or a specific form or part thereof, may be included in the form of program code on a physical medium, such as a floppy disk, an optical disk, a hard disk, or any other machine-readable (such as computer-readable) storage medium, wherein when the program code is loaded and executed by a machine, such as a computer, the machine becomes a device or system for participating in the present invention. The method, system and device of the present invention may also be transmitted in the form of program code through some transmission medium, such as wires or cables, optical fibers, or any transmission type, wherein when the program code is received, loaded and executed by a machine, such as a computer, the machine becomes a device or system for participating in the present invention. When implemented on a general-purpose processor, the program code combines with the processor to provide a unique device that operates similarly to an application-specific logic circuit.

本發明雖以較佳實施例揭露如上，然其並非用以限定本發明的範圍，任何所屬技術領域中具有通常知識者，在不脫離本發明之精神和範圍內，當可做些許的更動與潤飾，因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention is disclosed as above with the preferred embodiments, it is not intended to limit the scope of the present invention. Any person with ordinary knowledge in the relevant technical field can make some changes and modifications without departing from the spirit and scope of the present invention. Therefore, the protection scope of the present invention shall be defined by the scope of the attached patent application.

100:半監督學習系統 110:處理器 130:記憶體單元 140:儲存裝置 141:特徵萃取模型 142:半監督學習機制 143:任務模型 144:域判別模型 210-21N:源域資料 220:目標域資料 210A-21NA、220A:已標籤資料 210B-21NB、220B:未標籤資料 231:特徵向量 232:半監督損失函數 233:任務損失函數 234:域判別損失函數 S410-S480:步驟 100: Semi-supervised learning system 110: Processor 130: Memory unit 140: Storage device 141: Feature extraction model 142: Semi-supervised learning mechanism 143: Task model 144: Domain discrimination model 210-21N: Source domain data 220: Target domain data 210A-21NA, 220A: Labeled data 210B-21NB, 220B: Unlabeled data 231: Feature vector 232: Semi-supervised loss function 233: Task loss function 234: Domain discrimination loss function S410-S480: Steps

第1圖為依據本發明一實施例中之半監督學習系統的示意圖。第2圖為依據本發明一實施例中之特徵萃取模型之訓練過程的示意圖。第3A-3C圖為依據本發明一實施例中之源域資料及目標域資料的示意圖。第4圖為依據本發明一實施例中之半監督學習方法的流程圖。 FIG. 1 is a schematic diagram of a semi-supervised learning system according to an embodiment of the present invention. FIG. 2 is a schematic diagram of a training process of a feature extraction model according to an embodiment of the present invention. FIG. 3A-3C are schematic diagrams of source domain data and target domain data according to an embodiment of the present invention. FIG. 4 is a flow chart of a semi-supervised learning method according to an embodiment of the present invention.

S410-S480:步驟 S410-S480: Steps

Claims

A semi-supervised learning system includes: a non-volatile memory for storing a semi-supervised learning application; and a processor for executing the semi-supervised learning application to perform the following steps: obtaining source domain data of one or more source domains and target domain data of a target domain; using the source domain data and the target domain data to train a feature extraction model; using a domain discrimination model, a task The invention relates to a method for calculating a domain discrimination loss function, a task loss function and a semi-supervised loss function of the feature extraction model respectively by using a model and a semi-supervised learning mechanism; a total loss function is calculated according to the domain discrimination loss function, the task loss function and the semi-supervised loss function, and a first weight, a second weight and a second weight corresponding to the feature extraction model, the task model and the domain discrimination model are updated according to the total loss function. weight and a third weight; and in response to an overall model satisfying a model convergence condition, the training process of the overall model is terminated, wherein the overall model includes the feature extraction model, the task model and the domain discrimination model; wherein the source domain data includes first labeled data and first unlabeled data, and the target domain data includes second labeled data and second unlabeled data; wherein the processor uses the first labeled data, the first unlabeled data, the second labeled data and the second unlabeled data to calculate the domain discrimination loss function; wherein the processor uses the first labeled data and the second labeled data to calculate the task loss function; and wherein the processor uses the first unlabeled data and the second unlabeled data to calculate the semi-supervised loss function.

The semi-supervised learning system of claim 1, wherein the feature extraction model is a ResNet50 model, the semi-supervised learning mechanism is a consistency-normalized unsupervised data augmentation mechanism, the task model is a first fully connected layer, and the domain discrimination model is a second fully connected layer plus a gradient inversion layer.

A semi-supervised learning system as claimed in claim 1, wherein the processor assigns the first hyperparameter, the second hyperparameter and the third hyperparameter to the domain judgment loss function, the task loss function and the semi-supervised loss function respectively to calculate the total loss function.

A semi-supervised learning system as claimed in claim 3, wherein the processor updates the feature extraction model and the first weight of the domain discrimination model according to the domain discrimination loss function, updates the first weight of the feature extraction model and the second weight of the task model according to the task loss function, and updates the first weight of the feature extraction model according to the semi-supervised loss function.

As in claim 2, the semi-supervised learning system, wherein the model convergence condition means that the overall model reaches the target value in a fixed training batch (epoch) or a fixed training time or the total loss function.

A semi-supervised learning method includes: obtaining source domain data of one or more source domains and target domain data of a target domain; using the source domain data and the target domain data to train a feature extraction model; using a domain discrimination model, a task model and a semi-supervised learning mechanism to respectively calculate the domain discrimination loss function, the task loss function and the semi-supervised loss function of the feature extraction model; calculating a total loss function based on the domain discrimination loss function, the task loss function and the semi-supervised loss function, and updating the first weight, the second weight and the third weight corresponding to the feature extraction model, the task model and the domain discrimination model based on the total loss function; and updating the first weight, the second weight and the third weight corresponding to the feature extraction model, the task model and the domain discrimination model in response to the fullness of an overall model. The model convergence condition is satisfied, and the training process of the overall model is terminated, wherein the overall model includes the feature extraction model, the task model and the domain discrimination model; wherein the source domain data includes the first labeled data and the first unlabeled data, and the target domain data includes the second labeled data and the second unlabeled data, and the method further includes: using the first labeled data, the first unlabeled data, the second labeled data and the second unlabeled data to calculate the domain discrimination loss function; using the first labeled data and the second labeled data to calculate the task loss function; and using the first unlabeled data and the second unlabeled data to calculate the semi-supervised loss function.

The semi-supervised learning method of claim 6, wherein the feature extraction model is a ResNet50 model, the semi-supervised learning mechanism is a consistency regularized unsupervised data augmentation mechanism, the task model is a first fully connected layer, and the domain discrimination model is a second fully connected layer plus a gradient inversion layer.

The semi-supervised learning method of claim 7 further includes: respectively assigning the first hyperparameter, the second hyperparameter and the third hyperparameter to the domain judgment loss function, the task loss function and the semi-supervised loss function to calculate the total loss function.

The semi-supervised learning method of claim 8 further includes: updating the first weight of the feature extraction model according to the domain discrimination loss function; updating the first weight of the feature extraction model and the second weight of the task model according to the task loss function; and updating the first weight of the feature extraction model according to the semi-supervised loss function.

As in claim 7, the semi-supervised learning method, wherein the model convergence condition means that the overall model reaches the target value in a fixed training batch (epoch) or a fixed training time or the total loss function.