TWM625736U

TWM625736U - Ensemble learning predicting system

Info

Publication number: TWM625736U
Application number: TW111200543U
Authority: TW
Inventors: 闕壯華; 任佳珉; 黃博煜; 張語軒
Original assignee: 財團法人工業技術研究院
Priority date: 2018-11-09
Filing date: 2018-11-09
Publication date: 2022-04-11

Abstract

An ensemble learning predicting system includes a base predictor training model, a predictor weighting function training module, an evaluation module, a sample weight adjustment module. The predictor weighting function training module is connected to the base predictor training model. The sample weight adjustment module is connected to the evaluation module and is connected to the predictor weighting function training module. In the system, the following operations are performed: based on a plurality of sample weights of a plurality of sample data and respective identification results on the sample data by a plurality of predictors, establishing respective confidence models of the predictors; based on the confidence models, calculating respective scores of the confidence models, and sorting the confidence models based on the scores to select one confidence model among the confidence models as a target confidence model at a current iteration round; based on the confidence models and the identification results, adjusting the sample weights of the sample data; and based on the sample data and the adjusted sample weights, predicting the confidence models at a next round until all iteration rounds are completed to obtain a plurality of target confidence models. The target confidence models are used for generating a prediction result of a test data.

Description

Holistic Learning Prediction System

本創作是有關於一種整體式學習預測系統。 This creation is about a holistic learning prediction system.

透過分析歷史資料來預測未來事件在製造業，或是其他產業都扮有相當重要的角色。在智慧製造工廠中，機台會記錄很多生產參數，如溫度，壓力，氣體流量等，這些生產參數可以當成樣本資料，來預測產品品質或者是機台是否會故障。 Predicting future events by analyzing historical data plays an important role in manufacturing, as well as other industries. In a smart manufacturing factory, the machine will record many production parameters, such as temperature, pressure, gas flow, etc. These production parameters can be used as sample data to predict product quality or whether the machine will fail.

整體式學習(ensemble learning)屬於機器學習方法中監督式學習方法的一種。在整體式學習中，將多種預測器(或者是假說hypothesis)結合個別權重，並動態選擇信心度較高的假說，以得到預測結果。 Ensemble learning is one of the supervised learning methods in machine learning methods. In holistic learning, multiple predictors (or hypothesis hypothesis) are combined with individual weights, and the hypothesis with higher confidence is dynamically selected to obtain the prediction result.

第1圖顯示整體式學習的一例。如第1圖所示，樣本資料x分別輸入至基本預測器h ₁(x)、h ₂(x)與h ₃(x)，其中，基本預測器h ₁(x)、h ₂(x)與h ₃(x)的權重分別是w ₁、w ₂與w ₃。如果權重是固定的(亦即權重不隨x而變)，則預測結果y=w ₁ h ₁(x)+w ₂ h ₂(x)+w ₃ h ₃(x)。而如果權重是動態的(亦即權重w _i=g _i(x)，w _i隨x而變)，則預測結果y=g ₁(x)h ₁(x)+g ₂(x)h ₂(x)+g ₃(x)h ₃(x)。在整體式學習中，採用動態權重的話，通常會有較佳的預測結果。然而，如果樣本分布複雜度高的話，信心度估測容易不準確，則較難以有效地訓練出動態權重。 Figure 1 shows an example of holistic learning. As shown in Fig. 1, the sample data x is input to the basic predictors h ₁ ( x ), h ₂ ( x ) and h ₃ ( x ) respectively, wherein the basic predictors h ₁ ( x ), h ₂ ( x ) The weights with h ₃ ( x ) are w ₁ , w ₂ and w ₃ , respectively. If the weights are fixed (that is, the weights do not vary with x), the prediction result is y = w ₁ h ₁ ( x ) + w ₂ h ₂ ( x ) + w ₃ h ₃ ( x ). And if the weights are dynamic (that is, the weights w _i = g _i ( x ) , and w _i varies with x ), then the prediction result y = g ₁ ( x ) h ₁ ( x ) + g ₂ ( x ) h ₂ ( x ) + g ₃ ( x ) h ₃ ( x ). In ensemble learning, dynamic weights usually lead to better prediction results. However, if the complexity of the sample distribution is high, the confidence estimation is likely to be inaccurate, and it is difficult to effectively train dynamic weights.

故而，如何能夠即便在樣本空間(sample space)複雜的情況下，仍能有效訓練動態權重，是業界在設計整體式學習測試系統的重點之一。 Therefore, how to effectively train dynamic weights even when the sample space is complex is one of the key points in the design of an integrated learning test system in the industry.

根據本創作，提出一種整體式學習預測系統，包括一基本預測器訓練模型、一預測器權重函數訓練模組、一評價模組、及一樣本權重調整模組。基本預測器訓練模型利用從一信號源所傳來的多個訓練資料，來建立多個基本預測器，其中，信號源為一感測器。預測器權重函數訓練模組連接基本預測器訓練模型，對多個樣本資料的多個樣本權重進行初始化，並初始化一處理集合，初始化後之處理集合包括對應至此些基本預測器之多個預測器，且，於一第一疊代回合中，根據此些樣本資料與此些樣本權重，來建立處理集合中的此些預測器的多個預測器權重函數，且由處理集合中的各此些預測器分別對此些樣本資料進行預測，以針對各此些樣本資料產生一預測結果。評價模組評估此些預測器權重函數，並根據一評估結果來從第一疊代回合所建立的此些預測器權重函數中選擇一目標預測器權重函數且從處理集合中選擇對應至目標預測器權重函數之一目標預測器。樣本權重調整模組連接評價模組並連接預測器權重函數訓練模組，據以更新處理集合與更新此些樣本資料的此些樣本權重，更新後之處理集合係不包括目標預測器。其中，根據更新後之此些樣本資料的此些樣本權重，進行下一疊代回合來重複上述操作以從更新後的處理集合中選擇另一目標預測器並得到另一目標預測器權重函數，直到進行所有疊代回合來選出多個目標預測器權重函數與多個目標預測器，以整合成一整體預測器，其中，整體預測器包括此些目標預測器權重函數與此些目標預測器，且整體預測器的一預測結果更顯示於一顯示器。 According to this creation, an integrated learning and prediction system is proposed, which includes a basic predictor training model, a predictor weight function training module, an evaluation module, and a sample weight adjustment module. The basic predictor training model uses a plurality of training data transmitted from a signal source to establish a plurality of basic predictors, wherein the signal source is a sensor. The predictor weight function training module is connected to the basic predictor training model, initializes multiple sample weights of multiple sample data, and initializes a processing set. After initialization, the processing set includes multiple predictors corresponding to these basic predictors , and, in a first iteration round, according to the sample data and the sample weights, a plurality of predictor weight functions for the predictors in the processing set are established, and each of these predictors in the processing set is The predictor respectively predicts the sample data to generate a prediction result for each of the sample data. The evaluation module evaluates the predictor weight functions, and according to an evaluation result, selects a target predictor weight function from the predictor weight functions established in the first iteration round and selects a corresponding target predictor from the processing set target predictor, one of the predictor weight functions. The sample weight adjustment module is connected to the evaluation module and the predictor weight function training module to update the processing set and update the sample weights of the sample data. The updated processing set does not include the target predictor. Wherein, according to the sample weights of the updated sample data, the next iteration is performed to repeat the above operation to select another target predictor from the updated processing set and obtain another target predictor weight function, until all iterative rounds are performed to select a plurality of target predictor weight functions and a plurality of target predictors to integrate into an overall predictor, wherein the overall predictor includes these target predictor weight functions and these target predictors, and a prediction of the overall predictor The result is further displayed on a display.

為了對本創作之上述及其他方面有更佳的瞭解，下文特舉實施例，並配合所附圖式詳細說明如下： In order to have a better understanding of the above-mentioned and other aspects of the present creation, the following examples are given and described in detail with the accompanying drawings as follows:

210-280:步驟 210-280: Steps

300:整體式學習測試系統 300: Holistic Learning Test System

310:基本預測器訓練模型 310: Basic Predictor Training Model

320:預測器權重函數訓練模組 320: Predictor weight function training module

330:評價模組 330: Evaluation Module

340:樣本權重調整模組 340: Sample weight adjustment module

第1圖顯示整體式學習的一例。 Figure 1 shows an example of holistic learning.

第2圖顯示根據本案一實施例的整體式學習預測方法的流程圖。 FIG. 2 shows a flow chart of an integrated learning prediction method according to an embodiment of the present application.

第3圖顯示本案另一實施例的整體式學習測試系統的功能方塊圖。 FIG. 3 shows a functional block diagram of an integrated learning and testing system according to another embodiment of the present application.

本說明書的技術用語係參照本技術領域之習慣用語，如本說明書對部分用語有加以說明或定義，該部分用語之解釋係以本說明書之說明或定義為準。本揭露之各個實施例分別具有一或多個技術特徵。在可能實施的前提下，本技術領域具有通常知識者可選擇性地實施任一實施例中部分或全部的技術特徵，或者選擇性地將這些實施例中部分或全部的技術特徵加以組合。 The technical terms in this specification refer to the common terms in the technical field. If some terms are described or defined in this description, the interpretations of these terms are subject to the descriptions or definitions in this description. Each embodiment of the present disclosure has one or more technical features. Under the premise of possible implementation, those skilled in the art can selectively implement some or all of the technical features in any embodiment, or selectively combine some or all of the technical features in these embodiments.

第2圖顯示根據本案一實施例的整體式學習預測方法的流程圖，可應用於具有處理器的電子裝置。如第2圖所示，在步驟210(訓練階段)中，利用訓練資料(training data)D _train，來建立一組基本預測器h ₁,h ₂,...,h _N(N為正整數)，其中，每一個基本預測器h _i(i=1~N) 可能來自不同的演算法、不同的超參數(hyperparameter)、或是不同的抽樣樣本。本案並不受限於此。 FIG. 2 shows a flowchart of an integrated learning prediction method according to an embodiment of the present application, which can be applied to an electronic device with a processor. As shown in Fig. 2, in step 210 (training phase), a set of basic predictors h ₁ , h ₂ ,..., h _N (N is a positive integer) are established by using the training data D _train ), where each basic predictor h _i (i=1~N) may come from different algorithms, different hyperparameters, or different sampling samples. This case is not limited to this.

在步驟220(訓練階段)中，對驗證資料(validation data)D _valid中的各樣本的各樣本權重進行設定(初始化)，並設定(初始化)處理集合H。對驗證資料D _valid中的每一個樣本x _j(j=1~n，n代表驗證資料中的樣本個數)的樣本權重

給予初始化(t=1，t代表疊代回合數)。例如，將所有樣本x _j(j=1~n)的個別樣本權重

全部都初始化為1，亦即，

=1。經初始化後的處理集合H可表示如後H={h ₁,h ₂,...,h _N}，處理集合H所包括的預測器是指未被選的預測器的集合，至於選擇預測器的原則將在底下說明之。在本案實施例中，訓練資料D _train與驗證資料D _valid同屬於樣本資料。例如但不受限於，目前共有1000筆的樣本資料，其中，800筆樣本資料當成訓練資料，而剩餘的200筆樣本資料則當成驗證資料。當然，本案並不受限於此。 In step 220 (training phase), each sample weight of each sample in the validation data (validation data) D _valid is set (initialized), and the processing set H is set (initialized). The sample weight for each sample x _j (j=1~n, n represents the number of samples in the verification data) in the verification data D _valid

Give initialization (t=1, t represents the number of iteration rounds). For example, the individual sample weights of all samples x _j (j=1~n)

All are initialized to 1, that is,

=1. The initialized processing set H can be expressed as the following H = { h ₁ , h ₂ ,..., h _N }, and the predictors included in the processing set H refer to the set of unselected predictors. As for the selection prediction The principle of the device will be explained below. In the embodiment of this case, the training data D _train and the verification data D _valid belong to the sample data. For example, but not limited to, there are currently 1000 pieces of sample data, of which 800 pieces of sample data are regarded as training data, and the remaining 200 pieces of sample data are regarded as verification data. Of course, this case is not limited to this.

在步驟230(訓練階段)中，於第一疊代回合中，根據驗證資料D _valid的所有樣本資料與各樣本資料的個別樣本權重來建立處理集合H中的該些預測器的各別預測器權重函數(weighting function)。 In step 230 (training phase), in the first iteration round, the respective predictors of the predictors in the processing set H are established according to all the sample data of the verification data D _valid and the individual sample weights of each sample data Weighting function.

於建立好各預測器的各預測器權重函數後，由處理集合H中的各預測器h _i分別對驗證資料D _valid中的每一個樣本x _j進行預測，並辨別預測結果是正確或錯誤，而且將預測結果記錄起來。 After each predictor weight function of each predictor is established, each predictor h _i in the processing set H respectively predicts each sample x _j in the verification data D _valid , and distinguishes whether the prediction result is correct or incorrect, And record the forecast results.

例如但不受限於，從預測器h ₁開始，由預測器h ₁對驗證資料D _valid中的每一個樣本x _j進行預測，並辨別預測結果是正確或錯誤，將預測器h ₁的所有預測結果記錄下來。接著，由預測器h ₂對驗證資料D _valid中的每一個樣本x _j進行預測，並辨別預測結果是正確或錯誤，將預測器h ₂的所有預測結果記錄下來。重複上述做法，直到處理集合H中的最後一個預測器h _n對驗證資料D _valid中的每一個樣本x _j進行預測，並辨別預測結果是正確或錯誤，將預測器h _n的所有預測結果記錄下來。 For example, but not limited to, starting from the predictor h ₁ , predicting each sample x _j in the verification data D _valid by the predictor h ₁ , and distinguishing whether the prediction result is correct or wrong, all the predictor h ₁ The forecast results are recorded. Next, predict each sample x _j in the verification data D _valid by the predictor h ₂ , and distinguish whether the prediction result is correct or wrong, and record all the prediction results of the predictor h ₂ . Repeat the above process until the last predictor h _n in the processing set H predicts each sample x _j in the verification data D _valid , and distinguish whether the prediction result is correct or wrong, and record all the prediction results of the predictor h _n down.

例如，在第1疊代回合(t=1)時，預測器h ₁的預測結果如下：

For example, at the first iteration round (t=1), the prediction result of predictor h ₁ is as follows:

其中，f ₁ ⁽¹⁾是高或低，是以f ₁ ⁽¹⁾的輸出值是否大於預先設定門檻值來做判斷(在本案實施例中，f ₁ ⁽¹⁾的輸出值隨著樣本資料x的值而變)。集合R1代表，當f ₁ ⁽¹⁾的輸出值是高時，能被預測器h ₁正確預測的樣本資料的集合；相似地，集合R2代表，當f ₁ ⁽¹⁾的輸出值是低時，能被預測器h ₁正確預測的樣本資料的集合；集合R3代表，當f ₁ ⁽¹⁾的輸出值是高時，能被預測器h ₁錯誤預測的樣本資料的集合；而集合R4代表，當f ₁ ⁽¹⁾的輸出值是低時，能被預測器h ₁錯誤預測的樣本資料的集合。在本案實施例中，參考符號f _i ^(t)代表在第t疊代回合中，對預測器h _i所訓練出的預測器權重函數。 Among them, f ₁ ⁽¹⁾ is high or low, and the judgment is made based on whether the output value of f ₁ ⁽¹⁾ is greater than the preset threshold value (in the embodiment of this case, the output value of f ₁ ⁽¹⁾ follows the sample data. the value of x). Set R1 represents the set of sample data that can be correctly predicted by predictor h ₁ when the output value of f ₁ ⁽¹⁾ is high; similarly, set R2 represents when the output value of f ₁ ⁽¹⁾ is low , the set of sample data that can be correctly predicted by the predictor h ₁ ; the set R3 represents the set of sample data that can be wrongly predicted by the predictor h ₁ when the output value of f ₁ ⁽¹⁾ is high; and the set R4 represents , the set of sample data that can be mispredicted by the predictor h ₁ when the output value of f ₁ ⁽¹⁾ is low. In the embodiment of this case, the reference symbol f _i ^{( t )} represents the predictor weight function trained on the predictor h _i in the t-th iteration round.

在步驟240(訓練階段)中，評估所建立好的預測器權重函數，並根據評估結果來從此疊代回合中所建立的該些預測器權重函數中選擇一個目標預測器權重函數且從處理集合H中選擇一個目標預測器，並據以更新處理集合H(例如，將被選的預測器從處理集合H中移除)。 In step 240 (training phase), the established predictor weight function is evaluated, and according to the evaluation result, a target predictor weight function is selected from the predictor weight functions established in this iteration round and selected from the processing set A target predictor is selected from H , and the processing set H is updated accordingly (eg, the selected predictor is removed from the processing set H ).

步驟240的細節例如如下：在各疊代回合中，評估各預測器的各預測器權重函數時，評估各預測器權重函數的正確分數s _i，公式如下(當然，本案並不受限於此，本案其他可能實施例可以其他不同的分數公式，此皆在本案精神範圍內)：

The details of step 240 are, for example, as follows: in each iteration round, when evaluating each predictor weight function of each predictor, the correct score s _i of each predictor weight function is evaluated, and the formula is as follows (of course, this case is not limited to this) , other possible embodiments of this case can use other different fractional formulas, which are all within the scope of the spirit of this case):

參考符號

代表在第t疊代回合中，用於訓練預測器權重函數的樣本資料x _i的樣本權重。在所有預測器權重函數中，找出具有最高分數的預測器權重函數，例如，f _i有最高分數，則在第t疊代回合中，設定h _(t)=h _i以及g _(t)=f _i ^(t)，並將h _i由H中移除。例如，在目前疊代回合中，f ₂具有最高分數，則在目前疊代回合中，選擇h ₂與f ₂，且將h ₂從H中移除。h _(t)與g _(t)分別代表在第t疊代回合中所選擇的目標預測器與目標預測器權重函數。也就是說，在本案實施例中，預測器權重函數的正確分數代表預測器的預測結果是否一致於預測器權重函數的輸出值，如果分數愈高，則代表預測器的預測結果愈一致於預測器權重函數的輸出值；反之，如果分數愈低，則代表預測器的預測結果愈不一致於預測器權重函數的輸出值。 reference symbol

represents the sample weight of the sample data xi used to train the predictor weight function in the t- _th iteration. Among all predictor weight functions, find the predictor weight function with the highest score, for example, f _i has the highest score, then in the t-th iteration, set h _{( t )} = hi and _g ( _{t )} = f _i ^{( t )} _and remove hi from H. For example, in the current _iteration round, f2 has the highest score, then in the current iteration round, h2 and f2 are _selected , and _h2 is removed from _H. h _{( t )} and g _{( t )} represent the target predictor and target predictor weight function selected in the t-th iteration, respectively. That is to say, in the embodiment of this case, the correct score of the predictor weight function represents whether the prediction result of the predictor is consistent with the output value of the predictor weight function. If the score is higher, it means that the prediction result of the predictor is more consistent with the prediction The output value of the predictor weight function; on the contrary, if the score is lower, it means that the prediction result of the predictor is less consistent with the output value of the predictor weight function.

在步驟250(訓練階段)中，更新驗證資料D _valid中的樣本資料的個別樣本權重，其細節將於底下說明之。如果所有疊代回合(iteration loop)未結束(步驟260)，則流程回至步驟230，進行下一疊代回合。反之，如果所有疊代回合已結束，則代表所有的目標預測器權重函數已建立與訓練好，訓練階段已結束。可以據以得到整體預測器(步驟270)。 In step 250 (training phase), the individual sample weights of the sample data in the verification data D _valid are updated, the details of which will be described below. If all iteration loops are not over (step 260 ), the flow returns to step 230 for the next iteration round. Conversely, if all iteration rounds have ended, it means that all target predictor weight functions have been established and trained, and the training phase has ended. An overall predictor can be derived therefrom (step 270).

根據所建立好的該組基本預測器與其個別預測器權重函數，來得到整體預測器。於測試階段中，將測試資料x輸入至所建立好的整體預測器，即可得到預測結果y(步驟270)。在本案實施例中，所得到的預測結果更可以在一顯示器(未示出)上顯示出。此外，在本案實施例中，測試資料(用於測試階段)、訓練資料D_train(用於訓練階段)與驗證資料D_valid(用於訓練階段)例如從一信號源所產生，該信號源例如但不受限於，為感測器。例如，在智慧工廠中，感測器可用以於感測生產參數，如溫度、濕度、壓力等。 The overall predictor is obtained according to the established set of basic predictors and their individual predictor weight functions. In the test phase, the test data x is input to the established overall predictor, and the prediction result y can be obtained (step 270). In the embodiment of this case, the obtained prediction result can be displayed on a display (not shown). In addition, in the embodiment of the present case, the test data (for the testing phase), the training data D _train (for the training phase) and the verification data D _valid (for the training phase) are generated, for example, from a signal source, such as But not limited to, it is a sensor. For example, in a smart factory, sensors can be used to sense production parameters such as temperature, humidity, pressure, etc.

在第t疊代回合中，更新每一個樣本x _j的樣本權重為k _j ^(t+1)，更新方式如下表，其中c>1，α ₃>α ₄>α ₂>α ₁。更新後的樣本權重可用於在第(t+1)疊代回合中來訓練預測器權重函數。 In the t-th iteration round, update the sample weight of each sample x _j to k _j ^(t+1) , the update method is as follows, where c >1, α ₃ > α ₄ > α ₂ > α ₁ . The updated sample weights can be used to train the predictor weight function in the (t+1)th iteration round.

亦即，在更新每一個樣本x _j的樣本權重時，如果該樣本x _j在第t疊代回合中，被(所選的)預測器h _(t)正確預測且預測器h _(t)的相對應預測器權重函數g _(t)的值為高時，則將該樣本x _j的樣本權重k _j ^(t+1)更新為

=

(亦即調低)。如果該樣本x _j在第t疊代回合中，被(所選的)預測器h _(t)正確預測且預測器h _(t)的相對應預測器權重函數g _(t)的值為低時，則將該樣本x _j的樣本權重k _j ^(t+1)更新為

=

(亦即調低)。如果該樣本x _j在第t疊代回合中，被(所選的)預測器h _(t)錯誤預測且預測器h _(t)的相對應預測器權重函數g _(t)的值為高時，則將該樣本x _j的樣本權重k _j ^(t+1)更新為

=

(亦即調高)。如果該樣本x _j在第t疊代回合中，被(所選的)預測器h _(t)錯誤預測且預測器h _(t)的相對應預測器權重函數g _(t)的值為低時，則將該樣本x _j的樣本權重k _j ^(t+1)更新為

=

(亦即調高)。 That is, when updating the sample weight of each sample x _j , if the sample x _j is correctly predicted by the (selected) predictor h _{( t )} in the t iteration round and the predictor h _{( t )} When the value of the corresponding predictor weight function g _{( t )} is high, the sample weight k _j ^(t+1) of the sample x _j is updated as

=

(i.e. turn down). If the sample x _j is correctly predicted by the (selected) predictor h _{( t )} in the t iteration round and the value of the corresponding predictor weight function g _{( t )} of the predictor h _{(t )} is low , then the sample weight k _j ^(t+1) of the sample x _j is updated as

=

(i.e. turn down). If the sample x _j is mispredicted by the (selected) predictor h _{( t )} in the t-th iteration and the value of the corresponding predictor weight function g _{( t )} for the predictor h _{(t )} is high , then the sample weight k _j ^(t+1) of the sample x _j is updated as

=

(i.e. turn up). If the sample x _j is mispredicted by the (selected) predictor h _{( t )} in the t-th iteration and the value of the corresponding predictor weight function g _{( t )} for the predictor h _{(t )} is low , then the sample weight k _j ^(t+1) of the sample x _j is updated as

=

(i.e. turn up).

也就是說，在本案實施例中，對於被所選預測器h _(t)錯誤預測的樣本，將其下一回合的樣本權重調高，而對於被所選預測器h _(t)正確預測的樣本，將其下一回合的樣本權重調低。 That is to say, in the embodiment of this case, for the samples that are wrongly predicted by the selected predictor h _{( t )} , the weight of the samples in the next round is increased, and for the samples that are correctly predicted by the selected predictor h _{( t )} sample, and lower the weight of the sample in the next round.

也就是說，在本案實施例中，藉由評估預測器權重函數與被選預測器的辨識結果之間的一致性，來決定要如何更新樣本資料的樣本權重。當預測器權重函數與被選預測器的辨識結果之間的一致性為高時，則調低樣本資料的樣本權重。相反地，當預測器權重函數與被選預測器的辨識結果之間的一致性為低時，則調高樣本資料的樣本權重。或者是說，在本案實施例中，權重調整的方式是依據『預測器預測結果是否正確』以及『預測器預測結果與權重函數輸出值是否一致』去做調整。 That is to say, in the embodiment of the present application, how to update the sample weight of the sample data is determined by evaluating the consistency between the weight function of the predictor and the identification result of the selected predictor. When the consistency between the predictor weight function and the identification result of the selected predictor is high, the sample weight of the sample data is lowered. Conversely, when the consistency between the predictor weight function and the identification result of the selected predictor is low, the sample weight of the sample data is increased. In other words, in the embodiment of this case, the weight adjustment method is adjusted according to "whether the prediction result of the predictor is correct" and "whether the prediction result of the predictor is consistent with the output value of the weight function".

更進一步說明，當被選預測器對該樣本的辨識結果是正確的，而且給該預測器分配高的預測器權重函數，則兩者一致性高。當被選預測器對該樣本的辨識結果是錯誤的，而且給該預測器分配低的預測器權重函數，則兩者一致性高。當被選預測器對該樣本的辨識結果是正確的，而且給該預測器分配低的預測器權重函數，則兩者一致性低。當被選預測器對該樣本的辨識結果是錯誤的，而且給該預測器分配高的預測器權重函數，則兩者一致性低。 To further illustrate, when the selected predictor's identification result for the sample is correct, and a high predictor weight function is assigned to the predictor, the consistency between the two is high. When the selected predictor's identification result for the sample is wrong, and the predictor is assigned a low predictor weight function, the consistency between the two is high. When the selected predictor's identification result for the sample is correct, and the predictor is assigned a low predictor weight function, the consistency between the two is low. When the selected predictor's identification result for the sample is wrong, and the predictor is assigned a high predictor weight function, the consistency between the two is low.

現舉例說明，以更加清楚描述讓本案實施例的做法。為方便說明，以進行3疊代回合(亦即t=3)，驗證資料D _valid包括5筆樣本資料x₁-x₅，且令c=1.5，(α ₁，α ₂，α ₃，α ₄)=(-1，0，2，1)為例做說明。經初始化後的處理集合H={h ₁,h ₂,h ₃}。而經初始化後的各樣本資料的各樣本權重則為(

,

,

,

,

)=(1,1,1,1,1)。 An example will now be given to describe the practice of the embodiment of the present case more clearly. For the convenience of description, in order to carry out 3 iteration rounds (ie t=3), the verification data D _valid includes 5 sample data x ₁ -x ₅ , and let c=1.5, ( α ₁ , α ₂ , α ₃ , α ₄ )=(-1, 0, 2, 1) as an example to illustrate. The initialized processing set H = { h ₁ , h ₂ , h ₃ }. The weight of each sample after initialization is (

,

)=(1,1,1,1,1).

在第1疊代回合中，根據各樣本資料與各樣本資料的各樣本權重來建立各預測器的各預測器權重函數f ₁ ⁽¹⁾，f ₂ ⁽¹⁾，f ₃ ⁽¹⁾，至於如何建立權重函數的細節在此可以不特別限定之。 In the first iteration round, each predictor weight function f ₁ ⁽¹⁾ , f ₂ ⁽¹⁾ , f ₃ ⁽¹⁾ of each predictor is established according to each sample data and each sample weight of each sample data. The details of how to establish the weight function may not be particularly limited here.

計算在第1疊代回合中的各預測器權重函數的分數s。假設預測器h ₁的辨識情形如下表：

Calculate the score s for each predictor weight function in the first iteration round. Suppose the identification situation of the predictor h ₁ is as follows:

則f ₁ ⁽¹⁾的分數s ₁如下：

Then the fraction s ₁ of f ₁ ⁽¹⁾ is as follows:

經過計算各預測器權重函數的各別分數後，假設在第1疊代回合中，f ₂ ⁽¹⁾有最高分數，則代表，在第1疊代回合中所建立出的預測器權重函數f ₂ ⁽¹⁾是最佳的，所以，在第1疊代回合中，選擇目標預測器為預測器h ₂，且選擇目標預測器權重函數為預測器權重函數f ₂ ⁽¹⁾，亦即，h ₍₁₎=h ₂，g ₍₁₎=f ₂ ⁽¹⁾(h ₍₁₎與g ₍₁₎分別代表在第1疊代回合中所選擇出的目標預測器與目標預測器權重函數)，且更新H={h ₁,h ₃}(亦即將h ₂從H移除)，而且，依據第1疊代回合的結果，將各樣本資料的各樣本權重(其用於第二疊代回合)調整如下：(

,

,

,

,

)=(

* 1.5^-1,

* 1.5^-1,

* 1.5¹,

* 1.5⁰,

* 1.5²)。 After calculating the respective scores of each predictor weight function, assuming that in the first iteration round, f ₂ ⁽¹⁾ has the highest score, it means that the predictor weight function f established in the first iteration round ₂ ⁽¹⁾ is optimal, so, in the first iteration round, the target predictor is selected as the predictor h ₂ , and the target predictor weight function is selected as the predictor weight function f ₂ ⁽¹⁾ , that is, h ₍₁₎ = h ₂ , g ₍₁₎ = f ₂ ⁽¹⁾ ( h ₍₁₎ and g ₍₁₎ represent the target predictor and target predictor weight function selected in the first iteration round, respectively ), and update H = { h ₁ , h ₃ } (that is, remove h ₂ from H ), and, according to the results of the first iteration round, the weight of each sample of each sample data (which is used for the second iteration Generation round) is adjusted as follows: (

,

)=(

* 1.5 ^-1 ,

* 1.5 ¹ ,

* 1.5 ⁰ ,

* 1.5 ² ).

在第2疊代回合中，根據各樣本資料與各樣本資料的各樣本權重(

,

,

,

,

)來建立各預測器的各預測器權重函數f ₁ ⁽²⁾，f ₃ ⁽²⁾(f ₂ ⁽²⁾可以不用再建立)。 In the second iteration round, according to each sample data and each sample weight of each sample data (

,

) to establish each predictor weight function f ₁ ⁽²⁾ and f ₃ ⁽²⁾ of each predictor ( f ₂ ⁽²⁾ may not need to be established again).

類似上述的情況，計算在第2疊代回合中的處理集合H={h ₁,h ₃}中的預測器h ₁,h ₃的預測器權重函數f ₁ ⁽²⁾，f ₃ ⁽²⁾的分數s ₁與s ₃。假設預測器h ₃的辨識情形如下表：

Similar to the above case, calculate the predictor weight functions f ₁ ⁽²⁾ , f ₃ ⁽²⁾ of the predictors h ₁ , h ₃ in the processing set H = { h ₁ , h ₃ } in the second iteration The fractions s ₁ and s ₃ . Suppose the identification situation _of the predictor h3 is as follows:

所以，f ₃ ⁽²⁾的分數s ₃如下：

Therefore, the fraction s ₃ of f ₃ ⁽²⁾ is as follows:

經過計算各預測器權重函數的各別分數後，假設在第2疊代回合中，f ₃ ⁽²⁾有最高分數，則代表，在第2疊代回合中所建立出的預測器權重函數f ₃ ⁽²⁾是最佳的，所以，在第2疊代回合中，選擇目標預測器為h ₃，且選擇目標預測器權重函數為f ₃ ⁽²⁾，亦即，h ₍₂₎=h ₃，g ₍₂₎=f ₃ ⁽²⁾(h ₍₂₎與g ₍₂₎₎分別代表在第2疊代回合中所選擇出的目標預測器與目標預測器權重函數)，且更新H={h ₁}(亦即將h ₃從H移除)，而且，依據第2疊代回合的結果，將驗證資料的各樣本資料的各樣本權重(用於第3疊代回合)調整如下：(

,

,

,

,

)=(

* 1.5⁰,

* 1.5¹,

* 1.5²,

* 1.5^-1,

* 1.5^-1)=(

* 1.5^-1 * 1.5⁰,

* 1.5^-1 * 1.5¹,

* 1.5¹ * 1.5²,

* 1.5⁰ * 1.5^-1,

* 1.5² * 1.5^-1)。 After calculating the respective scores of each predictor weight function, assuming that in the second iteration round, f ₃ ⁽²⁾ has the highest score, it means that the predictor weight function f established in the second iteration round ₃ ⁽²⁾ is optimal, so in the second iteration round, the target predictor is selected as h ₃ , and the weight function of the target predictor is selected as f ₃ ⁽²⁾ , that is, h ₍₂₎ = h ₃ , g ₍₂₎ = f ₃ ⁽²⁾ ( h ₍₂₎ and g ₍₂₎₎ represent the target predictor and target predictor weight function selected in the second iteration round, respectively), and update H ={ h ₁ } (that is, removing h ₃ from H ), and, according to the results of the second iteration round, the sample weights of each sample data of the verification data (for the third iteration round) are adjusted as follows: (

,

)=(

* 1.5 ⁰ ,

* 1.5 ¹ ,

* 1.5 ² ,

* 1.5 ^-1 ,

* 1.5 ^-1 )=(

* 1.5 ^-1 * 1.5 ⁰ ,

* 1.5 ^-1 * 1.5 ¹ ,

* 1.5 ¹ * 1.5 ² ,

* 1.5 ⁰ * 1.5 ^-1 ,

* 1.5 ² * 1.5 ^-1 ).

在第3疊代回合中，根據各樣本資料與各樣本資料的各樣本權重(

,

,

,

,

)來建立預測器h ₁的預測器權重函數f ₁ ⁽³⁾。所以，在第3疊代回合中，選擇預測器h ₁與預測器權重函數f ₁ ⁽³⁾，亦即，h ₍₃₎=h ₁，g ₍₃₎=f ₁ ⁽³⁾(h ₍₃₎與g ₍₃₎分別代表在第3疊代回合中所選擇出的目標預測器與目標預測器權重函數)，且更新H=

(亦即將h ₁從H移除)。如此，即可以結束整體預測器的建立。 In the third iteration round, according to each sample data and each sample weight of each sample data (

,

) to establish the predictor weight function f ₁ ⁽³⁾ of the predictor h ₁ . Therefore, in the 3rd iteration round, the predictor h ₁ and the predictor weight function f ₁ ⁽³⁾ are selected, that is, h ₍₃₎ = h ₁ , g ₍₃₎ = f ₁ ⁽³⁾ ( h _{( 3)} and g ₍₃₎ respectively represent the target predictor and target predictor weight function selected in the third iteration round), and update H =

(that is, remove h1 from H ₎ . In this way, the establishment of the overall predictor can be completed.

令有一筆測試資料x _test，其對應的預測結果y屬於兩類別(1或-1)，將x _test代入h _(i)(x)及g _(i)(x)後的數值如下：(h ₍₁₎(x _test),h ₍₂₎(x _test),h ₍₃₎(x _test))=(1,-1,-1) Let there be a test data x _test , and its corresponding prediction result y belongs to two categories (1 or -1), and the values after substituting x _test into h _{( i )} ( x ) and g _{( i )} ( x ) are as follows: ( h ₍₁₎ ( x _test ), h ₍₂₎ ( x _test ), h ₍₃₎ ( x _test ))=(1,-1,-1)

(g ₍₁₎(x _test),g ₍₂₎(x _test),g ₍₃₎(x _test))=(0.3,0.7,0.6) ( g ₍₁₎ ( x _test ), g ₍₂₎ ( x _test ), g ₍₃₎ ( x _test ))=(0.3,0.7,0.6)

則可採用以下整體(ensemble)方法來得到預測結果： Then the following ensemble method can be used to get the prediction result:

方法1：設定臨界值為0.5，從g ₍₁₎開始，按照順序(順序先後為g ₍₁₎、g ₍₂₎、g ₍₃₎)，選擇第一個輸出值可以超過臨界值的權重函數。以上例而言，權重函數g ₍₁₎(x _test)的輸出值為0.3，沒有超過臨界值，所以，接著考慮權重函數g ₍₂₎。權重函數g ₍₂₎(x _test)的輸出值為0.7，已超過臨界值，所以，選擇權重函數g ₍₂₎。因此採用權重函數g ₍₂₎的相對應預測器h ₍₂₎(x _test)所得到的數值-1當成測試資料x _test的整體式預測結果(亦即，y=h ₍₂₎(x _test)=-1)。 Method 1: Set the critical value to 0.5, starting from g ₍₁₎ , in order (the order is g ₍₁₎ , g ₍₂₎ , g ₍₃₎ ), select the first output value that can exceed the critical value of the weight function. In the above example, the output value of the weight function g ₍₁₎ ( x _test ) is 0.3, which does not exceed the critical value. Therefore, the weight function g ₍₂₎ is considered next. The output value of the weight function g ₍₂₎ ( x _test ) is 0.7, which exceeds the critical value, so the weight function g ₍₂₎ is chosen. Therefore, the value -1 obtained by the corresponding predictor h ₍₂₎ ( x _test ) of the weight function g ₍₂₎ is used as the overall prediction result of the test data x _test (that is, y = h ₍₂₎ ( x _test ) )=-1).

亦即，以方法1來看，可對各預測器權重函數計算函數輸出值，對於輸出值超過臨界值的第一個權重函數給與最高權重，其餘權重函數為零。 That is, according to method 1, the function output value can be calculated for each predictor weight function, the first weight function whose output value exceeds the critical value is given the highest weight, and the rest of the weight functions are zero.

方法2：對該些g _(i)進行正規化後，進行加權平均：

Method 2: After normalizing these g _{( i )} , perform a weighted average:

因加權平均的結果後小於0，因此取-1做為測試資料x _test的整體式預測結果(y=-1)。亦即，以方法2來看，對各預測器權重函數的函數輸出值透過正規化後，作為各預測器的整合權重。 Because the result of the weighted average is less than 0, -1 is taken as the overall prediction result of the test data x _test (y=-1). That is, according to method 2, the function output value of each predictor weight function is normalized and used as the integrated weight of each predictor.

第3圖顯示本案另一實施例的整體式學習測試系統的功能方塊圖。整體式學習測試系統300包含：基本預測器訓練模型310、預測器權重函數訓練模組320、評價模組330與樣本權重調整模組340。基本預測器訓練模型310、預測器權重函數訓練模組320、評價模組330與樣本權重調整模組340可實施本案上述實施例的整體式學習測試方法。整體式學習測試系統300可以更包括顯示器(未示出)，用以顯示預測結果。在本案實施例中，整體式學習測試系統300例如是具有資訊處理的電子裝置(如伺服器等)，而且，基本預測器訓練模型310、預測器權重函數訓練模組320、評價模組330與樣本權重調整模組340可由具有資訊處理能力的硬體電路(如中央處理器)所實施。 FIG. 3 shows a functional block diagram of an integrated learning and testing system according to another embodiment of the present application. The integrated learning and testing system 300 includes: a basic predictor training model 310 , a predictor weight function training module 320 , an evaluation module 330 and a sample weight adjustment module 340 . The basic predictor training model 310 , the predictor weight function training module 320 , the evaluation module 330 and the sample weight adjustment module 340 can implement the integrated learning and testing method of the above embodiments of this application. The integrated learning test system 300 may further include a display (not shown) for displaying the predicted results. In the embodiment of this case, the integrated learning and testing system 300 is, for example, an electronic device (such as a server) with information processing, and the basic predictor training model 310 , the predictor weight function training module 320 , the evaluation module 330 and the The sample weight adjustment module 340 may be implemented by a hardware circuit with information processing capability (eg, a central processing unit).

在本案實施例的整體式學習中，乃是透過考慮各預測器(預測模型)的多樣性，以互補為依據，搭配樣本權重調整，以及具順序最佳化的權重函數訓練機制來建立權重函數。如此一來，本案實施例可以有效找出適合個別樣本區域的預測器，得以動態訓練權重函數，以提升整體預測效果。在此，所謂的「具順序最佳化的權重函數訓練機制」是指，並非在一回合疊代回合內找出所有的目標權重函數，而是逐回合地找出個別目標權重函數(如上述般，在第t疊代回合內，找出g _(t))。 In the holistic learning of the embodiment of this case, the weight function is established by considering the diversity of each predictor (prediction model), based on complementarity, with sample weight adjustment, and a weight function training mechanism with sequential optimization . In this way, the embodiment of the present case can effectively find a predictor suitable for individual sample regions, and can dynamically train the weight function to improve the overall prediction effect. Here, the so-called "weight function training mechanism with sequential optimization" means that instead of finding all target weight functions in one iteration, individual target weight functions are found round by round (as described above). Generally, in the t-th iteration round, find g _{( t )} ).

本案實施例可以在電腦上產生特殊功效(透過逐回合所產生的目標權重函數來預測結果)，而非僅僅將電腦當成工具。亦即，本案實施例非僅利用電腦，更是對於特定型態規則的利用，以達到改善整體式學習預測系統的特定功效。 The embodiment of this case can produce special effects on the computer (predicting the result through the target weight function generated round by round), rather than just using the computer as a tool. That is, the embodiment of the present case not only uses a computer, but also uses a specific type of rules to achieve a specific effect of improving the overall learning and prediction system.

本案上述實施例所具體指向的領域例如但不受限於，為電腦預測系統，相較於傳統電腦預測系統無法在複雜樣本空間中訓練出有效的預測器權重函數，本案實施例能利用具順序最佳化的權重函數估測模組與步驟，來逐回合地選擇出目標預測器與目標預測器權重函數，故而，即便是面對複雜樣本空間，本案實施例仍能有效訓練出預測器權重函數(因為本案不是在一回合內得到出複數個目標預測器權重函數，而是逐回合得到複數個目標預測器權重函數)。 The field specifically targeted by the above embodiments of this case is, for example, but not limited to, computer prediction systems. Compared with traditional computer prediction systems, which cannot train an effective predictor weight function in a complex sample space, the embodiments of this case can use sequential The optimized weight function estimation module and steps are used to select the target predictor and the target predictor weight function round by round. Therefore, even in the face of a complex sample space, the embodiment of this case can still effectively train the weight of the predictor function (because in this case, multiple target predictor weight functions are not obtained in one round, but multiple target predictor weight functions are obtained round by round).

綜上所述，雖然本創作已以實施例揭露如上，然其並非用以限定本創作。本創作所屬技術領域中具有通常知識者，在不脫離本創作之精神和範圍內，當可作各種之更動與潤飾。因此，本創作之保護範圍當視後附之申請專利範圍所界定者為準。 To sum up, although the present creation has been disclosed above with embodiments, it is not intended to limit the present creation. Those with ordinary knowledge in the technical field to which this creation belongs can make various changes and modifications without departing from the spirit and scope of this creation. Therefore, the scope of protection of this creation should be determined by the scope of the appended patent application.

300:整體式學習測試系統 300: Holistic Learning Test System

310:基本預測器訓練模型 310: Basic Predictor Training Model

330:評價模組 330: Evaluation Module

340:樣本權重調整模組 340: Sample weight adjustment module

Claims

An integral learning and prediction system, comprising: a basic predictor training model, which uses a plurality of training data transmitted from a signal source to establish a plurality of basic predictors, wherein the signal source is a sensor; a The predictor weight function training module is connected to the basic predictor training model, initializes a plurality of sample weights of a plurality of sample data, and initializes a processing set. After initialization, the processing set includes corresponding to the basic predictors. a plurality of predictors, and, in a first iteration round, according to the sample data and the sample weights, a plurality of predictor weight functions for the predictors in the processing set are established, and the processing Each of the predictors in the set respectively predicts the sample data to generate a prediction result for each of the sample data; an evaluation module evaluates the predictor weight functions, and obtains a prediction from the sample data according to an evaluation result. selecting a target predictor weight function from the predictor weight functions established in the first iteration round and selecting a target predictor corresponding to the target predictor weight function from the processing set; and a sample weight adjustment module , connecting the evaluation module and the predictor weight function training module, so as to update the processing set and update the sample weights of the sample data, the processing set after the update does not include the target predictor; Wherein, according to the updated sample weights of the sample data, perform the next iteration round to repeat the above operation to select another target predictor from the updated processing set and obtain another target predictor weight function , until all Iterative rounds to select a plurality of target predictor weight functions and a plurality of target predictors to integrate into an overall predictor, wherein the overall predictor includes the target predictor weight functions and the target predictors, and the overall predictor A prediction result of the predictor is further displayed on a display.

The system of claim 1, wherein, during initialization, the predictor weight function training module weights the samples of the sample data x _j

Given initialization, j=1~n, n represents the number of the sample data, t represents the number of iteration rounds); the initialized processing set H can be expressed as H = { h ₁ , h ₂ ,... , h _N }, h ₁ , h ₂ , . . . , h _N represent the basic predictors, and the processing set H includes the unselected predictors.

The system of claim 2, wherein the evaluation module: in each of the iterative rounds, evaluates an individual score of each of the predictor weight functions, the score representing a corresponding response to the sample data Whether the individual prediction results of the predictor are consistent with the output values of the predictor weight function corresponding to the specific samples, wherein, when the identification result corresponding to the predictor for a specific sample is correct, and for the specific sample If the output value of the predictor weight function of the predictor corresponding to the specific sample is higher than a threshold value, the consistency between the two is high, when the identification result of the predictor corresponding to the specific sample is wrong, and The output value of the predictor weight function of the predictor corresponding to the specific sample is lower than the threshold value, then the consistency between the two is high, when the identification result of the predictor corresponding to the specific sample is correct , and the output value of the predictor weight function of the predictor corresponding to the specific sample is lower than the threshold value, then the consistency between the two is low, when the identification result of the predictor corresponding to the specific sample is wrong, and the output value of the predictor weight function of the predictor corresponding to the specific sample is higher than the threshold value, then the consistency between the two is low; A predictor weight function with the highest score is the target predictor weight function for the iteration round, setting h _{( t )} = hi and g _{( t )} = f _i ₍ ^{t )} , h _{( t )} and g _{( t )} respectively represent the weight function of the target predictor and the target predictor selected in the t iteration round.

The system of claim 1, wherein when updating the processing set, the selected target predictor is removed from the processing set.

The system of claim 4, wherein when the sample weight adjustment module updates the sample weight of the sample data x _j : if the sample data x _j is When the target predictor h _{( t )} predicts correctly and the value of the predictor weight function g _{( t )} of the predictor h _{( t )} is high, the sample weight of the sample data x _j is adjusted down; if the sample When the data x _j is correctly predicted by the target predictor h _{( t )} in the t iteration round and the value of the predictor weight function g _{( t )} of the predictor h _{( t )} is low, then the The sample weight of the sample data x _j is adjusted down; if the sample data x _j is incorrectly predicted by the target predictor h _{( t )} in the t iteration round and the predictor weight of the predictor h _{( t )} is When the value of the function g _{( t )} is high, the sample weight of the sample data x _j is increased; and if the sample data x _j is in the t iteration round, the target predictor h _{( t )} When the prediction is wrong and the value of the predictor weight function g _{( t )} of the predictor h _{( t )} is low, the sample weight of the sample data x _j is adjusted higher.

The system of claim 1, wherein, in the overall predictor, individual function output values are calculated for each of the target predictor weight functions, for the target predictor whose function output value exceeds a threshold The weight function is given the highest weight, and the rest of the weight functions are zero.

The system of claim 1, wherein, in the overall predictor, the individual function output values of each predictor weight function are normalized to serve as the integrated weight of each of the target predictors.