TWI815592B

TWI815592B - Yield estimation apparatus and method

Info

Publication number: TWI815592B
Application number: TW111129591A
Authority: TW
Inventors: 賴柏榕
Original assignee: 財團法人資訊工業策進會
Priority date: 2022-08-05
Filing date: 2022-08-05
Publication date: 2023-09-11
Also published as: CN117575664A; TW202407588A

Abstract

A yield estimation apparatus and method. The apparatus stores target data and non-target data corresponding to a predicted target, and the non-target data includes a plurality of feature fields and a plurality of temporal feature fields. Based on the feature fields and the temporal feature fields contained in the non-target data, the apparatus trains a gate recursive unit encoder to generate a non-target feature extraction result. Based on the non-target feature extraction result and the target data, the apparatus trains a gate recursive unit decoder to generate a yield estimation result, and the target data corresponds to a historical sales volume of the predicted target.

Description

Production volume estimation device and method

本發明係關於一種生產量預估裝置及方法。具體而言，本發明係關於一種能提升生產量預估準確率之生產量預估裝置及方法。The present invention relates to a production volume prediction device and method. Specifically, the present invention relates to a production volume prediction device and method that can improve the production volume prediction accuracy.

近年來，與大數據相關的技術及應用快速的發展，企業端的供應鏈經常透過建置預估模型對於生產量的數據進行預估，以使成品的交貨期可以縮短至客戶期望的交貨期程。In recent years, technologies and applications related to big data have developed rapidly. Enterprise-side supply chains often estimate production volume data by building forecast models, so that the delivery time of finished products can be shortened to the delivery expected by customers. period process.

然而，由於實際的數據（例如：產品歷史銷售數據）存在與時間相關的序列，其具備高度不穩定和非線性的特性，使得企業若僅依靠單純的線性模型（例如：SVM、ARIMA）或是基礎的機器學習模型（例如：LSTM、CNN_LSTM），通常難以捕捉到資料在時間上的關聯性和變化，因而無法準確預測出生產量預估結果，常導致預測失誤。However, due to the existence of time-related sequences in actual data (such as product historical sales data), which are highly unstable and non-linear, companies must rely only on simple linear models (such as SVM, ARIMA) or Basic machine learning models (such as LSTM, CNN_LSTM) are usually difficult to capture the correlation and changes in data over time, and therefore cannot accurately predict the production output forecast results, often leading to prediction errors.

有鑑於此，如何提供一種可提升生產量預估準確率之生產量預估技術，乃業界亟需努力之目標。In view of this, how to provide a production volume prediction technology that can improve the accuracy of production volume prediction is an urgent goal for the industry.

本發明之一目的在於提供一種生產量預估裝置。該生產量預估裝置包含一儲存器及一處理器，該處理器電性連接至該儲存器。該儲存器用以儲存對應一預測目標之一目標資料及一非目標資料，其中該非目標資料包含複數個特徵欄位及複數個時間性特徵欄位。該處理器基於該非目標資料所包含之該等特徵欄位及該等時間性特徵欄位，訓練一閘門遞迴單位編碼器以產生一非目標特徵萃取結果。該處理器基於該非目標特徵萃取結果及該目標資料，訓練一閘門遞迴單位解碼器以產生一生產量預估結果，其中該目標資料對應至該預測目標之一歷史銷售量。One object of the present invention is to provide a production volume prediction device. The production volume prediction device includes a storage and a processor, and the processor is electrically connected to the storage. The storage is used to store target data corresponding to a prediction target and non-target data, wherein the non-target data includes a plurality of feature fields and a plurality of temporal feature fields. The processor trains a gate recursive unit encoder to generate a non-target feature extraction result based on the feature fields and the temporal feature fields included in the non-target data. The processor trains a gated recursive unit decoder to generate a production volume estimation result based on the non-target feature extraction result and the target data, wherein the target data corresponds to one of the historical sales volumes of the forecast target.

本發明之另一目的在於提供一種生產量預估方法，用於一電子裝置，該生產量預估方法由該電子裝置所執行且包含下列步驟：基於對應一預測目標之一非目標資料所包含之複數個特徵欄位及一特徵卷積，訓練一閘門遞迴單位編碼器，以產生一非目標特徵萃取結果中之一萃取特徵值；基於該非目標資料所包含之複數個時間性特徵欄位及一時間卷積，訓練該閘門遞迴單位編碼器，以產生該非目標特徵萃取結果中之一萃取時間性特徵值；以及基於該非目標特徵萃取結果及一目標資料，訓練一閘門遞迴單位解碼器以產生一生產量預估結果，其中該目標資料對應至該預測目標之一歷史銷售量。Another object of the present invention is to provide a production volume prediction method for an electronic device. The production volume prediction method is executed by the electronic device and includes the following steps: based on non-target data corresponding to a prediction target. A plurality of feature fields and a feature convolution are used to train a gated recursive unit encoder to generate an extraction feature value in a non-target feature extraction result; based on the plurality of temporal feature fields contained in the non-target data and a temporal convolution, training the gated recursive unit encoder to generate an extracted temporal feature value in the non-target feature extraction result; and based on the non-target feature extraction result and a target data, training a gated recursive unit decoder The machine is used to generate a production forecast result, in which the target data corresponds to one of the historical sales volumes of the forecast target.

本發明所提供之生產量預估技術（至少包含裝置及方法），分別針對對應預測目標之非目標資料所包含之該等特徵欄位及該等時間性特徵欄位產生非目標特徵萃取結果，且基於對應預測目標之目標資料產生目標特徵萃取結果。藉由同時考慮非目標特徵萃取結果及目標特徵萃取結果，產生生產量預估結果。由於本發明所提供之生產量預估技術，同時考慮目標資料及非目標資料的特徵在時間上的關聯性和變化，能夠提升生產量預估的準確率，解決習知技術不準確的問題。The production volume prediction technology (including at least a device and a method) provided by the present invention generates non-target feature extraction results for the feature fields and the temporal feature fields included in the non-target data corresponding to the prediction target, respectively. And generate target feature extraction results based on the target data corresponding to the predicted target. By simultaneously considering the non-target feature extraction results and the target feature extraction results, the throughput estimation results are generated. Since the production volume prediction technology provided by the present invention simultaneously considers the correlation and changes in time of the characteristics of target data and non-target data, it can improve the accuracy of production volume prediction and solve the problem of inaccuracy in the conventional technology.

以下結合圖式闡述本發明之詳細技術及實施方式，俾使本發明所屬技術領域中具有通常知識者能理解所請求保護之發明之技術特徵。The following describes the detailed technology and implementation of the present invention in conjunction with the drawings, so that those with ordinary knowledge in the technical field to which the present invention belongs can understand the technical features of the claimed invention.

以下將透過實施方式來解釋本發明所提供之一種生產量預估裝置及方法。然而，該等實施方式並非用以限制本發明需在如該等實施方式所述之任何環境、應用或方式方能實施。因此，關於實施方式之說明僅為闡釋本發明之目的，而非用以限制本發明之範圍。應理解，在以下實施方式及圖式中，與本發明非直接相關之元件已省略而未繪示，且各元件之尺寸以及元件間之尺寸比例僅為例示而已，而非用以限制本發明之範圍。The following will explain a production volume prediction device and method provided by the present invention through implementation examples. However, these embodiments are not intended to limit the invention to be implemented in any environment, application or manner as described in these embodiments. Therefore, the description of the embodiments is only for the purpose of explaining the present invention and is not used to limit the scope of the present invention. It should be understood that in the following embodiments and drawings, elements not directly related to the present invention have been omitted and not shown, and the size of each element and the size ratio between elements are only for illustration and are not intended to limit the present invention. range.

本發明之第一實施方式為生產量預估裝置1，其架構示意圖係描繪於第1圖。生產量預估裝置1包含一儲存器11及處理器13，處理器13電性連接至儲存器11。儲存器11可為記憶體、通用串列匯流排（Universal Serial Bus；USB）碟、硬碟、光碟、隨身碟或本發明所屬技術領域中具有通常知識者所知且具有相同功能之任何其他儲存媒體或電路。處理器13可為各種處理單元、中央處理單元（Central Processing Unit；CPU）、微處理器或本發明所屬技術領域中具有通常知識者所知悉之其他計算裝置。The first embodiment of the present invention is a production volume prediction device 1, the schematic structural diagram of which is depicted in Figure 1. The production volume prediction device 1 includes a storage 11 and a processor 13 . The processor 13 is electrically connected to the storage 11 . The storage 11 can be a memory, a Universal Serial Bus (USB) disk, a hard disk, an optical disk, a pen drive, or any other storage that is known to those with ordinary skill in the technical field of the present invention and has the same function. media or circuit. The processor 13 may be various processing units, a central processing unit (Central Processing Unit; CPU), a microprocessor, or other computing devices known to those with ordinary skill in the technical field to which this invention belongs.

於某些實施方式中，生產量預估裝置1更包含一收發介面接收資料。具體而言，收發介面為可接收及傳輸資料之介面或本發明所屬技術領域中具有通常知識者所知悉之其他可接收及傳輸資料之介面，收發介面可透過例如：外部裝置、外部網頁、外部應用程式等等來源接收資料。In some embodiments, the production volume estimating device 1 further includes a transceiver interface to receive data. Specifically, the transceiver interface is an interface that can receive and transmit data or other interfaces that can receive and transmit data known to those with ordinary knowledge in the technical field to which the present invention belongs. The transceiver interface can be through, for example: external devices, external web pages, external Applications and other sources receive data.

於本實施方式中，如第1圖所示，儲存器11用以儲存對應預測目標之目標資料TD及非目標資料NTD，其中非目標資料NTD包含複數個特徵欄位（feature fields）及複數個時間性特徵欄位（temporal feature fields）。In this implementation, as shown in Figure 1, the storage 11 is used to store target data TD and non-target data NTD corresponding to the prediction target, where the non-target data NTD includes a plurality of feature fields and a plurality of Temporal feature fields.

舉例而言，當生產量預估裝置1的預測目標為一種飲料A時，儲存器11儲存的目標資料TD為飲料A於過去一段時間的產品銷售量（例如：過去一周每天的銷售數據），非目標資料NTD則為可能會影響飲料A銷售的其他參數，例如：天氣狀況、溫度、體感溫度、降雨量、降雨機率、相對濕度、舒適度等等的參數。For example, when the prediction target of the production volume estimating device 1 is a beverage A, the target data TD stored in the storage 11 is the product sales volume of beverage A in the past period of time (for example: the sales data of each day in the past week), Non-target data NTD are other parameters that may affect the sales of beverage A, such as: weather conditions, temperature, body temperature, rainfall, rainfall probability, relative humidity, comfort, etc.

為便於理解，以一實際範例舉例而言，請參考第2圖之非目標資料NTD示意圖。如第2圖所示，非目標資料NTD中的特徵欄位包含了日期、時間、天氣狀況、溫度、體感溫度、降雨機率、相對濕度、舒適度等等的欄位。非目標資料NTD中的時間性特徵欄位則包含了複數個對應時間序列的欄位，例如：7月17星期六及7月18星期日等等的各個時間區間。For ease of understanding, take a practical example as an example, please refer to the non-target data NTD diagram in Figure 2. As shown in Figure 2, the characteristic fields in the non-target data NTD include fields for date, time, weather conditions, temperature, perceived temperature, rainfall probability, relative humidity, comfort, etc. The temporal characteristic fields in non-target data NTD include multiple fields corresponding to time series, for example: various time intervals for Saturday, July 17, Sunday, July 18, etc.

先簡單說明本發明之第一實施方式的運作，本發明主要包含二階段的運作，分別為編碼器運作階段及解碼器運作階段。於本實施方式中，編碼器運作階段主要針對非目標資料NTD中的資訊進行特徵的萃取，且將非目標特徵萃取結果輸入至解碼器，解碼器運作階段則更加入目標資料TD的特徵萃取資訊，基於非目標特徵萃取結果及目標特徵萃取結果產生生產量預估結果。The operation of the first embodiment of the present invention will be briefly described first. The present invention mainly includes two stages of operation, namely the encoder operation stage and the decoder operation stage. In this implementation, the encoder operation stage mainly extracts features from the information in the non-target data NTD, and inputs the non-target feature extraction results to the decoder. The decoder operation stage further adds the feature extraction information of the target data TD. , the production volume estimation result is generated based on the non-target feature extraction results and the target feature extraction results.

須說明者，於本實施方式中，本發明將透過編碼器（例如：GRU Encoder）與解碼器（例如：GRU Decoder）找出非目標資料NTD和目標資料TD在時間上的關聯性和變化之細部特徵，並利用這些特徵來預測未來時間點的生產量預估結果。It should be noted that in this implementation, the present invention will use an encoder (for example: GRU Encoder) and a decoder (for example: GRU Decoder) to find out the temporal correlation and change between the non-target data NTD and the target data TD. Detailed characteristics and use these characteristics to predict the production volume estimation results at future time points.

於某些實施方式中，處理器13更可將編碼器與解碼器萃取出的特徵合併作為卷積神經網路（Convolutional Neural Network；CNN）之輸入層，經過卷積層（Convolution Layers）和池化層（Pooling Layers）自動萃取特徵後，再送入全連接層（Fully Connected layers；FC），最後在輸出層得到預測最終生產量預估結果。In some embodiments, the processor 13 can further combine the features extracted by the encoder and the decoder as the input layer of a convolutional neural network (Convolutional Neural Network; CNN), through convolution layers (Convolution Layers) and pooling. After the features are automatically extracted from the Pooling Layers, they are then sent to the Fully Connected layers (FC), and finally the final production volume estimate is obtained in the output layer.

於本實施方式中，處理器13首先針對非目標資料NTD進行特徵卷積（Feature Convolution）和時間卷積（Temporal Convolution）的特徵萃取。應理解，特徵欄位指的是非目標資料NTD中的不同種類資訊，時間性特徵欄位則是指非目標資料NTD中的各個時間點。處理器13對非目標資料NTD的特徵欄位和時間性特徵欄位做特徵萃取，使模型能看到非目標資料NTD內每個種類（例如：天氣狀況、溫度、體感溫度、降雨機率、相對濕度、舒適度等等）在一段時間內的特徵，以及在同一個時間點每個種類間的特徵。In this embodiment, the processor 13 first performs feature extraction on the non-target data NTD using feature convolution (Feature Convolution) and temporal convolution (Temporal Convolution). It should be understood that the characteristic fields refer to different types of information in the non-target data NTD, and the temporal characteristic fields refer to various time points in the non-target data NTD. The processor 13 performs feature extraction on the feature fields and temporal feature fields of the non-target data NTD, so that the model can see each type of the non-target data NTD (for example: weather conditions, temperature, body temperature, rainfall probability, relative humidity, comfort, etc.) over a period of time, and between each species at the same point in time.

須說明者，由於傳統將非目標資料直接進行卷積的方法，可能會因一次考慮全部的資料，導致模型會忽略某些細節。相較之下，本發明的運作能看到更細緻的特徵，而且能關注到針對種類和時間不同的特徵。It should be noted that due to the traditional method of directly convolving non-target data, the model may ignore certain details by considering all the data at once. In comparison, the operation of the present invention can see more detailed characteristics, and can pay attention to characteristics that differ for types and time.

為便於理解，以下段落請同時參考第4圖的運作流程示意圖。應理解，第4圖僅用來例舉本發明之部分實施態樣，以及闡釋本發明之技術特徵，而非用來限制本發明之保護範疇及範圍。For ease of understanding, please refer to the operational flow diagram in Figure 4 in the following paragraphs. It should be understood that Figure 4 is only used to illustrate some embodiments of the present invention and to explain the technical features of the present invention, but is not used to limit the scope and scope of the present invention.

具體而言，於本實施方式中，處理器13基於非目標資料NTD所包含之特徵欄位及時間性特徵欄位，訓練閘門遞迴單位（Gated Recurrent Unit；GRU）編碼器EN以產生非目標特徵萃取結果。接著，處理器13基於該非目標特徵萃取結果及目標資料TD，訓練閘門遞迴單位解碼器DE以產生生產量預估結果，其中目標資料TD對應至預測目標之歷史銷售量。Specifically, in this implementation, the processor 13 trains the gated recurrent unit (GRU) encoder EN to generate non-target data based on the feature fields and temporal feature fields included in the non-target data NTD. Feature extraction results. Then, the processor 13 trains the gate recursive unit decoder DE to generate a production volume estimation result based on the non-target feature extraction result and the target data TD, where the target data TD corresponds to the historical sales volume of the predicted target.

於某些實施方式中，閘門遞迴單位編碼器EN包含特徵卷積FC及時間卷積TC。In some embodiments, the gated recurrent unit encoder EN includes a feature convolution FC and a temporal convolution TC.

具體而言，處理器13針對非目標資料NTD中之該等特徵欄位，執行特徵卷積FC，以產生該等特徵欄位各者所對應之第一萃取特徵值。處理器13針對非目標資料NTD中之該等時間性特徵欄位，執行時間卷積TC，以產生該等時間性特徵欄位各者所對應之第一萃取時間性特徵值。Specifically, the processor 13 performs feature convolution FC on the feature fields in the non-target data NTD to generate the first extracted feature values corresponding to each of the feature fields. The processor 13 performs temporal convolution TC on the temporal feature fields in the non-target data NTD to generate the first extracted temporal feature value corresponding to each of the temporal feature fields.

為便於理解，請參考第3A圖，處理器13可將非目標資料NTD分成以每個特徵欄位F為主的一維向量（例如:特徵F1、特徵F2、特徵F3），接著將它們乘上模型所學習出來的權重值（例如:卷積核K1），就能得到每個特徵欄位的特徵（例如:通道CH1、通道CH2及通道CH3），最後將萃取的特徵合併，就能得到針對特徵欄位萃取出的非目標資料特徵。For ease of understanding, please refer to Figure 3A. The processor 13 can divide the non-target data NTD into a one-dimensional vector based on each feature field F (for example: feature F1, feature F2, feature F3), and then multiply them by By applying the weight value learned by the model (for example: convolution kernel K1), you can get the features of each feature field (for example: channel CH1, channel CH2 and channel CH3). Finally, by merging the extracted features, you can get Non-target data features extracted for feature fields.

另外，請參考第3B圖，處理器13可將非目標資料NTD分成以每個時間性特徵欄位TF為主的一維向量（例如:特徵F1、特徵F2、特徵F3），接著將它們乘上模型所學習出來的權重值（例如:卷積核K1），就能得到每個時間性特徵欄位的特徵（例如:通道CH1、通道CH2及通道CH3），最後將萃取的特徵合併，就能得到針對時間性特徵欄位萃取出的非目標資料特徵。In addition, please refer to Figure 3B, the processor 13 can divide the non-target data NTD into a one-dimensional vector (for example: feature F1, feature F2, feature F3) based on each temporal feature field TF, and then multiply them By applying the weight value learned by the model (for example: convolution kernel K1), the features of each temporal feature field (for example: channel CH1, channel CH2 and channel CH3) can be obtained. Finally, the extracted features are merged to obtain Non-target data features extracted for temporal feature fields can be obtained.

須說明者，於本範例中，卷積運作是為了突顯強化特徵關係，選用1x1的過濾器（即，卷積核K1）是因為要著重並聚焦每個區塊，而不會有偏袒，因此過濾器又稱為卷積層的權重。It should be noted that in this example, the convolution operation is to highlight the enhanced feature relationship. The 1x1 filter (i.e., convolution kernel K1) is selected because it is necessary to emphasize and focus on each block without being biased. Therefore, Filters are also called weights of convolutional layers.

於某些實施方式中，處理器13可利用卷積運算產生的查詢（query）、鍵（key）及值（value）來計算每個特徵間的關聯性，對提取出的特徵做去雜訊化，以提高模型對高度相關特徵的關注。In some embodiments, the processor 13 can use the query, key, and value generated by the convolution operation to calculate the correlation between each feature, and remove noise on the extracted features. ization to improve the model’s focus on highly relevant features.

具體而言，處理器13對於該等第一萃取特徵值，執行第一自注意運作SA（Self-Attention），以產生複數個第二萃取特徵值。另外，處理器13對於該等第一萃取時間性特徵值，執行第二自注意運作SA，以產生複數個第二萃取時間性特徵值。Specifically, the processor 13 performs a first self-attention operation SA (Self-Attention) on the first extraction feature values to generate a plurality of second extraction feature values. In addition, the processor 13 performs a second self-attention operation SA on the first extraction temporal characteristic values to generate a plurality of second extraction temporal characteristic values.

於某些實施方式中，如第4圖所示，處理器13執行的特徵卷積運算FCO包含特徵卷積FC及自注意運作SA，時間卷積運算TCO包含時間卷積TC及自注意運作SA。In some embodiments, as shown in Figure 4, the feature convolution operation FCO performed by the processor 13 includes the feature convolution FC and the self-attention operation SA, and the temporal convolution operation TCO includes the temporal convolution TC and the self-attention operation SA. .

舉例而言，處理器13可先將查詢（query）和鍵（key）相乘，藉著避免相乘的數值過大先進行縮放，然後利用softmax函式來計算二者的關聯性，其數值通常介於0至1之間。最後，將算出的數值乘上原始的值（value），就能得到將特徵間關聯小的數值減少、弱化，並將關聯性大的數值增大、加強。For example, the processor 13 can first multiply the query (query) and the key (key), scale it first to avoid the multiplied value being too large, and then use the softmax function to calculate the correlation between the two. The value is usually Between 0 and 1. Finally, by multiplying the calculated value by the original value, you can reduce and weaken the values with small correlations between features, and increase and strengthen the values with large correlations.

於某些實施方式中，為了達成更高的萃取效果，處理器13將萃取方法的部分作多次的加深，進行複數次的迭代（例如:設定N次的預設迭代，其中N為一正整數）。具體而言，處理器13執行複數次特徵卷積FC及複數次第一自注意運作SA，以產生複數個第二萃取特徵值。此外，處理器13執行複數次時間卷積TC及複數次第二自注意運作SA，以產生複數個第二萃取時間性特徵值。In some embodiments, in order to achieve a higher extraction effect, the processor 13 deepens the extraction method multiple times and performs a plurality of iterations (for example: setting a default iteration of N times, where N is a positive integer). Specifically, the processor 13 performs a plurality of feature convolutions FC and a plurality of first self-attention operations SA to generate a plurality of second extracted feature values. In addition, the processor 13 performs a plurality of temporal convolutions TC and a plurality of second self-attention operations SA to generate a plurality of second extracted temporal feature values.

於某些實施方式中，如第4圖所示，為避免模型層數過多而引發的梯度爆炸問題，因此可加入殘差（Residual）的做法，將多層萃取出的特徵與原始非目標資料相加，以加強原本的特徵。In some implementations, as shown in Figure 4, in order to avoid the gradient explosion problem caused by too many model layers, a residual method can be added to compare the features extracted by multiple layers with the original non-target data. Add to enhance original features.

於某些實施方式中，為避免相加後的數值過大，加入了批量標準化（Batch Normalization），將數值正歸化後，再丟入閘門遞迴單位GRU內，讓模型學習連續時間上的變化特徵。具體而言，處理器13對於該等第二萃取特徵值，執行第一批量標準化運作BN及第一閘門遞迴單位GRU運作，以產生複數個第三萃取特徵值。此外，處理器13對於該等第二萃取時間性特徵值，執行第二批量標準化運作BN及第二閘門遞迴單位GRU運作，以產生複數個第三萃取時間性特徵值。In some implementations, in order to prevent the added value from being too large, batch normalization is added. After the value is normalized, it is then thrown into the gate recursive unit GRU to allow the model to learn changes in continuous time. Characteristics. Specifically, the processor 13 performs a first batch normalization operation BN and a first gate return unit GRU operation on the second extraction feature values to generate a plurality of third extraction feature values. In addition, the processor 13 performs a second batch normalization operation BN and a second gate return unit GRU operation for the second extraction temporal characteristic values to generate a plurality of third extraction temporal characteristic values.

於某些實施方式中，處理器13基於該等第三萃取特徵值及該等第三萃取時間性特徵值，產生該非目標特徵萃取結果。In some embodiments, the processor 13 generates the non-target feature extraction result based on the third extraction feature values and the third extraction temporal feature values.

隨後，處理器13將二個分支相加，傳入到解碼器DE做後續訓練。於某些實施方式中，為了更精確萃取的特徵，解碼器DE可進一步處理非目標資料NTD的部分，針對非目標資料NTD丟入跟編碼器EN相同的模型架構中。如第4圖所示，處理器13可將二者合併之後的資料再於萃取運作EX中再次萃取特徵。Subsequently, the processor 13 adds the two branches and passes them to the decoder DE for subsequent training. In some implementations, in order to extract features more accurately, the decoder DE can further process the part of the non-target data NTD, and throw the non-target data NTD into the same model structure as the encoder EN. As shown in Figure 4, the processor 13 can combine the two data and extract features again in the extraction operation EX.

具體而言，非目標資料NTD丟入跟編碼器EN具有相同的模型架構的萃取運作EX，途中經過池化（Pooling），使用最大池化法取最大值（即，找出最有關聯的部分），不會遺失重要資訊，再經過全連接層（Fully Connected layers）將資料平坦化，且將全部特徵矩陣轉換成向量，可以是一天或多天的資料。隨後，將產生的資料與目標資料TD進行對應合併，最後進入閘門遞迴單位GRU再次觀察短時間與長時間之特徵關聯性。應理解，解碼器DE之輸出可以是一筆或多筆資料。Specifically, the non-target data NTD is thrown into the extraction operation EX with the same model architecture as the encoder EN, and through pooling (Pooling) on the way, the maximum pooling method is used to obtain the maximum value (that is, to find the most relevant part ), without losing important information, and then flatten the data through fully connected layers (Fully Connected layers), and convert all feature matrices into vectors, which can be one or more days of data. Subsequently, the generated data is merged with the target data TD, and finally the gate return unit GRU is entered to observe the short-term and long-term feature correlation again. It should be understood that the output of the decoder DE can be one or more pieces of data.

於某些實施方式中，針對目標資料TD，處理器13是利用閘門遞迴單位GRU訓練前n個時間的目標資料TD （即，時間點T1、T2、……、Tn）。接著，在第Tn+1個時間點，加入從非目標資料NTD萃取出的特徵，基於時序資料的關係進行合併，讓模型可以學習到非目標資料NTD和目標資料TD二者間的特徵，來提高預測準確率。In some embodiments, for the target data TD, the processor 13 uses a gate recursion to return the target data TD n times before the unit GRU training (ie, time points T1, T2, ..., Tn). Then, at the Tn+1 time point, the features extracted from the non-target data NTD are added and merged based on the relationship of the time series data, so that the model can learn the features between the non-target data NTD and the target data TD. Improve prediction accuracy.

具體而言，於某些實施方式中，其中該閘門遞迴單位解碼器包含複數個第三閘門遞迴單位GRU，且處理器13將對應複數個時間點之該歷史銷售量分別輸入至該等第三閘門遞迴單位GRU，以產生目標特徵萃取結果。Specifically, in some embodiments, the gate return unit decoder includes a plurality of third gate return units GRU, and the processor 13 inputs the historical sales volume corresponding to a plurality of time points into these respectively. The third gate recursively unit GRU to produce target feature extraction results.

具體而言，於某些實施方式中，其中該閘門遞迴單位解碼器更包含第四閘門遞迴單位GRU，且處理器13將非目標特徵萃取結果及目標特徵萃取結果輸入至第四閘門遞迴單位GRU，以產生生產量預估結果。Specifically, in some embodiments, the gate recursive unit decoder further includes a fourth gate recursive unit GRU, and the processor 13 inputs the non-target feature extraction results and the target feature extraction results to the fourth gate recursive unit. Back to the unit GRU to generate production volume estimates.

須說明者，於本揭露中，非目標資料NTD會進入編碼器EN和解碼器DE去萃取其特徵，包含各種特徵之間的關聯以及時序間的關聯性，例如：季節和淡旺季之間可能存在某種關聯、未來3個月需求量和3個月後之訂貨週期之間可能有關聯以及當前庫存量和每件商品採購成本有關聯等。另外，目標資料TD會經過多層的閘門遞迴單位GRU去抓取長短時間內的特徵變化，例如：過去一段時間的產品銷售量可能和當天的產品銷售量有關、去年夏天可能和今年夏天有關等等。It should be noted that in this disclosure, the non-target data NTD will enter the encoder EN and the decoder DE to extract its features, including the correlation between various features and the correlation between time series, for example: between seasons and off-peak seasons. There is some correlation, there may be a correlation between the demand in the next three months and the order cycle three months later, and there may be a correlation between the current inventory and the purchase cost of each item, etc. In addition, the target data TD will be returned to the unit GRU through multiple layers of gates to capture feature changes over a long and short period of time. For example, the product sales volume in the past period may be related to the product sales volume of the day, last summer may be related to this summer, etc. wait.

於某些實施方式中，解碼器DE的輸出資料可進一步輸入至額外的卷積神經網路CNN的模型架構中，藉由卷積運作強化特徵關係，透過池化層使用全域平均池化讓結果更平均，透過全連階層將資料平坦化，將全部特徵矩陣轉換成向量（例如：一天或多天的資料）。最後，送入全連接層的神經網路來進行分類，預測出未來一天或多天之產品銷售量。應理解，卷積神經網路學的是卷積層的權重（即，過濾器），主要用來突顯特徵。池化的目的是壓縮來提取最強的特徵。全連接層就是多層感知器的隱藏層和輸出層，可進行分類。In some implementations, the output data of the decoder DE can be further input into the model architecture of an additional convolutional neural network (CNN) to enhance feature relationships through convolution operations, and use global average pooling through the pooling layer to optimize the results. More evenly, the data is flattened through a fully connected layer, converting all feature matrices into vectors (for example: data for one or more days). Finally, the neural network is fed into the fully connected layer for classification, and the product sales volume is predicted for one or more days in the future. It should be understood that convolutional neural networks learn the weights (ie, filters) of the convolutional layer, which are mainly used to highlight features. The purpose of pooling is to compress to extract the strongest features. The fully connected layer is the hidden layer and output layer of the multi-layer perceptron, which can be used for classification.

由上述說明可知，本發明所提供之生產量預估裝置1，分別針對對應預測目標之非目標資料所包含之該等特徵欄位及該等時間性特徵欄位產生非目標特徵萃取結果，且基於對應預測目標之目標資料產生目標特徵萃取結果。藉由同時考慮非目標特徵萃取結果及目標特徵萃取結果，產生生產量預估結果。由於本發明所提供之生產量預估技術，同時考慮目標資料及非目標資料的特徵在時間上的關聯性和變化，能夠提升生產量預估的準確率，解決習知技術不準確的問題。It can be seen from the above description that the production volume prediction device 1 provided by the present invention generates non-target feature extraction results for the feature fields and the temporal feature fields included in the non-target data corresponding to the prediction target, and Target feature extraction results are generated based on target data corresponding to the predicted target. By simultaneously considering the non-target feature extraction results and the target feature extraction results, the throughput estimation results are generated. Since the production volume prediction technology provided by the present invention simultaneously considers the correlation and changes in time of the characteristics of target data and non-target data, it can improve the accuracy of production volume prediction and solve the problem of inaccuracy in the conventional technology.

本發明之第二實施方式為一生產量預估方法，其流程圖係描繪於第5圖。生產量預估方法500適用於一電子裝置，例如：第一實施方式所述之生產量預估裝置1。生產量預估方法500透過步驟S501至步驟S505產生生產量預估結果。The second embodiment of the present invention is a production output estimation method, the flow chart of which is depicted in Figure 5 . The production volume prediction method 500 is suitable for an electronic device, such as the production volume prediction device 1 described in the first embodiment. The production volume prediction method 500 generates production volume prediction results through steps S501 to S505.

於步驟S501，由電子裝置基於對應預測目標之非目標資料所包含之複數個特徵欄位及特徵卷積，訓練閘門遞迴單位編碼器，以產生非目標特徵萃取結果中之萃取特徵值。接著，於步驟S503，由電子裝置基於非目標資料所包含之複數個時間性特徵欄位及時間卷積，訓練該閘門遞迴單位編碼器，以產生非目標特徵萃取結果中之萃取時間性特徵值。In step S501, the electronic device trains a gated recursive unit encoder based on a plurality of feature fields and feature convolutions included in the non-target data corresponding to the prediction target to generate extraction feature values in the non-target feature extraction results. Next, in step S503, the electronic device trains the gated recursive unit encoder based on the plurality of temporal feature fields and temporal convolutions included in the non-target data to generate the extracted temporal features in the non-target feature extraction results. value.

最後，於步驟S505，由電子裝置基於非目標特徵萃取結果及目標資料，訓練閘門遞迴單位解碼器以產生生產量預估結果，其中目標資料對應至預測目標之歷史銷售量。Finally, in step S505, the electronic device trains the gate return unit decoder to generate a production volume estimation result based on the non-target feature extraction results and target data, where the target data corresponds to the historical sales volume of the predicted target.

於某些實施方式中，其中閘門遞迴單位編碼器包含特徵卷積及時間卷積。In some embodiments, the gated recurrent unit encoder includes feature convolution and temporal convolution.

於某些實施方式中，生產量預估方法500更包含以下步驟：針對非目標資料中之特徵欄位，執行特徵卷積，以產生特徵欄位各者所對應之第一萃取特徵值；以及針對非目標資料中之時間性特徵欄位，執行時間卷積，以產生時間性特徵欄位各者所對應之第一萃取時間性特徵值。In some embodiments, the production volume estimation method 500 further includes the following steps: performing feature convolution on feature fields in the non-target data to generate first extracted feature values corresponding to each feature field; and For the temporal feature fields in the non-target data, perform temporal convolution to generate the first extracted temporal feature value corresponding to each temporal feature field.

於某些實施方式中，生產量預估方法500更包含以下步驟：對於第一萃取特徵值，執行第一自注意運作，以產生複數個第二萃取特徵值；以及對於第一萃取時間性特徵值，執行第二自注意運作，以產生複數個第二萃取時間性特徵值。In some embodiments, the throughput estimation method 500 further includes the following steps: for the first extraction characteristic value, performing a first self-attention operation to generate a plurality of second extraction characteristic values; and for the first extraction temporal characteristic value, and perform a second self-attention operation to generate a plurality of second extracted temporal feature values.

於某些實施方式中，生產量預估方法500更包含以下步驟：執行複數次該特徵卷積及複數次第一自注意運作，以產生複數個第二萃取特徵值；以及執行複數次時間卷積及複數次第二自注意運作，以產生複數個第二萃取時間性特徵值。In some embodiments, the throughput estimation method 500 further includes the following steps: performing the feature convolution and the first self-attention operation a plurality of times to generate a plurality of second extracted feature values; and performing the time convolution a plurality of times. The product and a plurality of second self-attention operations are performed to generate a plurality of second extraction temporal characteristic values.

於某些實施方式中，生產量預估方法500更包含以下步驟：對於第二萃取特徵值，執行第一批量標準化運作及第一閘門遞迴單位運作，以產生複數個第三萃取特徵值；以及對於該等第二萃取時間性特徵值，執行一第二批量標準化運作及一第二閘門遞迴單位運作，以產生複數個第三萃取時間性特徵值。In some embodiments, the production volume estimation method 500 further includes the following steps: for the second extraction characteristic value, perform a first batch standardization operation and a first gate return unit operation to generate a plurality of third extraction characteristic values; And for the second extraction temporal characteristic values, a second batch normalization operation and a second gate return unit operation are performed to generate a plurality of third extraction temporal characteristic values.

於某些實施方式中，生產量預估方法500更包含以下步驟：基於第三萃取特徵值及第三萃取時間性特徵值，產生非目標特徵萃取結果。In some embodiments, the throughput estimation method 500 further includes the following steps: generating a non-target feature extraction result based on the third extraction feature value and the third extraction temporal feature value.

於某些實施方式中，其中該閘門遞迴單位解碼器包含複數個第三閘門遞迴單位，且生產量預估方法500更包含以下步驟：將對應複數個時間點之歷史銷售量分別輸入至第三閘門遞迴單位，以產生目標特徵萃取結果。In some embodiments, the gate return unit decoder includes a plurality of third gate return units, and the production volume estimation method 500 further includes the following steps: inputting historical sales volumes corresponding to a plurality of time points into The third gate recursively unites to produce target feature extraction results.

於某些實施方式中，其中該閘門遞迴單位解碼器更包含第四閘門遞迴單位，且生產量預估方法500更包含以下步驟：將非目標特徵萃取結果及目標特徵萃取結果輸入至第四閘門遞迴單位，以產生生產量預估結果。In some embodiments, the gate return unit decoder further includes a fourth gate return unit, and the throughput estimation method 500 further includes the following steps: inputting the non-target feature extraction results and the target feature extraction results into Four gates are returned to the unit to produce throughput estimates.

除了上述步驟，第二實施方式亦能執行第一實施方式所描述之生產量預估裝置1之所有運作及步驟，具有同樣之功能，且達到同樣之技術效果。本發明所屬技術領域中具有通常知識者可直接瞭解第二實施方式如何基於上述第一實施方式以執行此等運作及步驟，具有同樣之功能，並達到同樣之技術效果，故不贅述。In addition to the above steps, the second embodiment can also perform all operations and steps of the production volume prediction device 1 described in the first embodiment, has the same functions, and achieves the same technical effects. Those with ordinary skill in the technical field of the present invention can directly understand how the second embodiment performs these operations and steps based on the above-mentioned first embodiment, has the same functions, and achieves the same technical effects, so no further description is given.

需說明者，於本發明專利說明書及申請專利範圍中，某些用語（包含：萃取特徵值、萃取時間性特徵值、自注意運作、批量標準化運作、閘門遞迴單位運作等等）前被冠以「第一」、「第二」、「第三」或「第四」，該等「第一」、「第二」、「第三」或「第四」僅用來區分不同之用語。例如：第一自注意運作及第二自注意運作中之「第一」及「第二」僅用來表示不同運作時所使用之自注意運作。It should be noted that in the patent specification and patent application scope of this invention, certain terms (including: extraction characteristic value, extraction time characteristic value, self-attention operation, batch standardization operation, gate return unit operation, etc.) are preceded by "First", "Second", "Third" or "Fourth" are used only to distinguish different terms. For example: "First" and "Second" in the first self-attention operation and the second self-attention operation are only used to indicate the self-attention operations used in different operations.

綜上所述，本發明所提供之生產量預估技術（至少包含裝置及方法），分別針對對應預測目標之非目標資料所包含之特徵欄位及時間性特徵欄位產生非目標特徵萃取結果，且基於對應預測目標之目標資料產生目標特徵萃取結果。藉由同時考慮非目標特徵萃取結果及目標特徵萃取結果，產生生產量預估結果。由於本發明所提供之生產量預估技術，同時考慮目標資料及非目標資料的特徵在時間上的關聯性和變化，能夠提升生產量預估的準確率，解決習知技術不準確的問題。To sum up, the production volume prediction technology (including at least a device and a method) provided by the present invention generates non-target feature extraction results for the feature fields and temporal feature fields contained in the non-target data corresponding to the prediction target. , and generate target feature extraction results based on the target data corresponding to the predicted target. By simultaneously considering the non-target feature extraction results and the target feature extraction results, the throughput estimation results are generated. Since the production volume prediction technology provided by the present invention simultaneously considers the correlation and changes in time of the characteristics of target data and non-target data, it can improve the accuracy of production volume prediction and solve the problem of inaccuracy in the conventional technology.

上述實施方式僅用來例舉本發明之部分實施態樣，以及闡釋本發明之技術特徵，而非用來限制本發明之保護範疇及範圍。任何本發明所屬技術領域中具有通常知識者可輕易完成之改變或均等性之安排均屬於本發明所主張之範圍，而本發明之權利保護範圍以申請專利範圍為準。The above embodiments are only used to illustrate some implementation aspects of the present invention and to illustrate the technical features of the present invention, but are not intended to limit the scope and scope of the present invention. Any changes or equivalence arrangements that can be easily accomplished by those with ordinary skill in the technical field to which the present invention belongs fall within the scope claimed by the present invention, and the scope of rights protection of the present invention shall be subject to the scope of the patent application.

1:生產量預估裝置 11:儲存器 13:處理器 TD:目標資料 NTD:非目標資料 F:特徵欄位 TF:時間性特徵欄位 F1、F2、F3:特徵 K1:卷積核 CH1、CH2、CH3:通道 EN:編碼器 DN:解碼器 FCO:特徵卷積運算 FC:特徵卷積 TCO:時間卷積運算 TC:時間卷積 SA:自注意運作 BN:批量標準化運作 GRU:閘門遞迴單位 EX:特徵擷取運作 T1、T2、……、Tn:時間點 500:生產量預估方法 S501、S503、S505:步驟 1: Production volume estimation device 11:Storage 13: Processor TD: target data NTD: non-target data F: Feature field TF: Temporal feature field F1, F2, F3: Features K1: convolution kernel CH1, CH2, CH3: Channel EN: Encoder DN: decoder FCO: Feature convolution operation FC: Feature convolution TCO: temporal convolution operation TC: temporal convolution SA: self-attention operation BN: batch standardized operation GRU: Gate return unit EX: Feature extraction operation T1, T2,..., Tn: time point 500:Production volume estimation method S501, S503, S505: steps

第1圖係描繪第一實施方式之生產量預估裝置之架構示意圖；第2圖係描繪第一實施方式之非目標資料示意圖；第3A圖係描繪第一實施方式之特徵卷積示意圖；第3B圖係描繪第一實施方式之時間卷積示意圖；第4圖係描繪某些實施方式之運作流程示意圖；以及第5圖係描繪第二實施方式之生產量預估方法之部分流程圖。 Figure 1 is a schematic diagram depicting the structure of the production volume estimating device according to the first embodiment; Figure 2 is a schematic diagram depicting non-target data in the first embodiment; Figure 3A is a schematic diagram depicting feature convolution of the first embodiment; Figure 3B is a schematic diagram depicting the temporal convolution of the first embodiment; Figure 4 is a schematic operational flow diagram depicting certain embodiments; and FIG. 5 is a partial flowchart depicting the production volume estimation method of the second embodiment.

國內寄存資訊(請依寄存機構、日期、號碼順序註記) 無國外寄存資訊(請依寄存國家、機構、日期、號碼順序註記) 無 Domestic storage information (please note in order of storage institution, date and number) without Overseas storage information (please note in order of storage country, institution, date, and number) without

500:生產量預估方法 500:Production volume estimation method

S501、S503、S505:步驟 S501, S503, S505: steps

Claims

A production volume forecasting device, including: A storage device for storing target data corresponding to a prediction target and non-target data, wherein the non-target data includes a plurality of feature fields and a plurality of temporal feature fields; and A processor electrically connected to the storage and used to perform the following steps: Based on the feature fields and the temporal feature fields included in the non-target data, train a gated recursive unit encoder to generate a non-target feature extraction result; and Based on the non-target feature extraction results and the target data, a gated recursive unit decoder is trained to generate a production volume estimation result, wherein the target data corresponds to one of the historical sales volumes of the forecast target.

The production volume prediction device of claim 1, wherein the gate recursive unit encoder includes a feature convolution and a temporal convolution.

The production volume estimation device as described in claim 2, wherein the processor further performs the following operations: Perform the feature convolution on the feature fields in the non-target data to generate a first extracted feature value corresponding to each of the feature fields; and The temporal convolution is performed on the temporal feature fields in the non-target data to generate a first extracted temporal feature value corresponding to each of the temporal feature fields.

The production volume estimation device as described in claim 3, wherein the processor further performs the following operations: For the first extraction feature values, perform a first self-attention operation to generate a plurality of second extraction feature values; and For the first extraction temporal characteristic values, a second self-attention operation is performed to generate a plurality of second extraction temporal characteristic values.

The production volume estimation device as described in claim 3, wherein the processor further performs the following operations: Perform a plurality of times of feature convolution and a plurality of first self-attention operations to generate a plurality of second extracted feature values; and The temporal convolution and the second self-attention operation are performed a plurality of times to generate a plurality of second extracted temporal feature values.

The production volume estimation device as described in claim 4, wherein the processor further performs the following operations: For the second extraction feature values, perform a first batch normalization operation and a first gate return unit operation to generate a plurality of third extraction feature values; and For the second extraction temporal characteristic values, a second batch normalization operation and a second gate recursive unit operation are performed to generate a plurality of third extraction temporal characteristic values.

The production volume estimation device as described in claim 6, wherein the processor further performs the following operations: The non-target feature extraction result is generated based on the third extraction feature values and the third extraction temporal feature values.

The production volume estimation device of claim 1, wherein the gate return unit decoder includes a plurality of third gate return units, and the processor further performs the following operations: The historical sales volumes corresponding to multiple time points are respectively input into the third gate return units to generate a target feature extraction result.

The production volume estimation device of claim 8, wherein the gate return unit decoder further includes a fourth gate return unit, and the processor further performs the following operations: The non-target feature extraction result and the target feature extraction result are input into a fourth gate return unit to generate the production volume estimation result.

A production volume prediction method is used for an electronic device. The production volume prediction method is executed by the electronic device and includes the following steps: Based on a plurality of feature fields and a feature convolution included in the non-target data corresponding to a prediction target, train a gated recursive unit encoder to generate an extracted feature value in a non-target feature extraction result; Based on a plurality of temporal feature fields and a temporal convolution included in the non-target data, train the gated recursive unit encoder to generate an extracted temporal feature value in the non-target feature extraction result; and Based on the non-target feature extraction results and a target data, a gated recursive unit decoder is trained to generate a production volume prediction result, wherein the target data corresponds to a historical sales volume of the prediction target.