TWI557576B

TWI557576B - Method and System for Predicting Calculation of Timing Data

Info

Publication number: TWI557576B
Application number: TW103128062A
Authority: TW
Inventors: Yung Sen Hsin; Chuang Fa Yang; Kun Hua Tsai
Original assignee: Chunghwa Telecom Co Ltd
Priority date: 2014-08-15
Filing date: 2014-08-15
Publication date: 2016-11-11
Also published as: TW201606524A

Description

Method and system for predicting calculation of time series data

本發明專利主要用來保護雲端資源分析服務，該服務除了使用在雲環境，也規劃應用於hicloud雲運算、虛擬私雲或單獨販售，能便利與有效地評估既有雲端資源是否有效地被利用及分析雲端資源資訊達到如監控、預測告警功能。 The invention patent is mainly used to protect the cloud resource analysis service. The service is used in the cloud environment, and is also planned to be applied to the hicloud cloud computing, the virtual private cloud or the separate sale, and can conveniently and effectively evaluate whether the existing cloud resources are effectively Utilize and analyze cloud resource information to achieve monitoring and forecasting alarm functions.

在第三方軟體在簡易設計下，系統需計算時則會透過REST等方式存取一次性資料，將計算負荷一次性處理，當計算需求提升時，其REST存取次數則會越多，造成母體軟體及自我系統的負荷提高，故在現行第三方軟體設計下，皆會透過自我儲存歷史資料之方式以利計算，但當收集之原始資料越來越多時，會導致搜尋效能越來越差，當一次計算需要耗時好幾小時，其系統之可用性會變得很低，資料應能拼裝，也就是分層儲存，當已少量使用之原始資料應移出壓成壓縮檔，要使用時再帶入，並在原始資料層之上建立中繼資料層，中繼資料淬取概括之資訊，其資料粒度較大，但中繼資料的數量較原始資料少。另外且在簡易架構下，多數計算仰賴資料庫之性能，當資料越來越旁大時，資料之可用性將會變得很低，當不同且大量之使用者交互查詢下，計算負荷越大，使用者之計算需求應能被預測，若系統被動地依靠使用者查詢再做事，則無法有效解決同時性地大量之計算。 In the simple design of the third-party software, when the system needs to calculate, the one-time data will be accessed through REST, etc., and the computing load will be processed once. When the computing demand is increased, the REST access times will be more, resulting in the mother. The load of software and self-system is improved. Therefore, under the current third-party software design, the calculation will be carried out by self-storing historical data. However, when more and more original data is collected, the search performance will be worse. When a calculation takes several hours, the availability of the system will become very low, and the data should be assembled, that is, tiered storage. When the raw materials that have been used in a small amount should be removed and compressed into compressed files, Into, and establish a relay data layer on the original data layer, relay data to extract the general information, the data size is larger, but the number of relay data is less than the original data. In addition, under the simple architecture, most calculations rely on the performance of the database. When the data becomes more and more large, the availability of the data will become very low. When different and a large number of user interaction queries, the calculation load The larger the user's computing needs should be predicted, if the system passively relies on the user to query and do things, it can not effectively solve a large number of simultaneous calculations.

本案發明人鑑於上述習用方式所衍生的各項缺點，乃亟思加以改良創新，並經多年苦心孤詣潛心研究後，終於成功研發完成本件時序資料預測計算之方法與系統。 In view of the shortcomings derived from the above-mentioned conventional methods, the inventor of the present invention has improved and innovated, and after years of painstaking research, he finally succeeded in researching and developing the method and system for predicting and calculating the time series data.

本發明提供一種時序資料預測計算之方法與系統，對所收集之時序資料及成員進行多層級群組管理，並能針對群組計算時，快速取用及再用中繼資料，達到節省系統效能及時間之效益。 The invention provides a method and a system for predicting and calculating time series data, and performs multi-level group management on the collected time series data and members, and can quickly access and reuse relay data for group calculation to save system performance. And the benefits of time.

本發明提出一種時序資料預測計算之系統，包括使用者、多階層群組與時序資料管理系統、時序資料來源。其中多階層群組與時序資料管理系統包含資料來源管理模組、時序資料收集及分析模組、數據資料庫、多階層群組管理模組、群組計算需求輸入模組、目錄管理模組、中繼資料計算模組、計算結果輸出模組和自我學習模組，其中自我學習模組包含常用目錄載入單元及預先計算單元。使用者操作多階層群組與時序資料管理系統透過資料來源管理模組定義時序資料來源，時序資料收集及分析模組將收集由時序資料來源所傳入之原始資料存入數據資料庫並分析資料成份及創建可選成員，使用者可透過多階層群組管理模組進行成員選取及群組管理，使用者透過群組計算輸入模組啟動系統運算，根據使用者計算需求透過目錄管理模組建立或取得既有之目錄，根據目錄定義，中繼資料計算模組進行多階層群組及成員之初步運算，其目錄定義可透過時間精準度進行計算調整。初步運算結束後，計算結果輸出模組進行最終結果之運算並輸出至使用者，其整體運算之過程皆由自我學習模組所監看，透過常用目錄載入單元，可分析使用行為並預先載入常用之目錄於快取清單增進取得目錄之效率，預先計算單元則可預測使用者計算需求預先計算中繼資料以利再用。 The invention provides a system for predicting and calculating time series data, including a user, a multi-level group and a time series data management system, and a timing data source. The multi-level group and time series data management system includes a data source management module, a time series data collection and analysis module, a data database, a multi-level group management module, a group computing demand input module, a directory management module, The relay data calculation module, the calculation result output module and the self-learning module, wherein the self-learning module comprises a common directory loading unit and a pre-calculation unit. The user operates the multi-level group and the time series data management system to define the time series data source through the data source management module. The time series data collection and analysis module collects the original data imported from the time series data source and stores the original data into the data database and analyzes the data. Incorporating components and creating optional members, users can perform member selection and group management through a multi-level group management module. The user initiates system operations through the group calculation input module, and establishes through the directory management module according to the user's computing needs. Or obtain the existing catalogue. According to the catalog definition, the relay data calculation module performs preliminary calculations of multi-level groups and members, and the catalog definition can be calculated and adjusted through time precision. early After the step operation ends, the calculation result output module performs the final result calculation and outputs it to the user. The whole operation process is monitored by the self-learning module, and the usage behavior can be analyzed and preloaded through the common directory loading unit. The commonly used directory is used to improve the efficiency of obtaining the catalog in the cache list, and the pre-calculation unit can predict the user's computing requirements to pre-calculate the relay data for reuse.

本發明提出一種時序資料預測計算之方法，包含步驟如下：步驟一、使用者對群組下達一計算要求；步驟二、群組計算需求輸入模組分析其計算要求並組建目錄；步驟三、目錄管理模組搜尋常用目錄；步驟四、常用目錄載入單元透過模糊概念對其餘常用目錄計算差異係數1減相似係數，將其使用率扣除；步驟五、搜尋目錄於數據資料庫；步驟六、創建此目錄；步驟七、目錄管理模組對取出之目錄其使用率加1；步驟八、中繼資料計算模組根據群組成員安排計算需求；步驟九、搜尋中繼計算資料於數據資料庫；步驟十、中繼資料計算模組開始進行計算並將計算完之資料掛載於目錄下；步驟十一、當中繼資料皆已完成計算，計算結果輸出模組則進行最後結算動作，並將結果輸出。 The invention provides a method for predicting and calculating time series data, which comprises the following steps: Step 1: The user issues a calculation request to the group; Step 2: The group calculation demand input module analyzes the calculation requirements and forms a directory; Step 3: Table of Contents The management module searches the common directory; in step 4, the common directory loading unit calculates the difference coefficient 1 minus the similarity coefficient through the fuzzy concept, and deducts the usage rate; step 5, searches the directory in the data database; step six, creates This directory; step seven, the directory management module increases the usage rate of the extracted directory by one; step eight, the relay data computing module arranges the computing requirements according to the group members; step IX, searches the relay computing data in the data database; Step 10: The relay data calculation module starts to calculate and mounts the calculated data in the directory; Step 11: When the relay data has been calculated, the calculation result output module performs the final settlement action, and the result is Output.

本發明時序資料預測計算之方法與系統，與其他習用技術相互比較時，更具備下列優點： The method and system for predicting and calculating the time series data of the present invention have the following advantages when compared with other conventional technologies:

1.本發明針對多層級群組提供運算資料再用機制，當系統取出群組成員時序資料時，先進行初步計算產生一中繼資料，並將中繼資料存於資料庫以利再用，再用可發生於使用者之重複查詢及多層級群組運算，根據此運算機制，可節省系統運算效能及使用者等待時間。 1. The present invention provides a computing data reuse mechanism for a multi-level group. When the system takes out the group member time series data, the preliminary calculation is performed to generate a relay data, and the relay data is stored in the data library for reuse. Reuse can occur in the user's repeated query and multi-level group operation, according to this operation mechanism, can save system computing efficiency and user waiting time.

2.本發明提供多層級群組之資料快速取用及運算機制，成員之原始資料進行初步運算後建立中繼資料，對此中繼資料建立目錄，此目錄根據使用者查詢之規則建立，透過目錄即能快速找到中繼資料，可利系統快速再用。 2. The present invention provides a data acquisition and operation mechanism for a multi-level group. The preliminary data of the member is used to establish a relay data, and a directory is created for the relay data. The directory is established according to the rules of the user query. The directory can quickly find the relay data, which can be quickly reused by the system.

3.本發明提供中繼資料目錄部份載入快取之機制，當中繼資料量越大時，目錄數量也會隨之增長，故根據系統歷史使用經驗，進行自我學習之功能，載入最常使用之目錄，並透過模糊概念，以差異係數方式調整最常使用之目錄清單。 3. The present invention provides a mechanism for loading a cache data directory into a cache. When the amount of relay data is larger, the number of directories will also increase. Therefore, according to the historical experience of the system, the self-learning function is performed, and the load is most loaded. A frequently used catalogue, and through the fuzzy concept, adjust the list of most frequently used catalogues by means of the coefficient of difference.

4.透過中繼資料之設計，其原始資料可移出被備份，降低資料庫之資料負荷，其目錄之設計除能快速查詢資料外，當目錄被刪除則連帶刪除相關之中繼資料，建立中繼資料刪除機制。 4. Through the design of relay data, the original data can be removed and backed up, reducing the data load of the database. In addition to the rapid query of the data, the catalogue is deleted, and the related relay data is deleted. Following the data deletion mechanism.

5.本發明提供預測使用者需求並預先計算之機制，系統可根據歷史使用經驗，依照不同目錄使用度之增減進行預測並預先計算中繼資料，以利下次計算時系統能快速再用，預先計算也能解決同時性之大量運算所帶來之效能問題。 5. The present invention provides a mechanism for predicting user requirements and pre-calculating. The system can predict and pre-calculate relay data according to historical usage experience and increase or decrease the usage of different directories, so that the system can be quickly reused in the next calculation. Pre-computation can also solve the performance problems caused by the large number of simultaneous operations.

100‧‧‧使用者 100‧‧‧Users

200‧‧‧多階層群組與時序資料管理系統 200‧‧‧Multi-level group and time series data management system

201‧‧‧資料來源管理模組 201‧‧‧Source Management Module

202‧‧‧時序資料收集及分析模組 202‧‧‧Time Series Data Collection and Analysis Module

203‧‧‧數據資料庫 203‧‧‧Data Database

204‧‧‧多階層群組管理模組 204‧‧‧Multi-level group management module

205‧‧‧群組計算需求輸入模組 205‧‧‧Group Computing Demand Input Module

206‧‧‧目錄管理模組 206‧‧‧Directory Management Module

207‧‧‧中繼資料計算模組 207‧‧‧Relay data calculation module

208‧‧‧計算結果輸出模組 208‧‧‧ Calculation result output module

209‧‧‧自我學習模組 209‧‧‧Self-learning module

2091‧‧‧常用目錄載入單元 2091‧‧‧Common directory loading unit

2092‧‧‧預先計算單元 2092‧‧‧ Pre-calculation unit

300‧‧‧時序資料來源 300‧‧‧Time source

S201~S211‧‧‧步驟圖 S201~S211‧‧‧Step Diagram

S301~S304‧‧‧預先計算需求步驟圖 S301~S304‧‧‧ Pre-calculated demand step diagram

請參閱有關本發明之詳細說明及其附圖，將可進一步瞭解本發明之技術內容及其目的功效；有關附圖為：圖1為本發明時序資料預測計算之方法與系統之系統架構圖；圖2為時序資料預測計算之方法與系統之步驟圖；圖3為時序資料預測計算之方法與系統之預先計算需求步驟圖。 The detailed description of the present invention and its accompanying drawings will be further understood. The technical content of the present invention and its functions are as follows: FIG. 1 is a system architecture diagram of a method and system for predicting time series data according to the present invention; 2 is a step diagram of a method and system for predicting calculation of time series data; FIG. 3 is a diagram of a pre-computation requirement step of a method and system for predicting calculation of time series data.

為了使本發明的目的、技術方案及優點更加清楚明白，下面結合附圖及實施例，對本發明進行進一步詳細說明。應當理解，此處所描述的具體實施例僅用以解釋本發明，但並不用於限定本發明。 The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

以下，結合附圖對本發明進一步說明： Hereinafter, the present invention will be further described with reference to the accompanying drawings:

請參閱圖1，本發明提出一種時序資料預測計算之系統，包括使用者100、多階層群組與時序資料管理系統200、時序資料來源300。 Referring to FIG. 1, the present invention provides a system for predicting and calculating time series data, including a user 100, a multi-level group and timing data management system 200, and a time series data source 300.

其中多階層群組與時序資料管理系統200包含資料來源管理模組201、時序資料收集及分析模組202、數據資料庫203、多階層群組管理模組204、群組計算需求輸入模組205、目錄管理模組206、中繼資料計算模組207、計算結果輸出模組208和自我學習模組209；其中自我學習模組209包含常用目錄載入單元2091及預先計算單元2092。 The multi-level group and time series data management system 200 includes a data source management module 201, a time series data collection and analysis module 202, a data database 203, a multi-level group management module 204, and a group computing demand input module 205. The directory management module 206, the relay data calculation module 207, the calculation result output module 208, and the self-learning module 209; wherein the self-learning module 209 includes a common directory loading unit 2091 and a pre-computing unit 2092.

使用者100操作多階層群組與時序資料管理系統200透過資料來源管理模組201定義時序資料來源300，時序資料收集及分析模組202將收集由時序資料來源300所傳入之原始資料存入數據資料庫203並分析資料成份及創建可選成員，使用者100可透過多階層群組管理模組204進行成員選取及群組管理，使用者100透過群組計算輸入模組205啟動系統運算，根據使用者100計算需求透過目錄管理模組206建立或取得既有之目錄，根據目錄定義，中繼資料計算模組207進行多階層群組及成員之初步運算，其目錄定義可透過時間精準度進行計算調整，初步運算結束後，計算結果輸出模組208進行最終結果之運算並輸出至使用者100，其整體運算之過程皆由自我學習模組209所監看，透過常用目錄載入單元2091，可分析使用行為並預先載入常用之目錄於快取清單增進取得目錄之效率，預先計算單元2092則可預測使用者計算需求預先計算中繼資料以利再用。 The user 100 operates the multi-level group and the time series data management system 200 to define the time series data source 300 through the data source management module 201, and the time series data collection and analysis module 202 collects the collected data from the time series data source 300. The original data is stored in the data database 203, and the data components are analyzed and optional members are created. The user 100 can perform member selection and group management through the multi-level group management module 204, and the user 100 calculates the input module through the group. 205 starts the system operation, establishes or obtains the existing directory through the directory management module 206 according to the computing requirements of the user 100, and according to the directory definition, the relay data calculation module 207 performs preliminary operations of the multi-level group and the members, and the directory definition thereof The calculation and adjustment can be performed through time precision. After the preliminary calculation is completed, the calculation result output module 208 performs the calculation of the final result and outputs it to the user 100. The overall calculation process is monitored by the self-learning module 209, and is commonly used. The directory loading unit 2091 can analyze the usage behavior and pre-load the commonly used directory to improve the efficiency of obtaining the directory in the cache list, and the pre-calculation unit 2092 can predict the user computing requirement to pre-calculate the relay data for reuse.

本發明提出一種時序資料預測計算之方法如圖2、及圖3所示，其步驟如下：步驟一、S201使用者對群組下達一計算要求，意謂使用者透過多階層群組管理模組對一群組下達計算要求；步驟二、S202群組計算需求輸入模組分析其計算要求並組建目錄，意謂使用者之計算要求，其要求應包含群組、此次計算之時間區間、資料粒度、需計算之結果(數值輸出或數列輸出)、時間定義(那些時間需剔除及允許)，再根據這些要求，組建一目錄，其中，需計算之結果為系統所能提供之可選項目，如最大值、最小值、平均值或一段最小值之時間數列，其一目錄產生方式基於一群組成員與一計算需求，其組建之目錄皆能被不同群組再利用；步驟三、S203目錄管理模組搜尋常用目錄，意謂上述步驟二依據所組建之目錄進入常用目錄搜尋，其搜尋方式為比對所有目錄定義之內容，其輸出為找到目錄或查無目錄；步驟四、S204常用目錄載入單元透過模糊概念對其餘常用目錄計算差異係數1減相似係數，將其使用率扣除，意謂依據上述步驟三所找到之目錄進行模糊概念計算其差異係數，並將此差異係數扣除其被比較常用目錄之使用率，其中，差異係數為1減相似係數，相似係數使用Dice係數去計算目錄中每一定義之相似係數，並平均所有相似係數取得該常用目錄對其比較之目錄的相似係數，其差異係數計算皆針對常用目錄(快取)進行計算，而非進入數據資料庫撈取所有目錄進行計算；步驟五、S205搜尋目錄於數據資料庫，意謂當步驟三查無目錄後，此步驟進入數據資料庫查找，其目錄比對方式同步驟三，輸出為找到目錄或查無目錄；步驟六、S206創建此目錄，意謂當步驟五查無目錄後，此步驟即創建此目錄於數據資料庫，其初始使用率為零，並將此目錄輸出；步驟七、S207目錄管理模組對取出之目錄其使用率加1，意謂依據上述步驟四及步驟六所取出之目錄對其使用率加，並更新至數據資料庫，並將此目錄輸出；步驟八、S208中繼資料計算模組根據群組成員安排計算需求，意謂依據步驟七所傳入之目錄並針對此次使用者需計算之群組及成員，將計算時間按目錄之資料粒度定義切割成小段，再依每一小段時間進行中繼資料計算安排，此步驟能接收組件所傳進之預先計算需求，此步驟只負責安排中繼資料之計算工作，當同時間不同使用者對相同目錄進行計算需求時，此步驟亦會以佇列方式進行，其計算時間必須大於目錄之資料粒度，以利時間切割；步驟九、S209搜尋中繼計算資料於數據資料庫，意謂依據上述步驟八所傳入之計算工作搜尋數據資料庫23，查看是否已計算，輸出為查無計算或已計算；步驟十、S210中繼資料計算模組開始進行計算並將計算完之資料掛載於目錄下，意謂依據上述步驟S209所傳入之未計算工作，開始進行中繼資料之計算，並將結果掛載於相關目錄下以及存入數據資料庫23，其中，對raw data轉化為中繼資料過程進行計算時，目錄之定義與剔除表單可經由調整時間精準度方式啟用或禁用；步驟十一、S211當中繼資料皆已完成計算，計算結果輸出模組則進行最後結算動作，並將結果輸出，意謂將此次計算需使用到之中繼資料由數據資料庫取出，不能有缺值，並將所有中繼資料進行統整，依使用者之計算需求算取結果，並計算結果輸出。 The method for predicting and calculating the time series data is as shown in FIG. 2 and FIG. 3, and the steps are as follows: Step 1: The user of S201 issues a calculation request to the group, which means that the user passes the multi-level group management module. Calculate the requirements for a group; Step 2: The S202 group calculates the demand input module to analyze the calculation requirements and form a catalog, which means the user's calculation requirements. The requirements should include the group, the time interval of the calculation, and the data. Granularity, results to be calculated (numerical output or sequence output), time definition (those to be excluded and allowed), and then a directory is created according to these requirements, wherein the result to be calculated is an optional item that the system can provide. For example, the maximum value, the minimum value, the average value, or the time series of a minimum value, a directory is generated based on a group of members and a computing requirement, and the organized directories can be grouped by different groups. Reuse; Step 3, S203 directory management module searches for common directories, which means that the above step 2 enters the common directory search according to the formed directory, and the search method is to compare the contents of all the directory definitions, and the output is to find the directory or check no. Table 4; Step 4, S204 common directory loading unit calculates the difference coefficient 1 minus the similarity coefficient through the fuzzy concept, and deducts the usage rate, which means that the difference coefficient is calculated according to the fuzzy concept created in the above step 3. And the difference coefficient is deducted from the usage rate of the commonly used catalogue, wherein the difference coefficient is 1 minus the similarity coefficient, the similarity coefficient uses the Dice coefficient to calculate the similarity coefficient of each definition in the catalog, and the average similarity coefficient is obtained to obtain the common catalogue pair. Comparing the similarity coefficient of the catalogue, the calculation of the difference coefficient is calculated for the common catalog (cache), instead of entering the data database to retrieve all the catalogues for calculation; step 5, S205 searching the catalogue in the data repository, meaning step After three investigations without a directory, this step enters the data database search, and its directory comparison The same as step 3, the output is to find the directory or check the directory; step six, S206 create this directory, meaning that when step 5 does not have a directory, this step creates this directory in the data database, its initial usage rate is zero, And output the directory; Step 7. The S207 directory management module adds 1 to the extracted directory, which means that the usage rate is added according to the directory extracted in the above steps 4 and 6, and is updated to the data database. And output this directory; Step 8: The S208 relay data calculation module arranges the calculation requirements according to the group members, which means that according to the directory entered in step 7 and for the group and members to be calculated by the user, the calculation time is based on the data granularity of the directory. The definition is cut into small segments, and the relay data calculation arrangement is performed according to each small time. This step can receive the pre-computation requirements transmitted by the component. This step is only responsible for arranging the calculation work of the relay data, when different users at the same time When the same directory is used for calculation, this step will also be performed in a queue. The calculation time must be greater than the data granularity of the directory to facilitate time cutting. Step 9. S209 searches for the relay calculation data in the data database, which means Step 8: The calculated work search data database 23 is checked to see if it has been calculated, and the output is checked for no calculation or calculated; Step 10, S210 relay data calculation module starts calculation and the calculated data is mounted on In the directory, it means that the calculation of the relay data is started according to the uncalculated work introduced in the above step S209, and the result is mounted on the phase. The directory and the data database 23 are stored, wherein when the raw data is converted into the relay data process, the definition and the cull form of the directory can be enabled or disabled through the adjustment time precision method; Step 11: S211 when the data is relayed The calculation has been completed, and the calculation result output module performs the final settlement action, and outputs the result, which means that the relay data to be used in this calculation is taken out from the data database, there is no shortage, and all relays are The data is integrated, the results are calculated according to the user's calculation requirements, and the results are calculated. Output.

由上述步驟可以得知，包含從使用者輸入計算需求到完成計算，包含再用、快速計算及預先計算方法之步驟。首先，步驟S201使用者對一群組下達計算要求(如算取此群組之一段時間內的數值平均)，步驟S202分析其計算要求並組建目錄，其目錄組建依據群組成員以及使用者所輸入之時間要求。步驟S203根據此目錄進行搜尋，若找到相同之目錄則進入步驟S204，步驟S204則依據模糊理論概念(fuzzy-based)，將對每一個常用目錄進行差異係數計算，並將其使用率扣掉此係數，如常用目錄為[a,b,c,d]，使用者此次需求為b，則進行a對b、c對b及d對b之差異係數計算，算出係數值後，再依此分別扣除a、c及d之使用率，其差異係數之算法為(1-相似係數)。步驟S203若找不到其目錄則進入步驟S205搜尋資料庫，若找到則進入步驟S204，功能如上所述，若仍找不到則進行步驟S206創建此目錄。步驟S207當已取出之所需目錄後則將其使用率加1，進入步驟S208開始進行群組成員之中繼資料計算安排，步驟S209則搜尋資料庫所需之計算是否已存在，若不存在則進行步驟S210中繼資料計算，當中繼資料皆已完成計算，則進入步驟S211進行最後計算及結果輸出。 It can be known from the above steps that the steps from the user inputting the calculation requirement to completing the calculation include the steps of reuse, fast calculation and pre-calculation. First, in step S201, the user issues a calculation request to a group (for example, calculating a numerical average over a period of time of the group), step S202 analyzes the calculation request and sets up a directory, and the directory is formed according to the group member and the user. Enter the time requirement. Step S203 searches according to the directory. If the same directory is found, the process proceeds to step S204. Step S204 calculates the difference coefficient for each common directory according to the fuzzy-based concept, and deducts the usage rate. The coefficient, if the common directory is [a, b, c, d], the user needs b, then the difference coefficient of a to b, c to b and d to b is calculated, and after calculating the coefficient value, The usage rates of a, c, and d are deducted, and the algorithm of the difference coefficient is (1-similarity coefficient). If the directory is not found, the process proceeds to step S205 to search for the database. If it is found, the process proceeds to step S204, and the function is as described above. If it is still not found, the directory is created in step S206. Step S207, after the required directory has been taken out, the usage rate is increased by one, and the process proceeds to step S208 to start the relay data calculation arrangement of the group member, and in step S209, the calculation required for searching the database already exists, if not exists, if not Then, in step S210, the data calculation is performed. When the relay data has been calculated, the process proceeds to step S211 to perform the final calculation and the result output.

其中，步驟S204中，等待一段定義時間後，常用目錄會搜尋數據資料庫，並根據目錄使用率高低，取出高使用率之目錄更新於常用目錄表，此步驟係為一循還迴圈，會不斷地按照使用率高低更新快取中的常用目錄表，其概念如同最少使用頁面置換算法(Least Recently Used，LRU)，但實作內容除傳統之使用率增加外更採用差異係數扣除使用率之方式讓常用目錄表更接近大多數使用者需求。 In step S204, after waiting for a defined time, the common directory searches for the data database, and according to the usage rate of the directory, the directory with high usage rate is updated in the common directory table, and the step is to follow the loop. Constantly update the common directory table in the cache according to the usage rate. The concept is like Least Recently Used (LRU), but the implementation content is different from the traditional usage rate, and the difference coefficient is used to deduct the usage rate. This way the common table of contents is closer to most user needs.

其中，步驟S208中的預先計算需求，如圖3所示，步驟如下：步驟一、S301等待一段定義時間後，預先計算單元會定期記錄當前常用目錄之使用率，意謂會在等待一定時間後計錄目前常用目錄之使用率，其中，記錄次數及歷史使用率資料皆存於快取，不存於數據資料庫；步驟二、S302當記錄次數滿足其定義次數，意謂判斷目前記錄之次數，若次數不滿其定義次數(最少三次)則繼續等待下次記錄，若滿足記錄次數則進行預測步驟；步驟三、S303若滿足其定義次數，預先計算單元將目前常用目錄中的歷史使用率資料筆數，滿足其定義筆數之目錄進行使用率預測，將高預測之目錄取出，意謂針對目前常用目錄中滿足歷史使用率筆數(最少三筆)，取出並計算其歷史使用率增長值，利用此增長數列進行預測，當算出預測之下次增長值後，排序取出前三名(可能不滿三名)並輸出，其中，若常用目錄之歷史筆數皆不滿其預測需求，則無需進行預測，其預測目錄則為空，其中，預測計算皆針對常用目錄(快取)進行計算，而非進入數據資料庫撈取所有目錄進行計算；步驟四、S304基於預測目錄，搜尋數據資料庫，取出對應之群組成員並進行預先計算，並回步驟一，意謂所輸出之預測目錄，從數據資料庫取出對應群組成員進行預先計算通知，將其通知送往中繼資料計算模組，其中，當步驟三所輸出之目錄為空則不進行預先計算，並繼續下次循還，其預測目錄之資料粒度須小於步驟一之等待時間才會進行預先計算通知；步驟五、若未滿足其定義次數，則回步驟一。 The pre-calculation requirement in step S208 is as shown in FIG. 3, and the steps are as follows: Step 1: After S301 waits for a defined time, the pre-calculation unit periodically records the usage rate of the current common directory, which means that after waiting for a certain time. Record the current usage rate of commonly used directories, in which the record number and historical usage data are stored in the cache, not in the data database; Step 2, S302 when the number of records meets its defined number of times, meaning the number of current records is judged If the number of times is less than the defined number of times (at least three times), it will continue to wait for the next record. If the number of records is satisfied, the prediction step is performed. Step 3: If the number of times defined by S303 is satisfied, the pre-calculation unit will record the historical usage data in the current common directory. The number of pens, the catalogue that meets the number of defined pens, is used to predict the usage rate, and the high-predicted catalogue is taken out, which means that the historical usage rate (at least three strokes) is satisfied for the current common catalogue, and the historical usage growth value is taken out and calculated. Use this growth series to make predictions. After calculating the next growth value of the forecast, sort out the top three (maybe less than three) And output, if the history of the common directory is not satisfied with its forecast demand, then no prediction is needed, and the prediction directory is empty, wherein the prediction calculation is performed for the common directory (cache) instead of entering the data. The database retrieves all the catalogues for calculation; in step 4, S304 searches for the data repository based on the predicted catalog, retrieves the corresponding group members and performs pre-calculation, and returns to step one, meaning that the output predicted list is taken out from the data repository. Corresponding group The member of the group performs a pre-computation notification and sends the notification to the relay data calculation module. When the directory output in step 3 is empty, the pre-calculation is not performed, and the next return is performed, and the data size of the predicted directory is required. If the waiting time is less than the waiting time of step 1, the pre-calculation notification is performed; if the number of times is not satisfied, the step 1 is returned.

綜上所述，係為一循還迴圈，首先步驟S301等待一段定義時間後，預先計算單元會定期記錄當前常用目錄之使用率，步驟S302當記錄次數滿足其定義次數後，步驟S303預先計算單元將目前常用目錄中的歷史使用率資料筆數，滿足其定義筆數之目錄進行使用率預測，將高預測之目錄取出，並則會檢查目前常用目錄中，取出滿足一定記錄使用率筆數之目錄進行預測，但因步驟S204中會不斷更新常用目錄，每次新進之常用目錄其記錄使用率筆數為零，故步驟S303會根據滿足一定記錄使用率筆數之目錄進行預測，其預測方式採用灰預測，灰預測特性為可基於少數筆數之離散資料進行快速預測，當預測出下一筆使用率後，將其預測值進行排序，取出高使用率之目錄，進入步驟S304基於預測目錄，搜尋數據資料庫，取出對應之群組成員並進行預先計算，搜尋數據資料庫取得此目錄群組成員，並才通知步驟S208進行預先計算。 In summary, the loopback is performed. After the step S301 waits for a defined period of time, the pre-calculation unit periodically records the usage rate of the current common directory. In step S302, when the number of records meets the defined number of times, step S303 is pre-calculated. The unit will use the historical usage data in the current common directory to meet the definition of the number of records to use the rate prediction, take out the high-predicted directory, and check the current common directory, and take out the number of records that meet the certain record usage rate. The directory is predicted, but since the common directory is continuously updated in step S204, the number of records used in each new favorite directory is zero, so step S303 predicts according to the directory that satisfies the record usage rate, and the prediction is made. The method adopts gray prediction, and the gray prediction characteristic is fast prediction based on discrete data of a small number of pens. After predicting the next usage rate, the predicted values are sorted, the high usage rate directory is taken out, and the process proceeds to step S304 based on the prediction list. , search the data database, take out the corresponding group members and perform pre-calculation, search the data database This catalog was a member of the group, and before notifying step S208 pre-calculated.

根據圖1、圖2、及圖3流程，實現較佳之實例，當使用者1輸入三個時序資料來源定義[source1,source2,source3]，時序資料收集及分析模組202則開始不斷進行資料收集如：source1[object1,object2],source2[object1],source3[object1] 其每個object皆包含一連串之時序資料[raw data1 in timestamp1,raw data2 in timestamp2...]，根據時間不斷增長則不斷增加，如公式一所示： According to the flow of FIG. 1, FIG. 2, and FIG. 3, a better example is implemented. When the user 1 inputs three time series data source definitions [source1, source2, source3], the time series data collection and analysis module 202 starts to continuously collect data. Such as: source1[object1, object2], source2[object1], source3[object1] Each object contains a series of time series data [raw data1 in timestamp1, raw data2 in timestamp2...], which increases according to the continuous increase of time. As shown in formula one:

時序資料收集及分析模組202則會分析source-object-raw data之關聯並填入數據資料庫203，多階層群組管理模組204將表現其關聯讓使用者100可以進行群組管理，例如使用者創建兩個群組[group1,group2]，其中group1為[member1(object1 in source1),member2(object1 in source3)]，group2為[member1(group1)]，可以發現group2裡的成員為group1，此為多階層群組之設計，且每一個member皆可以有自己之目錄，如公式二所示： The timing data collection and analysis module 202 analyzes the association of the source-object-raw data and fills in the data repository 203. The multi-level group management module 204 will express its association so that the user 100 can perform group management, for example, The user creates two groups [group1, group2], where group1 is [member1(object1 in source1), member2(object1 in source3)], group2 is [member1(group1)], and the member in group2 can be found as group1. This is a multi-level group design, and each member can have its own directory, as shown in Equation 2:

使用者100透過群組計算需求輸入模組205輸入計算需求，如使用者想取得group2之2014-04-13 00：00：00至2014-04-19 23：59：59的raw data最大值，並顯示最大值出現於那一天，但計算只取出每日08：00至16：59的raw data。群組計算需求輸入模組205會分析使用者輸入之資訊並組建成中繼資料目錄，目錄表達方式如下表1所示： The user 100 inputs the calculation requirement through the group calculation demand input module 205. For example, the user wants to obtain the maximum value of the raw data of the group4 2014-04-13 00:00:00 to 2014-04-19 23:59:59. And the maximum value appears on that day, but the calculation only takes out the raw data from 08:00 to 16:59 daily. The group computing demand input module 205 analyzes the information input by the user and forms a relay data directory. The directory expression is as shown in Table 1 below:

利用目錄進行資料切割，使用者之計算需求為最大值出現於那一天，則目錄之資料粒度為天，並將2014-04-13 00：00：00至2014-04-19 23：59：59以天為單位切成七天。年、月、周、時及分之定義皆為valid，唯天之定義08-16小時為valid，其餘00-07、17-23小時為invalid，年、月、天、時、分及秒之剔除表單為空，其一目錄產生方式基於一群組成員與一計算需求。 Using the catalog for data cutting, the user's calculation demand is the maximum value that appears on that day, the catalogue data size is days, and will be 2014-04-13 00:00:00 to 2014-04-19 23:59:59 Cut into seven days in days. The definitions of year, month, week, hour and minute are all valid. The definition of day is 08-16 hours is valid, and the remaining 00-07, 17-23 hours is invalid, year, month, day, hour, minute and second. The culling form is empty, and a directory is generated based on a group of members and a computing requirement.

透過目錄管理模組206搜尋常用目錄表單(快取)查看有無此目錄，若無則至數據資料庫203搜尋，若無則創建此目錄；若在常用目錄表單或數據資料庫203找到相同之目錄，則開始進行差異係數計算，如目前常用目錄表單為[目錄1,目錄2,目錄3,...,目錄10]，找到之目錄為目錄11(於數據資料庫203找到)，常用目錄載入單元2091開始進行目錄1對目錄11、目錄2對目錄11等至目錄10對目錄11之差異係數計算，差異係數為(1-相似係數)，如公式三所示： Searching the common directory form (cache) through the directory management module 206 to check whether there is any such directory, if not, searching to the data repository 203, if not, creating the directory; if the same directory is found in the common directory form or data repository 203 Then, the difference coefficient calculation is started. For example, the current common directory form is [directory 1, directory 2, directory 3, ..., directory 10], and the directory found is directory 11 (found in data database 203), and the common directory contains The entry unit 2091 starts the calculation of the difference coefficient between the directory 1 to the directory 11, the directory 2 to the directory 11 and the like to the directory 10 to the directory 11, and the difference coefficient is (1 - similarity coefficient), as shown in the formula 3:

因差異係數為(1-相似係數)，所以需先計算出相似係數，根據模糊理論概念(fuzzy-base)，兩個物體非只有等於或不等於之差別，可計算出兩者之相似度，本實例相似係數採用Dice係數計算，如公式四所示： Since the coefficient of difference is (1-similarity coefficient), the similarity coefficient needs to be calculated first. According to the fuzzy-base concept, the similarity between the two objects can be calculated by the difference between the two objects. The similarity coefficient of this example is calculated by Dice coefficient, as shown in Equation 4:

以目錄1對目錄11來說，計算天定義之相似係數則如下所示： For directory 1 versus directory 11, the similarity factor for the calculation of the day definition is as follows:

目錄1(範例)： Directory 1 (example):

天(定義)：08-09(vaild),Rest(invaild) Day (definition): 08-09 (vaild), Rest (invaild)

… ...

目錄11(範例)： Directory 11 (example):

天(定義)：08-16(vaild),Rest(invaild) Day (definition): 08-16 (vaild), Rest (invaild)

… ...

目錄1(天定義)→目錄11(天定義) Directory 1 (day definition) → directory 11 (day definition)

天定義(dice相似係數)： Day definition (dice similarity coefficient):

可取得0.08值，其餘之年、月、周、時及分之定義和剃除表單以及資料粒度之相似係數可以此類推計算。取得所有相似係數值後並加以平均，其值則為目錄11對目錄1之相似係數，如公式五所示： A value of 0.08 can be obtained, and the definitions of the remaining years, months, weeks, hours, and minutes, and the similarity coefficients of the shaving form and the data granularity can be calculated by analogy. All the similarity coefficient values are obtained and averaged, and the value is the similarity coefficient of the directory 11 to the directory 1, as shown in the formula:

取出所有常用目錄對目錄11之相似係數後，再透過公式三取得差異係數，例如目錄1對目錄11、目錄2對目錄11等至目錄10對目錄11為[0.8,0.9,0.1,0.2,0.2,0.1,0.5,0.2,0.9,0.4]，將其目錄使用率減去差異係數，如目錄1原本使用率為12.9，減去0.8則剩12.1，其餘常用目錄以此類推，最後對所取出之目錄11進行使用率加1的動作，如步驟S207，等待一段時間後按照目錄使用率排序更新常用目錄表，此舉之作用主要在於學習使用者之行為，動態調整常用目錄，使用差異係數扣除之方式，可以只比對快取中的常用目錄，而非對數據資料庫中的每一筆目錄進行相似係數比較，可減少數據比對所帶來之計算耗損。 After taking the similarity coefficient of all the common directories to the directory 11, the difference coefficient is obtained through the formula 3, for example, the directory 1 to the directory 11, the directory 2 to the directory 11, and the like to the directory 10 to the directory 11 is [0.8, 0.9, 0.1, 0.2, 0.2 , 0.1, 0.5, 0.2, 0.9, 0.4], subtract the difference coefficient from its catalog usage rate, such as the original usage rate of catalog 1 is 12.9, minus 12.1, 12.1, and other common catalogues, and so on. The directory 11 performs the action of adding 1 to the usage rate. In step S207, after waiting for a period of time, the common directory table is updated according to the directory usage rate. The main purpose of the action is to learn the behavior of the user, dynamically adjust the common directory, and deduct the difference coefficient. In this way, the similarity coefficient comparison can be performed only for the common directory in the cache, rather than for each directory in the data database, which can reduce the computational loss caused by the data comparison.

步驟S208則開始針對群組進行中繼資料計算需求安排，以此例中，group2的成員為group1，因group1為群組，還得再針對group1進行計算，此為多階層群組之特色，group1成員為objectA(object1 in source1)及objectB(object1 in source3)，針對group1以及此兩object之目錄(基於此次計算需求)進行搜尋2014-04-13 00：00：00至2014-04-19 23：59：59間有無已計算之中繼資料，依此專利目錄架構之設計，已計算之中繼資料可再被利用，如下列表2所示： Step S208 starts to perform relay data calculation demand scheduling for the group. In this example, the member of group2 is group1, and since group1 is a group, it is further calculated for group1, which is a feature of the multi-level group, group1 The members are objectA (object1 in source1) and objectB (object1 in source3), searching for group1 and the directories of the two objects (based on this computing requirement) 2014-04-13 00:00:00 to 2014-04-19 23 : There is no calculated relay data between 59:59. According to the design of the patent directory structure, the calculated relay data can be reused, as shown in the following list 2:

已計算之資料可能為先前使用者針對group1已進行計算所導致，所以需計算中繼資料之區間為：objectA{2014-04-16,2014-04-17,2014-04-18,2014-04-19}、objectB{2014-04-16,2014-04-17,2014-04-18,2014-04-19}及group1{2014-04-16,2014-04-17,2014-04-18,2014-04-19}。 The calculated data may be caused by the previous user's calculation for group1, so the interval for calculating the relay data is: objectA{2014-04-16,2014-04-17,2014-04-18,2014-04 -19}, objectB{2014-04-16,2014-04-17,2014-04-18,2014-04-19} and group1{2014-04-16,2014-04-17,2014-04-18 , 2014-04-19}.

而中繼資料為資料切割後，根據每一資料粒度進行初步計算所建立之資料，例如此例使用者100要找到一段時間之最大值(2014-04-13 00：00：00至2014-04-19 23：59：59)並列出此最大值位於那一天，則系統會分割此段時間成為好幾小段時間(天為單位)，再細部計算每一小段時間(天)之最大值資料再用機制與中繼資料計算將發生於步驟S209及S210，一中繼資料由一群raw data所轉化而來，而對raw data進行計算時，其目錄之定義與剔除表單可透過時間精準度進行調整，依時間精準度之不同對raw data之嚴格度亦不同，例如當時間精準度為分時，目錄之分定義與秒剔除表單皆失效，故在收集相關之raw data資料進行計算時，可調整目錄之時間精準度以滿足系統設計需求，本實例時間精準度為秒(最小單位)，故目錄中所有定義與剔除表單對raw data皆啟用。當中繼資料皆已計算完成後，則進行步驟411進行最後統整計算，並將結果輸出，以此例中，統整objectA及objectB之最大值，並將最大值平均至group1的每一資料粒度計算裡，再針對group1的每一資料粒度之最大值進行比較，最後取出最大值及發生之日期(天)並輸出，即完成計算。 After the relay data is data cut, the preliminary data is calculated according to the granularity of each data. For example, the user 100 needs to find the maximum value for a period of time (2014-04-13 00:00:00 to 2014-04) -19 23:59:59) and list the maximum value on that day, the system will divide this time into several small time (days), and then calculate the maximum value of each small time (day). The mechanism and relay data calculation will occur in steps S209 and S210. A relay data is converted from a group of raw data, and when the raw data is calculated, the definition of the directory and the culling form can be adjusted through time precision. The stringency of raw data varies according to the accuracy of time. For example, when the time precision is minute, the definition of the directory and the second elimination form are invalid. Therefore, when collecting the relevant raw data for calculation, the directory can be adjusted. Time accuracy to meet system design requirements, the time accuracy of this example is seconds (minimum Unit), so all definitions and culling forms in the directory are enabled for raw data. After the relay data has been calculated, step 411 is performed to perform the final rounding calculation, and the result is output. In this example, the maximum values of objectA and objectB are unified, and the maximum value is averaged to each data granularity of group1. In the calculation, the maximum value of each data granularity of group1 is compared, and finally the maximum value and the date of occurrence (days) are taken out and output, that is, the calculation is completed.

在步驟S208中，預先計算單元2092將進行使用者計算需求預測，在系統運行一段時間後，其目錄之使用率變化則可反應大部份使用者之計算需求，故如圖3所示，預先計算單元2092會定期記錄常用目錄的使用率變化，並在記錄滿一定次數(最少三次)後，進行預測動作。根據常用目錄中，每一個目錄所被記錄的使用率筆數，其筆數滿足一定筆數(最少三筆)後，開始進行預測，本實例預測方法使用灰預測，因灰預測可利用少數筆數即可達到快速預測之功效。首先將歷史使用率前後相減取出increment數列，因灰預測需取得累加生成數列其計算方式如公式六： In step S208, the pre-calculation unit 2092 will perform user calculation demand prediction. After the system is running for a period of time, the usage change of the directory can reflect the calculation requirements of most users, so as shown in FIG. The calculation unit 2092 periodically records the usage change of the common directory, and performs the prediction action after the recording is completed a certain number of times (at least three times). According to the number of usage records recorded in each directory in the common directory, after the number of pens meets a certain number (minimum of three strokes), the prediction is started. The prediction method of this example uses gray prediction, and the gray prediction can utilize a small number of pens. The number can achieve the effect of fast prediction. First, the historical usage rate is subtracted before and after the increment series is taken. Because the gray prediction needs to be accumulated, the calculation series is calculated as Equation 6:

根據灰建模需進行灰差分方程式，如公式七所示：increment(h)+slope＊z(h)=intercept h=1,2,...,q,q<g (公式七) The gray difference equation is required according to the gray modeling, as shown in the formula: increment ( h ) + slope * z ( h ) = intercept h =1, 2,..., q , q < g (Equation 7)

z為背景值數列，λ為背景值參數，一般設為0.5，如公式八所示：z(h)=λ＊accumulation(h)+(1-λ)＊accumulation(h-1)λ=0.5 h=2,...,q,q<g (公式八) z is the background value sequence, λ is the background value parameter, generally set to 0.5, as shown in Equation 8: z ( h ) = λ * accumulation ( h ) + (1 - λ ) * accumulation ( h -1) λ = 0.5 h =2,..., q , q < g (Equation 8)

公式七依最小平方法可取得slope以及intercept參數，依數學推導可得出時間響應式，如公式九所示： Equation 7 can obtain the slope and intercept parameters according to the least squares method, and the time response can be obtained by mathematical derivation, as shown in formula IX:

最後再透過累減生成即逐步取得預測數列，若要預測下幾步之值，則只須調整公式九之參數即可，如公式十所示： Finally, through the reduction generation, the prediction series is gradually obtained. To predict the value of the next step, only the parameters of the formula 9 need to be adjusted, as shown in the formula:

本實例中只需預測下一步的值即可，也就是p=1，當預測出下一步的使用率增長值後，排序取出前三增長最多的常用目錄(可能會發生不足三個的情況)，進行預先計算。首先搜尋所有數據資料庫並取出預測目錄對應之群組成員進行預先計算，也就是步驟S304及S209。其預先計算之優點可如上述表2所示，已計算之中繼資料可能被再用於下次使用者計算，即可加快計算輸出速度。 In this example, it is only necessary to predict the value of the next step, that is, p=1. After predicting the next use rate increase value, the sorting takes out the top three most popular common directories (there may be less than three cases) , for pre-calculation. First, all the data databases are searched and the group members corresponding to the predicted directory are taken out for pre-calculation, that is, steps S304 and S209. The advantages of its pre-computation can be as shown in Table 2 above. The calculated relay data may be reused for the next user calculation to speed up the calculation of the output speed.

當預測目錄產生並通知中繼資料計算模組207後，此模組將基於預測目錄之資料粒度以及預先計算單元2092之等待時間決定中繼資料計算安排，例如預先計算單元2092之等待時間設定為每月進行步驟S301，其預測目錄之資料粒度小於月才能進行預先計算，例如步驟S208於2014-04-01進行預測得出目錄11並通知中繼資料計算模組207，其目錄11資料粒度為天，此模組將針對預測目錄之對應成員進行2014-04-01至2014-04-30所有天數之預先計算，至2014-05-01時，組件6再進行下一次預測，以此類推。 After the prediction directory is generated and notified to the relay data calculation module 207, the module determines the relay data calculation schedule based on the data granularity of the prediction directory and the waiting time of the pre-calculation unit 2092. For example, the waiting time of the pre-calculation unit 2092 is set to Step S301 is performed every month, and the data granularity of the predicted directory is less than monthly to perform pre-calculation. For example, step S208 predicts the directory 11 and predicts the relay data calculation module 207 at 2014-04-01, and the directory 11 data granularity is In the day, this module will pre-calculate all the days from 2014-04-01 to 2014-04-30 for the corresponding members of the forecast catalog. By 2014-05-01, component 6 will make the next forecast, and so on.

上列詳細說明乃針對本發明之一可行實施例進行具體說明，惟該實施例並非用以限制本發明之專利範圍，凡未脫離本發明技藝精神所為之等效實施或變更，均應包含於本案之專利範圍中。 The detailed description of the present invention is intended to be illustrative of a preferred embodiment of the invention, and is not intended to limit the scope of the invention. The patent scope of this case.

綜上所述，本案不僅於技術思想上確屬創新，並具備習用之傳統方法所不及之上述多項功效，已充分符合新穎性及進步性之法定發明專利要件，爰依法提出申請，懇請貴局核准本件發明專利申請案，以勵發明，至感德便。 To sum up, this case is not only innovative in terms of technical thinking, but also has many of the above-mentioned functions that are not in the traditional methods of the past. It has fully complied with the statutory invention patent requirements of novelty and progressiveness, and applied for it according to law. Approved this invention patent application, in order to invent invention, to the sense of virtue.

100‧‧‧使用者 100‧‧‧Users

201‧‧‧資料來源管理模組 201‧‧‧Source Management Module

203‧‧‧數據資料庫 203‧‧‧Data Database

206‧‧‧目錄管理模組 206‧‧‧Directory Management Module

207‧‧‧中繼資料計算模組 207‧‧‧Relay data calculation module

209‧‧‧自我學習模組 209‧‧‧Self-learning module

2092‧‧‧預先計算單元 2092‧‧‧ Pre-calculation unit

300‧‧‧時序資料來源 300‧‧‧Time source

Claims

A system for predicting and calculating time series data, the system comprising a data source management module, a time series data collection and analysis module, a data database, a multi-level group management module, a group computing demand input module, a directory management module The group, the relay data calculation module, the calculation output module and the self-learning module, wherein: the data source management module receives the user's input source setting such that the time series data collection and analysis module collects time series data. The original data transmitted from the source; the time series data collection and analysis module is to analyze the original data transmitted from the source of the time series data, and identify the data object association and write the data database; the multi-level group management The module is to retrieve the data object association from the data database, and let the user create a group, and establish a multi-level group feature, and the group has another group, the members can be included in different The structure of the group; the multi-level group, the multi-level group includes a group structure of single-level or single-group members; the group computing demand input module To identify the computing needs of the user and to form a directory, the directory is generated based on a group of members and a computing requirement; the directory management module is configured for the directory in which the group computing demand input module is formed. Search, add 1 to the searched directory usage rate and calculate the difference coefficient between the directory and the other commonly used directories, and the difference coefficient is deducted from the usage rate of the compared directory, wherein the difference coefficient is calculated for the common directory. Calculate instead of entering the data repository There is a directory for calculation; the relay data calculation module performs relay data calculation for the directory taken out by the directory management module and the group to be calculated; the calculation output module is for calculating the relay data. The relay data completed by the group is calculated and outputted; and the self-learning module is to periodically update the common directory and predict the computing needs of the user.

For example, the system for predicting and calculating the time series data described in the first paragraph of the patent application, wherein the relay data and the directory cannot be shared and reused by different groups.

The system for predicting and calculating the time series data according to the first aspect of the patent application, wherein the self-learning module further comprises a common directory loading unit and a pre-computing unit, wherein: the common directory loading unit is through the sorting data. The usage of the directory in the database is high and low, and the common directory is updated regularly; the pre-computing unit finds the historical usage rate of the common directory and uses the growth rate series to predict the next directory growth value. The user's usual computing needs are pre-calculated for reuse.

A system for predicting and calculating time series data according to claim 3, wherein the common directory has a history usage record number of zero.

The system for predicting and calculating the time series data according to the third aspect of the patent application, wherein the pre-calculation unit notifies the relay data calculation module of the predicted common directory, and performs pre-calculation on the predicted directory and related members.

A system for predicting and calculating time series data as described in item 3 of the patent application, The pre-computation is calculated for common directories, rather than entering the data repository to retrieve all the catalogues for calculation.

The system for predicting and calculating time series data according to claim 5, wherein the relay data calculation module determines a relay data calculation schedule based on a data granularity of the predicted directory and a waiting time of the pre-calculation unit.

A method for predicting calculation of time series data comprises the following steps: Step 1: The user issues a calculation request to the group; Step 2: The group calculation demand input module analyzes the calculation requirements and forms a directory; Step 3: The directory management module Search for commonly used directories; Step 4: The common directory loading unit calculates the difference coefficient (1-similarity coefficient) for the remaining common directories through the fuzzy concept, and deducts the usage rate; Step 5: Search the directory in the data database; Step 6: Create this Table 7; Step 7: The directory management module adds 1 to the extracted directory; Step 8: The relay data calculation module arranges the calculation requirements according to the group members, wherein the computing requirements include: Step 1. Wait for a defined time Afterwards, the pre-calculation unit periodically records the usage rate of the current common directory; step 2. when the number of records meets its defined number of times; step 3. If the number of definitions is satisfied, the pre-calculation unit records the historical usage data in the current common directory. , satisfying the usage forecast of the directory whose number of definitions is defined, and taking out the directory of high prediction; Step 4. Based on the prediction directory, search the data database, take out the corresponding group members and perform pre-calculation, and go back to step 1; Step 5. If the number of definitions is not satisfied, go back to step 1; Step 9: Search relay calculation The data is in the data database; in step 10, the relay data calculation module starts to calculate and mounts the calculated data in the directory; step 11: when the relay data has been calculated, the calculation result output module performs the last The action is settled and the result is output.