TWI712950B

TWI712950B - Data processing method and apparatus

Info

Publication number: TWI712950B
Application number: TW108120547A
Authority: TW
Inventors: 楊璨瑜
Original assignee: 和碩聯合科技股份有限公司
Priority date: 2019-06-13
Filing date: 2019-06-13
Publication date: 2020-12-11
Also published as: TW202046091A

Abstract

The present invention provides a data processing method used to direct streaming sequential data having raw data to a Not Only SQL (NoSQL) data base. The data processing method includes the following steps: generating a first sequential data point that includes a tuple data, a first test time and a first test result according to the raw data; determining whether the NoSQL data base includes a second sequential data point that includes the tuple data; if affirmative, generating a third sequential data point that includes the tuple data, the third test time and the third test result according to the first and second sequential data points, replacing the second sequential data points with the third sequential data points and to store into the NoSQL data base; and if not, storing the first sequential data point into the NoSQL data base.

Description

Data processing method and device

本發明涉及一種資料處理方法及裝置，特別是涉及一種可直接且有效率地處理非順序性時序資料的方法及裝置。The present invention relates to a data processing method and device, in particular to a method and device that can directly and efficiently process non-sequential time series data.

在產品製造過程中，會將其導入不同功能的站別（各組裝站、測試站、維修站），每個產品會循生產線流過各種站別，而每一個產品在流經每一個站別時，因為可能需要反覆測試，會產生一組時序資料。In the product manufacturing process, it will be imported into different functional stations (each assembly station, test station, maintenance station), each product will flow through various stations along the production line, and each product will flow through each station At this time, because repeated tests may be required, a set of timing data will be generated.

在現有的做法中，若是時序資料之物件過多、時序太長，將耗費大量儲存空間。且在此時序資料為非順序性取得時，需要重新對時序資料做排序運算，使計算速度大幅下降。故，如何透過資料運算方式的改良，來有效且快速的處理非順序性資料，並節省儲存空間，進而克服上述的缺陷，已成為該項事業所欲解決的重要課題之一。In the existing practice, if there are too many objects in the timing data and the timing is too long, a large amount of storage space will be consumed. And when the time series data is obtained non-sequentially, it is necessary to re-sort the time series data, which greatly reduces the calculation speed. Therefore, how to efficiently and quickly process non-sequential data and save storage space through the improvement of data calculation methods to overcome the above-mentioned shortcomings has become one of the important issues to be solved by this business.

本發明所要解決的技術問題在於，針對現有技術的不足提供一種資料處理方法，能同時處理順序性與非順序性時序資料、不因時序太長而影響運算效率、且能有效地運用資料儲存空間。The technical problem to be solved by the present invention is to provide a data processing method in view of the shortcomings of the prior art, which can process sequential and non-sequential time series data at the same time, does not affect the operation efficiency due to too long time series, and can effectively use the data storage space .

為了解決上述的技術問題，本發明所採用的其中一技術方案是，提供一種資料處理方法，用以將至少一串流時序資料導入非關聯式資料庫，串流時序資料包括原始資料，資料處理方法包括：藉由處理單元來根據所述原始資料而產生第一時序資料點，第一時序資料點至少包括二元組資料、第一測試時間以及第一測試結果；藉由處理單元來判斷非關聯式資料庫中是否已存有包括二元組資料的第二時序資料點；若處理單元判斷非關聯式資料庫中已存有第二時序資料點，則藉由邏輯單元進一步進行以下步驟：比較第一時序資料點的第一測試時間及第二時序資料點的第二測試時間；以第一測試時間及第二測試時間中較晚的一者作為第三測試時間；對第一測試結果及第二時序資料點的第二測試結果執行邏輯運算，以產生第三測試結果；依據第三測試時間及第三測試結果產生第三時序資料點，其中，第三時序資料點包括二元組資料、第三測試時間以及第三測試結果；以及以第三時序資料點取代第二時序資料點並存入非關聯式資料庫。In order to solve the above technical problems, one of the technical solutions adopted by the present invention is to provide a data processing method for importing at least one stream time series data into a non-relational database. The stream time series data includes original data, and data processing The method includes: generating a first time series data point based on the original data by a processing unit, the first time series data point including at least two-tuple data, a first test time, and a first test result; Determine whether a second time series data point including binary data is already stored in the non-relational database; if the processing unit determines that there is a second time series data point in the non-relational database, the logic unit further performs the following Steps: compare the first test time of the first sequential data point and the second test time of the second sequential data point; use the later of the first test time and the second test time as the third test time; A test result and the second test result of the second sequential data point perform logical operations to generate a third test result; a third sequential data point is generated according to the third test time and the third test result, where the third sequential data point includes The binary data, the third test time, and the third test result; and the third time series data point is substituted for the second time series data point and stored in the non-relational database.

為了解決上述的技術問題，本發明所採用的另外一技術方案是，提供一種資料處理裝置，用以將至少一串流時序資料導入非關聯式資料庫，至少一串流時序資料包括原始資料，資料處理裝置包括：處理單元以及邏輯單元。處理單元根據原始資料而產生第一時序資料點，第一時序資料點至少包括二元組資料、第一測試時間以及第一測試結果，以及判斷非關聯式資料庫中是否已存有包括二元組資料的第二時序資料點。邏輯單元，耦接於處理單元，若邏輯單元判斷非關聯式資料庫中已存有第二時序資料點，則邏輯單元比較第一時序資料點的第一測試時間及第二時序資料點的第二測試時間、以第一測試時間及第二測試時間中較晚的一者作為第三測試時間、對第一測試結果及第二時序資料點的第二測試結果執行邏輯運算，以產生第三測試結果、依據第三測試時間及第三測試結果產生第三時序資料點以及以第三時序資料點取代第二時序資料點並存入非關聯式資料庫；其中，第三時序資料點包括二元組資料、第三測試時間以及第三測試結果。In order to solve the above technical problem, another technical solution adopted by the present invention is to provide a data processing device for importing at least one stream of time series data into a non-relational database, and the at least one stream of time series data includes original data. The data processing device includes: a processing unit and a logic unit. The processing unit generates a first time series data point based on the original data. The first time series data point includes at least two-tuple data, a first test time, and a first test result, and determines whether the non-relational database already contains The second sequential data point of the binary data. The logic unit is coupled to the processing unit. If the logic unit determines that there is a second time series data point in the non-relational database, the logic unit compares the first test time of the first time series data point and the second time series data point The second test time, the later of the first test time and the second test time is used as the third test time, and the logic operation is performed on the first test result and the second test result of the second sequential data point to generate the first test result Three test results, generating a third time series data point based on the third test time and the third test result, and replacing the second time series data point with the third time series data point and storing it in the non-relational database; where the third time series data point includes Two-tuple data, third test time, and third test result.

本發明的其中一有益效果在於，本發明所提供的資料處理方法，其能通過“利用各個產品編號與各個站別的二元組（Tuple）以建立資料流的多層次架構”以及“僅比對具有同樣二元組的兩筆資料並及時更新產品狀態”的技術方案，進而大幅提升運算效率並降低系統的儲存空間。One of the beneficial effects of the present invention is that the data processing method provided by the present invention can establish a multi-level structure of data flow by "using the tuple of each product number and each station" and "only comparing The technical solution of "update the product status in time for two pieces of data with the same two-tuple" will greatly increase the computing efficiency and reduce the storage space of the system.

為使能更進一步瞭解本發明的特徵及技術內容，請參閱以下有關本發明的詳細說明與圖式，然而所提供的圖式僅用於提供參考與說明，並非用來對本發明加以限制。In order to further understand the features and technical content of the present invention, please refer to the following detailed description and drawings about the present invention. However, the provided drawings are only for reference and description, and are not used to limit the present invention.

參閱圖1，圖1為本發明實施例提供一種資料處理方法的方塊流程圖。本發明實施例的資料處理方法，用於即時管理產品於不同測試站別的測試時間及測試結果。本發明的資料處理方法主要藉由處理單元及邏輯單元來執行存於非暫態電腦可讀媒體中的程式而達成。值得注意的是，處理單元不需進行任何邏輯運算，只要能對原始資料做格式處理即可。而邏輯單元可以是任何具邏輯運算功能的韌體或是軟體，例如：處理器、運算單元、記憶體中的指令、程式語言等等，本發明不對此做任何限制。在本實施例中的產品為手機，但本發明適用的產品種類不限於此。Referring to FIG. 1, FIG. 1 is a block flowchart of a data processing method according to an embodiment of the present invention. The data processing method of the embodiment of the present invention is used for real-time management of the test time and test results of products at different test stations. The data processing method of the present invention is mainly achieved by a processing unit and a logic unit to execute a program stored in a non-transitory computer-readable medium. It is worth noting that the processing unit does not need to perform any logical operations, as long as the original data can be formatted. The logic unit can be any firmware or software with a logic operation function, such as a processor, arithmetic unit, instructions in the memory, a programming language, etc., and the present invention does not impose any limitation on this. The product in this embodiment is a mobile phone, but the types of products applicable to the present invention are not limited to this.

圖1的資料處理方法可包含下列步驟：The data processing method of Figure 1 may include the following steps:

步驟S1：於非關聯式資料庫中匯入產品的產品編號。步驟S2：於非關聯式資料庫中匯入產品於測試站別的第一測試時間及第一測試結果。一般來說，產品於多個測試站經過測試之後會記錄產品每次的測試時間及測試結果。此時，為了追蹤控管產品的良率，產品本身的產品編號(如：生產序號、型號、條碼等)及測試站別(測試站的編號或名稱)，連同上述的測試時間及測試結果會形成一組串流時序資料，並被匯入非關聯式資料庫中進行儲存，如圖1的步驟S1~S2所示。值得注意的是，於實際應用中，本發明的資料處理方法不僅僅受限於測試站，也可以運用於維修站或是組裝產線等，當然，也可以如本實施例所揭露的用於測試與維修(Test and Repair)之產品狀態追蹤。Step S1: Import the product number of the product in the non-relational database. Step S2: Import the first test time and the first test result of the product at the test station in the non-relational database. Generally speaking, after a product has been tested at multiple test stations, the test time and test results of each product are recorded. At this time, in order to track and control the yield of the product, the product number (such as production serial number, model, barcode, etc.) and test station (number or name of the test station) of the product itself, together with the above-mentioned test time and test results, will be A set of streaming time series data is formed and imported into a non-relational database for storage, as shown in steps S1 to S2 in Figure 1. It is worth noting that in practical applications, the data processing method of the present invention is not only limited to test stations, but can also be applied to maintenance stations or assembly lines. Of course, it can also be used as disclosed in this embodiment. Product status tracking for Test and Repair.

步驟S3：藉由處理單元來將產品編號及測試站別彙整成二元組(tuple)資料。然後，處理單元將由產品編號及測試站別所組成的二元組資料、第一測試時間及第一測試結果組成產品的第一時序資料點並儲存於非關聯式資料庫中。更精確地說，進一步參閱圖2，本實施例的時序資料點的資料格式由關鍵值(key)欄位、時間值(test time)欄位及數值(value)欄位三個部分依序組成。舉例來說，第一時序資料點210的關鍵值欄位用以儲存二元組資料A20170912X01_ST1，也就是產品編號“A20170912X01”及測試站別“ST1”；第一時序資料點210的時間值欄位用以儲存第一測試時間“20170921081820”；而第一時序資料點210的數值欄位用以儲存第一測試結果及/或第一維修紀錄“Pass”。類似地，第二時序資料點220的關鍵值欄位用以儲存二元組資料A20170912X01_ST1；第二時序資料點220的時間值欄位用以儲存第二測試時間“20170921093854”；而第二時序資料點220的數值欄位用以儲存第一測試結果及/或第一維修紀錄“Pass”。由上述可知，第一時序資料點210以及第二時序資料點220的關鍵值欄位具有相同的二元組資料，其表示於相同的測試站進行測試。然而，第一時序資料點210以及第二時序資料點220的時間值欄位以及數值欄位分別具有不同的數值，其代表相同的產品在不相同的測試時間點於相同的測試站測試而得到不同的測試結果。值得注意的是，有別於傳統的時序資料處理方法，本發明的資料處理方法先將產品的二元組資料(產品編號及測試站別)彙整後放在時序資料點的關鍵值欄位，才將時序資料點儲存於非關聯式資料庫中，此做法有利後續的比對。Step S3: Utilize the processing unit to consolidate the product number and the test station into a tuple data. Then, the processing unit composes the first time sequence data point of the product composed of the two-tuple data composed of the product number and the test station, the first test time, and the first test result and stores it in the non-relational database. More precisely, referring to FIG. 2, the data format of the time series data point of this embodiment is composed of three parts in sequence: a key field, a test time field, and a value field. . For example, the key value field of the first time series data point 210 is used to store the binary data A20170912X01_ST1, which is the product number "A20170912X01" and the test station type "ST1"; the time value of the first time series data point 210 The field is used to store the first test time "20170921081820"; and the value field of the first time series data point 210 is used to store the first test result and/or the first maintenance record "Pass". Similarly, the key value field of the second time series data point 220 is used to store the binary data A20170912X01_ST1; the time value field of the second time series data point 220 is used to store the second test time "20170921093854"; and the second time series data The value field of point 220 is used to store the first test result and/or the first maintenance record "Pass". From the above, it can be seen that the key value fields of the first time series data point 210 and the second time series data point 220 have the same two-tuple data, which are represented by the same test station for testing. However, the time value field and the value field of the first time series data point 210 and the second time series data point 220 respectively have different values, which means that the same product is tested at the same test station at different test time points. Get different test results. It is worth noting that, different from the traditional time series data processing method, the data processing method of the present invention first aggregates the product's binary data (product number and test station type) and places it in the key value field of the time series data point. The time series data points are stored in the non-relational database, which is beneficial for subsequent comparisons.

步驟S4：藉由處理單元來判斷非關聯式資料庫中是否已存有包括所述二元組資料的第二時序資料點。步驟S5：若處理單元判斷非關聯式資料庫中已存有所述第二時序資料點，則藉由邏輯單元進一步進行邏輯運算。步驟S6：相反地，若處理單元判斷非關聯式資料庫中未存有所述第二時序資料點，則處理單元將第一時序資料點存入非關聯式資料庫。簡單地說，在步驟S4中，本發明之資料處理方法判斷非關聯式資料庫中是否已存有包括所述二元組資料的第二時序資料點，即為：判斷非關聯式資料庫中是否已存有相同產品於相同測試站別(即：二元組資料相同)的其他不同時間的測試結果。意即：若非關聯式資料庫中已存有包括該二元組(產品-測試站別)資料的第二時序資料點，則表示該產品已於該測試站做過測試並已記錄測試時間及測試結果。換句話說，第二時序資料點具有該二元組(相同於第一時序資料點的二元組)、第二測試時間(不同於第一時序資料點的第一測試時間)以及第二測試結果。若結果為非關聯式資料庫已存有第二時序資料點，則本發明的資料處理方法進入步驟S5：邏輯單元進行邏輯運算。若結果為未存有第二時序資料點，則表示此產品於此測試站僅做過這次測試，因此，本發明的資料處理方法直接進行步驟S6：處理單元將第一時序資料點存入非關聯式資料庫。必須注意的是，本發明的二元組資料為產品-測試站別的組合，因此，以上所述的“包括二元組資料”必須是同樣產品編號以及同樣的測試站別。有鑑於同樣的產品可能在不同的測試站進行測試，於此狀況下，二元組資料就會不同。再者，受惠於本發明的資料處理方法先將產品的二元組資料(產品編號及測試站別)彙整後放在時序資料點的關鍵值欄位，因此，此處的資料比對步驟，處理單元僅需比對非關聯式資料庫中所存在的時序資料點的關鍵值欄位與第一時序資料點的關鍵值欄位即可，確認關鍵值完全一樣再進行後方欄位的邏輯運算，有效節省資料排序及比較的時間，且可即時運算並更新產品的當前狀態。Step S4: The processing unit is used to determine whether a second time series data point including the two-tuple data is already stored in the non-relational database. Step S5: If the processing unit determines that the second time series data point already exists in the non-relational database, the logic unit further performs logic operations. Step S6: Conversely, if the processing unit determines that the second time series data point is not stored in the non-relational database, the processing unit stores the first time series data point in the non-relational database. Simply put, in step S4, the data processing method of the present invention determines whether a second time series data point including the two-tuple data already exists in the non-relational database, that is, judging the non-relational database Whether there are other test results of the same product at the same test station (ie: the same two-tuple data) at different times. This means that if the non-relational database already contains the second sequence data point including the data of the two-tuple (product-test station), it means that the product has been tested at the test station and the test time and Test Results. In other words, the second time series data point has the two-tuple (the same two-tuple as the first time series data point), the second test time (different from the first test time of the first time series data point), and the second 2. Test results. If the result is that the non-relational database already has a second time series data point, the data processing method of the present invention proceeds to step S5: the logic unit performs a logic operation. If the result is that there is no second time series data point, it means that this product has only been tested this time at this test station. Therefore, the data processing method of the present invention directly proceeds to step S6: the processing unit stores the first time series data point in Non-relational database. It must be noted that the binary data of the present invention is a combination of product-testing station. Therefore, the above-mentioned "including binary data" must be the same product number and the same test station. In view of the fact that the same product may be tested at different test stations, in this situation, the binary data will be different. Furthermore, thanks to the data processing method of the present invention, the binary data of the product (product number and test station) are first collected and placed in the key value field of the time series data point. Therefore, the data comparison step here , The processing unit only needs to compare the key value field of the time series data point existing in the non-relational database with the key value field of the first time series data point, confirm that the key value is exactly the same, and then perform the following field Logical operation effectively saves the time of data sorting and comparison, and can calculate and update the current status of the product in real time.

邏輯運算包括以下步驟。步驟S51：比較第一時序資料點的第一測試時間及第二時序資料點的第二測試時間。步驟S52：以第一測試時間及第二測試時間中較晚的一者作為第三測試時間。步驟S53：產生第三時序資料點，其中，第三時序資料點包括二元組資料、第三測試時間以及第三測試結果。若處理單元判斷已存有第二時序資料點，則邏輯單元比較第一時序資料點及第二時序資料點的時間值欄位及數值欄位。首先，邏輯單元比較第一時序資料點及第二時序資料點之時間值欄位中的第一測試時間及第二測試時間，並根據第一測試時間及第二測試時間來產生第三測試時間(即：更新後的測試時間)。第三測試時間為第一測試時間及第二測試時間中相對較晚的測試時間。也就是說，若第一測試時間早於第二測試時間，則第三測試時間便等於第二測試時間。若第一測試時間晚於第二測試時間，則第三測試時間便等於第一測試時間。換句話說，相同產品於相同測試站別的不同時間點上進行測試，僅有相對較晚的時間點會被記錄下來。The logic operation includes the following steps. Step S51: Compare the first test time of the first time series data point and the second test time of the second time series data point. Step S52: Use the later one of the first test time and the second test time as the third test time. Step S53: Generate a third time series data point, where the third time series data point includes binary data, a third test time, and a third test result. If the processing unit determines that the second time series data point already exists, the logic unit compares the time value field and the value field of the first time series data point and the second time series data point. First, the logic unit compares the first test time and the second test time in the time value field of the first sequential data point and the second sequential data point, and generates a third test according to the first test time and the second test time Time (ie: the updated test time). The third test time is the relatively later test time of the first test time and the second test time. In other words, if the first test time is earlier than the second test time, the third test time is equal to the second test time. If the first test time is later than the second test time, the third test time is equal to the first test time. In other words, the same product is tested at different time points at the same test station, and only the relatively late time point will be recorded.

進一步參閱圖3，圖3為測試結果的真值表。當邏輯單元對第一時序資料點及第二時序資料點之數值欄位中的第一測試結果及第二測試結果進行邏輯運算後，會產生以下各種邏輯運算結果，即：第三測試結果。第三測試結果可包括：測試通過(pass)、測試失敗(fail)及重測通過(retest pass)。參照圖3表，若測試時間較早的測試結果A為測試通過，測試時間較晚的測試結果B為測試通過，則邏輯運算結果(即：第三測試結果X)即為測試通過。若測試時間較早的測試結果A為測試通過，測試時間較晚的測試結果B為測試失敗，則第三測試結果X即為測試失敗。若測試時間較早的測試結果A為測試失敗，測試時間較晚的測試結果B為測試通過，則第三測試結果X即為重測通過。若測試時間較早的測試結果A為測試失敗，測試時間較晚的測試結果B為測試失敗，則第三測試結果X即為測試失敗。接著，延續此邏輯運算，若測試時間較早的測試結果A為測試通過，測試時間較晚的測試結果B為重測通過，則邏輯運算結果X即為重測通過。若測試時間較早的測試結果A為重測通過，測試時間較晚的測試結果B為測試通過，則第三測試結果X即為重測通過。若測試時間較早的測試結果A為重測通過，測試時間較晚的測試結果B為測試失敗，則第三測試結果X即為測試失敗。若測試時間較早的測試結果A為測試失敗，測試時間較晚的測試結果B為重測通過，則第三測試結果X即為重測通過。值得注意的是，雖然重測通過(retest pass)代表產品在最後的測試是通過的，但跟測試通過(pass)的狀態在產品良率上的判斷意義還是不同的。Further refer to Figure 3, which is the truth table of the test results. When the logic unit performs a logical operation on the first test result and the second test result in the numerical fields of the first time series data point and the second time series data point, the following various logical operation results will be generated, namely: the third test result . The third test result may include: a pass, a fail, and a retest pass. Referring to the table in Figure 3, if the test result A with an earlier test time is the test passed and the test result B with a later test time is the test passed, the logical operation result (ie, the third test result X) is the test passed. If the test result A with the earlier test time is the test passed, and the test result B with the later test time is the test failed, the third test result X is the test failed. If the test result A with the earlier test time is the test failure, and the test result B with the later test time is the test passed, the third test result X is the retest passed. If the test result A with an earlier test time is a test failure, and the test result B with a later test time is a test failure, the third test result X is a test failure. Then, the logic operation is continued. If the test result A with the earlier test time is the test passed, and the test result B with the later test time is the retest passed, the logic operation result X is the retest passed. If the test result A with the earlier test time is passed the re-test, and the test result B with the later test time is the test passed, the third test result X is the re-test passed. If the test result A with the earlier test time is passed the retest, and the test result B with the later test time is the test failed, the third test result X is the test failed. If the test result A with an earlier test time is a test failure, and the test result B with a later test time is a retest passed, the third test result X is a retest passed. It is worth noting that although a retest pass means that the product passed the final test, it is different from the pass state in terms of product yield.

參閱圖4A及4B，圖4A及4B為針對圖3在不同時間點R1、R2以及R3所得出的第三測試結果，以測試時間較早的測試結果A為測試失敗(fail)且測試時間較晚的測試結果B為測試通過(pass)為例作說明。若在R1時間點從非關聯式資料庫中擷取產品的時序資料點，因產品尚未做過測試，因此第三測試結果X為無(none)；若在R2時間點從非關聯式資料庫中擷取產品的時序資料點資料，因產品僅做第一次測試且結果為測試失敗，因此第三測試結果X便是測試失敗；若在R3時間點從非關聯式資料庫中擷取產品的時序資料點資料，因該產品在該測試站別已經過二次測試並有不同的測試結果，透過本實施例的邏輯運算後得到第三測試結果X為重測通過。可以得知，雖然產品在測試站的最後一筆測試結果是通過的，但在之前的測試曾經測試失敗過。Refer to Figures 4A and 4B, Figures 4A and 4B are the third test results obtained at different time points R1, R2, and R3 for Figure 3, and the test result A with the earlier test time is regarded as the test failure (fail) and the test time is longer. The late test result B is an example of a pass. If the time series data points of the product are retrieved from the non-relational database at time R1, because the product has not been tested, the third test result X is none; if it is from the non-relational database at time R2 The time sequence data point data of the product is retrieved in the, because the product is only tested for the first time and the result is a test failure, the third test result X is the test failure; if the product is retrieved from the non-relational database at time R3 Because the product has been tested twice at the test station and has different test results, the third test result X obtained after the logic operation of this embodiment is passed the retest. It can be known that although the last test result of the product at the test station was passed, the previous test has failed.

在產生了第三時序資料點的第三測試時間以及第三測試結果之後，邏輯單元進行步驟S54：以第三時序資料點取代第二時序資料點並存入非關聯式資料庫。簡單地說，就是為產品在測試站的測試資料做更新。當然，如同上述，數值欄的部分也可以計入產品在維修站的維修紀錄，其邏輯運算方法及資料處理方法與前述相同，於此便不再贅述。After the third test time of the third time series data point and the third test result are generated, the logic unit proceeds to step S54: replace the second time series data point with the third time series data point and store it in the non-associated database. Simply put, it is to update the test data of the product at the test station. Of course, as mentioned above, the part of the value column can also be included in the maintenance record of the product at the repair station. The logical operation method and data processing method are the same as the above, so I will not repeat it here.

本發明的其中一有益效果在於，本發明所提供的資料處理方法，其能通過“利用各個產品編號與各個站別的二元組（Tuple）以建立資料流的多層次架構”以及“僅比對具有同樣二元組的兩筆資料並及時更新產品測試狀態”的技術方案，進而大幅提升運算效率。另一方面，在非關聯式資料庫中僅會儲存各個產品最晚的測試時間及更新後的測試結果，相較於習知技術會儲存各個產品的不同測試時間及不同的測試結果，本發明的資料處理方法可降低儲存空間。One of the beneficial effects of the present invention is that the data processing method provided by the present invention can establish a multi-level structure of data flow by "using the tuple of each product number and each station" and "only comparing The technical solution of "two pieces of data with the same two-tuple and timely update of the product test status", thereby greatly improving computing efficiency. On the other hand, in the non-relational database, only the latest test time and updated test results of each product are stored. Compared with the conventional technology, which stores different test times and different test results of each product, the present invention The data processing method can reduce storage space.

更進一步來說，相比於傳統資料處理方法需要先將所有資料載入記憶體或存入資料庫、濾出該產品在同一站別的所有測試資料、一一比較出所有測試資料的先後順序、以得出所有測試資料的正確排序才能判斷出最後測試結果，本發明的資料處理方法僅需比對兩組測試資料，並即時更新產品當前測試狀態，不僅能同時處理順序性與非順序性時序資料、不因時序太長而影響運算效率、且能有效地運用資料儲存空間。再者，本發明資料處理方法不須等到該產品在該站別的所有測試資料全部出來才能得知產品的測試狀態，而是可以在不同時間點去更新資料的即時測試結果，於實務操作上更能有效的掌握產品的確切動態。Furthermore, compared to traditional data processing methods, all data needs to be loaded into memory or stored in the database, filtered out all test data of the product at the same station, and the order of all test data is compared one by one. , The final test result can be judged by obtaining the correct order of all test data. The data processing method of the present invention only needs to compare two sets of test data and update the current test status of the product in real time. It can not only process sequential and non-sequential data at the same time Time series data does not affect computing efficiency due to too long time series, and can effectively use data storage space. Furthermore, the data processing method of the present invention does not need to wait until all the test data of the product at the site is fully available to know the test status of the product, but can update the real-time test results of the data at different points in time, which is practical in operation. More effectively grasp the exact dynamics of the product.

以上所公開的內容僅為本發明的優選可行實施例，並非因此侷限本發明的申請專利範圍，所以凡是運用本發明說明書及圖式內容所做的等效技術變化，均包含於本發明的申請專利範圍內。The content disclosed above is only a preferred and feasible embodiment of the present invention, and does not limit the scope of the patent application of the present invention. Therefore, all equivalent technical changes made using the description and schematic content of the present invention are included in the application of the present invention. Within the scope of the patent.

無no

圖1為本發明資料處理方法的方塊流程圖。Figure 1 is a block flow diagram of the data processing method of the present invention.

圖2為本發明之時序資料點的格式示意圖。FIG. 2 is a schematic diagram of the format of the time series data point of the present invention.

圖3為時序資料點之測試結果的真值表的示意圖。Figure 3 is a schematic diagram of the truth table of the test results of time series data points.

圖4A及4B為針對圖3在不同時間點所得的測試結果示意圖。4A and 4B are schematic diagrams of the test results obtained at different time points for FIG. 3.

Claims

A data processing method is used to import at least one stream time series data into a non-relational database, the at least one stream time series data includes a raw data, and the data processing method includes: by a processing The unit generates a first time series data point based on the original data, the first time series data point at least including a tuple data, a first test time, and a first test result; and The processing unit determines whether a second time series data point including the binary data is already stored in the non-relational database, and if the processing unit determines that the non-relational database already exists If there is the second sequential data point, a logic unit is used to further perform the following steps: compare the first test time of the first sequential data point and a second test of the second sequential data point Time; the later of the first test time and the second test time is used as a third test time; a second test result of the first test result and the second sequential data point Perform logic operations to generate a third test result; generate a third time series data point according to the third test time and the third test result, wherein the third time series data point includes the binary data , The third test time and the third test result; and the third time series data point is substituted for the second time series data point and stored in the non-associated database.

For example, the data processing method described in item 1 of the scope of patent application further includes: if the processing unit determines that the second sequential data point is not stored in the non-relational database, the processing unit will The first sequential data point is stored in the non- Relational database.

Such as the data processing method described in item 1 of the scope of patent application, wherein the two-tuple data includes a product number and a station category.

The data processing method described in item 1 of the scope of patent application, wherein the data format of each of the first time series data point, the second time series data point, and the third time series data point is composed of a A key value field, a time value field, and a value field are composed, wherein the key value field of the first time series data point includes the two-tuple data, and the first time series data point The time value field includes the first test time, and the value field of the first time series data point includes the first test result.

For example, the data processing method described in item 4 of the scope of patent application, wherein the step of determining whether the second time series data point already exists in the non-relational database includes only checking whether the non-relational database contains The value of the key value field of the first time series data point.

The data processing method described in item 1 of the scope of patent application, wherein each of the first test result, the second test result, and the third test result includes one of the following: test passed, test Failed and passed the retest.

For example, the data processing method described in item 6 of the scope of patent application, wherein: if the first test result is a test passed and the second test result is a test passed, then the third test result is a test passed; If the first test result is a test passed, the second test result is a test failure, and the first test time is earlier than the second test time, then the The third test result is a test failure; if the first test result is a test passed, the second test result is a test failure, and the second test time is earlier than the first test time, then the third The test result is a retest passed; and if the first test result is a test failure and the second test result is a test failure, the third test result is a test failure.

For example, the data processing method described in item 6 of the scope of patent application, wherein: if the first test result is passed the re-test and the second test result is the test passed, the third test result is the re-test passed; The first test result is a retest passed, the second test result is a test failure, and the first test time is earlier than the second test time, then the third test result is a test failure; A test result is a retest passed, the second test result is a test failure, and the second test time is earlier than the first test time, the third test result is a retest passed; and if the first test time The test result is the test passed, and the second test result is the retest passed, and the third test result is the retest passed.

According to the data processing method described in item 1 of the scope of patent application, the non-relational database includes one of the following: HBase, Dynamo, BigTable, Cassandra, Hypertable, and Mongodb.

A data processing device is used to import at least one stream time series data into a non-relational database, the at least one stream time data includes a raw data, and the data processing device includes: a processing unit, A first time series data point is generated according to the original data, so The first time series data point includes at least a tuple data, a first test time, and a first test result, and it is determined whether the non-relational database already contains the tuple data A second time series data point of data; and a logic unit, coupled to the processing unit, if the logic unit determines that the second time series data point already exists in the non-relational database, the The logic unit compares the first test time of the first time series data point and a second test time of the second time series data point, and the first test time and the second test time are later One is used as a third test time, and a logic operation is performed on the first test result and a second test result of the second time series data point to generate a third test result and a third time series data point , And replace the second time series data point with the third time series data point and store it in the non-relational database; wherein, the third time series data point includes the two-tuple data, the third Test time and a third test result.

For example, the data processing device according to item 10 of the scope of patent application, wherein, if the processing unit determines that the second time series data point is not stored in the non-relational database, the processing unit additionally The first time sequence data point is stored in the non-relational database.

For example, the data processing device described in item 10 of the scope of patent application, wherein the two-tuple data includes a product number and a station category.

The data processing device according to claim 10, wherein the data format of each of the first time series data point, the second time series data point, and the third time series data point is composed of a Key value field, a time value field and a value The key value field of the first time series data point includes the two-tuple data, and the time value field of the first time series data point includes the first test Time, and the value field of the first time series data point includes the first test result.

The data processing device according to item 13 of the scope of patent application, wherein the step of the processing unit determining whether the second time series data point already exists in the non-relational database includes the processing unit only checking all Whether the non-relational database includes the value of the key value field of the first time series data point.

The data processing device according to claim 10, wherein each of the first test result, the second test result, and the third test result includes one of the following: test passed, test Failed and passed the retest.

For example, the data processing device described in item 15 of the scope of patent application, wherein: if the first test result is a test passed and the second test result is a test passed, then the third test result is a test passed; If the first test result is a test passed, the second test result is a test failure, and the first test time is earlier than the second test time, the third test result is a test failure; A test result is a test passed, the second test result is a test failure, and the second test time is earlier than the first test time, the third test result is a retest passed; and if the first test time The test result is a test failure, and the second test result is a test failure, and the third test result is a test failure.

Such as the data processing device described in item 16 of the scope of patent application, in which: If the first test result is passed the retest and the second test result is passed the test, then the third test result is passed the retest; if the first test result is passed the retest, the second test result is The test fails, and the first test time is earlier than the second test time, the third test result is a test failure; if the first test result is a retest passed, the second test result is a test failure , And the second test time is earlier than the first test time, the third test result is passed the retest; and if the first test result is the test passed, the second test result is the retest passed, Then the third test result is passed the retest.

According to the data processing device described in item 10 or 11 of the scope of patent application, the non-relational database includes one of the following: HBase, Dynamo, BigTable, Cassandra, Hypertable, and Mongodb.