TW202032395A - Data calculation method and engine - Google Patents

Data calculation method and engine Download PDF

Info

Publication number
TW202032395A
TW202032395A TW108132569A TW108132569A TW202032395A TW 202032395 A TW202032395 A TW 202032395A TW 108132569 A TW108132569 A TW 108132569A TW 108132569 A TW108132569 A TW 108132569A TW 202032395 A TW202032395 A TW 202032395A
Authority
TW
Taiwan
Prior art keywords
target
node
data
input parameters
current layer
Prior art date
Application number
TW108132569A
Other languages
Chinese (zh)
Other versions
TWI723535B (en
Inventor
趙亮星雲
Original Assignee
香港商阿里巴巴集團服務有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 香港商阿里巴巴集團服務有限公司 filed Critical 香港商阿里巴巴集團服務有限公司
Publication of TW202032395A publication Critical patent/TW202032395A/en
Application granted granted Critical
Publication of TWI723535B publication Critical patent/TWI723535B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a data calculation method and an engine, and the method comprises the steps: receiving a data calculation request which comprises identifiers of a plurality of target data views; according to a preset DAG configuration corresponding to the data view, determining input parameters of a current layer DS node and a current layer DS node of each target data view; determining a plurality of current-layer target DS nodes and input parameters thereof according to the current-layer DS nodes and the input parameters thereof, the plurality of current-layer target DS nodes having nofirst DS node or second DS node, the first DS node being the same as the second DS node, and the input parameters of the first DS node being the same as that of the second DS node; executing each current-layer target DS node according to the input parameter of each current-layer target DS node; and determining a data calculation result of the current layer of each target data view according to the execution result of each target DS node of the current layer and the DAG configuration corresponding to each target data view.

Description

資料計算方法及引擎Data calculation method and engine

本發明係關於電腦技術領域,特別關於一種資料計算方法及引擎。The present invention relates to the field of computer technology, in particular to a data calculation method and engine.

在業務系統的運行過程中,會產生大量的資料。在實際應用場景中,開發人員一般根據自身的需求開發相應的腳本,並利用該腳本對業務系統產生的資料進行計算,資料計算結果可被用於分析用戶的需求等。 例如,以下兩個查詢過程分別對應兩個腳本,一是查詢用戶1的常用IP,然後查詢該IP使用的常用設備;二是查詢用戶1的常用IP,然後查詢該IP的最近使用時間。透過完整地執行該腳本來實現查詢目的。 上述兩個查詢過程中都存在“查詢用戶1的常用IP”。由於資料查詢實現的全過程包裝在一整段程式碼片段中,所以,在實際查詢過程中,需要對“用戶1的常用IP”這一資料進行2次查詢。而重複查詢同一資料,將增加業務系統的IO消耗。During the operation of the business system, a large amount of data will be generated. In actual application scenarios, developers generally develop corresponding scripts according to their own needs, and use the scripts to calculate data generated by the business system. The data calculation results can be used to analyze user needs. For example, the following two query processes correspond to two scripts respectively. One is to query the common IP of user 1 and then the common equipment used by that IP; the other is to query the common IP of user 1 and then query the most recent use time of the IP. The query purpose is achieved by executing the script completely. In the above two query processes, there is "query user 1's common IP". Since the entire process of data query is packaged in a whole piece of code fragment, in the actual query process, it is necessary to query the data of "User 1's common IP" twice. The repeated query of the same data will increase the IO consumption of the business system.

鑒於此,本發明實施例提供了一種資料計算方法及引擎,能夠降低業務系統的IO消耗。 第一態樣,本發明實施例提供了一種資料計算方法,包括: 接收資料計算請求,其中,所述資料計算請求中包括:若干目標資料視圖的標識; 根據預設的與資料視圖相對應的DAG(Directed Acyclic Graph,有向無環圖)配置,確定各個所述目標資料視圖的當前層DS(Data Source,資料源)節點和所述當前層DS節點的輸入參數; 根據各個所述當前層DS節點及其輸入參數,確定若干當前層目標DS節點及其輸入參數,其中,所述若干當前層目標DS節點中不存在第一DS節點和第二DS節點,所述第一DS節點與第二DS節點相同、且所述第一DS節點的輸入參數與所述第二DS節點的輸入參數相同; 根據各個所述當前層目標DS節點的輸入參數,執行各個所述當前層目標DS節點; 根據各個所述當前層目標DS節點的執行結果和各個所述目標資料視圖對應的DAG配置,確定各個所述目標資料視圖的當前層的資料計算結果。 第二態樣,本發明實施例提供了一種資料計算引擎,包括: 接收單元,用於接收資料計算請求,其中,所述資料計算請求中包括:若干目標資料視圖的標識; 確定單元,用於根據預設的與資料視圖相對應的DAG配置,確定各個所述目標資料視圖的當前層DS節點和所述當前層DS節點的輸入參數; 合併單元,用於根據各個所述當前層DS節點及其輸入參數,確定若干當前層目標DS節點及其輸入參數,其中,所述若干當前層目標DS節點中不存在第一DS節點和第二DS節點,所述第一DS節點與第二DS節點相同、且所述第一DS節點的輸入參數與所述第二DS節點的輸入參數相同; 執行單元,用於根據各個所述當前層目標DS節點的輸入參數,執行各個所述當前層目標DS節點; 計算單元,用於根據各個所述當前層目標DS節點的執行結果和各個所述目標資料視圖對應的DAG配置,確定各個所述目標資料視圖的當前層的資料計算結果。 本發明實施例採用的上述至少一個技術方案能夠達到以下有益效果:該方法將資料計算抽象為取數邏輯和資料加工邏輯,其中,取數邏輯透過DS節點(資料源層)實現,資料加工邏輯透過DAG配置(資料視圖層)實現。當接收到資料計算請求時,該方法將根據DAG配置分層收集DS節點(IO節點),並執行去重後的DS節點,減少對業務系統的存取次數,降低業務系統的IO消耗。In view of this, the embodiments of the present invention provide a data calculation method and engine, which can reduce the IO consumption of the business system. In the first aspect, an embodiment of the present invention provides a data calculation method, including: Receiving a data calculation request, wherein the data calculation request includes: identifiers of a number of target data views; Determine the current layer DS (Data Source, data source) node and the current layer DS node of each target data view according to the preset DAG (Directed Acyclic Graph) configuration corresponding to the data view Input parameters; According to each of the current layer DS nodes and their input parameters, a number of current layer target DS nodes and their input parameters are determined, wherein the first DS node and the second DS node do not exist in the number of current layer target DS nodes. The first DS node is the same as the second DS node, and the input parameters of the first DS node are the same as the input parameters of the second DS node; Execute each of the current layer target DS nodes according to the input parameters of each of the current layer target DS nodes; According to the execution result of each target DS node of the current layer and the DAG configuration corresponding to each target data view, the data calculation result of the current layer of each target data view is determined. In the second aspect, an embodiment of the present invention provides a data calculation engine, including: The receiving unit is configured to receive a data calculation request, wherein the data calculation request includes: identifiers of a number of target data views; The determining unit is configured to determine the current layer DS node of each target data view and the input parameters of the current layer DS node according to the preset DAG configuration corresponding to the data view; The merging unit is used to determine a number of current layer target DS nodes and their input parameters according to each of the current layer DS nodes and their input parameters, wherein the first DS node and the second DS node do not exist in the plurality of current layer target DS nodes. DS node, the first DS node is the same as the second DS node, and the input parameters of the first DS node are the same as the input parameters of the second DS node; The execution unit is configured to execute each of the current layer target DS nodes according to the input parameters of each of the current layer target DS nodes; The calculation unit is configured to determine the data calculation result of the current layer of each target data view according to the execution result of each target DS node of the current layer and the DAG configuration corresponding to each target data view. The above-mentioned at least one technical solution adopted by the embodiment of the present invention can achieve the following beneficial effects: the method abstracts data calculation into access logic and data processing logic, where the access logic is implemented through DS nodes (data source layer), and the data processing logic Realized through DAG configuration (data view layer). When a data calculation request is received, the method collects DS nodes (IO nodes) hierarchically according to the DAG configuration, and executes the DS nodes after deduplication, reducing the number of accesses to the business system and reducing the IO consumption of the business system.

為使本發明實施例的目的、技術方案和優點更加清楚,下面將結合本發明實施例中的圖式,對本發明實施例中的技術方案進行清楚、完整地描述,顯然,所描述的實施例是本發明一部分實施例,而不是全部的實施例,基於本發明中的實施例,本領域普通技術人員在沒有做出創造性勞動的前提下所獲得的所有其他實施例,都屬於本發明保護的範圍。 如圖1所示,本發明實施例提供了一種資料計算方法,該方法可以包括以下步驟: 步驟101:接收資料計算請求,其中,資料計算請求中包括:若干目標資料視圖的標識。 資料計算請求中還包括:各個目標資料視圖的第一層DS節點的輸入參數。一個資料計算請求可以針對一個或多個資料視圖。 步驟102:根據預設的與資料視圖相對應的DAG配置,確定各個目標資料視圖的當前層DS節點和當前層DS節點的輸入參數。 DAG配置為資料視圖的表現形式,DAG為分層結構,為確定各層目標DS節點提供了便利。DAG配置中可以包括多層,對每層的處理均可以採用步驟102提供的方法。 在本發明實施例中,對第一層來說,DS節點的輸入參數為資料計算請求中包括的資料視圖的第一層DS節點的輸入參數,對於除了第一層之外的其他層來說,DS節點的輸入參數為上一層的資料計算結果。 如圖2所示,是一種資料視圖對應的DAG配置,該資料視圖要實現的業務目的是:根據輸入用戶ID,獲取該用戶ID關聯的常用IP,並根據這些IP清單獲取曾經使用過這些IP進行登入的帳號總數。 該DAG配置包括兩層,第一層DS節點對應的執行任務為“取用戶常用IP列表”,其輸入參數為用戶ID;第二層DS節點對應的執行任務為“該IP出現過的帳號個數”,其輸入參數為第一層的資料計算結果,最終得到的資料計算結果為“用戶ID常用IP所能關聯出的帳號個數”。 需要說明的是,在DAG配置中,第一層DS節點和第二層DS節點分別位於DAG樹的不同層,即每個資料視圖本身存在多個待計算的層級。但是,從邏輯層面來說,第一層DS節點和第二層DS節點都屬於資料源層。 步驟103:根據各個當前層DS節點及其輸入參數,確定若干當前層目標DS節點及其輸入參數,其中,若干當前層目標DS節點中不存在第一DS節點和第二DS節點,第一DS節點與第二DS節點相同、且第一DS節點的輸入參數與第二DS節點的輸入參數相同。 需要說明的是,第一DS節點和第二DS節點為若干當前層目標DS節點中重複的節點,即兩個DS節點對應的執行任務相同,且對應的輸入參數也相同。 當DAG配置中包括多層時,需要確定每一層的目標DS節點及其輸入參數。現以圖3-圖5所示的三個DAG配置的第一層為例,對步驟103進行詳細的說明。 三個DAG配置的第一層DS節點分別為:DS4、DS1、DS1,對應的輸入參數皆為用戶ID,由於圖4中的DS1與圖5中的DS1相同,且兩個節點的輸入參數皆為用戶ID,則圖4中的DS1可以與圖5中的DS1合併執行,即第一層目標DS節點為DS4和DS1,對應的輸入參數皆為用戶ID。合併後得到的當前層目標DS節點的數量少於合併前的當前層DS節點的數量。 步驟104:根據各個當前層目標DS節點的輸入參數,執行各個當前層目標DS節點。 在現有技術中,由於線上業務系統和離線業務系統存在巨大的環境差異,導致資料計算邏輯需要被分開定義:即對同一資料需求(包含取數、資料加工等複雜的資料構造邏輯),需要根據環境不同,進行兩次獨立開發。這樣開發代價高,人力成本高,並且很難做到真正的資料邏輯對等。 鑒於此,根據應用的環境不同,該方法分為以下兩種情況: 情況1:所處環境為線上環境; 此時,步驟104具體包括: A1:調用TR服務介面。 A2:將當前層目標DS節點的輸入參數提供給TR服務介面,以使TR服務介面獲取與當前層目標DS節點的輸入參數相匹配的資料。 情況2:所處環境為離線環境; 此時,步驟104具體包括: 從離線資料庫中篩選出與當前層目標DS節點的輸入參數相匹配的資料。 以圖2所示的DAG配置為例,對於第一層DS節點,當所處環境為線上環境時,步驟104可以透過調用一個TR服務介面實現:return  IpService.queryIpList(userId)。當所處環境為離線環境時,步驟104可以透過一段SQL語句實現:select ip from table1 where userId=“userId”。 對於第二層DS節點,當所處環境為線上環境時,步驟104可以透過調用一個TR服務介面實現:return IpService.queryUserIdCount(ipList)。當所處環境為離線環境時,步驟104可以透過一段SQL語句實現:select count(userId) from table1 where ip in ipList。 在本發明實施例中,該資料計算方法支援配置化的IO合併,能夠最大限度的為各業務系統節省IO消耗。並且,該資料引擎支援一次配置,能夠同時適用於線上、離線環境,可大大節省資料開發成本,並提高線上、離線資料的一致性。 雖然DS節點這部分需要線上和離線各自根據環境適配,但由於以下兩個原因,使得這個過程簡單可控,不會增加開發複雜度。 (1)DS節點僅包含最基本的取數邏輯,不存在複雜加工邏輯,離線和線上很好對齊。 (2)在資料計算場景中,基礎資料邏輯往往是一個很小的集合。更多資料是透過處理和加工而衍生出來的。 需要說明的是,為了提高資料計算效率,在步驟104中,併發執行各個當前層目標DS節點。 步驟105:根據各個當前層目標DS節點的執行結果和各個目標資料視圖對應的DAG配置,確定各個目標資料視圖的當前層的資料計算結果。 步驟105具體包括: B1:根據各個當前層目標DS節點的執行結果和各個目標資料視圖對應的DAG配置,確定各個目標資料視圖對應的執行結果。 根據目標資料視圖的DAG配置可以確定各層DS節點,透過DS節點可以確定目標資料視圖對應的目標DS節點,該目標DS節點的執行結果即為目標資料視圖對應的執行結果。 目標資料視圖對應的執行結果可以分為兩種:一種是執行成功,即目標資料視圖對應的當前層目標DS節點在預設的執行時間範圍內得到與其輸入參數相匹配的資料;另一種是執行失敗,即目標資料視圖對應的當前層目標DS節點在執行時間範圍內未得到與其輸入參數相匹配的資料。 B2:根據各個目標資料視圖對應的執行結果及DAG配置進行資料計算,得到各個目標資料視圖的當前層的資料計算結果,其中,不同目標資料視圖對應的資料計算串列執行。 在本發明實施例中,透過預設的執行時間範圍控制執行目標DS節點的時間,提高資料計算的效率。執行時間範圍的存在能夠避免一個目標資料視圖的資料計算過程中止,不影響其他目標資料視圖的資料計算過程的進行。如果在執行時間範圍內有某個DS節點沒算出來,則這個DS的計算過程放到後續DS參數準備過程中串列計算。 針對上述兩種執行結果,根據各個目標資料視圖對應的執行結果及DAG配置進行資料計算,具體分為以下兩種情況: (1)當目標資料視圖對應的當前層目標DS節點在預設的執行時間範圍內得到與其輸入參數相匹配的資料時,根據資料和目標資料視圖對應的DAG配置進行資料計算。 (2)當目標資料視圖對應的當前層目標DS節點在執行時間範圍內未得到與其輸入參數相匹配的資料時,根據目標資料視圖對應的當前層目標DS節點的輸入參數,重新執行目標資料視圖對應的當前層目標DS節點,當目標資料視圖對應的當前層目標DS節點在執行時間範圍內得到與其輸入參數相匹配的資料時,根據目標資料視圖對應的DAG配置進行資料計算。 當然,在實際應用場景中,當目標資料視圖對應的當前層目標DS節點在執行時間範圍內未得到與其輸入參數相匹配的資料時,還可以終止目標資料視圖的資料計算過程。需要說明的是,一個目標資料視圖對應的資料計算過程終止,並不影響其他目標資料視圖對應的資料計算過程。 該方法將資料計算抽象為取數邏輯和資料加工邏輯,其中,取數邏輯透過DS節點(資料源層)實現,資料加工邏輯透過DAG配置(資料視圖層)實現。當接收到資料計算請求時,該方法將根據DAG配置分層收集DS節點(IO節點),並執行去重後的DS節點,減少對業務系統的存取次數,降低業務系統的IO消耗。 本發明實施例以圖3-圖5所示的三個資料視圖對應的DAG配置為例,對資料計算方法進行詳細的說明,該方法包括: S1:接收資料計算請求,其中,資料計算請求中包括:若干目標資料視圖的標識和各個目標資料視圖的第一層DS節點的輸入參數。 假設圖3所示的DAG配置對應資料視圖1,圖4所示的DAG配置對應資料視圖2,圖5所示的DAG配置對應資料視圖3。 資料計算請求中包括:目標資料視圖的標識1、2、3,對應的第一層DS節點的輸入參數皆為用戶ID。 S2:根據預設的與資料視圖相對應的DAG配置,確定各個目標資料視圖的第一層DS節點和第一層DS節點的輸入參數。 目標資料視圖1的第一層DS節點為DS4,對應的輸入參數為用戶ID;目標資料視圖2的第一層DS節點為DS1,對應的輸入參數為用戶ID;目標資料視圖3的第一層DS節點為DS1,對應的輸入參數為用戶ID。 S3:根據各個第一層DS節點及其輸入參數,確定若干第一層目標DS節點及其輸入參數,其中,若干第一層目標DS節點中不存在第一DS節點和第二DS節點,第一DS節點與第二DS節點相同、且第一DS節點的輸入參數與第二DS節點的輸入參數相同。 第一層目標DS節點為DS1和DS4,對應的輸入參數皆為用戶ID。 S4:根據各個第一層目標DS節點的輸入參數,執行各個第一層目標DS節點。 以目標資料視圖1為例,當所處環境為線上環境時,S4具體包括:調用TR服務介面;將用戶ID提供給TR服務介面,以使TR服務介面獲取與用戶ID相匹配的資料。 當所處環境為離線環境時,S4具體包括:從離線資料庫中篩選出與用戶ID相匹配的資料。 S5:根據各個第一層目標DS節點的執行結果和各個目標資料視圖對應的DAG配置,確定各個目標資料視圖對應的執行結果。 目標資料視圖1對應的執行結果為DS4的執行結果,目標資料視圖2、目標資料視圖3對應的執行結果為DS1的執行結果。 S6:根據各個目標資料視圖對應的執行結果及DAG配置進行資料計算,得到各個目標資料視圖的第一層的資料計算結果,其中,不同目標資料視圖對應的資料計算串列執行。 對上述三個目標資料視圖進行串列計算,但是目標資料視圖的具體計算順序並不做限定,例如,按照目標資料視圖1、2、3的順序,分別計算三個目標資料視圖第一層的資料計算結果。 以目標資料視圖1為例,當DS4在預設的執行時間範圍內得到與其輸入參數相匹配的資料時,根據資料和目標資料視圖1對應的DAG配置進行資料計算。其中,資料計算可以為資料過濾(filter)、資料校驗等。 當DS4在執行時間範圍內未得到與用戶ID相匹配的資料時,根據目標資料視圖1對應的用戶ID,重新執行目標資料視圖1對應的DS4,當目標資料視圖1對應的DS4在執行時間範圍內得到與用戶ID相匹配的資料時,根據目標資料視圖1對應的DAG配置進行資料計算。 在目標資料視圖1的第一層資料計算完成後,依次進行目標資料視圖2和目標資料視圖3的第一層資料計算。 S7:根據預設的與資料視圖相對應的DAG配置,確定各個目標資料視圖的第二層DS節點和第二層DS節點的輸入參數。 目標資料視圖1的第二層DS節點為DS2,對應的輸入參數為其第一層的資料計算結果;目標資料視圖2的第二層DS節點為DS2,對應的輸入參數為其第一層的資料計算結果;目標資料視圖3的第二層DS節點為DS3,對應的輸入參數為其第一層的資料計算結果。 S8:根據各個第二層DS節點及其輸入參數,確定若干第二層目標DS節點及其輸入參數,其中,若干第二層目標DS節點中不存在第一DS節點和第二DS節點,第一DS節點與第二DS節點相同、且第一DS節點的輸入參數與第二DS節點的輸入參數相同。 第二層目標DS節點為DS2和DS3,對應的輸入參數皆為上一層的資料計算結果。 S9:根據各個第二層目標DS節點的輸入參數,執行各個第二層目標DS節點。 以目標資料視圖1為例,當所處環境為線上環境時,S4具體包括:調用TR服務介面;將第一層的資料計算結果提供給TR服務介面,以使TR服務介面獲取與第一層的資料計算結果相匹配的資料。 當所處環境為離線環境時,S4具體包括:從離線資料庫中篩選出與第一層的資料計算結果相匹配的資料。 S10:根據各個第二層目標DS節點的執行結果和各個目標資料視圖對應的DAG配置,確定各個目標資料視圖對應的執行結果。 目標資料視圖1和目標資料視圖2對應的執行結果為DS2的執行結果,目標資料視圖3對應的執行結果為DS3的執行結果。 S6:根據各個目標資料視圖對應的執行結果及DAG配置進行資料計算,得到各個目標資料視圖的第二層的資料計算結果,其中,不同目標資料視圖對應的資料計算串列執行。 按照目標資料視圖1、2、3的順序,分別計算三個目標資料視圖第二層的資料計算結果。 以目標資料視圖1為例,當DS2在預設的執行時間範圍內得到與第一層的資料計算結果相匹配的資料時,根據資料和目標資料視圖1對應的DAG配置進行資料計算。其中,資料計算可以為資料去重、資料校驗等。 當DS2在執行時間範圍內未得到與第一層的資料計算結果相匹配的資料時,根據目標資料視圖1對應的第一層的資料計算結果,重新執行目標資料視圖1對應的DS2,當目標資料視圖1對應的DS2在執行時間範圍內得到與第一層的資料計算結果相匹配的資料時,根據目標資料視圖1對應的DAG配置進行資料計算。 在目標資料視圖1的第二層資料計算完成後,依次進行目標資料視圖2和目標資料視圖3的第二層資料計算。 如圖6所示,一種資料計算引擎,包括: 接收單元601,用於接收資料計算請求,其中,資料計算請求中包括:若干目標資料視圖的標識; 確定單元602,用於根據預設的與資料視圖相對應的有向無環圖DAG配置,確定各個目標資料視圖的當前層DS節點和當前層DS節點的輸入參數; 合併單元603,用於根據各個當前層DS節點及其輸入參數,確定若干當前層目標DS節點及其輸入參數,其中,若干當前層目標DS節點中不存在第一DS節點和第二DS節點,第一DS節點與第二DS節點相同、且第一DS節點的輸入參數與第二DS節點的輸入參數相同; 執行單元604,用於根據各個當前層目標DS節點的輸入參數,執行各個當前層目標DS節點; 計算單元605,用於根據各個當前層目標DS節點的執行結果和各個目標資料視圖對應的DAG配置,確定各個目標資料視圖的當前層的資料計算結果。 在本發明的一個實施例中,計算單元605,用於根據各個當前層目標DS節點的執行結果和各個目標資料視圖對應的DAG配置,確定各個目標資料視圖對應的執行結果;根據各個目標資料視圖對應的執行結果及DAG配置進行資料計算,得到各個目標資料視圖的當前層的資料計算結果,其中,不同目標資料視圖對應的資料計算串列執行。 在本發明的一個實施例中,計算單元605,用於當目標資料視圖對應的當前層目標DS節點在預設的執行時間範圍內得到與其輸入參數相匹配的資料時,根據資料和目標資料視圖對應的DAG配置進行資料計算。 在本發明的一個實施例中,計算單元605,進一步用於當目標資料視圖對應的當前層目標DS節點在執行時間範圍內未得到與其輸入參數相匹配的資料時,根據目標資料視圖對應的當前層目標DS節點的輸入參數,重新執行目標資料視圖對應的當前層目標DS節點,當目標資料視圖對應的當前層目標DS節點在執行時間範圍內得到與其輸入參數相匹配的資料時,根據目標資料視圖對應的DAG配置進行資料計算。 在本發明的一個實施例中,當所處環境為線上環境時,執行單元604,用於調用TR服務介面;將當前層目標DS節點的輸入參數提供給TR服務介面,以使TR服務介面獲取與當前層目標DS節點的輸入參數相匹配的資料。 在本發明的一個實施例中,當所處環境為離線環境時,執行單元604,用於從離線資料庫中篩選出與當前層目標DS節點的輸入參數相匹配的資料。 本發明實施例提供了一種資料計算設備,包括:處理器和記憶體; 記憶體用於儲存執行指令,處理器用於執行記憶體儲存的執行指令以實現上述任一實施例的方法。 在20世紀90年代,對於一個技術的改進可以很明顯地區分是硬體上的改進(例如,對二極體、電晶體、開關等電路結構的改進)還是軟體上的改進(對於方法流程的改進)。然而,隨著技術的發展,當今的很多方法流程的改進已經可以視為硬體電路結構的直接改進。設計人員幾乎都透過將改進的方法流程程式設計到硬體電路中來得到相應的硬體電路結構。因此,不能說一個方法流程的改進就不能用硬體實體模組來實現。例如,可程式設計邏輯裝置(Programmable Logic Device, PLD)(例如現場可程式設計閘陣列(Field Programmable Gate Array,FPGA))就是這樣一種積體電路,其邏輯功能由用戶對器件程式設計來確定。由設計人員自行程式設計來把一個數位系統“集成”在一片PLD上,而不需要請晶片製造廠商來設計和製作專用的積體電路晶片。而且,如今,取代手工地製作積體電路晶片,這種程式設計也多半改用“邏輯編譯器(logic compiler)”軟體來實現,它與程式開發撰寫時所用的軟體編譯器相類似,而要編譯之前的原始程式碼也得用特定的程式設計語言來撰寫,此稱之為硬體描述語言(Hardware Description Language,HDL),而HDL也並非僅有一種,而是有許多種,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language)等,目前最普遍使用的是VHDL (Very-High-Speed Integrated Circuit Hardware Description Language)與Verilog。本領域技術人員也應該清楚,只需要將方法流程用上述幾種硬體描述語言稍作邏輯程式設計並程式設計到積體電路中,就可以很容易得到實現該邏輯方法流程的硬體電路。 控制器可以按任何適當的方式實現,例如,控制器可以採取例如微處理器或處理器以及儲存可由該(微)處理器執行的電腦可讀程式碼(例如軟體或韌體)的電腦可讀媒體、邏輯閘、開關、專用積體電路(Application Specific Integrated Circuit,ASIC)、可程式設計邏輯控制器和嵌入微控制器的形式,控制器的例子包括但不限於以下微控制器:ARC 625D、Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320,記憶體控制器還可以被實現為記憶體的控制邏輯的一部分。本領域技術人員也知道,除了以純電腦可讀程式碼方式實現控制器以外,完全可以透過將方法步驟進行邏輯程式設計來使得控制器以邏輯閘、開關、專用積體電路、可程式設計邏輯控制器和嵌入微控制器等的形式來實現相同功能。因此這種控制器可以被認為是一種硬體部件,而對其內包括的用於實現各種功能的裝置也可以視為硬體部件內的結構。或者甚至,可以將用於實現各種功能的裝置視為既可以是實現方法的軟體模組又可以是硬體部件內的結構。 上述實施例闡明的系統、裝置、模組或單元,具體可以由電腦晶片或實體實現,或者由具有某種功能的產品來實現。一種典型的實現設備為電腦。具體的,電腦例如可以為個人電腦、膝上型電腦、行動電話、相機電話、智慧型電話、個人數位助理、媒體播放機、導航設備、電子郵件設備、遊戲控制台、平板電腦、可穿戴設備或者這些設備中的任何設備的組合。 為了描述的方便,描述以上裝置時以功能分為各種單元分別描述。當然,在實施本申請時可以把各單元的功能在同一個或多個軟體和/或硬體中實現。 本領域內的技術人員應明白,本發明的實施例可提供為方法、系統、或電腦程式產品。因此,本發明可採用完全硬體實施例、完全軟體實施例、或結合軟體和硬體態樣的實施例的形式。而且,本發明可採用在一個或多個其中包含有電腦可用程式碼的電腦可用儲存媒體(包括但不限於磁碟記憶體、CD-ROM、光學記憶體等)上實施的電腦程式產品的形式。 本發明是參照根據本發明實施例的方法、設備(系統)、和電腦程式產品的流程圖和/或方框圖來描述的。應理解可由電腦程式指令實現流程圖和/或方框圖中的每一流程和/或方框、以及流程圖和/或方框圖中的流程和/或方框的結合。可提供這些電腦程式指令到通用電腦、專用電腦、嵌入式處理機或其他可程式設計資料處理設備的處理器以產生一個機器,使得透過電腦或其他可程式設計資料處理設備的處理器執行的指令產生用於實現在流程圖一個流程或多個流程和/或方框圖一個方框或多個方框中指定的功能的裝置。 這些電腦程式指令也可儲存在能引導電腦或其他可程式設計資料處理設備以特定方式工作的電腦可讀記憶體中,使得儲存在該電腦可讀記憶體中的指令產生包括指令裝置的製造品,該指令裝置實現在流程圖一個流程或多個流程和/或方框圖一個方框或多個方框中指定的功能。 這些電腦程式指令也可裝載到電腦或其他可程式設計資料處理設備上,使得在電腦或其他可程式設計設備上執行一系列操作步驟以產生電腦實現的處理,從而在電腦或其他可程式設計設備上執行的指令提供用於實現在流程圖一個流程或多個流程和/或方框圖一個方框或多個方框中指定的功能的步驟。 在一個典型的配置中,計算設備包括一個或多個處理器(CPU)、輸入/輸出介面、網路介面和記憶體。 記憶體可能包括電腦可讀媒體中的非永久性記憶體,隨機存取記憶體(RAM)和/或非易失性記憶體等形式,如唯讀記憶體(ROM)或快閃記憶體(flash RAM)。記憶體是電腦可讀媒體的示例。 電腦可讀媒體包括永久性和非永久性、可移動和非可移動媒體可以由任何方法或技術來實現資訊儲存。資訊可以是電腦可讀指令、資料結構、程式的模組或其他資料。電腦的儲存媒體的例子包括,但不限於相變記憶體(PRAM)、靜態隨機存取記憶體(SRAM)、動態隨機存取記憶體(DRAM)、其他類型的隨機存取記憶體(RAM)、唯讀記憶體(ROM)、電可擦除可程式設計唯讀記憶體(EEPROM)、快閃記憶體或其他記憶體技術、唯讀光碟唯讀記憶體(CD-ROM)、數位多功能光碟(DVD)或其他光學儲存、磁盒式磁帶,磁帶磁磁片儲存或其他磁性存放裝置或任何其他非傳輸媒體,可用於儲存可以被計算設備存取的資訊。按照本文中的界定,電腦可讀媒體不包括暫存電腦可讀媒體(transitory media),如調製的資料信號和載波。 還需要說明的是,術語“包括”、“包含”或者其任何其他變體意在涵蓋非排他性的包含,從而使得包括一系列要素的過程、方法、商品或者設備不僅包括那些要素,而且還包括沒有明確列出的其他要素,或者是還包括為這種過程、方法、商品或者設備所固有的要素。在沒有更多限制的情況下,由語句“包括一個……”限定的要素,並不排除在包括所述要素的過程、方法、商品或者設備中還存在另外的相同要素。 本申請可以在由電腦執行的電腦可執行指令的一般上下文中描述,例如程式模組。一般地,程式模組包括執行特定任務或實現特定抽象資料類型的常式、程式、物件、元件、資料結構等等。也可以在分散式運算環境中實踐本申請,在這些分散式運算環境中,由通過通信網路而被連接的遠端處理設備來執行任務。在分散式運算環境中,程式模組可以位於包括存放裝置在內的本地和遠端電腦儲存媒體中。 本說明書中的各個實施例均採用遞進的方式描述,各個實施例之間相同相似的部分互相參見即可,每個實施例重點說明的都是與其他實施例的不同之處。尤其,對於系統實施例而言,由於其基本相似於方法實施例,所以描述的比較簡單,相關之處參見方法實施例的部分說明即可。 以上所述僅為本申請的實施例而已,並不用於限制本申請。對於本領域技術人員來說,本申請可以有各種更改和變化。凡在本申請的精神和原理之內所作的任何修改、等同替換、改進等,均應包含在本申請的請求項之內。In order to make the objectives, technical solutions and advantages of the embodiments of the present invention clearer, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work are protected by the present invention. range. As shown in Figure 1, an embodiment of the present invention provides a data calculation method, which may include the following steps: Step 101: Receive a data calculation request, where the data calculation request includes identifications of a number of target data views. The data calculation request also includes: the input parameters of the first layer DS node of each target data view. A data calculation request can be for one or more data views. Step 102: Determine the input parameters of the current layer DS node and the current layer DS node of each target data view according to the preset DAG configuration corresponding to the data view. DAG is configured as the manifestation of data view, and DAG is a hierarchical structure, which provides convenience for determining the target DS nodes of each layer. The DAG configuration can include multiple layers, and the method provided in step 102 can be used to process each layer. In the embodiment of the present invention, for the first layer, the input parameter of the DS node is the input parameter of the first layer DS node of the data view included in the data calculation request. For other layers except the first layer , The input parameter of the DS node is the data calculation result of the previous layer. As shown in Figure 2, it is a DAG configuration corresponding to a data view. The business purpose of the data view is to obtain the commonly used IP associated with the user ID according to the input user ID, and obtain the IP that has been used according to the IP list. The total number of accounts logged in. The DAG configuration includes two layers. The execution task corresponding to the DS node on the first layer is "Fetch the user frequently used IP list", and the input parameter is the user ID; the execution task corresponding to the DS node on the second layer is "The number of accounts that have appeared in this IP "Number", the input parameter is the data calculation result of the first layer, and the final data calculation result is "the number of accounts that can be associated with the user ID common IP". It should be noted that in the DAG configuration, the first-level DS nodes and the second-level DS nodes are located at different levels of the DAG tree, that is, each data view itself has multiple levels to be calculated. However, from a logical perspective, both the first-level DS nodes and the second-level DS nodes belong to the data source layer. Step 103: Determine a number of current layer target DS nodes and their input parameters according to each current layer DS node and its input parameters. Among the current layer target DS nodes, the first DS node and the second DS node do not exist, and the first DS The node is the same as the second DS node, and the input parameters of the first DS node are the same as the input parameters of the second DS node. It should be noted that the first DS node and the second DS node are repeated nodes among several current-layer target DS nodes, that is, the two DS nodes correspond to the same execution tasks and the corresponding input parameters are also the same. When multiple layers are included in the DAG configuration, the target DS node and its input parameters of each layer need to be determined. Now take the first layer of the three DAG configurations shown in Figs. 3 to 5 as an example to describe step 103 in detail. The first layer DS nodes of the three DAG configurations are: DS4, DS1, DS1, and the corresponding input parameters are all user IDs. Because DS1 in Figure 4 is the same as DS1 in Figure 5, and the input parameters of both nodes are both If it is a user ID, DS1 in FIG. 4 can be combined with DS1 in FIG. 5, that is, the target DS nodes of the first layer are DS4 and DS1, and the corresponding input parameters are all user IDs. The number of current layer target DS nodes obtained after the merging is less than the number of current layer DS nodes before the merging. Step 104: Execute each current-layer target DS node according to the input parameters of each current-layer target DS node. In the prior art, due to the huge environmental differences between online business systems and offline business systems, data calculation logic needs to be defined separately: that is, for the same data requirements (including complex data structure logic such as data access and data processing), it needs to be based on The environment is different and two independent developments are carried out. Such development costs are high, labor costs are high, and it is difficult to achieve true data logical equivalence. In view of this, according to the different application environment, this method is divided into the following two situations: Situation 1: The environment is online; At this time, step 104 specifically includes: A1: Call the TR service interface. A2: Provide the input parameters of the target DS node of the current layer to the TR service interface so that the TR service interface can obtain data that matches the input parameters of the target DS node of the current layer. Situation 2: The environment is offline; At this time, step 104 specifically includes: Filter out the data matching the input parameters of the target DS node of the current layer from the offline database. Taking the DAG configuration shown in FIG. 2 as an example, for the first-level DS node, when the environment is an online environment, step 104 can be implemented by calling a TR service interface: return IpService.queryIpList(userId). When the environment is an offline environment, step 104 can be implemented through a SQL statement: select ip from table1 where userId="userId". For the second-level DS node, when the environment is an online environment, step 104 can be implemented by calling a TR service interface: return IpService.queryUserIdCount(ipList). When the environment is an offline environment, step 104 can be implemented through a SQL statement: select count(userId) from table1 where ip in ipList. In the embodiment of the present invention, the data calculation method supports configured IO consolidation, which can save IO consumption for each business system to the greatest extent. In addition, the data engine supports one-time configuration and can be applied to both online and offline environments, which can greatly save data development costs and improve the consistency of online and offline data. Although this part of the DS node needs to be adapted online and offline according to the environment, due to the following two reasons, the process is simple and controllable without increasing development complexity. (1) The DS node only contains the most basic access logic, there is no complicated processing logic, and the offline and online are well aligned. (2) In data computing scenarios, the basic data logic is often a small collection. More information is derived through processing and processing. It should be noted that, in order to improve the efficiency of data calculation, in step 104, each current layer target DS node is executed concurrently. Step 105: Determine the data calculation result of the current layer of each target data view according to the execution result of each current layer target DS node and the DAG configuration corresponding to each target data view. Step 105 specifically includes: B1: Determine the execution result corresponding to each target data view according to the execution result of each current layer target DS node and the DAG configuration corresponding to each target data view. The DS nodes of each layer can be determined according to the DAG configuration of the target data view, and the target DS node corresponding to the target data view can be determined through the DS node. The execution result of the target DS node is the execution result corresponding to the target data view. The execution results corresponding to the target data view can be divided into two types: one is successful execution, that is, the target DS node of the current layer corresponding to the target data view obtains data that matches its input parameters within the preset execution time range; the other is execution Failed, that is, the target DS node of the current layer corresponding to the target data view does not get data that matches its input parameters within the execution time range. B2: Perform data calculation according to the execution result and DAG configuration corresponding to each target data view to obtain the data calculation result of the current layer of each target data view. Among them, the data calculation corresponding to different target data views is executed in series. In the embodiment of the present invention, the execution time of the target DS node is controlled through the preset execution time range, thereby improving the efficiency of data calculation. The existence of the execution time range can prevent the data calculation process of one target data view from being suspended, and does not affect the data calculation process of other target data views. If there is a DS node that is not calculated within the execution time range, the DS calculation process is placed in the subsequent DS parameter preparation process for serial calculation. For the above two execution results, data calculations are performed according to the execution results and DAG configuration corresponding to each target data view, which are specifically divided into the following two situations: (1) When the target DS node of the current layer corresponding to the target data view obtains data matching its input parameters within the preset execution time range, data calculation is performed according to the data and the DAG configuration corresponding to the target data view. (2) When the current layer target DS node corresponding to the target data view does not get data that matches its input parameters within the execution time range, the target data view is re-executed according to the input parameters of the current layer target DS node corresponding to the target data view The corresponding current-level target DS node, when the current-level target DS node corresponding to the target data view obtains data that matches its input parameters within the execution time range, data calculation is performed according to the DAG configuration corresponding to the target data view. Of course, in actual application scenarios, when the target DS node of the current layer corresponding to the target data view does not obtain data that matches its input parameters within the execution time range, the data calculation process of the target data view can also be terminated. It should be noted that the termination of the data calculation process corresponding to one target data view does not affect the data calculation process corresponding to other target data views. This method abstracts data calculation into access logic and data processing logic. The access logic is implemented through DS nodes (data source layer), and the data processing logic is implemented through DAG configuration (data view layer). When a data calculation request is received, the method collects DS nodes (IO nodes) hierarchically according to the DAG configuration, and executes the DS nodes after deduplication, reducing the number of accesses to the business system and reducing the IO consumption of the business system. The embodiment of the present invention takes the DAG configuration corresponding to the three data views shown in Fig. 3 to Fig. 5 as an example to describe the data calculation method in detail. The method includes: S1: Receive a data calculation request, where the data calculation request includes the identifiers of a number of target data views and the input parameters of the first-level DS nodes of each target data view. Assume that the DAG configuration shown in FIG. 3 corresponds to data view 1, the DAG configuration shown in FIG. 4 corresponds to data view 2, and the DAG configuration shown in FIG. 5 corresponds to data view 3. The data calculation request includes: the identifiers 1, 2, and 3 of the target data view, and the input parameters of the corresponding first-level DS nodes are all user IDs. S2: Determine the input parameters of the first layer DS node and the first layer DS node of each target data view according to the preset DAG configuration corresponding to the data view. The first level DS node of target profile view 1 is DS4, and the corresponding input parameter is user ID; the first level DS node of target profile view 2 is DS1, and the corresponding input parameter is user ID; the first level of target profile view 3 The DS node is DS1, and the corresponding input parameter is the user ID. S3: Determine a number of first-level target DS nodes and their input parameters according to each first-level DS node and its input parameters. Among the first-level target DS nodes, the first DS node and the second DS node do not exist. A DS node is the same as the second DS node, and the input parameters of the first DS node are the same as the input parameters of the second DS node. The first-level target DS nodes are DS1 and DS4, and the corresponding input parameters are user IDs. S4: Execute each first-level target DS node according to the input parameters of each first-level target DS node. Taking target data view 1 as an example, when the environment is an online environment, S4 specifically includes: invoking the TR service interface; providing the user ID to the TR service interface, so that the TR service interface can obtain data that matches the user ID. When the environment is an offline environment, S4 specifically includes: filtering out the information matching the user ID from the offline database. S5: Determine the execution result corresponding to each target data view according to the execution result of each first-level target DS node and the DAG configuration corresponding to each target data view. The execution result corresponding to target data view 1 is the execution result of DS4, and the execution result corresponding to target data view 2 and target data view 3 is the execution result of DS1. S6: Perform data calculation according to the execution result and DAG configuration corresponding to each target data view to obtain the data calculation result of the first layer of each target data view, wherein the data calculation corresponding to different target data views is executed in series. Perform serial calculations on the above three target data views, but the specific calculation order of the target data views is not limited. For example, according to the order of the target data views 1, 2, and 3, calculate the first level of the three target data views. Data calculation results. Taking target data view 1 as an example, when DS4 obtains data that matches its input parameters within the preset execution time range, data calculation is performed based on the data and the DAG configuration corresponding to the target data view 1. Among them, data calculation can be data filtering (filter), data verification, etc. When DS4 does not get the data matching the user ID within the execution time range, according to the user ID corresponding to the target data view 1, re-execute the DS4 corresponding to the target data view 1 and when the DS4 corresponding to the target data view 1 is in the execution time range When the data that matches the user ID is obtained, the data is calculated according to the DAG configuration corresponding to the target data view 1. After the calculation of the first-level data of the target data view 1 is completed, the first-level data calculation of the target data view 2 and the target data view 3 are performed in sequence. S7: Determine the input parameters of the second layer DS node and the second layer DS node of each target data view according to the preset DAG configuration corresponding to the data view. The second-level DS node of target data view 1 is DS2, and the corresponding input parameter is the data calculation result of the first level; the second-level DS node of target data view 2 is DS2, and the corresponding input parameter is the first-level data calculation result. Data calculation result; the second-level DS node of the target data view 3 is DS3, and the corresponding input parameter is the first-level data calculation result. S8: Determine a number of second-level target DS nodes and their input parameters according to each second-level DS node and its input parameters. Among the number of second-level target DS nodes, the first DS node and the second DS node do not exist. A DS node is the same as the second DS node, and the input parameters of the first DS node are the same as the input parameters of the second DS node. The target DS nodes of the second layer are DS2 and DS3, and the corresponding input parameters are the data calculation results of the previous layer. S9: Execute each second-level target DS node according to the input parameters of each second-level target DS node. Taking target data view 1 as an example, when the environment is an online environment, S4 specifically includes: invoking the TR service interface; providing the data calculation results of the first layer to the TR service interface so that the TR service interface can obtain the same information as the first layer The data calculated results match the data. When the environment is an offline environment, S4 specifically includes: filtering out data matching the data calculation result of the first layer from the offline database. S10: Determine the execution result corresponding to each target data view according to the execution result of each second-level target DS node and the DAG configuration corresponding to each target data view. The execution result corresponding to target data view 1 and target data view 2 is the execution result of DS2, and the execution result corresponding to target data view 3 is the execution result of DS3. S6: Perform data calculation according to the execution result and DAG configuration corresponding to each target data view to obtain the data calculation result of the second layer of each target data view, wherein the data calculation corresponding to different target data views is executed in series. Calculate the data calculation results of the second layer of the three target data views in the order of the target data views 1, 2, and 3. Taking target data view 1 as an example, when DS2 obtains data that matches the data calculation result of the first layer within the preset execution time range, data calculation is performed based on the data and the DAG configuration corresponding to the target data view 1. Among them, data calculation can be data deduplication, data verification, etc. When DS2 does not obtain data that matches the data calculation result of the first layer within the execution time range, according to the data calculation result of the first layer corresponding to the target data view 1, re-execute the DS2 corresponding to the target data view 1. When DS2 corresponding to data view 1 obtains data that matches the data calculation result of the first layer within the execution time range, data calculation is performed according to the DAG configuration corresponding to target data view 1. After the calculation of the second-level data of the target data view 1 is completed, the second-level data calculations of the target data view 2 and the target data view 3 are performed in sequence. As shown in Figure 6, a data calculation engine includes: The receiving unit 601 is configured to receive a data calculation request, where the data calculation request includes: identifiers of a number of target data views; The determining unit 602 is configured to determine the input parameters of the current layer DS node and the current layer DS node of each target data view according to a preset directed acyclic graph DAG configuration corresponding to the data view; The merging unit 603 is configured to determine a number of current layer target DS nodes and their input parameters according to each current layer DS node and its input parameters, where the first DS node and the second DS node do not exist among the current layer target DS nodes, The first DS node is the same as the second DS node, and the input parameters of the first DS node are the same as the input parameters of the second DS node; The execution unit 604 is configured to execute each current layer target DS node according to the input parameters of each current layer target DS node; The calculation unit 605 is configured to determine the data calculation result of the current layer of each target data view according to the execution result of each current layer target DS node and the DAG configuration corresponding to each target data view. In an embodiment of the present invention, the calculation unit 605 is configured to determine the execution result corresponding to each target data view according to the execution result of each current layer target DS node and the DAG configuration corresponding to each target data view; Data calculation is performed on the corresponding execution result and DAG configuration, and the data calculation result of the current layer of each target data view is obtained. Among them, the data calculation corresponding to different target data views is executed in series. In an embodiment of the present invention, the calculation unit 605 is configured to, when the current-level target DS node corresponding to the target data view obtains data that matches its input parameters within a preset execution time range, according to the data and the target data view The corresponding DAG configuration performs data calculation. In an embodiment of the present invention, the calculation unit 605 is further configured to, when the current layer target DS node corresponding to the target data view does not obtain data that matches its input parameters within the execution time range, according to the current data view corresponding to the target data view. Re-execute the current layer target DS node corresponding to the target data view for the input parameters of the layer target DS node. When the current layer target DS node corresponding to the target data view obtains data that matches its input parameters within the execution time range, according to the target data Data calculation is performed on the DAG configuration corresponding to the view. In an embodiment of the present invention, when the environment is an online environment, the execution unit 604 is used to call the TR service interface; provide the input parameters of the target DS node of the current layer to the TR service interface so that the TR service interface can obtain Data that matches the input parameters of the target DS node of the current layer. In an embodiment of the present invention, when the environment is an offline environment, the execution unit 604 is configured to filter out data matching the input parameters of the target DS node of the current layer from the offline database. The embodiment of the present invention provides a data computing device, including: a processor and a memory; The memory is used to store execution instructions, and the processor is used to execute the execution instructions stored in the memory to implement the method of any of the above embodiments. In the 1990s, the improvement of a technology can be clearly distinguished between hardware improvements (for example, improvements to the circuit structure of diodes, transistors, switches, etc.) or software improvements (for method flow Improve). However, with the development of technology, the improvement of many methods and processes of today can be regarded as a direct improvement of the hardware circuit structure. Designers almost always get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be realized by the hardware entity module. For example, a Programmable Logic Device (PLD) (such as Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user's programming of the device. It is designed by the designer to "integrate" a digital system on a PLD without having to ask the chip manufacturer to design and produce a dedicated integrated circuit chip. Moreover, nowadays, instead of manually making integrated circuit chips, this kind of programming is mostly realized by using "logic compiler" software, which is similar to the software compiler used in program development and writing. The source code before compilation must also be written in a specific programming language, which is called the Hardware Description Language (HDL), and there is not only one HDL, but many, such as ABEL ( Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, RHDL (Ruby Hardware Description Language), etc. At present, the most commonly used are VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. It should also be clear to those skilled in the art that only a little logic programming of the method flow using the above hardware description languages and programming into an integrated circuit can easily obtain a hardware circuit that implements the logic method flow. The controller can be implemented in any suitable manner. For example, the controller can be a microprocessor or a processor, and a computer readable program code (such as software or firmware) that can be executed by the (micro) processor. Media, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers and embedded microcontrollers. Examples of controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory control logic. Those skilled in the art also know that, in addition to implementing the controller in a purely computer-readable code, it is entirely possible to design the method steps with logic programming to enable the controller to be controlled by logic gates, switches, dedicated integrated circuits, and programmable logic. The same function can be realized in the form of a device and embedded microcontroller. Therefore, such a controller can be regarded as a hardware component, and the devices included in it for realizing various functions can also be regarded as a structure within the hardware component. Or even, the device for realizing various functions can be regarded as both a software module for realizing the method and a structure in a hardware component. The systems, devices, modules or units explained in the above embodiments may be implemented by computer chips or entities, or implemented by products with certain functions. A typical implementation device is a computer. Specifically, the computer can be, for example, a personal computer, a laptop computer, a mobile phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, and a wearable device. Or any combination of these devices. For the convenience of description, when describing the above device, the functions are divided into various units and described separately. Of course, when implementing this application, the functions of each unit can be implemented in the same or multiple software and/or hardware. Those skilled in the art should understand that the embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Moreover, the present invention can be in the form of computer program products implemented on one or more computer-usable storage media (including but not limited to disk memory, CD-ROM, optical memory, etc.) containing computer-usable program codes. . The present invention is described with reference to flowcharts and/or block diagrams of methods, equipment (systems), and computer program products according to embodiments of the present invention. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to the processors of general-purpose computers, special-purpose computers, embedded processors, or other programmable data processing equipment to generate a machine that can be executed by the processor of the computer or other programmable data processing equipment Produce means for realizing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram. These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including the instruction device , The instruction device realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram. These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to generate computer-implemented processing, so that the computer or other programmable equipment The instructions executed on the above provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram. In a typical configuration, the computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory. Memory may include non-permanent memory in computer-readable media, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory ( flash RAM). Memory is an example of computer-readable media. Computer-readable media includes permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. Information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), and other types of random access memory (RAM) , Read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, read-only CD-ROM (CD-ROM), digital multi-function Optical discs (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices, or any other non-transmission media, can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves. It should also be noted that the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, product or equipment including a series of elements not only includes those elements, but also includes Other elements that are not explicitly listed, or include elements inherent to this process, method, commodity, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, commodity, or equipment that includes the element. This application can be described in the general context of computer-executable instructions executed by a computer, such as a program module. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. This application can also be implemented in a distributed computing environment. In these distributed computing environments, remote processing devices connected through a communication network perform tasks. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices. The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment. The above descriptions are only examples of this application and are not used to limit this application. For those skilled in the art, this application can have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the claims of this application.

101~105:方法步驟 601:接收單元 602:確定單元 603:合併單元 604:執行單元 605:計算單元101~105: Method steps 601: receiving unit 602: Determine Unit 603: Merging Unit 604: Execution Unit 605: Computing Unit

為了更清楚地說明本發明實施例或現有技術中的技術方案,下面將對實施例或現有技術描述中所需要使用的圖式作簡單地介紹,顯而易見地,下面描述中的圖式是本發明的一些實施例,對於本領域普通技術人員來講,在不付出創造性勞動的前提下,還可以根據這些圖式獲得其他的圖式。 [圖1]是本發明一個實施例提供的一種資料計算方法的流程圖; [圖2]是本發明一個實施例提供的一種DAG配置的結構示意圖; [圖3]是本發明一個實施例提供的另一種DAG配置的結構示意圖; [圖4]是本發明一個實施例提供的又一種DAG配置的結構示意圖; [圖5]是本發明一個實施例提供的再一種DAG配置的結構示意圖; [圖6]是本發明一個實施例提供的一種資料計算引擎的結構示意圖。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are of the present invention. For some of the embodiments, for those of ordinary skill in the art, other schemas can be obtained based on these schemas without creative work. [Figure 1] is a flowchart of a data calculation method provided by an embodiment of the present invention; [Figure 2] is a schematic structural diagram of a DAG configuration provided by an embodiment of the present invention; [Figure 3] is a schematic structural diagram of another DAG configuration provided by an embodiment of the present invention; [Figure 4] is a schematic structural diagram of yet another DAG configuration provided by an embodiment of the present invention; [Figure 5] is a schematic structural diagram of yet another DAG configuration provided by an embodiment of the present invention; [Fig. 6] is a schematic structural diagram of a data calculation engine provided by an embodiment of the present invention.

Claims (12)

一種資料計算方法,包括: 接收資料計算請求,其中,該資料計算請求中包括:若干目標資料視圖的標識; 根據預設的與資料視圖相對應的有向無環圖DAG配置,確定各個該目標資料視圖的當前層資料源DS節點和該當前層DS節點的輸入參數; 根據各個該當前層DS節點及其輸入參數,確定若干當前層目標DS節點及其輸入參數,其中,該若干當前層目標DS節點中不存在第一DS節點和第二DS節點,該第一DS節點與第二DS節點相同、且該第一DS節點的輸入參數與該第二DS節點的輸入參數相同; 根據各個該當前層目標DS節點的輸入參數,執行各個該當前層目標DS節點; 根據各個該當前層目標DS節點的執行結果和各個該目標資料視圖對應的DAG配置,確定各個該目標資料視圖的當前層的資料計算結果。A data calculation method, including: Receive a data calculation request, where the data calculation request includes: the identification of a number of target data views; Determine the current layer data source DS node of each target data view and the input parameters of the current layer DS node according to the preset directed acyclic graph DAG configuration corresponding to the data view; According to each of the current layer DS nodes and their input parameters, determine a number of current layer target DS nodes and their input parameters, where the first DS node and the second DS node do not exist in the number of current layer target DS nodes, and the first DS The node is the same as the second DS node, and the input parameters of the first DS node are the same as the input parameters of the second DS node; Execute each current layer target DS node according to the input parameters of each current layer target DS node; According to the execution result of each target DS node of the current layer and the DAG configuration corresponding to each target data view, the data calculation result of the current layer of each target data view is determined. 如請求項1所述的資料計算方法, 所述根據各個該當前層目標DS節點的執行結果和各個該目標資料視圖對應的DAG配置,確定各個該目標資料視圖的當前層的資料計算結果,包括: 根據各個該當前層目標DS節點的執行結果和各個該目標資料視圖對應的DAG配置,確定各個該目標資料視圖對應的執行結果; 根據各個該目標資料視圖對應的執行結果及DAG配置進行資料計算,得到各個該目標資料視圖的當前層的資料計算結果,其中,不同目標資料視圖對應的資料計算串列執行。The data calculation method described in claim 1, According to the execution result of each target DS node of the current layer and the DAG configuration corresponding to each target data view, determining the data calculation result of the current layer of each target data view includes: Determine the execution result corresponding to each target data view according to the execution result of each target DS node of the current layer and the DAG configuration corresponding to each target data view; Data calculation is performed according to the execution result and DAG configuration corresponding to each target data view, and the data calculation result of the current layer of each target data view is obtained. Among them, the data calculation corresponding to different target data views is executed in series. 如請求項2所述的資料計算方法, 所述根據各個該目標資料視圖對應的執行結果及DAG配置進行資料計算,包括: 當該目標資料視圖對應的當前層目標DS節點在預設的執行時間範圍內得到與其輸入參數相匹配的資料時,根據該資料和該目標資料視圖對應的DAG配置進行資料計算。The data calculation method described in claim 2, The data calculation according to the execution result and DAG configuration corresponding to each target data view includes: When the current-level target DS node corresponding to the target data view obtains data matching its input parameters within the preset execution time range, data calculation is performed according to the data and the DAG configuration corresponding to the target data view. 如請求項3所述的資料計算方法,進一步包括: 當該目標資料視圖對應的當前層目標DS節點在該執行時間範圍內未得到與其輸入參數相匹配的資料時, 根據該目標資料視圖對應的當前層目標DS節點的輸入參數,重新執行該目標資料視圖對應的當前層目標DS節點; 當該目標資料視圖對應的當前層目標DS節點在該執行時間範圍內得到與其輸入參數相匹配的資料時,根據該目標資料視圖對應的DAG配置進行資料計算。The data calculation method described in claim 3 further includes: When the target DS node of the current layer corresponding to the target data view does not obtain data that matches its input parameters within the execution time range, According to the input parameters of the current layer target DS node corresponding to the target data view, re-execute the current layer target DS node corresponding to the target data view; When the current-level target DS node corresponding to the target data view obtains data that matches its input parameters within the execution time range, data calculation is performed according to the DAG configuration corresponding to the target data view. 如請求項1所述的資料計算方法, 當所處環境為線上環境時, 所述根據各個該當前層目標DS節點的輸入參數,執行各個該當前層目標DS節點,包括: 調用TR服務介面; 將該當前層目標DS節點的輸入參數提供給該TR服務介面,以使該TR服務介面獲取與該當前層目標DS節點的輸入參數相匹配的資料。The data calculation method described in claim 1, When the environment is online, The executing each current-layer target DS node according to the input parameters of each current-layer target DS node includes: Call the TR service interface; The input parameters of the current-level target DS node are provided to the TR service interface, so that the TR service interface obtains data that matches the input parameters of the current-level target DS node. 如請求項1-5中任一項所述的資料計算方法, 當所處環境為離線環境時, 所述根據各個該當前層目標DS節點的輸入參數,執行各個該當前層目標DS節點,包括: 從離線資料庫中篩選出與該當前層目標DS節點的輸入參數相匹配的資料。The data calculation method described in any one of claims 1-5, When the environment is offline, The executing each current-layer target DS node according to the input parameters of each current-layer target DS node includes: Filter out data matching the input parameters of the target DS node of the current layer from the offline database. 一種資料計算引擎,包括: 接收單元,用於接收資料計算請求,其中,該資料計算請求中包括:若干目標資料視圖的標識; 確定單元,用於根據預設的與資料視圖相對應的有向無環圖DAG配置,確定各個該目標資料視圖的當前層資料源DS節點和該當前層DS節點的輸入參數; 合併單元,用於根據各個該當前層DS節點及其輸入參數,確定若干當前層目標DS節點及其輸入參數,其中,該若干當前層目標DS節點中不存在第一DS節點和第二DS節點,該第一DS節點與第二DS節點相同、且該第一DS節點的輸入參數與該第二DS節點的輸入參數相同; 執行單元,用於根據各個該當前層目標DS節點的輸入參數,執行各個該當前層目標DS節點; 計算單元,用於根據各個該當前層目標DS節點的執行結果和各個該目標資料視圖對應的DAG配置,確定各個該目標資料視圖的當前層的資料計算結果。A data calculation engine, including: The receiving unit is configured to receive a data calculation request, where the data calculation request includes: the identifiers of a number of target data views; The determining unit is used to determine the current layer data source DS node of each target data view and the input parameters of the current layer DS node according to the preset directed acyclic graph DAG configuration corresponding to the data view; The merging unit is used to determine a number of current layer target DS nodes and their input parameters according to each current layer DS node and its input parameters, wherein the first DS node and the second DS node do not exist among the current layer target DS nodes , The first DS node and the second DS node are the same, and the input parameters of the first DS node are the same as the input parameters of the second DS node; The execution unit is configured to execute each target DS node of the current layer according to the input parameters of each target DS node of the current layer; The calculation unit is used to determine the data calculation result of the current layer of each target data view according to the execution result of each target DS node of the current layer and the DAG configuration corresponding to each target data view. 如請求項7所述的資料計算引擎, 該計算單元,用於根據各個該當前層目標DS節點的執行結果和各個該目標資料視圖對應的DAG配置,確定各個該目標資料視圖對應的執行結果;根據各個該目標資料視圖對應的執行結果及DAG配置進行資料計算,得到各個該目標資料視圖的當前層的資料計算結果,其中,不同目標資料視圖對應的資料計算串列執行。The data calculation engine as described in claim 7, The calculation unit is used to determine the execution result corresponding to each target data view according to the execution result of each target DS node of the current layer and the DAG configuration corresponding to each target data view; according to the execution result corresponding to each target data view and The DAG is configured to perform data calculations to obtain the data calculation results of the current layer of each target data view. Among them, the data calculation series corresponding to different target data views are executed. 如請求項8所述的資料計算引擎, 該計算單元,用於當該目標資料視圖對應的當前層目標DS節點在預設的執行時間範圍內得到與其輸入參數相匹配的資料時,根據該資料和該目標資料視圖對應的DAG配置進行資料計算。The data calculation engine as described in claim 8, The calculation unit is used to perform data according to the data and the DAG configuration corresponding to the target data view when the current layer target DS node corresponding to the target data view obtains data matching its input parameters within a preset execution time range Calculation. 如請求項9所述的資料計算引擎, 該計算單元,進一步用於當該目標資料視圖對應的當前層目標DS節點在該執行時間範圍內未得到與其輸入參數相匹配的資料時,根據該目標資料視圖對應的當前層目標DS節點的輸入參數,重新執行該目標資料視圖對應的當前層目標DS節點,當該目標資料視圖對應的當前層目標DS節點在該執行時間範圍內得到與其輸入參數相匹配的資料時,根據該目標資料視圖對應的DAG配置進行資料計算。The data calculation engine as described in claim 9, The calculation unit is further used for when the current-level target DS node corresponding to the target data view does not obtain data that matches its input parameters within the execution time range, according to the input of the current-level target DS node corresponding to the target data view Parameter, re-execute the current-level target DS node corresponding to the target data view. When the current-level target DS node corresponding to the target data view obtains data that matches its input parameters within the execution time range, it corresponds to the target data view DAG configuration for data calculation. 如請求項7所述的資料計算引擎, 當所處環境為線上環境時, 該執行單元,用於調用TR服務介面;將該當前層目標DS節點的輸入參數提供給該TR服務介面,以使該TR服務介面獲取與該當前層目標DS節點的輸入參數相匹配的資料。The data calculation engine as described in claim 7, When the environment is online, The execution unit is used to call the TR service interface; provide the input parameters of the current layer target DS node to the TR service interface so that the TR service interface obtains data that matches the input parameters of the current layer target DS node. 如請求項7-11中任一項所述的資料計算引擎, 當所處環境為離線環境時, 該執行單元,用於從離線資料庫中篩選出與該當前層目標DS節點的輸入參數相匹配的資料。The data calculation engine described in any of Claims 7-11, When the environment is offline, The execution unit is used to filter out data matching the input parameters of the target DS node of the current layer from the offline database.
TW108132569A 2019-02-19 2019-09-10 Data calculation method and engine TWI723535B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910125629.1 2019-02-19
CN201910125629.1A CN110020004B (en) 2019-02-19 2019-02-19 Data calculation method and engine

Publications (2)

Publication Number Publication Date
TW202032395A true TW202032395A (en) 2020-09-01
TWI723535B TWI723535B (en) 2021-04-01

Family

ID=67189027

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108132569A TWI723535B (en) 2019-02-19 2019-09-10 Data calculation method and engine

Country Status (3)

Country Link
CN (1) CN110020004B (en)
TW (1) TWI723535B (en)
WO (1) WO2020168901A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI801287B (en) * 2021-07-20 2023-05-01 奧義智慧科技股份有限公司 Event visualization device and related computer program product for generating hierarchical directed acyclic graph

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110020004B (en) * 2019-02-19 2020-08-07 阿里巴巴集团控股有限公司 Data calculation method and engine
CN110781180B (en) * 2019-09-05 2022-08-30 腾讯科技(深圳)有限公司 Data screening method and data screening device

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8260768B2 (en) * 2010-01-29 2012-09-04 Hewlett-Packard Development Company, L.P. Transformation of directed acyclic graph query plans to linear query plans
US20120158768A1 (en) * 2010-12-15 2012-06-21 Microsoft Corporation Decomposing and merging regular expressions
CN102541875B (en) * 2010-12-16 2014-04-16 北京大学 Access method, device and system for relational node data of directed acyclic graph
CN102571752B (en) * 2011-12-03 2014-12-24 山东大学 Service-associative-index-map-based quality of service (QoS) perception Top-k service combination system
CN103123652A (en) * 2013-03-14 2013-05-29 曙光信息产业(北京)有限公司 Data query method and cluster database system
CN103150219B (en) * 2013-04-03 2016-08-10 重庆大学 Heterogeneous resource system is avoided the fast worktodo distribution method of deadlock
KR101621490B1 (en) * 2014-08-07 2016-05-17 (주)그루터 Query execution apparatus and method, and system for processing data employing the same
CN105677752A (en) * 2015-12-30 2016-06-15 深圳先进技术研究院 Streaming computing and batch computing combined processing system and method
CN106815027B (en) * 2017-01-22 2020-06-09 山东鲁能软件技术有限公司 High-elasticity computing platform for power grid multi-dimensional service composite computing
CN106960004A (en) * 2017-02-15 2017-07-18 浙江大学 A kind of analysis method of multidimensional data
CN107133257A (en) * 2017-03-21 2017-09-05 华南师范大学 A kind of similar entities recognition methods and system based on center connected subgraph
CN109063056A (en) * 2018-07-20 2018-12-21 阿里巴巴集团控股有限公司 A kind of data query method, system and terminal device
CN110020004B (en) * 2019-02-19 2020-08-07 阿里巴巴集团控股有限公司 Data calculation method and engine

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI801287B (en) * 2021-07-20 2023-05-01 奧義智慧科技股份有限公司 Event visualization device and related computer program product for generating hierarchical directed acyclic graph

Also Published As

Publication number Publication date
CN110020004B (en) 2020-08-07
TWI723535B (en) 2021-04-01
CN110020004A (en) 2019-07-16
WO2020168901A1 (en) 2020-08-27

Similar Documents

Publication Publication Date Title
TWI718375B (en) Data processing method and equipment based on blockchain
TWI698813B (en) Method and device for data storage and query based on blockchain
CN107450972B (en) Scheduling method and device and electronic equipment
TWI748175B (en) Data processing method, device and equipment
TWI710916B (en) Database status determination method, consistency verification method and device
TWI723535B (en) Data calculation method and engine
TWI709931B (en) Method, device and electronic equipment for detecting indicator abnormality
US9996394B2 (en) Scheduling accelerator tasks on accelerators using graphs
TWI735845B (en) Method, device and equipment for data synchronization
WO2018045753A1 (en) Method and device for distributed graph computation
WO2021000570A1 (en) Model loading method and system, control node and execution node
CN107203465B (en) System interface testing method and device
US11379260B2 (en) Automated semantic tagging
TWI679581B (en) Method and device for task execution
TW201915867A (en) Virtual card opening method and system, payment system, and card issuing system
TW201944314A (en) Payment process configuration and execution method, apparatus and device
WO2020199709A1 (en) Method and system for refershing cascaded cache, and device
WO2023151436A1 (en) Sql statement risk detection
CN109656946B (en) Multi-table association query method, device and equipment
US20170220669A1 (en) Method and device for determining a category directory, and an automatic classification method and device
US11176161B2 (en) Data processing method, apparatus, and device
US20240036924A1 (en) Techniques for execution orchestration with topological dependency relationships
CN111078435A (en) Service processing method and device and electronic equipment
CN111198689B (en) Code execution method, device and computer readable storage medium
CN107645541B (en) Data storage method and device and server