TW201918910A

TW201918910A - Data statistics method and apparatus

Info

Publication number: TW201918910A
Application number: TW107130573A
Authority: TW
Inventors: 王華忠
Original assignee: 香港商阿里巴巴集團服務有限公司
Priority date: 2017-10-31
Filing date: 2018-08-31
Publication date: 2019-05-16
Also published as: CN109726363B; WO2019085656A1; TWI689828B; CN109726363A

Abstract

Provided are a data statistics method and apparatus, the method comprising: generating a first parameter and a second parameter to correspond to each data identifier; if a piece of first data corresponding to a data identifier does not participate in data statistics, the second parameter is equal to the first parameter, and otherwise, the second parameter is calculated according to the first parameter and the piece of first data; sending each data identifier and the corresponding first parameter and second parameter to a cooperative data party; receiving cooperative party calculation values returned by the cooperative data party, the cooperative party calculation values being obtained by the cooperative data party according to selected first parameters or second parameters; removing calculation values of various first parameters from the cooperative party calculation values, and obtaining required statistical values.

Description

Data statistics method and device

本公開關於網路技術領域，特別關於一種資料統計方法和裝置。The present disclosure relates to the field of network technology, and in particular to a data statistical method and device.

大數據時代，存在非常多的資料孤島。例如，一個自然人的資料，可以分散儲存於不同的企業中，而企業與企業之間由於競爭關係和用戶隱私保護的考慮，並不是完全的互相信任，這就為涉及企業之間資料合作的統計工作造成了障礙。如何在充分保護企業核心資料隱私的前提下，既能夠利用雙方擁有的資料完成一些資料統計計算，又不會洩露企業各自的資料隱私安全，成為一個亟待解決的迫切問題。但是目前並沒有很好的解決方案。In the era of big data, there are many data islands. For example, the data of a natural person can be distributed and stored in different companies, and due to competition and user privacy protection, companies do not trust each other completely. This is the statistics of data cooperation between companies. Work creates obstacles. Under the premise of fully protecting the core data privacy of the enterprise, it is not only able to use the data owned by both parties to complete some statistical calculations of the data, but also not disclose the privacy of the respective data of the enterprise. But there is currently no good solution.

有鑑於此，本公開提供一種資料統計方法和裝置，以在保護兩個資料擁有方的資料隱私的基礎上，實現兩方安全計算。　　具體地，本說明書一個或多個實施例是透過如下技術方案實現的：　　第一方面，提供一種資料統計方法，所述方法應用於聯合本地資料方和合作資料方的資料進行資料統計，本地資料方具有待計算統計值的多個第一資料，所述多個第一資料分別對應不同的資料標識，合作資料方具有所述資料標識對應的多個第二資料，所述方法包括：　　對應於每個資料標識，生成第一參數和第二參數；若所述資料標識對應的第一資料不參與資料統計，則第二參數等於第一參數，否則，所述第二參數是根據第一參數和所述第一資料計算得到；　　將每個資料標識、以及對應所述資料標識的第一參數和第二參數，發送至合作資料方；　　接收合作資料方返回的合作方計算值，所述合作方計算值是合作資料方根據選擇的第一參數或第二參數得到，若所述資料標識對應的第二資料參與資料統計，則合作資料方選擇第二參數，否則，合作資料方選擇第一參數；　　由所述合作方計算值中去除各個第一參數的計算值，得到所述統計值。　　第二方面，提供一種資料統計方法，所述方法用於在本地資料方與統計資料方之間進行資料統計，所述統計資料方具有待計算統計值的多個第一資料，所述多個第一資料分別對應不同的資料標識，所述本地資料方具有同一資料標識對應的第二資料；所述方法包括：　　接收所述統計資料方發送的資料標識、以及對應所述資料標識的第一參數和第二參數；其中，當所述資料標識對應的第一資料參與資料統計時，所述第二參數是根據第一參數和所述第一資料計算得到，否則，所述第二參數等於第一參數；　　若所述資料標識對應的第二資料是本地參與資料統計的資料，則選擇所述資料標識對應的第二參數；否則，選擇所述資料標識對應的第一參數；　　根據選擇的第一參數和第二參數進行統計計算，得到合作方計算值；　　將所述合作方計算值發送至所述統計資料方，以使得所述統計資料方根據所述合作方計算值去除各個第一參數的計算值，得到所述統計值。　　第三方面，提供一種資料統計裝置，所述裝置用於聯合本地資料方和合作資料方的資料進行資料統計，所述本地資料方具有待計算統計值的多個第一資料，所述多個第一資料分別對應不同的資料標識，所述合作資料方具有所述資料標識對應的多個第二資料；所述裝置包括：　　參數生成模組，用於對應於每個資料標識，生成第一參數和第二參數；若所述資料標識對應的第一資料不參與資料統計，則第二參數等於第一參數，否則，所述第二參數是根據第一參數和所述第一資料計算得到；　　資料發送模組，用於將每個資料標識、以及對應所述資料標識的第一參數和第二參數，發送至所述合作資料方；　　資料接收模組，用於接收合作資料方返回的合作方計算值，所述合作方計算值是合作資料方根據選擇的第一參數或第二參數得到，若所述資料標識對應的第二資料參與資料統計，則合作資料方選擇第二參數，否則，合作資料方選擇第一參數；　　統計處理模組，用於由所述合作方計算值中去除各個第一參數的計算值，得到所述統計值。　　第四方面，提供一種資料統計裝置，所述裝置用於在本地資料方與統計資料方之間進行資料統計，所述統計資料方具有待計算統計值的多個第一資料，所述多個第一資料分別對應不同的資料標識，所述本地資料方具有同一資料標識對應的第二資料；所述裝置包括：　　參數接收模組，用於接收所述統計資料方發送的資料標識、以及對應所述資料標識的第一參數和第二參數；其中，當所述資料標識對應的第一資料參與資料統計時，所述第二參數是根據第一參數和所述第一資料計算得到，否則，所述第二參數等於第一參數；　　參數選擇模組，用於若所述資料標識對應的第二資料是本地參與資料統計的資料，則選擇所述資料標識對應的第二參數；否則，選擇所述資料標識對應的第一參數；　　統計計算模組，用於根據選擇的第一參數和第二參數進行統計計算，得到合作方計算值；　　數值發送模組，用於將所述合作方計算值發送至所述統計資料方，以使得所述統計資料方根據所述合作方計算值去除各個第一參數的計算值，得到所述統計值。　　第五方面，提供一種資料統計設備，所述設備包括記憶體、處理器，以及儲存在記憶體上並可在處理器上運行的電腦指令，所述處理器執行指令時實現以下步驟：　　對應於每個資料標識，生成第一參數和第二參數；若所述資料標識對應的第一資料不參與資料統計，則第二參數等於第一參數，否則，所述第二參數是根據第一參數和所述第一資料計算得到；　　將每個資料標識、以及對應所述資料標識的第一參數和第二參數，發送至合作資料方；　　接收合作資料方返回的合作方計算值，所述合作方計算值是合作資料方根據選擇的第一參數或第二參數得到，若所述資料標識對應的第二資料參與資料統計，則合作資料方選擇第二參數，否則，合作資料方選擇第一參數；　　由所述合作方計算值中去除各個第一參數的計算值，得到所述統計值。　　第六方面，提供一種資料統計設備，所述設備包括記憶體、處理器，以及儲存在記憶體上並可在處理器上運行的電腦指令，所述處理器執行指令時實現以下步驟：　　接收所述統計資料方發送的資料標識、以及對應所述資料標識的第一參數和第二參數；其中，當所述資料標識對應的第一資料參與資料統計時，所述第二參數是根據第一參數和所述第一資料計算得到，否則，所述第二參數等於第一參數；　　若所述資料標識對應的第二資料是本地參與資料統計的資料，則選擇所述資料標識對應的第二參數；否則，選擇所述資料標識對應的第一參數；　　根據選擇的第一參數和第二參數進行統計計算，得到合作方計算值；　　將所述合作方計算值發送至所述統計資料方，以使得所述統計資料方根據所述合作方計算值去除各個第一參數的計算值，得到所述統計值。　　本說明書一個或多個實施例的資料統計方法和裝置，透過生成用於混淆真實資料的第一參數和第二參數，在將這些參數發送至合作資料方時，可以使得合作資料方不會知曉本端的真實資料，並且，合作資料方返回的合作方計算值也是根據合作資料方的資料過濾條件確定，而本端不會知曉合作資料方所做的資料選擇，從而實現了在保護兩個資料擁有方的資料隱私的基礎上，聯合兩方資料進行了兩方安全計算。In view of this, the present disclosure provides a method and device for data statistics to realize the secure calculation of the two parties on the basis of protecting the data privacy of the two data owners. Specifically, one or more embodiments of this specification are implemented through the following technical solutions: 　　The first aspect provides a data statistics method, which is applied to data from a local data partner and a partner data partner for data statistics, local data The party has a plurality of first data to be calculated statistical values, the plurality of first materials respectively correspond to different data identifiers, and the cooperating data party has a plurality of second materials corresponding to the material identifiers, the method includes: 　　 corresponds to Each data identifier generates a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter; otherwise, the second parameter is based on the first parameter Calculated with the first data; 　　Send each data identifier, and the first parameter and the second parameter corresponding to the data identifier, to the partner data party; 　　Receive the calculated partner value returned by the partner data party, the cooperation The calculated value is obtained by the partner data party according to the selected first parameter or second parameter. If the second data corresponding to the data identifier participates in the data statistics, the partner data party selects the second parameter; otherwise, the partner data party selects the first parameter Parameter; 　　 The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value. In a second aspect, a data statistics method is provided. The method is used to perform data statistics between a local data party and a statistical data party. The statistical data party has a plurality of first data to be calculated statistical values. The first data corresponds to different data identifiers respectively, and the local data party has second data corresponding to the same data identifier; the method includes: 　　 receiving the data identifier sent by the statistical data party, and the first corresponding to the data identifier Parameters and a second parameter; wherein, when the first data corresponding to the data identifier participates in data statistics, the second parameter is calculated based on the first parameter and the first data, otherwise, the second parameter is equal to The first parameter; 　　 If the second data corresponding to the data identifier is the data of the local participation in data statistics, then select the second parameter corresponding to the data identifier; otherwise, select the first parameter corresponding to the data identifier; 　　The first parameter and the second parameter are statistically calculated to obtain the calculated value of the partner; 　　 Send the calculated value of the partner to the statistical data party, so that the statistical data party removes each first according to the calculated value of the partner The calculated value of the parameter to obtain the statistical value. In a third aspect, a data statistics device is provided, which is used to perform data statistics with data from a local data partner and a cooperative data partner. The local data partner has a plurality of first data to be calculated statistical values, the plurality of The first data corresponds to different data identifiers respectively, and the partner data partner has multiple second materials corresponding to the data identifiers; the device includes: a 　　 parameter generation module, corresponding to each data identifier, to generate a first Parameter and second parameter; if the first data corresponding to the data identifier does not participate in the data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is calculated based on the first parameter and the first data ; 　　 data sending module, used to send each data identifier, and the first parameter and the second parameter corresponding to the data identifier to the cooperative data party; 　　 data receiving module, used to receive the return of the cooperative data party The calculated value of the partner. The calculated value of the partner is obtained by the partner data party according to the selected first parameter or second parameter. If the second data corresponding to the data identifier participates in the data statistics, the partner data party selects the second parameter. Otherwise, the partner data party selects the first parameter; 　　 statistical processing module, used to remove the calculated value of each first parameter from the calculated value of the partner to obtain the statistical value. According to a fourth aspect, there is provided a data statistics device for statistical data between a local data party and a statistical data party, the statistical data party having a plurality of first data to be calculated statistical values, the plurality of The first data corresponds to different data identifiers respectively, and the local data party has the second data corresponding to the same data identifier; the device includes: a 　　 parameter receiving module for receiving the data identifier sent by the statistical data party and the corresponding The first parameter and the second parameter of the data identification; wherein, when the first data corresponding to the data identification participates in data statistics, the second parameter is calculated based on the first parameter and the first data, otherwise , The second parameter is equal to the first parameter; 　　 parameter selection module, used to select the second parameter corresponding to the data identifier if the second data corresponding to the data identifier is the data of local participation in data statistics; otherwise, Select the first parameter corresponding to the data identifier; 　　Statistical calculation module for statistical calculation according to the selected first parameter and second parameter to obtain the calculated value of the partner; 　　Numerical value sending module for the partner The calculated value is sent to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the calculated value of the partner to obtain the statistical value. According to a fifth aspect, a data statistics device is provided. The device includes a memory, a processor, and computer instructions stored on the memory and executable on the processor. When the processor executes the instructions, the following steps are implemented: 　　 corresponds to Each data identifier generates a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter; otherwise, the second parameter is based on the first parameter Calculated with the first data; 　　Send each data identifier, and the first parameter and the second parameter corresponding to the data identifier, to the partner data partner; 　　Receive the partner partner’s calculated value returned by the partner data partner The calculated value is obtained by the partner data party according to the selected first parameter or second parameter. If the second data corresponding to the data identifier participates in the data statistics, the partner data party selects the second parameter; otherwise, the partner data party selects the first parameter Parameter; 　　 The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value. According to a sixth aspect, a data statistics device is provided. The device includes a memory, a processor, and computer instructions stored on the memory and executable on the processor. When the processor executes the instructions, the following steps are implemented: 　　Receiver The data identifier sent by the statistical data party, and the first parameter and the second parameter corresponding to the data identifier; wherein, when the first data corresponding to the data identifier participates in the data statistics, the second parameter is based on the first The parameters and the first data are calculated, otherwise, the second parameter is equal to the first parameter; 　　If the second data corresponding to the data identifier is the data of the local participation in data statistics, then select the second corresponding to the data identifier Parameters; otherwise, select the first parameter corresponding to the data identifier; 　　 perform statistical calculations based on the selected first and second parameters to obtain the calculated value of the partner; 　　 send the calculated value of the partner to the statistical data party, So that the statistical data party removes the calculated value of each first parameter according to the calculated value of the partner to obtain the statistical value. The data statistical method and device of one or more embodiments of this specification, by generating the first parameter and the second parameter used to obfuscate the real data, when sending these parameters to the partner data party, the partner data party will not be known The real data of the local end, and the calculated value of the partner returned by the cooperative data side is also determined according to the data filtering conditions of the cooperative data side, and the local end will not know the data selection made by the cooperative data side, thereby realizing the protection of two data On the basis of the data privacy of the owner, the two parties' data are calculated in conjunction with the two parties.

為了使本技術領域的人員更好地理解本說明書一個或多個實施例中的技術方案，下面將結合本說明書一個或多個實施例中的附圖，對本說明書一個或多個實施例中的技術方案進行清楚、完整地描述，顯然，所描述的實施例僅僅是一部分實施例，而不是全部的實施例。基於本說明書一個或多個實施例，本領域普通技術人員在沒有作出創造性勞動前提下所獲得的所有其他實施例，都應當屬於本公開保護的範圍。　　在大數據時代，資料的儲存方式可以是垂直模式，即多個資料擁有方可以擁有同一個實體的不同屬性資訊，例如，同一個自然人的車險分在一個機構，該自然人的理賠金額在另一個機構。這種垂直模式的資料儲存，可能導致在進行一些資料統計計算時，會涉及到多個資料擁有方，需要多個資料擁有方合作完成一次資料統計。然而，由於不同企業之間的競爭關係或者隱私保護的考慮，不能洩露企業各自的資料秘密。　　本公開的例子中，旨在基於不同的資料擁有方的資料進行資料統計，同時又不會洩露資料擁有方各自的資料隱私，如下將以一個示例的應用場景為例，來詳細描述該方法。應用場景：　　在一個例子中，可以有兩個資料來源，分別為：資料來源A和資料來源B。假設資料來源A可以是一個資料機構，資料來源B可以是一個保險機構，這兩個資料來源可以分別儲存同一個車主的不同資訊。　　資料來源A：假設該資料來源A可以儲存每個車主的車險分，車險分可以是對車主進行精準畫像和風險分析後得到的分數，車險分的分數越高，可以表明風險越低。如表1所示，資料來源A側儲存車險分的資料結構示例如下：表1 資料來源A的資料結構資料來源B：假設該資料來源B可以儲存每個車主的理賠資訊，例如，車主的理賠資訊可以包括理賠次數、理賠金額等。如表2所示，資料來源B側儲存的每個車主的資料結構示例如下：表2 資料來源B的資料結構基於上述的應用場景，可以基於資料來源A和資料來源B的資料，共同完成資料統計的處理。例如，統計工作的需求可以是“統計車險分大於500分的女性用戶理賠次數的總和”，那麼，“車險分大於500分”需要依據資料來源A的資料來確定，“女性使用者、理賠次數”這些資料都儲存在資料來源B中，因此，這種統計工作需要資料來源A和資料來源B的資料配合。　　在本說明書一個或多個實施例對資料統計方法的描述中，可以將擁有統計資料的資料來源稱為統計資料方，可以將另一個資料來源稱為合作資料方。例如，在統計工作“統計車險分大於500分的女性用戶理賠次數的總和”中，“理賠次數”是統計資料，所以資料來源B是統計資料方，那麼資料來源A是合作資料方。　　統計資料方和合作資料方可以分別儲存同一個車主的不同資訊，可以將統計資料方中儲存的待參與統計的車主資訊（例如，理賠次數）稱為第一資料，將合作資料方中儲存的參與統計的車主資訊（例如，車險分）稱為第二資料。此外，資料來源A和資料來源B中都包括的身份證號idcard_no可以稱為資料標識，統計資料方方（例如，資料來源B）可以儲存該資料標識對應的第一資料，合作資料方（例如，資料來源A）可以儲存該同一資料標識對應的第二資料。　　圖1示例了一種資料統計方法的流程，可以包括：　　在步驟100中，統計資料方對應於每個資料標識，生成第一參數和第二參數。　　例如，第一參數可以是一個亂數，或者，第一參數也可以是根據一個亂數計算得到的數值，如，亂數的二分之一。　　例如，第二參數的數值可以根據資料過濾條件而確定，如果資料標識對應的第一資料滿足本地的資料過濾條件，是參與資料統計的資料，則可以根據第一參數和第一資料計算得到該第二參數。比如，可以將第一參數和第一資料進行求和統計得到第二參數。如果資料標識對應的第二資料不滿足本地的資料過濾條件，則可以設置第二參數等於第一參數。但是實際實施中，第二參數的生成方式不限制於將第一資料和第一參數求和的方式得到，也可以採用其他計算方式。　　在步驟102中，統計資料方將本地的資料標識、以及對應所述資料標識的第一參數和第二參數，發送至合作資料方。　　在步驟104中，合作資料方選擇參數，若資料標識對應的第二資料是本地參與資料統計的資料，則選擇所述資料標識對應的第二參數；否則，選擇資料標識對應的第一參數。　　例如，合作資料方在接收到統計資料方發送的資料標識、以及對應資料標識的第一參數和第二參數後，可以在本步驟進行參數的選擇，選擇的參數可以參與後續步驟106的處理。　　其中，合作資料方可以根據本地的資料過濾條件，如果一個資料標識對應的第二資料滿足過濾條件，是參與資料統計的資料，則可以選擇第二參數；否則，如果一個資料標識對應的第二資料不過濾條件，不是參與資料統計的資料，則可以選擇第一參數。　　在步驟106中，合作資料方將選擇的第一參數和第二參數進行統計計算，得到合作方計算值。例如，在所要獲取的統計值是求和統計時，可以將選擇的第一參數和第二參數進行加和；當然在其他的統計方式中，也可以將第一參數和第二參數進行對應的其他形式的計算。　　在步驟108中，合作資料方將合作方計算值發送至所述統計資料方。　　在步驟110中，統計資料方用合作方計算值去除第一參數的計算值，得到統計值。例如，可以是用合作方計算值減去各個第一參數之和。　　上述圖1的流程例子，採用了不經意傳輸協定（Oblivious transfer，OT），該OT是一種可保護隱私的雙方通信協議，能使通信雙方以一種選擇模糊化的方式傳送消息，可以使得服務的接收方以不經意的方式得到服務發送方輸入的某些消息，這樣就可以保護接受者的隱私不被發送者所知道。　　例如，在圖1的例子中，統計資料方可以將所有的資料標識和對應的第一參數、第二參數發送至合作資料方，其實統計資料方已經根據本地的資料過濾條件對第二參數設置了不同的數值，但是由合作資料方的角度來看，接收到的是所有資料標識，不會洩露統計資料方的過濾資料。再者，統計資料方透過兩個參數的方式混淆了自己的真實資料，向合作資料方傳送的第一參數和第二參數並不是真實的第一資料，也不會導致資料隱私洩露。並且，再由統計資料方的角度來看，它所接收的合作方計算值是合作資料方進行資料過濾後的選擇，但是統計資料方也無法區分合作資料方選擇了哪些資料，因此，合作資料方的資料也能夠得到隱私保護。　　基於表1所示的資料結構，假設資料來源A擁有的車險分數據如下表3，其中，idcard_no可以是車主的身份證號，score可以是該車主的車險分。表3 資料來源A的數據基於表2所示的資料結構，假設資料來源B擁有的資料如下表4：表4 資料來源B的數據如下基於上述表3和表4，統計車險分大於500分的女性用戶理賠次數的總和。可以看到，本次統計工作的統計資料“理賠次數”儲存在資料來源B，表4中的times這一列可以稱為“統計列”，即要對這一列的資料進行求和統計。而過濾條件中的“車險分大於500分”位於資料來源A（第二資料用於作為統計值獲取的過濾條件），過濾條件“女性”位於資料來源B，即過濾條件可以在兩個資料來源都存在。資料來源A和資料來源B進行資料合作，可以實現對理賠次數統計求和（獲取統計值）的工作。　　圖2示例了結合資料來源A和資料來源B進行求和統計的流程，可以包括：　　在步驟200中，資料來源B針對每一行資料都生成一個亂數，並根據資料過濾條件生成M0和M1。　　本步驟中，例如表4所示例的資料，理賠次數times對應的列是統計列。其中的3、7、6都是該統計列中的第一資料。　　針對每一行資料生成的一個亂數，假設對應1234567的亂數是t1，對應2345678的亂數是t2，對應3456789的亂數是t3。　　根據本地的資料過濾條件“女性使用者”，可以得到2345678和3456789這兩個idcard_no的車主符合該條件，是參與本次數據統計的第一資料；而1234567的車主不符合過濾條件，不參與資料統計。據此，假設統計列中的各個第一資料用b表示，那麼可以生成對應每個idcard_no的第一參數和第二參數。其中的第一參數可以是上述對應每個idcard_no的亂數，第二參數可以是亂數與該idcard_no對應的第一資料之和，所述第一資料可以是參與統計的b。　　如下表5的示例，對每行資料都生成一個亂數，假設對應統計列的真實值為b。對每行資料做遍歷，如果這行資料滿足自身的過濾條件，則生成M0=t，M1=t+b；如果不滿足自身的過濾條件，則生成M0=M1=t。表5 每一行資料的MO和M1 本步驟生成的M0和M1，是透過隨機值的生成來混淆真實的統計列資料，就算合作資料方接收到idcard_no對應的M0和M1，也不能知道該資料標識idcard_no對應的真實的統計列資料b是多少。例如，即使接收到資料標識2345678對應的t₂ 和t₂ +7，也不能知道真實的b的數值7。　　此外，上述分別與每個資料標識對應的亂數t₁ 、t₂ 和t₃ ，可以不同。　　在步驟202中，資料來源B將每一行資料的資料標識、以及對應所述資料標識的MO和M1，發送至資料來源A。　　在步驟204中，資料來源A根據本地的資料過濾條件，若資料標識對應的第二資料參與資料統計，則選擇M1，否則，選擇MO。　　例如，資料來源A可以根據過濾條件“車險分大於500分”，來判斷每個資料標識idcard_no對應的第二資料（表3中的score）是否大於500分。若idcard_no對應的score大於500，則選擇表5中的“t+b”，否則，若idcard_no對應的score小於500，則選擇表5中的“t”。　　舉例來說，以idcard_no是1234567為例，該資料標識對應的車險分是490，並不滿足“車險分大於500分”的過濾條件，則可以選擇表5中對應1234567的M0，即選擇t1。又例如，以idcard_no是2345678為例，在表3中，該資料標識對應的車險分是501，滿足“車險分大於500分”的過濾條件，則可以選擇表5中對應2345678的M1，即選擇t2+7。同理，對於idcard_no是3456789，將選擇t3+6。　　在步驟206中，資料來源A將選擇數做累加，得到累加值。　　例如，資料來源A可以將選擇的參數進行累加操作，得到一個累加值。比如，累加值可以是M=t1+t2+7+t3+6。該累加值即為合作方計算值。　　在步驟208中，資料來源A將累加值發送至資料來源B。　　在步驟210中，資料來源B用累加值減去M0之和，得到統計值。　　本步驟中，資料來源B接收到累加值後，把累加值減去所有的亂數MO的和，得到的就是要統計的理賠次數之和。例如，可以計算M-(t1+t2+t3)=13，即為最終的統計值，其中的M是累加值。　　本例子中，資料來源B接收到累加值後，並不能知道資料來源A側具體選擇的是M0還是M1，而只是接收到一個累加值；同樣，資料來源A也不能知道資料來源B側過濾的參與統計資料，而只是接收到兩個參數。因此，這種方式在計算過程中沒有洩露任何一方的明細資料，而且高效的完成了兩方的求和統計。　　上述圖2所示的流程，是以統計值是多個第一資料之和為例，比如求取理賠次數的總和。在其他的例子中，本說明書一個或多個實施例的資料統計方法，還可以應用於其他統計計算的場景，比如，統計值還可以是求取多個第一資料的平均值。　　以求取“車險分大於500分的女性用戶理賠次數的平均值”為例，還可以採用圖2所示的處理流程，不同的是，可以採用不同的第一參數和第二參數。比如，當一行資料不滿足自身的過濾條件，則對應資料標識生成的第一參數和第二參數可以是M0=M1=t；而當一行資料滿足自身的過濾條件，則對應資料標識生成的第一參數和第二參數可以是第一參數加上第一資料的二分之一。　　例如，以表5中資料標識2345678為例，生成的M0可以是t₂ ，生成的M1可以是“t₂ +7/2”。或者，還可以是將第一參數生成為亂數的二分之一，比如“t₂ /2”，對應的第二參數可以是“（t₂ +7）/2”。如下表6所示：表6 統計平均值時的MO和M1 待資料來源B接收到資料來源A發送的累加值M後，假設資料來源A選擇的是後兩行資料（對應資料標識2345678和3456789），仍然可以是M-(t1+t2+t3)=6.5。　　為了實現上述的方法，本說明書一個或多個實施例還提供了一種資料統計裝置，如圖3所示，該裝置可以包括：參數生成模組31、資料發送模組32、資料接收模組33和統計處理模組34。　　參數生成模組31，用於對應於每個資料標識，生成第一參數和第二參數；若所述資料標識對應的第一資料不參與資料統計，則第二參數等於第一參數，否則，所述第二參數是根據第一參數和所述第一資料計算得到；　　資料發送模組32，用於將每個資料標識、以及對應所述資料標識的第一參數和第二參數，發送至所述合作資料方；　　資料接收模組33，用於接收合作資料方返回的合作方計算值，所述合作方計算值是合作資料方根據選擇的第一參數或第二參數得到，若所述資料標識對應的第二資料參與資料統計，則合作資料方選擇第二參數，否則，合作資料方選擇第一參數；　　統計處理模組34，用於由所述合作方計算值中去除各個第一參數的計算值，得到所述統計值。　　在一個例子中，所述多個第一資料，位於本地資料來源的同一個統計列中。　　在一個例子中，參數生成模組31，在用於根據第一參數和所述第一資料計算得到第二參數時，具體是用於由所述第一參數和所述第一資料進行求和統計得到第二參數。統計處理模組34，在用於由所述合作方計算值中去除各個第一參數的計算值時，具體是用於透過累加值減去各個所述第一參數之和，所述累加值是合作資料方根據選擇的第一參數或第二參數累加得到。　　在一個例子中，參數生成模組31，在用於由第一參數和第一資料進行求和統計得到第二參數時，具體是用於：若所述資料標識對應的第一資料滿足用於確定參與統計資料的資料過濾條件，則當所述統計值是多個第一資料之和時，所述第一參數是一個亂數，所述第二參數是所述亂數與所述第一資料之和。　　在一個例子中，參數生成模組31，在用於由第一參數和第一資料進行求和統計得到第二參數時，具體是用於：若所述資料標識對應的第一資料滿足用於確定參與統計資料的資料過濾條件，則當所述統計值是多個第一資料的平均值時，所述第二參數是所述第一參數加上所述第一資料的二分之一。　　為了實現上述的方法，本說明書一個或多個實施例還提供了一種資料統計裝置，如圖4所示，該裝置可以包括：參數接收模組41、參數選擇模組42、統計計算模組43和數值發送模組44。　　參數接收模組41，用於接收所述統計資料方發送的資料標識、以及對應所述資料標識的第一參數和第二參數；其中，當所述資料標識對應的第一資料參與資料統計時，所述第二參數是根據第一參數和所述第一資料計算得到，否則，所述第二參數等於第一參數；　　參數選擇模組42，用於若所述資料標識對應的第二資料是本地參與資料統計的資料，則選擇所述資料標識對應的第二參數；否則，選擇所述資料標識對應的第一參數；　　統計計算模組43，用於根據選擇的第一參數和第二參數進行統計計算，得到合作方計算值；　　數值發送模組44，用於將所述合作方計算值發送至所述統計資料方，以使得所述統計資料方根據所述合作方計算值去除各個第一參數的計算值，得到所述統計值。　　為了描述的方便，描述以上裝置時以功能分為各種模組分別描述。當然，在實施本說明書一個或多個實施例時可以把各模組的功能在同一個或多個軟體和/或硬體中實現。　　上述方法實施例所示流程中的各個步驟，其執行順序不限制於流程圖中的順序。此外，各個步驟的描述，可以實現為軟體、硬體或者其結合的形式，例如，本領域技術人員可以將其實現為軟體代碼的形式，可以為能夠實現所述步驟對應的邏輯功能的電腦可執行指令。當其以軟體的方式實現時，所述的可執行指令可以儲存在記憶體中，並被設備中的處理器執行。　　例如，對應於上述方法，本說明書一個或多個實施例同時提供一種資料統計設備，該設備用於聯合本地資料方和合作資料方的資料進行資料統計，所述本地資料方具有待計算統計值的多個第一資料，所述多個第一資料分別對應不同的資料標識，所述合作資料方具有所述資料標識對應的多個第二資料。該設備可以包括處理器、記憶體、以及儲存在記憶體上並可在處理器上運行的電腦指令，所述處理器透過執行所述指令，用於實現如下步驟：　　對應於每個資料標識，生成第一參數和第二參數；若所述資料標識對應的第一資料不參與資料統計，則第二參數等於第一參數，否則，所述第二參數是根據第一參數和所述第一資料計算得到；　　將每個資料標識、以及對應所述資料標識的第一參數和第二參數，發送至合作資料方；　　接收合作資料方返回的合作方計算值，所述合作方計算值是合作資料方根據選擇的第一參數或第二參數得到，若所述資料標識對應的第二資料參與資料統計，則合作資料方選擇第二參數，否則，合作資料方選擇第一參數；　　由所述合作方計算值中去除各個第一參數的計算值，得到所述統計值。　　例如，對應於上述方法，本說明書一個或多個實施例還提供一種資料統計設備，該設備用於在本地資料方與統計資料方之間進行資料統計，所述統計資料方具有待計算統計值的多個第一資料，所述多個第一資料分別對應不同的資料標識，所述本地資料方具有同一資料標識對應的第二資料。該設備可以包括處理器、記憶體、以及儲存在記憶體上並可在處理器上運行的電腦指令，所述處理器透過執行所述指令，用於實現如下步驟：　　接收所述統計資料方發送的資料標識、以及對應所述資料標識的第一參數和第二參數；其中，當所述資料標識對應的第一資料參與資料統計時，所述第二參數是根據第一參數和所述第一資料計算得到，否則，所述第二參數等於第一參數；　　若所述資料標識對應的第二資料是本地參與資料統計的資料，則選擇所述資料標識對應的第二參數；否則，選擇所述資料標識對應的第一參數；　　根據選擇的第一參數和第二參數進行統計計算，得到合作方計算值；　　將所述合作方計算值發送至所述統計資料方，以使得所述統計資料方根據所述合作方計算值去除各個第一參數的計算值，得到所述統計值。　　上述實施例闡明的裝置或模組，具體可以由電腦晶片或實體實現，或者由具有某種功能的產品來實現。一種典型的實現設備為電腦，電腦的具體形式可以是個人電腦、膝上型電腦、蜂窩電話、相機電話、智慧型電話、個人數位助理、媒體播放機、導航設備、電子郵件收發設備、遊戲控制台、平板電腦、可穿戴設備或者這些設備中的任意幾種設備的組合。　　本領域內的技術人員應明白，本說明書一個或多個實施例可提供為方法、系統、或電腦程式產品。因此，本說明書一個或多個實施例可採用完全硬體實施例、完全軟體實施例、或結合軟體和硬體方面的實施例的形式。而且，本說明書一個或多個實施例可採用在一個或多個其中包含有電腦可用程式碼的電腦可用儲存媒體（包括但不限於磁碟記憶體、CD-ROM、光學記憶體等）上實施的電腦程式產品的形式。　　這些電腦程式指令也可儲存在能引導電腦或其他可程式設計資料處理設備以特定方式工作的電腦可讀記憶體中，使得儲存在該電腦可讀記憶體中的指令產生包括指令裝置的製造品，該指令裝置實現在流程圖一個流程或多個流程和／或方框圖一個方框或多個方框中指定的功能。　　這些電腦程式指令也可裝載到電腦或其他可程式設計資料處理設備上，使得在電腦或其他可程式設計設備上執行一系列操作步驟以產生電腦實現的處理，從而在電腦或其他可程式設計設備上執行的指令提供用於實現在流程圖一個流程或多個流程和／或方框圖一個方框或多個方框中指定的功能的步驟。　　還需要說明的是，術語“包括”、“包含”或者其任何其他變體意在涵蓋非排他性的包含，從而使得包括一系列要素的過程、方法、商品或者設備不僅包括那些要素，而且還包括沒有明確列出的其他要素，或者是還包括為這種過程、方法、商品或者設備所固有的要素。在沒有更多限制的情況下，由語句“包括一個……”限定的要素，並不排除在包括所述要素的過程、方法、商品或者設備中還存在另外的相同要素。　　本說明書一個或多個實施例可以在由電腦執行的電腦可執行指令的一般上下文中描述，例如程式模組。一般地，程式模組包括執行特定任務或實現特定抽象資料類型的常式、程式、物件、元件、資料結構等等。也可以在分散式運算環境中實踐本說明書一個或多個實施例，在這些分散式運算環境中，由透過通信網路而被連接的遠端處理設備來執行任務。在分散式運算環境中，程式模組可以位於包括存放裝置在內的本地和遠端電腦儲存媒體中。　　本說明書中的各個實施例均採用遞進的方式描述，各個實施例之間相同相似的部分互相參見即可，每個實施例重點說明的都是與其他實施例的不同之處。尤其，對於服務端設備實施例而言，由於其基本相似於方法實施例，所以描述的比較簡單，相關之處參見方法實施例的部分說明即可。　　上述對本說明書特定實施例進行了描述。其它實施例在所附申請專利範圍的範圍內。在一些情況下，在申請專利範圍中記載的動作或步驟可以按照不同於實施例中的順序來執行並且仍然可以實現期望的結果。另外，在附圖中描繪的過程不一定要求示出的特定順序或者連續順序才能實現期望的結果。在某些實施方式中，多工處理和並行處理也是可以的或者可能是有利的。　　以上所述僅為本說明書一個或多個實施例的較佳實施例而已，並不用以限制本公開，凡在本公開的精神和原則之內，所做的任何修改、等同替換、改進等，均應包含在本公開保護的範圍之內。In order to enable those skilled in the art to better understand the technical solutions in one or more embodiments of this specification, the following will be combined with the drawings in one or more embodiments of this specification. The technical solution is described clearly and completely. Obviously, the described embodiments are only a part of the embodiments, but not all the embodiments. Based on one or more embodiments of this specification, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present disclosure. In the era of big data, the way of storing data can be a vertical mode, that is, multiple data owners can have different attribute information of the same entity. For example, the car insurance of the same natural person is divided in one institution, and the amount of compensation of the natural person is in another. mechanism. This vertical mode of data storage may cause multiple data owners to be involved in the calculation of some data statistics, requiring multiple data owners to cooperate to complete a data statistics. However, due to competition between different companies or privacy protection considerations, the respective data secrets of the companies cannot be revealed. In the examples of the present disclosure, the purpose is to perform data statistics based on the data of different data owners without revealing the respective data privacy of the data owners. The following will take an example application scenario as an example to describe the method in detail. Application scenario: In an example, there can be two data sources: data source A and data source B. Suppose data source A can be a data organization, and data source B can be an insurance organization. The two data sources can store different information of the same vehicle owner. Data source A: Suppose that this data source A can store the car insurance score of each car owner. The car insurance score can be the score obtained after accurate portrait and risk analysis of the car owner. The higher the car insurance score, the lower the risk. As shown in Table 1, an example of the data structure of the auto insurance points stored on the data source A is as follows: Table 1 Data structure of the data source A Data source B: It is assumed that this data source B can store the claims information of each vehicle owner. For example, the claims information of the vehicle owner can include the number of claims and the amount of claims. As shown in Table 2, an example of the data structure of each vehicle owner stored on the side of data source B is as follows: Table 2 Data structure of data source B Based on the above application scenarios, data statistics can be completed based on data from data source A and data source B. For example, the demand for statistical work may be "to calculate the total number of claims made by female users with auto insurance points greater than 500 points", then "auto insurance points greater than 500 points" need to be determined based on data from source A, "female users, claims times "These data are stored in data source B. Therefore, this kind of statistical work needs the cooperation of data source A and data source B. In the description of the data statistical method in one or more embodiments of this specification, a data source that owns statistical data may be referred to as a statistical data party, and another data source may be referred to as a cooperative data party. For example, in the statistical work "Counting the total number of claims made by female users with a car insurance score greater than 500 points", "Number of claims" is statistical data, so data source B is a statistical data source, then data source A is a cooperative data side. The statistical data partner and the cooperative data partner can separately store different information of the same vehicle owner. The information of the vehicle owner to be included in the statistics (for example, the number of claims) stored in the statistical data partner can be referred to as the first data. The information of car owners participating in the statistics (for example, car insurance points) is called secondary data. In addition, the ID card number idcard_no included in both data source A and data source B can be called a data identification, and the statistical data party (for example, data source B) can store the first data corresponding to the data identification, the cooperative data party (for example , Data source A) can store the second data corresponding to the same data identifier. FIG. 1 illustrates a flow of a method for data statistics, which may include: In step 100, a statistical data party generates a first parameter and a second parameter corresponding to each data identifier. For example, the first parameter may be a random number, or the first parameter may be a value calculated according to a random number, for example, a half of the random number. For example, the value of the second parameter can be determined according to the data filtering conditions. If the first data corresponding to the data identifier meets the local data filtering conditions and is the data participating in the data statistics, the value can be calculated according to the first parameter and the first data The second parameter. For example, the second parameter can be obtained by summing the first parameter and the first data. If the second data corresponding to the data identifier does not satisfy the local data filtering conditions, the second parameter can be set equal to the first parameter. However, in actual implementation, the generation method of the second parameter is not limited to the method of summing the first data and the first parameter, and other calculation methods may also be used. In step 102, the statistical data party sends the local data identifier, and the first parameter and the second parameter corresponding to the data identifier to the cooperative data party. In step 104, the collaborating data party selects parameters, and if the second data corresponding to the data identifier is the data of the local participating data statistics, the second parameter corresponding to the data identifier is selected; otherwise, the first parameter corresponding to the data identifier is selected. For example, after receiving the data identifier sent by the statistical data party and the first parameter and the second parameter corresponding to the data identifier, the cooperating data party may select parameters at this step, and the selected parameters may participate in the subsequent step 106 processing. Among them, the collaborating data party can filter the data according to local conditions. If the second data corresponding to a data identifier meets the filtering conditions and is the data participating in the data statistics, the second parameter can be selected; otherwise, if the second data corresponding to a data identifier The data does not filter the conditions, and is not the data participating in the statistics of the data, you can choose the first parameter. In step 106, the partner data party performs a statistical calculation on the selected first parameter and second parameter to obtain a partner calculated value. For example, when the statistical value to be obtained is a summation statistic, the selected first parameter and the second parameter can be added; of course, in other statistical methods, the first parameter and the second parameter can also be correspondingly Other forms of calculation. In step 108, the collaborating data party sends the calculated value of the collaborating party to the statistical data party. In step 110, the statistical data side divides the calculated value of the first parameter with the calculated value of the partner to obtain a statistical value. For example, it may be calculated by the partner minus the sum of the first parameters. The process example in Figure 1 above uses an inadvertent transfer protocol (Oblivious transfer, OT). This OT is a privacy-protecting two-party communication protocol that enables the two parties to communicate to send messages in a fuzzified manner, which can enable the reception of services. The party gets some messages entered by the service sender in an inadvertent way, so that the privacy of the recipient can be protected from the sender. For example, in the example of FIG. 1, the statistical data party can send all the data identifiers and the corresponding first and second parameters to the partner data party. In fact, the statistical data party has set the second parameter according to the local data filtering conditions. Different values are shown, but from the perspective of the partner data party, all the data identifiers are received, and the filtered data of the statistical data party will not be leaked. Furthermore, the statistical data side confuses his real data through two parameters. The first parameter and the second parameter sent to the cooperative data side are not real first data, nor will they lead to the disclosure of data privacy. Moreover, from the perspective of the statistical data party, the calculated value of the cooperative party it receives is the data filtering party's choice after the data filtering, but the statistical data party cannot distinguish between the data selected by the cooperative data party. Therefore, the cooperative data Fang's data can also be protected by privacy. Based on the data structure shown in Table 1, it is assumed that the data of auto insurance points owned by data source A is as shown in Table 3 below, where idcard_no can be the owner's ID number and score can be the car insurance points of the owner. Table 3 Source A data Based on the data structure shown in Table 2, it is assumed that the data owned by data source B is as follows in Table 4: Table 4 Data from data source B Based on Tables 3 and 4 above, the total number of claims made by female users with auto insurance points greater than 500 points is counted. It can be seen that the statistical data "the number of claims" of this statistical work is stored in data source B. The time column in Table 4 can be called a "statistic column", that is, the data in this column should be summed up. The "auto insurance score greater than 500 points" in the filter condition is located in the data source A (the second data is used as the filter condition obtained as a statistical value), and the filter condition "female" is located in the data source B, that is, the filter condition can be in two data sources All exist. Data cooperation between data source A and data source B can realize the statistical summation (acquisition of statistical values) of the number of claims. Figure 2 illustrates the process of summing statistics with data source A and data source B, which may include: In step 200, data source B generates a random number for each row of data, and generates M0 and M1 according to the data filtering conditions. In this step, for example, the data shown in Table 4, the column corresponding to the claims times is a statistical column. Among them, 3, 7, and 6 are the first data in the statistical column. For a random number generated for each row of data, assume that the random number corresponding to 1234567 is t1, the random number corresponding to 2345678 is t2, and the random number corresponding to 3456789 is t3. According to the local data filtering condition "Female User", you can get that the owners of 2345678 and 3456789 idcard_no meet this condition, which is the first data to participate in this data statistic; and the owners of 1234567 do not meet the filtering conditions and do not participate in the data statistics. According to this, assuming that each first data in the statistical column is represented by b, then the first parameter and the second parameter corresponding to each idcard_no can be generated. The first parameter may be the aforementioned random number corresponding to each idcard_no, and the second parameter may be the sum of the random number and the first data corresponding to the idcard_no, and the first data may be b participating in statistics. As shown in the example in Table 5 below, a random number is generated for each row of data. Assume that the true value of the corresponding statistical column is b. Iterate over each line of data, if this line of data meets its own filtering conditions, it generates M0=t, M1=t+b; if it does not meet its own filtering conditions, it generates M0=M1=t. Table 5 MO and M1 of each row of data The M0 and M1 generated in this step are used to confuse the real statistical data through the generation of random values. Even if the collaborating data party receives M0 and M1 corresponding to idcard_no, it cannot know the true statistical data corresponding to the id id_no. how many. For example, even if t ₂ and t ₂ +7 corresponding to the material identifier 2345678 are received, the true value of b cannot be known. In addition, the above-mentioned random numbers t ₁ , t ₂ and t ₃ corresponding to each data identifier may be different. In step 202, the data source B sends the data identification of each line of data, and the MO and M1 corresponding to the data identification to the data source A. In step 204, the data source A selects M1 if the second data corresponding to the data identifier participates in the data statistics according to the local data filtering conditions, otherwise, selects MO. For example, the data source A may determine whether the second data (score in Table 3) corresponding to each data identifier idcard_no is greater than 500 points according to the filtering condition “auto insurance score is greater than 500 points”. If the score corresponding to idcard_no is greater than 500, select "t+b" in Table 5, otherwise, if the score corresponding to idcard_no is less than 500, select "t" in Table 5. For example, taking idcard_no as 1234567 as an example, the data identification corresponding to the car insurance score is 490, which does not satisfy the filter condition of "car insurance score is greater than 500 points", then you can select the M0 corresponding to 1234567 in Table 5, that is, select t1. For another example, taking idcard_no as 2345678 as an example, in Table 3, the car insurance score corresponding to the data identifier is 501, and the filter condition of “car insurance score is greater than 500 points” is satisfied, then you can select M1 corresponding to 2345678 in Table 5 to select t2+7. Similarly, for idcard_no is 3456789, t3+6 will be selected. In step 206, the data source A accumulates the selected numbers to obtain the accumulated value. For example, data source A can accumulate the selected parameters to obtain an accumulated value. For example, the accumulated value may be M=t1+t2+7+t3+6. The accumulated value is the calculated value of the partner. In step 208, data source A sends the accumulated value to data source B. In step 210, the data source B subtracts the sum of M0 from the accumulated value to obtain a statistical value. In this step, after receiving the accumulated value, the data source B subtracts the sum of all the random numbers MO to obtain the sum of the number of claims to be counted. For example, M-(t1+t2+t3)=13 can be calculated as the final statistical value, where M is the accumulated value. In this example, after data source B receives the accumulated value, it does not know whether M0 or M1 is specifically selected on the side of data source A, but only an accumulated value is received; similarly, data source A cannot know the filtered value on the side of data source B. Participate in statistics, but only receive two parameters. Therefore, this method does not reveal the details of any party in the calculation process, and efficiently completes the summation statistics of both parties. The process shown in FIG. 2 above takes the statistical value as the sum of multiple first data as an example, for example, to obtain the total number of claims. In other examples, the data statistical method of one or more embodiments of this specification can also be applied to other statistical calculation scenarios, for example, the statistical value can also be an average of multiple first data. Taking the "average number of claims for female users with vehicle insurance points greater than 500 points" as an example, the processing flow shown in FIG. 2 can also be used. The difference is that different first and second parameters can be used. For example, when a line of data does not meet its own filtering conditions, the first and second parameters generated by the corresponding data mark can be M0=M1=t; and when a line of data meets its own filtering conditions, the corresponding first The first parameter and the second parameter may be the first parameter plus half of the first data. For example, taking the material identifier 2345678 in Table 5 as an example, the generated M0 may be t ₂ , and the generated M1 may be “t ₂ +7/2”. Alternatively, the first parameter may be generated as a half of a random number, such as “t ₂ /2”, and the corresponding second parameter may be “(t ₂ +7)/2”. As shown in Table 6 below: Table 6 MO and M1 when statistical average After data source B receives the accumulated value M sent by data source A, assuming that data source A selects the last two lines of data (corresponding to the data identifiers 2345678 and 3456789), it can still be M-(t1+t2+t3)=6.5 . In order to implement the above method, one or more embodiments of this specification also provide a data statistics device. As shown in FIG. 3, the device may include: a parameter generation module 31, a data transmission module 32, and a data reception module 33 And statistics processing module 34. The parameter generation module 31 is used to generate a first parameter and a second parameter corresponding to each data identification; if the first data corresponding to the data identification does not participate in the data statistics, the second parameter is equal to the first parameter, otherwise, The second parameter is calculated based on the first parameter and the first data; the data sending module 32 is used to send each data identifier, and the first parameter and the second parameter corresponding to the data identifier to The partner data party; the data receiving module 33 is used to receive the partner calculated value returned by the partner data party, the partner calculated value is obtained by the partner data party according to the selected first parameter or second parameter, if the If the second data corresponding to the data identifier participates in the data statistics, the partner data party selects the second parameter; otherwise, the partner data party selects the first parameter; the statistical processing module 34 is used to remove each first value from the calculated value of the partner The calculated value of the parameter to obtain the statistical value. In one example, the plurality of first data are located in the same statistical column of the local data source. In one example, the parameter generating module 31, when used to calculate the second parameter based on the first parameter and the first data, is specifically used to sum the first parameter and the first data Statistics get the second parameter. The statistical processing module 34, when used to remove the calculated value of each first parameter from the calculated value of the partner, is specifically used to subtract the sum of each first parameter through an accumulated value, the accumulated value is The cooperation data parties are accumulated according to the selected first parameter or second parameter. In one example, when the parameter generating module 31 is used to sum up the first parameter and the first data to obtain the second parameter, it is specifically used to: if the first data corresponding to the data identifier satisfies the Determine the data filtering conditions for participating in the statistical data, when the statistical value is the sum of multiple first data, the first parameter is a random number, and the second parameter is the random number and the first Sum of information. In one example, when the parameter generating module 31 is used to sum up the first parameter and the first data to obtain the second parameter, it is specifically used to: if the first data corresponding to the data identifier satisfies the To determine the data filtering conditions for participating in the statistical data, when the statistical value is an average of multiple first data, the second parameter is the first parameter plus one-half of the first data. In order to implement the above method, one or more embodiments of this specification also provide a data statistics device. As shown in FIG. 4, the device may include: a parameter receiving module 41, a parameter selection module 42, and a statistical calculation module 43和数值更新机构44. The parameter receiving module 41 is configured to receive the data identifier sent by the statistical data party, and the first parameter and the second parameter corresponding to the data identifier; wherein, when the first data corresponding to the data identifier participates in data statistics , The second parameter is calculated based on the first parameter and the first data, otherwise, the second parameter is equal to the first parameter; the parameter selection module 42 is used if the data identifies the corresponding second data If the data is locally involved in data statistics, the second parameter corresponding to the data identifier is selected; otherwise, the first parameter corresponding to the data identifier is selected; the statistical calculation module 43 is used to select the first parameter and the second Statistical calculation of the parameters to obtain the calculated value of the partner; the value sending module 44 is used to send the calculated value of the partner to the statistical data party, so that the statistical data party removes each according to the calculated value of the partner The calculated value of the first parameter obtains the statistical value. For the convenience of description, when describing the above device, the functions are divided into various modules and described separately. Of course, when implementing one or more embodiments of this specification, the functions of each module may be implemented in one or more software and/or hardware. The execution steps of the steps shown in the above method embodiments are not limited to the sequence in the flowchart. In addition, the description of each step can be implemented in the form of software, hardware, or a combination thereof. For example, those skilled in the art can implement it in the form of software code, which can be a computer capable of implementing the logical functions corresponding to the steps. Execute instructions. When implemented in software, the executable instructions can be stored in memory and executed by the processor in the device. For example, corresponding to the above method, one or more embodiments of this specification simultaneously provide a data statistics device, which is used to perform data statistics in conjunction with data from a local data partner and a cooperative data partner, the local data party having a statistical value to be calculated A plurality of first materials, the plurality of first materials respectively correspond to different material identifiers, and the partner material party has a plurality of second materials corresponding to the material identifiers. The device may include a processor, a memory, and computer instructions stored on the memory and executable on the processor. The processor executes the instructions to implement the following steps: corresponding to each data identifier, Generate a first parameter and a second parameter; if the first data corresponding to the data identifier does not participate in data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is based on the first parameter and the first parameter The data is calculated; each data identifier, and the first parameter and the second parameter corresponding to the data identifier are sent to the partner data party; the partner calculated value returned by the partner data party is received, and the partner calculated value is the cooperation The data party obtains according to the selected first parameter or second parameter. If the second data corresponding to the data identifier participates in the data statistics, the cooperative data party selects the second parameter; otherwise, the cooperative data party selects the first parameter; The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value. For example, corresponding to the above method, one or more embodiments of this specification also provide a data statistics device, which is used to perform data statistics between a local data party and a statistical data party, the statistical data party having a statistical value to be calculated A plurality of first materials, the plurality of first materials respectively correspond to different data identifiers, and the local data party has second materials corresponding to the same material identifier. The device may include a processor, a memory, and computer instructions stored on the memory and executable on the processor. The processor executes the instructions to implement the following steps: send by receiving the statistical data Data identification, and the first parameter and the second parameter corresponding to the data identification; wherein, when the first data corresponding to the data identification participates in data statistics, the second parameter is based on the first parameter and the A data is calculated, otherwise, the second parameter is equal to the first parameter; if the second data corresponding to the data identifier is the data of the local participation in data statistics, the second parameter corresponding to the data identifier is selected; otherwise, the selection A first parameter corresponding to the data identifier; perform a statistical calculation according to the selected first parameter and second parameter to obtain a calculated value of a partner; send the calculated value of the partner to the statistical data party so that the statistics The data side removes the calculated value of each first parameter according to the calculated value of the partner to obtain the statistical value. The device or module explained in the above embodiments may be realized by a computer chip or entity, or by a product with a certain function. A typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email sending and receiving device, and a game control Desk, tablet, wearable device, or any combination of these devices. Those skilled in the art should understand that one or more embodiments of this specification may be provided as a method, system, or computer program product. Therefore, one or more embodiments of this specification may take the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of this specification can be implemented on one or more computer-usable storage media (including but not limited to disk memory, CD-ROM, optical memory, etc.) containing computer-usable program code In the form of computer program products. These computer program instructions can also be stored in a computer readable memory that can guide a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer readable memory produce a manufactured product including an instruction device The instruction device implements the functions specified in one block or multiple blocks in one flow or multiple flows in the flowchart and/or one block in the block diagram. These computer program instructions can also be loaded onto a computer or other programmable data processing device, so that a series of operating steps can be performed on the computer or other programmable device to generate computer-implemented processing, and thus on the computer or other programmable device The instructions executed on provide steps for implementing the functions specified in one block or multiple blocks of the flowchart one flow or multiple flows and/or block diagrams. It should also be noted that the terms "include", "include" or any other variant thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or device that includes a series of elements includes not only those elements, but also includes Other elements not explicitly listed, or include elements inherent to such processes, methods, goods, or equipment. Without more restrictions, the element defined by the sentence "include one..." does not exclude that there are other identical elements in the process, method, commodity, or equipment that includes the element. One or more embodiments of this specification may be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. One or more embodiments of this specification can also be practiced in distributed computing environments in which remote processing devices connected through a communication network perform tasks. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices. The embodiments in this specification are described in a progressive manner. The same or similar parts between the embodiments can be referred to each other. Each embodiment focuses on the differences from other embodiments. In particular, for the server device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant part can be referred to the description of the method embodiment. The foregoing describes specific embodiments of the present specification. Other embodiments are within the scope of the attached patent application. In some cases, the actions or steps described in the scope of the patent application may be performed in a different order than in the embodiment and still achieve the desired result. In addition, the processes depicted in the drawings do not necessarily require the particular order shown or sequential order to achieve the desired results. In some embodiments, multiplexing and parallel processing are also possible or may be advantageous. The above are only preferred embodiments of one or more embodiments of this specification, and are not intended to limit the disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the disclosure, All should be included in the scope of protection of the present disclosure.

31‧‧‧參數生成模組31‧‧‧Parameter generation module

32‧‧‧資料發送模組32‧‧‧Data sending module

33‧‧‧資料接收模組33‧‧‧Data receiving module

34‧‧‧統計處理模組34‧‧‧Statistical processing module

41‧‧‧參數接收模組41‧‧‧Parameter receiving module

42‧‧‧參數選擇模組42‧‧‧Parameter selection module

43‧‧‧統計計算模組43‧‧‧Statistical calculation module

44‧‧‧數值發送模組44‧‧‧Value sending module

為了更清楚地說明本說明書一個或多個實施例或現有技術中的技術方案，下面將對實施例或現有技術描述中所需要使用的附圖作簡單地介紹，顯而易見地，下面描述中的附圖僅僅是本說明書一個或多個實施例中記載的一些實施例，對於本領域普通技術人員來講，在不付出創造性勞動性的前提下，還可以根據這些附圖獲得其他的附圖。　　圖1為本說明書一個或多個實施例提供的一種資料統計方法的流程圖；　　圖2為本說明書一個或多個實施例提供的一種資料求和統計的流程圖；　　圖3為本說明書一個或多個實施例提供的一種資料統計裝置的結構示意圖；　　圖4為本說明書一個或多個實施例提供的一種資料統計裝置的結構示意圖。In order to more clearly explain one or more embodiments of the specification or the technical solutions in the prior art, the following will briefly introduce the drawings required in the description of the embodiments or the prior art. Obviously, the appended The drawings are only some of the embodiments described in one or more embodiments of this specification. For those of ordinary skill in the art, without paying any creative labor, other drawings can also be obtained based on these drawings. 1 is a flow chart of a method for data statistics provided by one or more embodiments of the specification; FIG. 2 is a flow chart of a data summation statistics provided by one or more embodiments of the specification; FIG. 3 is a A structural schematic diagram of a data statistical device provided by multiple embodiments; FIG. 4 is a schematic structural diagram of a data statistical device provided by one or more embodiments of the present specification.

Claims

A method of data statistics, which is applied to the data of a joint local data partner and a partner data partner for data statistics. The local data partner has multiple first data to be calculated statistical values, the multiple first data corresponding to different data respectively ID, the partner data partner has multiple second data corresponding to the data ID. The method includes: 　　 corresponding to each data ID, generating a first parameter and a second parameter; if the first data corresponding to the data ID does not participate in the data statistics , The second parameter is equal to the first parameter, otherwise, the second parameter is calculated based on the first parameter and the first data; 　　 sends each data identifier, and the first parameter and the second parameter corresponding to the data identifier, To the partner data party; 　　 Receive the partner calculation value returned by the partner data party. The partner calculation value is obtained by the partner data party according to the selected first parameter or second parameter. If the second data corresponding to the data identifier participates in the data statistics, Then, the partner data party selects the second parameter; otherwise, the partner data party selects the first parameter; 　　The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.

According to the method described in claim 1, 　　 the second parameter is calculated based on the first parameter and the first data, including: 　　 the second parameter is obtained by summing statistics of the first parameter and the first data; 　　The calculated value of each first parameter is removed from the calculated value of the partner, including: 　　 The calculated value of the partner is the accumulated value obtained by the cooperative data party according to the selected first parameter or second parameter, and the accumulated value is subtracted by the accumulated value The sum of the first parameter.

According to the method described in claim 2, 　　 the second parameter is obtained by summing the first parameter and the first data, including: 　　If the first data corresponding to the data identifier satisfies the data filtering condition for determining the participating statistical data Then, when the statistical value is the sum of multiple first data, the first parameter is a random number, and the second parameter is the sum of the random number and the first data.

According to the method described in claim 2, 　　 the second parameter is obtained by summing the first parameter and the first data, including: 　　If the first data corresponding to the data identifier satisfies the data filtering condition for determining the participating statistical data Then, when the statistical value is an average of multiple first data, the second parameter is the first parameter plus half of the first data.

A data statistics method, which is used to perform data statistics between a local data party and a statistical data party, the statistical data party having a plurality of first data to be calculated statistical values, the plurality of first data corresponding to different Data identification, the local data side has the second data corresponding to the same data identification; the method includes: 　　 receiving the data identification sent by the statistical data side, and the first parameter and the second parameter corresponding to the data identification; wherein, when the data When the first data corresponding to the identifier participates in the data statistics, the second parameter is calculated based on the first parameter and the first data; otherwise, the second parameter is equal to the first parameter; 　　If the second data corresponding to the material identifier is local For the data participating in the data statistics, the second parameter corresponding to the data identifier is selected; otherwise, the first parameter corresponding to the data identifier is selected; 　　 performs a statistical calculation based on the selected first parameter and second parameter to obtain the calculated value of the partner; The calculated value of the partner is sent to the statistical data side, so that the statistical data side removes the calculated value of each first parameter according to the calculated value of the partner to obtain the statistical value.

A data statistics device, which is used to perform data statistics by combining data of a local data partner and a cooperative data partner. The local data partner has a plurality of first data to be calculated statistical values, and the plurality of first data respectively correspond to different Data identification, the cooperative data party has a plurality of second data corresponding to the data identification; the device includes: a parameter generation module for generating first and second parameters corresponding to each data identification; if the data identification The corresponding first data does not participate in the data statistics, then the second parameter is equal to the first parameter, otherwise, the second parameter is calculated based on the first parameter and the first data; 　　Data sending module, used to identify each data , And the first parameter and the second parameter corresponding to the data identification are sent to the partner data partner; 　　 data receiving module, used to receive the partner partner’s calculated value returned by the partner data partner, the partner’s calculated value is based on the partner data partner’s The selected first parameter or second parameter is obtained. If the second data corresponding to the data identifier participates in the data statistics, the partner data party selects the second parameter; otherwise, the partner data party selects the first parameter; 　　Statistics processing module for The calculated value of each first parameter is removed from the calculated value of the partner to obtain the statistical value.

According to the device described in claim 6, the parameter generation module, when used to calculate the second parameter based on the first parameter and the first data, is specifically used to obtain the first parameter and the first data And statistics to obtain the second parameter; 　　 The statistical processing module, when used to remove the calculated value of each first parameter from the calculated value of the partner, is specifically used to subtract the sum of each of the first parameters through the accumulated value , The accumulated value is obtained by the cooperative data party according to the selected first parameter or second parameter.

According to the device described in claim 7, the parameter generation module, when used to sum the first parameter and the first data to obtain the second parameter, is specifically used to: if the data identifies the corresponding first data Meet the data filtering conditions for determining the participating statistical data, then when the statistical value is the sum of multiple first data, the first parameter is a random number, and the second parameter is the random number and the first data with.

According to the device described in claim 7, the parameter generation module, when used to sum the first parameter and the first data to obtain the second parameter, is specifically used to: if the data identifies the corresponding first data If the data filtering condition for determining the participating statistical data is satisfied, then when the statistical value is the average of multiple first data, the second parameter is the first parameter plus half of the first data.

A data statistics device for statistical data between a local data party and a statistical data party, the statistical data party having a plurality of first data to be calculated statistical values, the plurality of first data respectively corresponding to different Data identification, the local data side has second data corresponding to the same data identification; the device includes: 　　 parameter receiving module for receiving the data identification sent by the statistical data side, and the first parameter and the second corresponding to the data identification Parameters; wherein, when the first data corresponding to the data identifier participates in the data statistics, the second parameter is calculated based on the first parameter and the first data, otherwise, the second parameter is equal to the first parameter; 　　 parameter selection module , Used to select the second parameter corresponding to the data identifier if the second data corresponding to the data identifier is local data participating in the statistics; otherwise, select the first parameter corresponding to the data identifier; 　　Statistics calculation module, used for Perform statistical calculations based on the selected first and second parameters to obtain the calculated value of the partner; 　　 numerical sending module, used to send the calculated value of the partner to the statistical data party, so that the statistical data party according to the partner The calculated value is obtained by removing the calculated value of each first parameter to obtain the statistical value.

A data statistics device, which includes a memory, a processor, and computer instructions stored on the memory and executable on the processor. When the processor executes the instructions, the following steps are implemented: 　　 Corresponding to each data identification, a A parameter and a second parameter; if the first data corresponding to the data identifier does not participate in the data statistics, the second parameter is equal to the first parameter, otherwise, the second parameter is calculated based on the first parameter and the first data; 　　will Each data identifier, and the first and second parameters corresponding to the material identifier, are sent to the partner data partner; 　　 Receive the partner calculated value returned by the partner data partner. The parameter or the second parameter is obtained. If the second data corresponding to the data identifier participates in the statistics, the partner data party selects the second parameter; otherwise, the partner data party selects the first parameter; 　　 removes each first parameter from the calculated value of the partner The calculated value of the parameter to obtain the statistical value.

A data statistics device, which includes a memory, a processor, and computer instructions stored on the memory and runable on the processor. When the processor executes the instructions, the following steps are implemented: 　　Receiving the data identification sent by the statistical data party , And the first parameter and the second parameter corresponding to the data identification; wherein, when the first data corresponding to the data identification participates in the data statistics, the second parameter is calculated based on the first parameter and the first data, otherwise, The second parameter is equal to the first parameter; 　　If the second data corresponding to the data identifier is the data of the local participation in data statistics, the second parameter corresponding to the data identifier is selected; otherwise, the first parameter corresponding to the data identifier is selected; 　　according to Perform statistical calculation on the selected first and second parameters to obtain the calculated value of the partner; 　　 Send the calculated value of the partner to the statistical data party, so that the statistical data party removes the first parameter from the calculated value of the partner Calculate the value to get the statistical value.