WO2023185052A1

WO2023185052A1 - Smart contract-based calculation, update and read method and apparatus, and electronic device

Info

Publication number: WO2023185052A1
Application number: PCT/CN2022/135439
Authority: WO
Inventors: 周晨辉; 闫莺
Original assignee: 蚂蚁区块链科技(上海)有限公司
Priority date: 2022-03-30
Filing date: 2022-11-30
Publication date: 2023-10-05
Also published as: CN114693450A

Abstract

A smart contract-based calculation method. A smart contract for performing approximate calculation is deployed on a blockchain. The method comprises: receiving a smart contract call transaction for a smart contract initiated by a calculation initiating party, the smart contract call transaction comprising calculation parameters corresponding to approximate calculation, and the calculation parameters comprising a data identifier of a data collection participating in the approximate calculation; in response to the smart contract call transaction, calling a sampling logic comprised in the smart contract call transaction, dividing the data collection corresponding to the data identifier into an outlier data subset consisting of a plurality of outlier data samples and a non-outlier data subset consisting of a plurality of non-outlier data samples, and performing sampling for the non-outlier data samples in the non-outlier data subset; and calling a calculation logic comprised in the smart contract call transaction, performing accurate calculation for the outlier data samples in the outlier data subset, performing approximate calculation on the non-outlier data samples obtained by sampling, and combining the results of accurate calculation and approximate calculation.

Description

Computing, updating, reading methods and devices and electronic equipment based on smart contracts

This application requires the priority of the Chinese patent application submitted to the China Patent Office on March 30, 2022, with the application number 202210332050.4, and the invention name is "Smart Contract-Based Calculation, Update, Reading Methods and Devices, Electronic Equipment", which The entire contents are incorporated herein by reference.

Technical field

One or more embodiments of this specification relate to the field of blockchain technology, and in particular to a computing device and electronic equipment based on smart contracts.

Background technique

Blockchain is a new application model of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. In the blockchain system, data blocks are combined into a chained data structure in a chronological manner and are cryptographically guaranteed to be an untamperable and unforgeable distributed ledger. Due to the characteristics of blockchain, such as decentralization, non-tamperable information, and autonomy, blockchain has also received more and more attention and applications.

Contents of the invention

This specification proposes a calculation method based on smart contracts, which is applied to node devices in the blockchain. Smart contracts for performing approximate calculations are deployed on the blockchain. The method includes:

Receive a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include the data identifier of the data set participating in the approximate calculation ;

In response to the smart contract call transaction, calling the sampling logic included in the smart contract call transaction, dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

Further call the calculation logic contained in the smart contract call transaction to perform precise calculations on the outlier data samples in the outlier data subset, and perform approximate calculations on the non-outlier data samples sampled from the non-outlier data subset. Calculate, and combine the results of the exact calculation and the approximate calculation as an approximate calculation result for the data set.

This specification also proposes a computing device based on smart contracts, which is applied to node equipment in the blockchain. Smart contracts for performing approximate calculations are deployed on the blockchain. The device includes:

A receiving module that receives a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include a data set participating in the approximate calculation data identification;

A sampling module, in response to the smart contract call transaction, calls the sampling logic included in the smart contract call transaction, and divides the data set corresponding to the data identifier into outlier data sub-sub-sets composed of a number of outlier data samples. a set, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

The calculation module further calls the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and on the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.

In the above technical solution, when a smart contract is called to perform approximate calculations on a data set, by introducing a sampling mechanism for the data set in the smart contract, the accuracy of the approximate calculation results can be reduced without sacrificing the accuracy of the data set. The time consuming when performing approximate calculations on a data set improves the computational efficiency when performing approximate calculations on the data set. Moreover, since in the process of approximate calculation of the data set, the outlier data in the data set is not sampled and then the approximate calculation is performed, but accurate calculation is performed directly without sampling, it is possible to include outliers in the data set. In the case of data, further avoiding the impact of these outlier data samples on the accuracy of the approximate calculation results for the data set can ensure the accuracy of the approximate calculation for the data set to the greatest extent.

Description of drawings

Figure 1 is a flow chart of a smart contract-based calculation method provided by an exemplary embodiment;

Figure 2 is a flow chart of an optimization solution method provided by an exemplary embodiment;

Figure 3 is a schematic structural diagram of an electronic device provided by an exemplary embodiment;

FIG. 4 is a block diagram of a smart contract-based computing device provided by an exemplary embodiment.

Detailed ways

Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of this specification. Rather, they are merely examples of apparatus and methods consistent with some aspects of one or more embodiments of this specification as detailed in the appended claims.

It should be noted that in other embodiments, the steps of the corresponding method are not necessarily performed in the order shown and described in this specification. In some other embodiments, methods may include more or fewer steps than described in this specification. In addition, a single step described in this specification may be broken down into multiple steps for description in other embodiments; and multiple steps described in this specification may also be combined into a single step in other embodiments. describe.

With the continuous development of smart contract technology, when smart contracts are used to connect with businesses, smart contracts gradually begin to bear part of the computing power related to the business.

For example, in practical applications, smart contracts deployed on the blockchain to interface with businesses can include, in addition to business logic related to the business, logic for calculating business data related to the business. This allows users to complete business-related calculations on the blockchain by calling the smart contract.

When smart contracts are used to calculate business-related data sets, the total calculation time usually depends on the time it takes to perform I/O operations on each piece of data and the time it takes to calculate the above set of data in batches.

For example, in practical applications, taking a business-related data set that is pre-stored on the blockchain as an example, the total time taken by a smart contract to calculate a business-related data set can usually be calculated using the following formula: To express:

Among them, in the above formula, i represents the i-th piece of data in the above-mentioned data set; IO _i represents the time-consuming I/O operation for the i-th piece of data; Operatioo _i represents the batch processing of the i-th piece of data in the data set. The time it takes to perform the calculation.

It should be noted that when data sets are stored on the blockchain, they are usually stored one by one in the storage medium mounted on the blockchain node device in the form of key-value pairs. Therefore, for data stored in the area The above-mentioned data collection on the blockchain can usually only read data one by one from the storage media carried by the blockchain node device based on the key value of the data.

In some application scenarios that require high privacy and security of data computing, the above smart contracts can also be deployed in the TEE (Trusted execution environment) mounted on the blockchain node device.

In this case, the data in the above data collection usually needs to be encrypted and stored. At this time, when smart contracts are used to calculate business-related data sets, the total calculation time usually depends on the time it takes to perform I/O operations on each piece of data and the time it takes to decrypt each piece of data. , and the time it takes to calculate the above set of data in batches.

Among them, in the above formula, Operation _i represents the time taken to decrypt the i-th piece of data in the data set.

From the above introduction, it is not difficult to see that in the scenario of using smart contracts to calculate business-related data sets, if the data set contains a relatively large amount of data, the smart contract can be used to calculate the data set to obtain accurate calculations. The result is very time consuming.

In practical applications, in some business scenarios, precise calculation results for business-related data may not be required, but some loss in calculation accuracy can be tolerated.

For example, in the calculation scenario of calculating the average age of users, in most cases, accurate calculation results are not required. Usually, only an approximate calculation is needed to obtain an average age range.

Based on this, this specification proposes a technical solution that introduces approximate calculation and data sampling mechanisms into smart contracts to improve the computational efficiency of calculating business-related data.

When implemented, a smart contract for data calculation can be deployed on the blockchain, and the smart contract can contain approximate calculation logic for approximate calculation and sampling logic for data sampling. The calculation initiator can initiate a smart contract call transaction to call the smart contract to perform approximate calculations on the data set participating in the calculation. Wherein, the smart contract call transaction may include calculation parameters corresponding to the approximate calculation; the calculation parameters may include the data identifier of the data set participating in the approximate calculation;

When the node device in the blockchain receives the smart contract call transaction initiated by the calculation initiator, it can respond to the smart contract call transaction, call the sampling logic contained in the smart contract call transaction, and convert the data corresponding to the data identifier. The data set is divided into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, and the non-outlier data subset in the non-outlier data subset is Group data samples are sampled; after the sampling is completed, the approximate calculation logic contained in the smart contract can be further called to perform accurate calculations for the outlier data samples in the outlier data subset, and for the outlier data samples from the non-outlier data subset The sampled non-outlier data samples are subjected to approximate calculation, and the results of the precise calculation and the approximate calculation are combined to serve as the approximate calculation result for the data set.

In the above technical solution, when a smart contract is called to perform approximate calculations on a data set, by introducing a sampling mechanism for the data set in the smart contract, the accuracy of the approximate calculation results can be reduced without sacrificing the accuracy of the results. The time consuming when performing approximate calculations on this data set improves the computing efficiency when performing approximate calculations on this data set.

Moreover, since in the process of approximate calculation of the data set, the outlier data in the data set is not sampled and then the approximate calculation is performed, but accurate calculation is performed directly without sampling, it is possible to include outliers in the data set. In the case of data, further avoiding the impact of these outlier data samples on the accuracy of the approximate calculation results for the data set can ensure the accuracy of the approximate calculation for the data set to the greatest extent.

Please refer to Figure 1, which is a flow chart of a smart contract-based calculation method provided by an exemplary embodiment. The method is applied to node devices in the blockchain; wherein a smart contract for performing approximate calculations is deployed on the blockchain, and the method includes the following steps:

Step 102: Receive a smart contract call transaction for the smart contract initiated by the calculation initiator; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include a data set participating in the approximate calculation data identification;

The above-mentioned calculation initiator may specifically be a party with data calculation requirements. For example, in one example, the above calculation initiator may be a user with data calculation requirements. In another example, in a scenario based on smart contracts and business docking, the calculation initiator can also be an off-chain business system with data calculation requirements.

On the blockchain, a smart contract for data calculation can be deployed. The smart contract contains execution logic corresponding to the contract code. Specifically, it can include approximate calculation logic for approximate calculation and sampling logic for data sampling. . In this way, the logic of approximate calculation of data and data sampling can be introduced into the smart contract.

Among them, it should be noted that the sampling method used for the above data sampling is not particularly limited in this specification; for example, random sampling (Random Sampling), stratified sampling (Stratified Sampling), etc. can be used.

The above-mentioned calculation initiator can call the above-mentioned smart contract to perform approximate calculations on the data set participating in the calculation by initiating a smart contract call transaction.

For example, take the above calculation initiator as the user and the above blockchain as a blockchain using an account model. In this case, the above smart contract can be understood as a contract code anchored on the blockchain. Contract account, and the user can register an external account on the blockchain, initiate a smart contract call transaction through the external account, and submit the smart contract call transaction to the connected blockchain node device to call the Smart contracts.

Among them, it should be noted that in the above-mentioned smart contract call transaction, the calculation parameters corresponding to the approximate calculation may be included; the calculation parameters may include the data identifier of the data set participating in the approximate calculation.

When the above calculation initiator initiates the above smart contract call transaction, if the calculation initiator directly connects with the blockchain node, it can package a smart contract transaction and directly submit it to the connected blockchain node device point-to-point. Can. If the calculation initiator accesses the blockchain through the blockchain connection service provided by the Baas (Blockchain as a Service) platform, it can generate a call request for the above smart contract and submit the call request to the Baas platform. , and then the Baas platform packages a smart contract call transaction based on the call parameters carried in the call request and submits it to the blockchain node device.

The blockchain node device can receive the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, and when receiving the above-mentioned smart contract call transaction, can respond to the smart contract call transaction and call the above-mentioned smart contract on the blockchain. Approximate calculations are performed on data sets.

Step 104, in response to the smart contract call transaction, call the sampling logic contained in the smart contract call transaction, and divide the data set corresponding to the data identifier into outlier data sub-sets composed of a number of outlier data samples. a set, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

After the blockchain node device receives the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, it can respond to the smart contract call transaction, call the sampling logic contained in the smart contract, and process the data set corresponding to the data identifier. Sample the data samples in .

Among them, it should be noted that after receiving the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, the blockchain node device usually needs to work with other blockchain nodes participating in the consensus based on the consensus algorithm supported by the blockchain. , perform consensus processing on the smart contract call transaction and the execution results of the smart contract call transaction. Since this manual does not involve improving the consensus process of the blockchain, the process of consensus processing of the smart contract call transaction and the execution results of the smart contract call transaction will not be described in detail in this manual.

In an embodiment shown, before calling the sampling logic contained in the smart contract to sample the data samples in the data set corresponding to the data identification, the blockchain node device may first obtain the above-mentioned The smart contract calls the above data identifier contained in the transaction and reads the data collection involved in the approximate calculation based on the data identifier.

When reading the data set participating in the approximate calculation based on the data identification, it can be read from the blockchain or from outside the chain, which is not particularly limited in this specification.

In an implementation manner, the data set can be pre-certified on the above-mentioned blockchain.

For example, a certificate deposit contract for data certificate can be deployed on the blockchain. Before calling the above smart contract for calculation, the calculation initiator can package a certificate deposit transaction to include the need in the calculation. The data set is published to the certificate deposit contract for certificate deposit.

As another example, the execution logic corresponding to the contract code contained in the above-mentioned smart contract may include, in addition to the above-mentioned approximate calculation logic and the above-mentioned sampling logic, data storage logic. That is to say, in addition to being used for approximate calculations, the smart contract itself also has its own data storage function. At this time, before calling the smart contract for calculation, the calculation initiator can also package a certificate deposit transaction and pre-release the data set that needs to participate in the calculation to the smart contract for certificate deposit. Subsequently, the smart contract can Read the above-mentioned data set that has been certified from its own contract storage space to perform approximate calculations.

In this case, the blockchain node device can obtain the data set corresponding to the data identifier stored on the blockchain based on the above-mentioned data identifier. For example, in this case, the data identifier may be a certificate hash returned by the blockchain node after the above-mentioned data set is successfully certificated on the blockchain.

In another implementation, the data set can also be pre-stored in an off-chain database connected to the above-mentioned blockchain. In this case, the smart contract can obtain the data set corresponding to the data identifier from the above-mentioned off-chain database through its corresponding oracle machine.

Among them, the above-mentioned oracle program can specifically be a centralized oracle program or a decentralized oracle program. When the above-mentioned oracle program is a centralized oracle program, the oracle program may be an oracle service program deployed on a service device outside the chain. When the above-mentioned oracle program is a decentralized oracle program, the oracle program may be an oracle contract deployed on the blockchain that interfaces with the above-mentioned smart contract. It should be noted that since this specification does not involve improvements related to the oracle program, in this specification the above-mentioned smart contract obtains the data set corresponding to the data identifier from the above-mentioned off-chain database through its corresponding oracle program. The specific implementation process will not be described in detail in this manual.

For the calculation parameters included in the above-mentioned smart contract call transaction, in addition to the data identifiers of the above-mentioned data sets mentioned above, in practical applications, they can also include other forms of parameters related to approximate calculations.

In an embodiment shown, the above calculation parameters may specifically include various types of parameters shown in the following table:

参数类型Parameter Type	参数含义Parameter meaning
数据集IDData set ID	表示参与近似计算的数据集合Represents a collection of data involved in approximate calculations
计算类型IDCalculation type ID	表示需要进行的近似计算的计算类型The type of calculation that represents the approximate calculation that needs to be made
误差值difference	表示可容忍的近似计算的计算误差Indicates the tolerable calculation error for approximate calculations
置信概率confidence probability	表示期望的近似计算的准确度Indicates the expected accuracy of the approximate calculation
采样算法IDSampling algorithm ID	表示指定的采样算法类型Indicates the specified sampling algorithm type

Among them, it should be noted that except for the data set ID, other parameters in the above table are optional parameters.

For example, if the calculation parameters in the above-mentioned smart contract call transaction do not contain the calculation type ID, it means that the above-mentioned smart contract is allowed to use the default calculation type to perform approximate calculations on the above-mentioned data set. If the calculation parameters in the above smart contract call transaction do not contain an error value, it means that the tolerable calculation error is 0. If the calculation parameters in the above-mentioned smart contract call transaction do not include the confidence probability, it means that the confidence probability is 100% and the expected accuracy of the approximate calculation is 100%. In this case, the above-mentioned smart contract will perform the above-mentioned data collection. Calculate exactly, no more approximations.

In an embodiment shown, when the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, in order to avoid outlier data in the above-mentioned data set To affect the final approximate calculation result, the above data set can be divided into an outlier data subset composed of several outlier data samples, and a non-outlier data subset composed of several non-outlier data samples, and then only Sampling data samples from non-outlier data subsets.

Among them, the outlier data in the above data set can be calculated by the above smart contract.

In one embodiment shown, the sampling logic included in the smart contract may further include logic for performing outlier calculations on the data set. In this case, when the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, it can specifically execute the above-mentioned outlier calculation logic and target the above-mentioned data. Perform outlier data calculation on the data samples in the set to determine the outlier data samples and non-outlier data samples contained in the data set, and then create an outlier data subset based on the determined outlier data samples. Create a non-outlier data subset from non-outlier data samples.

Among them, the process of calculating outlier data for the data samples in the above-mentioned data set usually refers to the process of counting data samples in the above-mentioned data set that are obviously different from other data samples based on a certain statistical algorithm. Specifically, The method of statistical calculation is not particularly limited in this specification. For example, in one example, the median of the values corresponding to the data samples in the above data set can be calculated, and then based on the median, the data samples in the data set whose values deviate significantly from the median can be filtered out, as Outlier data samples.

Of course, in practical applications, the outlier data in the above data set can also be manually calibrated in advance. In this case, the above-mentioned sampling logic contained in the above-mentioned smart contract may also specifically include logic for filtering outlier data on the above-mentioned data collection. When the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, it can specifically execute the above-mentioned logic of filtering outlier data, and target the data in the above-mentioned data set. The samples are screened for outlier data to determine the outlier data samples and non-outlier data samples contained in the data set, and then an outlier data subset is created based on the identified outlier data samples. Data samples create non-outlier data subsets.

In an embodiment shown, before sampling the data samples in the non-outlier data subset, the number of samples to be sampled for the data samples in the non-outlier data subset can be calculated first, and then Then sample the data samples in the non-outlier data subset according to the calculated sampling number.

In one embodiment shown, Hoeffding’s Inequality is generally used to describe a random variable and an upper bound on the probability of its deviation from its expected value. In the scenario of approximate calculation, the above-mentioned sampling number can be used as a random variable, the error value of the above-mentioned approximate calculation can be used as the expected value deviation, and the confidence probability of the above-mentioned approximate calculation can be used as the upper limit of the above-mentioned probability. Therefore, in this specification, Hoeffding's inequality can be used to describe the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation, and the confidence probability of the above-mentioned approximate calculation. In other words, in the scenario of approximate calculation, Hoeffding's inequality can be used to derive the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation, and the confidence probability of the above-mentioned approximate calculation.

Among them, when Hoeffding's inequality is used to describe the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation and the confidence probability of the above-mentioned approximate calculation, Hoeffding's inequality is expressed as the following formula:

In the above formula, H represents the mathematical identifier of Hoeffding's inequality. n _g represents the number of samples.

b _g and a _g respectively represent the maximum value and minimum value of the data samples in the data set. δ represents the confidence probability; ε _g represents the error value corresponding to the above approximate calculation; N _g represents the total number of data samples in the data set.

The mathematical relationship between the above-mentioned sampling number derived based on the above-mentioned formula, the error value of the above-mentioned approximate calculation and the confidence probability of the above-mentioned approximate calculation can be expressed by the following formula:

In the above-mentioned smart contract, the above-mentioned mathematical relationship can be maintained in advance. When the blockchain node device calls the above-mentioned smart contract to calculate the number of samples required for sampling the data samples in the above-mentioned non-outlier data subset, it can obtain the confidence corresponding to the approximate calculation in the calculation parameters in the above-mentioned smart contract call transaction. probability δ, and the error value ε _g corresponding to the approximate calculation, and then input the obtained confidence probability δ and error value ε _g into the above mathematical relationship maintained for calculation, and obtain the number of samples corresponding to the above non-outlier data subset .

Among them, it should be noted that the sampling method used to sample data samples in non-outlier data subsets is not particularly limited in this specification; for example, random sampling (Random Sampling), stratified sampling ( Stratified Sampling), etc.

In the following embodiments, random sampling and stratified sampling will be used as examples to describe in detail the sampling process for non-outlier data samples in non-outlier data subsets.

In an embodiment shown, if random sampling is used to randomly sample data samples in the non-outlier data subset, in this case, the blockchain node device calls the above smart contract, based on the calculated When randomly sampling the above data set, you can first obtain the random number used for random sampling, and then randomly sample the non-outlier data samples in the non-outlier data subset based on the obtained random number. , obtain the data samples corresponding to the calculated number of samples above.

Among them, the above-mentioned random numbers are specifically used to control the randomness of the non-outlier data samples sampled from the above-mentioned non-outlier data subsets. In practical applications, the obtained random numbers can be used to determine the need to obtain the above-mentioned non-outlier data. Non-outlier data samples sampled from the subset. For example, in one example, a random number can be used to represent the sample identifier of the non-outlier data sample to be sampled. During the random sampling process, the random number can be used to randomly extract a subset of the non-outlier data. The value of this random number serves as the non-outlier data sample identified by the sample to complete the data sampling.

It should be noted that the specific method of obtaining the above random numbers can be generated on the blockchain or obtained from outside the chain, and is not particularly limited in this specification.

The following are several specific methods for obtaining random numbers shown in this manual:

In one way shown, a random function for generating random numbers can be pre-deployed on the blockchain. For example, in practical applications, the above random function can be deployed on the blockchain as an independent smart contract, or deployed in the smart contract as the execution logic contained in the smart contract for approximate calculation. In this case, a random tree can be generated on the blockchain by calling the random function mentioned above;

In another method shown, the above-mentioned blockchain node device can be equipped with a Trusted Execution Environment (Trusted Execution Environment). In this trusted execution environment, a random number seed for generating random numbers can be maintained in advance. In this case, random numbers can be generated based on the random seed in the trusted execution environment.

In the third way shown, the target data parameters that can be used as random number seeds can also be obtained from the data parameters related to the data maintained by the above-mentioned smart contract used for approximate calculation, and then the target data parameters can be obtained based on the obtained targets. The data parameters generate random numbers in the above smart contract. For example, the unique parameters such as the hash value of the historical block maintained by the above-mentioned smart contract and the generation timestamp of the historical block can be used as a random number seed to calculate the random number in the smart contract.

In the fourth way shown, the random numbers described above can be generated off-chain. In this case, the above-mentioned smart contract can also obtain the random number generated outside the chain through the oracle program corresponding to the smart contract.

In the fifth way shown, a random number seed for further generating the above-mentioned random numbers can be generated outside the chain. In this case, the above-mentioned smart contract can also obtain the random number seed generated outside the chain through the oracle program corresponding to the smart contract, and then generate the random number seed in the smart contract based on the obtained random number seed. random number.

In the sixth method shown, the random number seed generated outside the chain can also be carried as a calculation parameter in the smart contract call transaction. In this case, the random number seed generated off-chain included in the smart contract call transaction can be obtained, and then a random number can be generated in the smart contract based on the obtained random number seed.

The above are several common implementation methods for obtaining random numbers. It should be emphasized that in practical applications, it is obvious that other methods other than the implementation methods listed above can be used to obtain random numbers. This will not be discussed in this manual. List one.

In the illustrated embodiment, if stratified sampling is used to randomly sample data samples in non-outlier data subsets, in this case, since stratified sampling is used, it is usually necessary to The group data subset is divided into several buckets, and then data sampling is performed from these buckets. Therefore, when the blockchain node device calls the above-mentioned smart contract to calculate the number of samples required for stratified sampling, it can obtain the confidence probability δ corresponding to the approximate calculation contained in the calculation parameters carried by the above-mentioned smart contract call transaction, and the confidence probability δ corresponding to each The error value ε _k corresponding to the bucket, and then input the obtained confidence probability δ and the error value ε _k of each bucket into the above mathematical relationship maintained for calculation respectively, and obtain the corresponding bucket corresponding to the above non-outlier data subset. the number of samples.

It should be noted that when performing stratified sampling on the above-mentioned non-outlier data subset, the number of buckets K that need to be divided and the error value ε _k corresponding to each bucket can be specified by the calculation initiator and used as The calculation parameters are carried in the above smart contract call transaction. For example, in this case, in addition to the total error ε _g specified by the calculation initiator for the approximate calculation of the above data set, the above calculation parameters also need to carry K error values ε _k corresponding to each bucket.

In addition, when performing stratified sampling on the above-mentioned non-outlier data subset, the number of buckets K needs to be divided, and the error value ε _k corresponding to each bucket, which can also be specified by the above-mentioned smart contract on the chain. The optimal value obtained by independent calculation.

In an embodiment shown, the number K of buckets to be divided and the error value ε _k corresponding to each bucket can be the optimal value solved by the above-mentioned smart contract using the optimization solution method.

In this case, when the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to conduct hierarchical sampling of non-outlier data samples in the non-outlier data subset, it can first use the optimization solution method to solve the problem of The optimal number K of buckets required for stratified sampling of the above non-outlier data subset, and the optimal error value ε _k corresponding to each bucket, are then added to the calculation parameters in the above smart contract call transaction. The confidence probability δ corresponding to the approximate calculation and the optimal error value ε _k corresponding to each solved bucket are entered into the above mathematical relationships maintained for calculation respectively, and the corresponding buckets corresponding to the above non-outlier data subsets are obtained. the optimal number of samples. Then, based on the calculated optimal number K and the optimal sampling number, stratified sampling can be performed on the data samples in the non-outlier data subset.

It should be noted that the specific type of optimization solution method used by the above-mentioned smart contract is not particularly limited in this specification. In practical applications, those skilled in the art can flexibly adopt different optimal solutions based on actual needs. Optimization solving algorithm. For example, in practical applications, commonly used optimal solution algorithms such as gradient descent can be used.

Among them, for the optimization solution method, it is usually necessary to set a clear constraint. In practical applications, the above constraints can usually be set based on specific solution goals.

In the scenario of stratified sampling of the above-mentioned non-outlier data subsets, the optimal solution goals may include finding the optimal number of divided buckets, finding the optimal error value corresponding to each bucket, and so on. Then, in practical applications, the above constraints can be set for the above optimization solution method based on the above optimization objectives.

In an embodiment shown, based on the above solution objective, setting constraints for the above optimization solution method may specifically be:

A weighted average calculation is performed on the error values corresponding to each bucket, and the weighted average error value obtained is the smallest and no greater than the total error value corresponding to the approximate calculation for the above-mentioned non-outlier data subset.

For example, the above constraints can be expressed as the following formula:

In the above formula, ε _g represents the total error value corresponding to the approximate calculation for the above-mentioned non-outlier data subset. N _k represents the number of samples sampled from the k-th bucket. N represents the total number of samples sampled from the above non-outlier data subset.

The following describes the specific algorithm flow using the above constraints to solve the optimal number of buckets required for stratified sampling and the optimal error value of each bucket through the accompanying drawings and specific embodiments.

Please refer to Figure 2. Figure 2 is a flow chart of an optimization solution method shown in this specification, including the following execution steps:

Step 201: Initialize the i value; where the i value represents the number of samples included in each bucket of the initialization settings. In addition to step 201, the following steps are iteratively executed:

Step 202, adjust the value corresponding to the initialized i value;

Among them, the adjustment range of the i value can be set flexibly and is not specifically limited in this manual.

Step 203: Divide the non-outlier data subset into several buckets each containing i samples;

Step 204: Use the above-mentioned confidence probability δ (i.e., the confidence probability δ included in the calculation parameters carried by the smart contract call transaction) and the adjusted i value (i.e., the number of samples corresponding to each bucket) as calculation parameters, and input them into the mathematical Calculate in the relationship to obtain the error value corresponding to each bucket, and perform a weighted average calculation on the error value corresponding to each bucket to obtain the weighted average error value;

Among them, it should be noted that if it is the first round of iteration, after step 204 is executed, steps 202 to 204 will be executed again to execute the second round of iteration.

Step 205: Determine whether the weighted average error value is not greater than the total error value (that is, the error value corresponding to the approximate calculation contained in the calculation parameters carried by the smart contract call transaction), and is less than the weighted value calculated in the previous round of iterations. The average error value (that is, the weighted average error value calculated based on the i value before this round of iteration adjustment); if not, re-execute the above steps 202 to 205, continue to execute the next round of iteration, and repeat the above iteration process, The iteration stops until the optimization solution algorithm converges and the weighted average error value that satisfies the above constraints is calculated.

Step 206: After stopping the iteration, obtain the optimal i value when the calculated weighted average error value satisfies the above constraints;

Step 207: Determine the optimal number of buckets to be divided when performing stratified sampling on the non-outlier data subset based on the optimal i value, and again combine the above confidence probability and the above optimal i value , input into the mathematical relationship for calculation, and obtain the optimal error value corresponding to each bucket. It should be noted that in the above embodiment, it is a specific implementation manner to set constraints for the above-mentioned optimization solution method based on the above-described solution objectives. In practical applications, it is obvious that the above-mentioned solution objectives can also be used to set constraints for The above optimization solution method sets other forms of constraints. In this specification, the above-mentioned smart contract uses the optimization solution method to solve for the optimal number of buckets that need to be divided, and the optimal number of samples corresponding to each bucket, and then the optimal number and the optimal error value can be calculated based on the optimal number and the optimal error value. The above non-outlier data subsets are stratified sampled.

In an embodiment shown, when the above-mentioned smart contract performs stratified sampling on the above-mentioned non-outlier data subset based on the optimal number and the optimal error value, the above-mentioned non-outlier data subset can first be stratified according to the optimal number. The group data subset is divided into several buckets; for example, assuming that the above-mentioned optimal number is K, the above-mentioned non-outlier data subset can be divided into K buckets. Then, from each divided bucket, the data samples in each bucket can be sampled according to the above-mentioned optimal sampling number.

Among them, the specific sampling method used to sample the data samples in each bucket according to the above-mentioned optimal sampling number is not particularly limited in this specification.

For example, if random sampling is used to sample data samples in each bucket according to the above-mentioned optimal sampling number, you can first obtain the random number used for random sampling, and then pair the data based on the obtained random number. Non-outlier data samples in each bucket are randomly sampled to obtain non-outlier data samples corresponding to the calculated optimal sampling number. Regarding the specific acquisition method of the above random number, reference may be made to the description of the previous embodiment, and no further description will be given.

Step 106: Further call the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and perform accurate calculations on the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.

In this specification, after completing data sampling for the non-outlier data subset in the above-mentioned non-outlier data subset, approximate calculations can be further performed on the sampled non-outlier data samples.

Among them, when performing approximate calculations on the sampled non-outlier data samples, the calculation type specified by the calculation initiator can be used for approximate calculation, or the default calculation type supported by the above smart contract can be used for approximate calculation. Specially limited.

For example, in one embodiment shown, the sampling algorithm ID may also be included in the above-mentioned smart contract call transaction. The sampling algorithm ID may be specifically used to indicate the calculation type specified by the calculation initiator to perform approximate calculation on the above-mentioned data set.

In this case, when performing approximate calculations on the sampled non-outlier data samples, you can obtain the sampling algorithm ID included in the smart contract call transaction, and then use the calculation type indicated by the sampling algorithm ID to perform approximate calculations on the collected non-outlier data samples. Approximate calculations are performed on outlier data samples.

Of course, if the above-mentioned smart contract call transaction does not include the above-mentioned sampling algorithm ID, that is, the calculation initiator does not specify a calculation type for approximate calculation of the above-mentioned data set, it can also be based on the default calculation type supported by the above-mentioned smart contract for the sampled results. Approximate calculations are performed on non-outlier sample data.

For the above-mentioned outlier data subset, the outlier data samples in the above-mentioned outlier data subset may not be sampled. At the same time, since the results of approximate calculations for outlier data samples usually deviate from the approximate calculation results of the above-mentioned data sets, it is not necessary to perform approximate calculations for the outlier data samples in the above-mentioned outlier data subsets. Instead, make precise calculations.

Among them, when accurately calculating the outlier data samples in the outlier data subset, the calculation type specified by the calculation initiator can be used for accurate calculation, or the default calculation type supported by the above smart contract can be used for accurate calculation. In this manual, There are no special restrictions.

For example, when performing approximate calculations on outlier data in an outlier data subset, you can obtain the sampling algorithm ID included in the above-mentioned smart contract call transaction, and then calculate the outlier data in the outlier data subset according to the calculation type indicated by the sampling algorithm ID. Accurate calculation of group data samples. Of course, if the above-mentioned smart contract call transaction does not include the above-mentioned sampling algorithm ID, accurate calculations can also be performed on the outlier data in the outlier data subset based on the default calculation type supported by the above-mentioned smart contract.

It should be noted that the calculation types corresponding to the above approximate calculations and exact calculations are not particularly limited in this specification. For example, it may include summing, averaging, etc., which will not be listed one by one in this specification.

After completing the precise calculation for the outlier data samples in the outlier data subset and the approximate calculation for the non-outlier data samples sampled in the non-outlier data subset, the results of the above precise calculation and the above approximate calculation can be combined, As the final approximate calculation result for the above data set.

In one embodiment shown, the above-mentioned smart contract for performing approximate calculations may specifically be a privacy smart contract deployed in a trusted execution environment mounted on a blockchain node device.

In this scenario, the calculation parameters in the above-mentioned smart contract call transaction and the obtained data samples in the above-mentioned data collection are usually encrypted in advance. Before calling the sampling logic contained in the above-mentioned smart contract to divide the data set into outlier data subsets and non-outlier data subsets, the blockchain node device can also perform the above calculation parameters and calculations in the trusted execution environment. Decrypt the obtained data samples in the above data set respectively.

For example, in one example, the above-mentioned trusted execution environment may be assigned an asymmetric key pair for encryption and decryption of data, and the private key of the above-mentioned asymmetric key may be stored in the above-mentioned trusted execution environment, Publish the public key of the above-mentioned asymmetric key to the above-mentioned calculation initiator. The calculation parameters in the above-mentioned smart contract call transaction and the obtained data samples in the above-mentioned data collection can be encrypted in advance based on the above-mentioned public key. Before calling the sampling logic contained in the above-mentioned smart contract to randomly sample the data set corresponding to the above-mentioned data identification, the blockchain node device can also use the maintained private key to calculate the above-mentioned calculation parameters in the trusted execution environment. And decrypt the obtained data samples in the above data set respectively. In the above technical solution, when a smart contract is called to perform approximate calculations on a data set, by introducing a sampling mechanism for the data set in the smart contract, the accuracy of the approximate calculation results can be reduced without sacrificing the accuracy of the results. The time consuming when performing approximate calculations on this data set improves the computing efficiency when performing approximate calculations on this data set.

For example, taking the case that business-related data sets are pre-stored on the blockchain, after the data sampling mechanism is introduced in the smart contract, the total time taken by the smart contract to calculate the business-related data sets will be , usually can be expressed by the following formula:

Among them, in the above formula, n _g represents the number of data samples sampled from the data set. N _g represents the total number of data samples in the data set. Since the value of n _g is usually an order of magnitude different from the value of N _g , after the sampling mechanism is introduced in the smart contract, the time-consuming calculation of the data set through the smart contract will also be reduced by an order of magnitude. It can be seen that the introduction of a sampling mechanism in smart contracts can significantly shorten the time-consuming calculation of data sets and improve the calculation efficiency of approximate calculations for this data set.

Corresponding to the above method embodiments, this application also provides device embodiments.

The embodiments of the device in this specification can be applied to electronic equipment. The device embodiments may be implemented by software, or may be implemented by hardware or a combination of software and hardware. Taking software implementation as an example, as a logical device, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory and running them through the processor of the electronic device where it is located.

From the hardware level, as shown in Figure 3, it is a hardware structure diagram of the electronic equipment where the device of this specification is located. In addition to the processor, memory, network interface, and non-volatile memory shown in Figure 3, The electronic device in which the device in the embodiment is located usually may also include other hardware according to the actual functions of the electronic device, which will not be described again.

FIG. 4 is a block diagram of a smart contract-based computing device according to an exemplary embodiment of this specification.

Please refer to Figure 4. The smart contract-based computing device 40 can be applied in the electronic device shown in Figure 3. A smart contract for performing approximate calculations is deployed on the blockchain. The device 40 includes:

The receiving module 401 receives a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include data participating in the approximate calculation The data identifier of the collection;

The sampling module 402, in response to the smart contract call transaction, calls the sampling logic contained in the smart contract call transaction, and divides the data set corresponding to the data identifier into outlier data composed of a number of outlier data samples. subset, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

The calculation module 403 further calls the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and on the non-outlier samples sampled from the non-outlier data subset. Approximate calculations are performed on the data samples, and the results of the exact calculations and the approximate calculations are combined to serve as approximate calculation results for the data set.

In this embodiment, the device 40 further includes:

The acquisition module 404 (not shown in Figure 4), in the sampling module 402, divides the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a number of non-outlier data subsets. Before forming a non-outlier data subset composed of data samples, obtain the data set corresponding to the data identifier stored on the blockchain; or, through the oracle program corresponding to the smart contract, obtain the data set corresponding to the data identifier from the Obtain the data set corresponding to the data identifier from the off-chain database connected to the blockchain.

In this embodiment, the sampling module 402:

Perform outlier data calculations on the data samples in the data set corresponding to the data identifier to determine outlier data samples and non-outlier data samples contained in the data set;

The outlier data subset is created based on the outlier data samples, and a non-outlier data subset is created based on the non-outlier data samples.

In this embodiment, the calculation parameters include a confidence probability corresponding to the approximate calculation; and a total error value corresponding to the approximate calculation; the confidence probability represents the accuracy of the approximate calculation; the intelligence The contract maintains three values derived based on Hoeffding's inequality to describe the confidence probability corresponding to the approximate calculation, the error value corresponding to the approximate calculation, and the number of samples corresponding to the data set participating in the approximate calculation. the mathematical relationship between them;

The sampling module 402 further:

Before sampling the data samples in the non-outlier data subset, the confidence probability corresponding to the approximate calculation and the error value corresponding to the approximate calculation are input into the mathematical relationship. Calculate and obtain the number of samples corresponding to the non-outlier data subset.

In this embodiment, the data relationship is expressed using the following formula:

Among them, in the above formula, n _g represents the sampling number;

b _g and a _g represent the maximum and minimum values of the data samples in the data set respectively; δ represents the confidence probability; ε _g represents the error value; N _g represents the total value of the data samples in the data set. quantity.

In this embodiment, the sampling module 402:

The non-outlier data samples in the non-outlier data subset are sampled according to the calculated sampling number.

In this embodiment, sampling for non-outlier data samples in the non-outlier data subset includes random sampling;

The sampling module 402 further:

Get the random number used for random sampling;

Randomly sample non-outlier data samples in the non-outlier data subset based on the random number to obtain non-outlier data samples corresponding to the calculated sampling number.

In this embodiment, sampling for non-outlier data samples in the non-outlier data subset includes stratified sampling;

The sampling module 402 further:

Use the optimization solution method to find the optimal number of data subsets that need to be divided when performing stratified sampling for the non-outlier data subset, and the optimal error value corresponding to each divided data subset. ;

The confidence probability corresponding to the approximate calculation and the optimal error value corresponding to each data subset are input into the mathematical relationship for calculation to obtain the corresponding corresponding to each data subset. Optimal sampling number;

The data set is divided into several data subsets according to the optimal number, and non-outlier data samples in each data subset are sampled according to the optimal sampling data.

In this embodiment, the constraints adopted by the optimization solution method include: the weighted average error value obtained by performing weighted average calculation on the error values corresponding to each data subset is the smallest and is not greater than the total error value. ;

The sampling module 402 further performs the following steps:

Step A: Adjust the value corresponding to the initialized i value;

Step B: Divide the non-outlier data subset into several data subsets each containing i number of samples;

Step C: Use the confidence probability and the adjusted i value as calculation parameters, input them into the mathematical relationship for calculation, obtain the error values corresponding to each of the data subsets, and calculate the error values for each of the data subsets. The corresponding error values are calculated as a weighted average to obtain a weighted average error value;

Step D: Determine whether the weighted average error value is not greater than the total error value and less than the weighted average error value calculated based on the i value before this adjustment; if not, re-execute step A above -Step D, stop iteration until the calculated weighted average error value satisfies the constraint condition, and obtain the optimal i value when the weighted average error value satisfies the constraint condition;

Step E: Determine the optimal number of data subsets that need to be divided when performing stratified sampling on the non-outlier data subset based on the optimal i value, and compare the confidence probability and the optimal The i value is input into the mathematical relationship for calculation, and the optimal error value corresponding to each of the data subsets is obtained.

In this embodiment, the sampling module further:

Get the random number used for random sampling;

Randomly sample non-outlier data samples in each data subset based on the random number to obtain non-outlier data samples corresponding to the calculated optimal sampling number.

In this embodiment, the sampling module further performs any of the following:

Call the random function deployed on the blockchain to generate a random tree for random sampling;

Generate random numbers in the trusted execution environment based on the random number seeds maintained in the trusted execution environment carried in the node device;

Obtain the target data parameter as the random number seed from the data parameters related to the data maintained by the smart contract, and generate random numbers for random sampling in the smart contract based on the acquired target data parameters;

Obtain the random numbers generated outside the chain for random sampling through the oracle program corresponding to the smart contract;

Through the oracle program corresponding to the smart contract, a random number seed generated outside the chain for generating random numbers is obtained, and a random number used for random sampling is generated in the smart contract based on the obtained target data parameters. Number; obtain the random number seed generated outside the chain included in the calculation parameters, and generate random numbers for random sampling in the smart contract based on the random number seed.

In this embodiment, the calculation parameters also include algorithm identifiers;

The calculation module 403:

Perform accurate calculations on outlier data samples in the outlier data subset according to the calculation type indicated by the algorithm identifier;

Performing approximate calculations on non-outlier data samples sampled from the non-outlier data subset includes:

According to the calculation type indicated by the algorithm identifier, approximate calculation is performed on the non-outlier data samples sampled from the non-outlier data subset.

In this embodiment, the smart contract is deployed in a trusted execution environment mounted on the node device; the calculation parameters and the data samples in the data set have been encrypted in advance;

The sampling module 402 further:

Before dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples and a non-outlier data subset composed of a number of non-outlier data samples, the possible The calculation parameters and the obtained data samples in the data set are respectively decrypted in the information execution environment.

The systems, devices, modules or units described in the above embodiments may be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer, which may be in the form of a personal computer, a laptop, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, or a game controller. desktop, tablet, wearable device, or a combination of any of these devices.

In a typical configuration, a computer includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

Memory may include non-permanent storage in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information. Information may be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory. (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transmission medium, can be used to store information that can be accessed by computing devices. As defined in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprises," or any other variation thereof are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that includes a list of elements not only includes those elements, but also includes Other elements are not expressly listed or are inherent to the process, method, article or equipment. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article, or device that includes the stated element.

The foregoing describes specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desired results. Additionally, the processes depicted in the figures do not necessarily require the specific order shown, or sequential order, to achieve desirable results. Multitasking and parallel processing are also possible or may be advantageous in certain implementations.

The terminology used in one or more embodiments of this specification is for the purpose of describing particular embodiments only and is not intended to limit the one or more embodiments of this specification. As used in one or more embodiments of this specification and the appended claims, the singular forms "a," "the" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.

It should be understood that although one or more embodiments of this specification may use the terms first, second, third, etc. to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other. For example, without departing from the scope of one or more embodiments of this specification, the first information may also be called second information, and similarly, the second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "when" or "when" or "in response to determining."

The above are only preferred embodiments of one or more embodiments of this specification, and are not intended to limit one or more embodiments of this specification. Within the spirit and principles of one or more embodiments of this specification, Any modifications, equivalent substitutions, improvements, etc. shall be included in the scope of protection of one or more embodiments of this specification.

Claims

A calculation method based on smart contracts, applied to node devices in a blockchain where smart contracts for performing approximate calculations are deployed, and the method includes:

Receive a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include the data identifier of the data set participating in the approximate calculation ;

In response to the smart contract call transaction, calling the sampling logic included in the smart contract call transaction, dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

Further call the calculation logic contained in the smart contract call transaction to perform precise calculations on the outlier data samples in the outlier data subset, and perform approximate calculations on the non-outlier data samples sampled from the non-outlier data subset. Calculate, and combine the results of the exact calculation and the approximate calculation as an approximate calculation result for the data set.
The method of claim 1, wherein the data set corresponding to the data identifier is divided into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples. Before outlier data subsets, also include:

Obtain the data set corresponding to the data identifier stored on the blockchain; or,

Through the oracle program corresponding to the smart contract, the data set corresponding to the data identifier is obtained from the off-chain database connected to the blockchain.
According to the method of claim 2, the data set corresponding to the data identifier is divided into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples. Data subsets, including:

Perform outlier data calculations on the data samples in the data set corresponding to the data identifier to determine outlier data samples and non-outlier data samples contained in the data set;

The outlier data subset is created based on the outlier data samples, and a non-outlier data subset is created based on the non-outlier data samples.
The method according to claim 3, wherein the calculation parameter includes a confidence probability corresponding to the approximate calculation; and, a total error value corresponding to the approximate calculation; the confidence probability represents the accuracy of the approximate calculation; The smart contract maintains the confidence probability corresponding to the approximate calculation, the error value corresponding to the approximate calculation, and the data set participating in the approximate calculation derived based on Hoeffding's inequality. The mathematical relationship between the three sampling quantities;

Before sampling the data samples in the non-outlier data subset, it also includes:

The confidence probability corresponding to the approximate calculation and the error value corresponding to the approximate calculation are input into the mathematical relationship for calculation, and the number of samples corresponding to the non-outlier data subset is obtained.
According to the method of claim 4, the data relationship is expressed using the following formula:

Among them, in the above formula, n g represents the number of samples; b g and a g represent the maximum and minimum values of the data samples in the data set respectively; δ represents the confidence probability; ε g represents the error Value; N g represents the total number of data samples in the data set.
The method of claim 3, sampling non-outlier data samples in the non-outlier data subset includes:

The non-outlier data samples in the non-outlier data subset are sampled according to the calculated sampling number.
The method of claim 6, wherein sampling of non-outlier data samples in the non-outlier data subset includes random sampling;

Sampling the non-outlier data samples in the non-outlier data subset according to the calculated sampling number includes:

Get the random number used for random sampling;

Randomly sample non-outlier data samples in the non-outlier data subset based on the random number to obtain non-outlier data samples corresponding to the calculated sampling number.
The method of claim 6, wherein sampling of non-outlier data samples in the non-outlier data subset includes stratified sampling;

Sampling the non-outlier data samples in the non-outlier data subset according to the calculated sampling number includes:

Use the optimization solution method to find the optimal number of data subsets that need to be divided when performing stratified sampling for the non-outlier data subset, and the optimal error value corresponding to each divided data subset. ;

The confidence probability corresponding to the approximate calculation and the optimal error value corresponding to each data subset are input into the mathematical relationship for calculation to obtain the corresponding corresponding to each data subset. Optimal sampling number;

The data set is divided into several data subsets according to the optimal number, and non-outlier data samples in each data subset are sampled according to the optimal sampling data.
The method according to claim 8, wherein the constraints adopted by the optimization solution method include: the weighted average error value obtained by performing a weighted average calculation on the error values corresponding to each data subset is the smallest, and is not greater than the total error value;

Use the optimization solution method to find the optimal number of data subsets that need to be divided when performing stratified sampling for the non-outlier data subset, and the optimal error value corresponding to each divided data subset. ,include:

Step A: Adjust the value corresponding to the initialized i value;

Step B: Divide the non-outlier data subset into several data subsets each containing i number of samples;

Step C: Use the confidence probability and the adjusted i value as calculation parameters, input them into the mathematical relationship for calculation, obtain the error values corresponding to each of the data subsets, and calculate the error values for each of the data subsets. The corresponding error values are calculated as a weighted average to obtain a weighted average error value;

Step D: Determine whether the weighted average error value is not greater than the total error value and less than the weighted average error value calculated based on the i value before this adjustment; if not, re-execute step A above -Step D, stop iteration until the calculated weighted average error value satisfies the constraint condition, and obtain the optimal i value when the weighted average error value satisfies the constraint condition;

Step E: Determine the optimal number of data subsets that need to be divided when performing stratified sampling on the non-outlier data subset based on the optimal i value, and compare the confidence probability and the optimal The i value is input into the mathematical relationship for calculation, and the optimal error value corresponding to each of the data subsets is obtained.
The method according to claim 8, sampling non-outlier data samples in each data subset according to the optimal sampling data, including:

Get the random number used for random sampling;

Randomly sample non-outlier data samples in each data subset based on the random number to obtain non-outlier data samples corresponding to the calculated optimal sampling number.
The method according to claim 7 or 10, said obtaining random numbers used for random sampling includes any of the following:

Call the random function deployed on the blockchain to generate a random tree for random sampling;

Generate random numbers in the trusted execution environment based on the random number seeds maintained in the trusted execution environment carried in the node device;

Obtain the target data parameter as the random number seed from the data parameters related to the data maintained by the smart contract, and generate random numbers for random sampling in the smart contract based on the acquired target data parameters;

Obtain the random numbers generated outside the chain for random sampling through the oracle program corresponding to the smart contract;

Through the oracle program corresponding to the smart contract, a random number seed generated outside the chain for generating random numbers is obtained, and a random number used for random sampling is generated in the smart contract based on the obtained target data parameters. Number; obtain the random number seed generated outside the chain included in the calculation parameters, and generate random numbers for random sampling in the smart contract based on the random number seed.
The method according to claim 1, the calculation parameters further include an algorithm identifier;

Perform precise calculations on outlier data samples in the outlier data subset, including:

Perform accurate calculations on outlier data samples in the outlier data subset according to the calculation type indicated by the algorithm identifier;

Performing approximate calculations on non-outlier data samples sampled from the non-outlier data subset includes:

According to the calculation type indicated by the algorithm identifier, approximate calculation is performed on the non-outlier data samples sampled from the non-outlier data subset.
According to the method of claim 1, the smart contract is deployed in a trusted execution environment mounted on the node device; the calculation parameters and data samples in the data set have been encrypted in advance;

Before dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, the method further includes: :

The calculation parameters and the obtained data samples in the data set are decrypted respectively in the trusted execution environment.
A computing device based on smart contracts, applied to node equipment in a blockchain. Smart contracts for performing approximate calculations are deployed on the blockchain. The device includes:

A receiving module that receives a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include a data set participating in the approximate calculation data identification;

A sampling module, in response to the smart contract call transaction, calls the sampling logic included in the smart contract call transaction, and divides the data set corresponding to the data identifier into outlier data sub-sub-sets composed of a number of outlier data samples. a set, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;

The calculation module further calls the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and on the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.
An electronic device including:

processor;

Memory used to store instructions executable by the processor;

Wherein, the processor executes the steps of the method according to any one of claims 1-13 by running the executable instructions.
A computer-readable storage medium having computer instructions stored thereon, which when executed by a processor, implements the steps of the method according to any one of claims 1-13.