WO2023185052A1 - Smart contract-based calculation, update and read method and apparatus, and electronic device - Google Patents

Smart contract-based calculation, update and read method and apparatus, and electronic device Download PDF

Info

Publication number
WO2023185052A1
WO2023185052A1 PCT/CN2022/135439 CN2022135439W WO2023185052A1 WO 2023185052 A1 WO2023185052 A1 WO 2023185052A1 CN 2022135439 W CN2022135439 W CN 2022135439W WO 2023185052 A1 WO2023185052 A1 WO 2023185052A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
calculation
outlier
outlier data
sampling
Prior art date
Application number
PCT/CN2022/135439
Other languages
French (fr)
Chinese (zh)
Inventor
周晨辉
闫莺
Original Assignee
蚂蚁区块链科技(上海)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 蚂蚁区块链科技(上海)有限公司 filed Critical 蚂蚁区块链科技(上海)有限公司
Publication of WO2023185052A1 publication Critical patent/WO2023185052A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/17Function evaluation by approximation methods, e.g. inter- or extrapolation, smoothing, least mean square method
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures

Definitions

  • One or more embodiments of this specification relate to the field of blockchain technology, and in particular to a computing device and electronic equipment based on smart contracts.
  • Blockchain is a new application model of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • data blocks are combined into a chained data structure in a chronological manner and are cryptographically guaranteed to be an untamperable and unforgeable distributed ledger. Due to the characteristics of blockchain, such as decentralization, non-tamperable information, and autonomy, blockchain has also received more and more attention and applications.
  • This specification proposes a calculation method based on smart contracts, which is applied to node devices in the blockchain. Smart contracts for performing approximate calculations are deployed on the blockchain. The method includes:
  • This specification also proposes a computing device based on smart contracts, which is applied to node equipment in the blockchain. Smart contracts for performing approximate calculations are deployed on the blockchain.
  • the device includes:
  • a receiving module that receives a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include a data set participating in the approximate calculation data identification;
  • a sampling module in response to the smart contract call transaction, calls the sampling logic included in the smart contract call transaction, and divides the data set corresponding to the data identifier into outlier data sub-sub-sets composed of a number of outlier data samples. a set, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
  • the calculation module further calls the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and on the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.
  • Figure 1 is a flow chart of a smart contract-based calculation method provided by an exemplary embodiment
  • Figure 2 is a flow chart of an optimization solution method provided by an exemplary embodiment
  • Figure 3 is a schematic structural diagram of an electronic device provided by an exemplary embodiment
  • FIG. 4 is a block diagram of a smart contract-based computing device provided by an exemplary embodiment.
  • the steps of the corresponding method are not necessarily performed in the order shown and described in this specification.
  • methods may include more or fewer steps than described in this specification.
  • a single step described in this specification may be broken down into multiple steps for description in other embodiments; and multiple steps described in this specification may also be combined into a single step in other embodiments. describe.
  • smart contracts deployed on the blockchain to interface with businesses can include, in addition to business logic related to the business, logic for calculating business data related to the business. This allows users to complete business-related calculations on the blockchain by calling the smart contract.
  • the total calculation time usually depends on the time it takes to perform I/O operations on each piece of data and the time it takes to calculate the above set of data in batches.
  • the total time taken by a smart contract to calculate a business-related data set can usually be calculated using the following formula: To express:
  • i represents the i-th piece of data in the above-mentioned data set
  • IO i represents the time-consuming I/O operation for the i-th piece of data
  • Operatioo i represents the batch processing of the i-th piece of data in the data set. The time it takes to perform the calculation.
  • the above smart contracts can also be deployed in the TEE (Trusted execution environment) mounted on the blockchain node device.
  • TEE Trusted execution environment
  • the data in the above data collection usually needs to be encrypted and stored.
  • the total calculation time usually depends on the time it takes to perform I/O operations on each piece of data and the time it takes to decrypt each piece of data. , and the time it takes to calculate the above set of data in batches.
  • the total time taken by a smart contract to calculate a business-related data set can usually be calculated using the following formula: To express:
  • Operation i represents the time taken to decrypt the i-th piece of data in the data set.
  • this specification proposes a technical solution that introduces approximate calculation and data sampling mechanisms into smart contracts to improve the computational efficiency of calculating business-related data.
  • a smart contract for data calculation can be deployed on the blockchain, and the smart contract can contain approximate calculation logic for approximate calculation and sampling logic for data sampling.
  • the calculation initiator can initiate a smart contract call transaction to call the smart contract to perform approximate calculations on the data set participating in the calculation.
  • the smart contract call transaction may include calculation parameters corresponding to the approximate calculation; the calculation parameters may include the data identifier of the data set participating in the approximate calculation;
  • the node device in the blockchain When the node device in the blockchain receives the smart contract call transaction initiated by the calculation initiator, it can respond to the smart contract call transaction, call the sampling logic contained in the smart contract call transaction, and convert the data corresponding to the data identifier.
  • the data set is divided into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, and the non-outlier data subset in the non-outlier data subset is Group data samples are sampled; after the sampling is completed, the approximate calculation logic contained in the smart contract can be further called to perform accurate calculations for the outlier data samples in the outlier data subset, and for the outlier data samples from the non-outlier data subset The sampled non-outlier data samples are subjected to approximate calculation, and the results of the precise calculation and the approximate calculation are combined to serve as the approximate calculation result for the data set.
  • the outlier data in the data set is not sampled and then the approximate calculation is performed, but accurate calculation is performed directly without sampling, it is possible to include outliers in the data set.
  • further avoiding the impact of these outlier data samples on the accuracy of the approximate calculation results for the data set can ensure the accuracy of the approximate calculation for the data set to the greatest extent.
  • Figure 1 is a flow chart of a smart contract-based calculation method provided by an exemplary embodiment.
  • the method is applied to node devices in the blockchain; wherein a smart contract for performing approximate calculations is deployed on the blockchain, and the method includes the following steps:
  • Step 102 Receive a smart contract call transaction for the smart contract initiated by the calculation initiator; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include a data set participating in the approximate calculation data identification;
  • the above-mentioned calculation initiator may specifically be a party with data calculation requirements.
  • the above calculation initiator may be a user with data calculation requirements.
  • the calculation initiator in a scenario based on smart contracts and business docking, can also be an off-chain business system with data calculation requirements.
  • the smart contract contains execution logic corresponding to the contract code. Specifically, it can include approximate calculation logic for approximate calculation and sampling logic for data sampling. . In this way, the logic of approximate calculation of data and data sampling can be introduced into the smart contract.
  • sampling method used for the above data sampling is not particularly limited in this specification; for example, random sampling (Random Sampling), stratified sampling (Stratified Sampling), etc. can be used.
  • the above-mentioned calculation initiator can call the above-mentioned smart contract to perform approximate calculations on the data set participating in the calculation by initiating a smart contract call transaction.
  • the above smart contract can be understood as a contract code anchored on the blockchain.
  • Contract account and the user can register an external account on the blockchain, initiate a smart contract call transaction through the external account, and submit the smart contract call transaction to the connected blockchain node device to call the Smart contracts.
  • the calculation parameters corresponding to the approximate calculation may be included; the calculation parameters may include the data identifier of the data set participating in the approximate calculation.
  • the calculation initiator When the above calculation initiator initiates the above smart contract call transaction, if the calculation initiator directly connects with the blockchain node, it can package a smart contract transaction and directly submit it to the connected blockchain node device point-to-point. Can. If the calculation initiator accesses the blockchain through the blockchain connection service provided by the Baas (Blockchain as a Service) platform, it can generate a call request for the above smart contract and submit the call request to the Baas platform. , and then the Baas platform packages a smart contract call transaction based on the call parameters carried in the call request and submits it to the blockchain node device.
  • the Baas Blockchain as a Service
  • the blockchain node device can receive the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, and when receiving the above-mentioned smart contract call transaction, can respond to the smart contract call transaction and call the above-mentioned smart contract on the blockchain. Approximate calculations are performed on data sets.
  • Step 104 in response to the smart contract call transaction, call the sampling logic contained in the smart contract call transaction, and divide the data set corresponding to the data identifier into outlier data sub-sets composed of a number of outlier data samples. a set, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
  • the blockchain node device After the blockchain node device receives the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, it can respond to the smart contract call transaction, call the sampling logic contained in the smart contract, and process the data set corresponding to the data identifier. Sample the data samples in .
  • the blockchain node device after receiving the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, the blockchain node device usually needs to work with other blockchain nodes participating in the consensus based on the consensus algorithm supported by the blockchain. , perform consensus processing on the smart contract call transaction and the execution results of the smart contract call transaction. Since this manual does not involve improving the consensus process of the blockchain, the process of consensus processing of the smart contract call transaction and the execution results of the smart contract call transaction will not be described in detail in this manual.
  • the blockchain node device may first obtain the above-mentioned The smart contract calls the above data identifier contained in the transaction and reads the data collection involved in the approximate calculation based on the data identifier.
  • the data set can be pre-certified on the above-mentioned blockchain.
  • a certificate deposit contract for data certificate can be deployed on the blockchain.
  • the calculation initiator can package a certificate deposit transaction to include the need in the calculation.
  • the data set is published to the certificate deposit contract for certificate deposit.
  • the execution logic corresponding to the contract code contained in the above-mentioned smart contract may include, in addition to the above-mentioned approximate calculation logic and the above-mentioned sampling logic, data storage logic. That is to say, in addition to being used for approximate calculations, the smart contract itself also has its own data storage function. At this time, before calling the smart contract for calculation, the calculation initiator can also package a certificate deposit transaction and pre-release the data set that needs to participate in the calculation to the smart contract for certificate deposit. Subsequently, the smart contract can Read the above-mentioned data set that has been certified from its own contract storage space to perform approximate calculations.
  • the blockchain node device can obtain the data set corresponding to the data identifier stored on the blockchain based on the above-mentioned data identifier.
  • the data identifier may be a certificate hash returned by the blockchain node after the above-mentioned data set is successfully certificated on the blockchain.
  • the data set can also be pre-stored in an off-chain database connected to the above-mentioned blockchain.
  • the smart contract can obtain the data set corresponding to the data identifier from the above-mentioned off-chain database through its corresponding oracle machine.
  • the above-mentioned oracle program can specifically be a centralized oracle program or a decentralized oracle program.
  • the oracle program may be an oracle service program deployed on a service device outside the chain.
  • the oracle program may be an oracle contract deployed on the blockchain that interfaces with the above-mentioned smart contract.
  • calculation parameters may specifically include various types of parameters shown in the following table:
  • Parameter Type meaning Data set ID Represents a collection of data involved in approximate calculations Calculation type ID The type of calculation that represents the approximate calculation that needs to be made difference Indicates the tolerable calculation error for approximate calculations confidence probability Indicates the expected accuracy of the approximate calculation Sampling algorithm ID Indicates the specified sampling algorithm type
  • calculation parameters in the above-mentioned smart contract call transaction do not contain the calculation type ID, it means that the above-mentioned smart contract is allowed to use the default calculation type to perform approximate calculations on the above-mentioned data set. If the calculation parameters in the above smart contract call transaction do not contain an error value, it means that the tolerable calculation error is 0. If the calculation parameters in the above-mentioned smart contract call transaction do not include the confidence probability, it means that the confidence probability is 100% and the expected accuracy of the approximate calculation is 100%. In this case, the above-mentioned smart contract will perform the above-mentioned data collection. Calculate exactly, no more approximations.
  • the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, in order to avoid outlier data in the above-mentioned data set
  • the above data set can be divided into an outlier data subset composed of several outlier data samples, and a non-outlier data subset composed of several non-outlier data samples, and then only Sampling data samples from non-outlier data subsets.
  • the outlier data in the above data set can be calculated by the above smart contract.
  • the sampling logic included in the smart contract may further include logic for performing outlier calculations on the data set.
  • the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, it can specifically execute the above-mentioned outlier calculation logic and target the above-mentioned data. Perform outlier data calculation on the data samples in the set to determine the outlier data samples and non-outlier data samples contained in the data set, and then create an outlier data subset based on the determined outlier data samples. Create a non-outlier data subset from non-outlier data samples.
  • the process of calculating outlier data for the data samples in the above-mentioned data set usually refers to the process of counting data samples in the above-mentioned data set that are obviously different from other data samples based on a certain statistical algorithm.
  • the method of statistical calculation is not particularly limited in this specification.
  • the median of the values corresponding to the data samples in the above data set can be calculated, and then based on the median, the data samples in the data set whose values deviate significantly from the median can be filtered out, as Outlier data samples.
  • the outlier data in the above data set can also be manually calibrated in advance.
  • the above-mentioned sampling logic contained in the above-mentioned smart contract may also specifically include logic for filtering outlier data on the above-mentioned data collection.
  • the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, it can specifically execute the above-mentioned logic of filtering outlier data, and target the data in the above-mentioned data set.
  • the samples are screened for outlier data to determine the outlier data samples and non-outlier data samples contained in the data set, and then an outlier data subset is created based on the identified outlier data samples. Data samples create non-outlier data subsets.
  • the number of samples to be sampled for the data samples in the non-outlier data subset can be calculated first, and then Then sample the data samples in the non-outlier data subset according to the calculated sampling number.
  • Hoeffding s Inequality is generally used to describe a random variable and an upper bound on the probability of its deviation from its expected value.
  • the above-mentioned sampling number can be used as a random variable
  • the error value of the above-mentioned approximate calculation can be used as the expected value deviation
  • the confidence probability of the above-mentioned approximate calculation can be used as the upper limit of the above-mentioned probability. Therefore, in this specification, Hoeffding's inequality can be used to describe the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation, and the confidence probability of the above-mentioned approximate calculation.
  • Hoeffding's inequality can be used to derive the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation, and the confidence probability of the above-mentioned approximate calculation.
  • Hoeffding's inequality is used to describe the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation and the confidence probability of the above-mentioned approximate calculation, Hoeffding's inequality is expressed as the following formula:
  • H represents the mathematical identifier of Hoeffding's inequality.
  • n g represents the number of samples.
  • b g and a g respectively represent the maximum value and minimum value of the data samples in the data set.
  • represents the confidence probability;
  • ⁇ g represents the error value corresponding to the above approximate calculation;
  • N g represents the total number of data samples in the data set.
  • the above-mentioned mathematical relationship can be maintained in advance.
  • the blockchain node device calls the above-mentioned smart contract to calculate the number of samples required for sampling the data samples in the above-mentioned non-outlier data subset, it can obtain the confidence corresponding to the approximate calculation in the calculation parameters in the above-mentioned smart contract call transaction. probability ⁇ , and the error value ⁇ g corresponding to the approximate calculation, and then input the obtained confidence probability ⁇ and error value ⁇ g into the above mathematical relationship maintained for calculation, and obtain the number of samples corresponding to the above non-outlier data subset .
  • sampling method used to sample data samples in non-outlier data subsets is not particularly limited in this specification; for example, random sampling (Random Sampling), stratified sampling (Stratified Sampling), etc.
  • random sampling and stratified sampling will be used as examples to describe in detail the sampling process for non-outlier data samples in non-outlier data subsets.
  • the blockchain node device calls the above smart contract, based on the calculated
  • the above-mentioned random numbers are specifically used to control the randomness of the non-outlier data samples sampled from the above-mentioned non-outlier data subsets.
  • the obtained random numbers can be used to determine the need to obtain the above-mentioned non-outlier data.
  • Non-outlier data samples sampled from the subset For example, in one example, a random number can be used to represent the sample identifier of the non-outlier data sample to be sampled. During the random sampling process, the random number can be used to randomly extract a subset of the non-outlier data. The value of this random number serves as the non-outlier data sample identified by the sample to complete the data sampling.
  • a random function for generating random numbers can be pre-deployed on the blockchain.
  • the above random function can be deployed on the blockchain as an independent smart contract, or deployed in the smart contract as the execution logic contained in the smart contract for approximate calculation.
  • a random tree can be generated on the blockchain by calling the random function mentioned above;
  • the above-mentioned blockchain node device can be equipped with a Trusted Execution Environment (Trusted Execution Environment).
  • Trusted Execution Environment a Trusted Execution Environment
  • a random number seed for generating random numbers can be maintained in advance.
  • random numbers can be generated based on the random seed in the trusted execution environment.
  • the target data parameters that can be used as random number seeds can also be obtained from the data parameters related to the data maintained by the above-mentioned smart contract used for approximate calculation, and then the target data parameters can be obtained based on the obtained targets.
  • the data parameters generate random numbers in the above smart contract.
  • the unique parameters such as the hash value of the historical block maintained by the above-mentioned smart contract and the generation timestamp of the historical block can be used as a random number seed to calculate the random number in the smart contract.
  • the random numbers described above can be generated off-chain.
  • the above-mentioned smart contract can also obtain the random number generated outside the chain through the oracle program corresponding to the smart contract.
  • a random number seed for further generating the above-mentioned random numbers can be generated outside the chain.
  • the above-mentioned smart contract can also obtain the random number seed generated outside the chain through the oracle program corresponding to the smart contract, and then generate the random number seed in the smart contract based on the obtained random number seed. random number.
  • the random number seed generated outside the chain can also be carried as a calculation parameter in the smart contract call transaction.
  • the random number seed generated off-chain included in the smart contract call transaction can be obtained, and then a random number can be generated in the smart contract based on the obtained random number seed.
  • stratified sampling is used to randomly sample data samples in non-outlier data subsets, in this case, since stratified sampling is used, it is usually necessary to The group data subset is divided into several buckets, and then data sampling is performed from these buckets.
  • the blockchain node device calls the above-mentioned smart contract to calculate the number of samples required for stratified sampling, it can obtain the confidence probability ⁇ corresponding to the approximate calculation contained in the calculation parameters carried by the above-mentioned smart contract call transaction, and the confidence probability ⁇ corresponding to each The error value ⁇ k corresponding to the bucket, and then input the obtained confidence probability ⁇ and the error value ⁇ k of each bucket into the above mathematical relationship maintained for calculation respectively, and obtain the corresponding bucket corresponding to the above non-outlier data subset. the number of samples.
  • the number of buckets K that need to be divided and the error value ⁇ k corresponding to each bucket can be specified by the calculation initiator and used as The calculation parameters are carried in the above smart contract call transaction.
  • the above calculation parameters in addition to the total error ⁇ g specified by the calculation initiator for the approximate calculation of the above data set, the above calculation parameters also need to carry K error values ⁇ k corresponding to each bucket.
  • the number of buckets K needs to be divided, and the error value ⁇ k corresponding to each bucket, which can also be specified by the above-mentioned smart contract on the chain.
  • the optimal value obtained by independent calculation.
  • the number K of buckets to be divided and the error value ⁇ k corresponding to each bucket can be the optimal value solved by the above-mentioned smart contract using the optimization solution method.
  • the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to conduct hierarchical sampling of non-outlier data samples in the non-outlier data subset, it can first use the optimization solution method to solve the problem of The optimal number K of buckets required for stratified sampling of the above non-outlier data subset, and the optimal error value ⁇ k corresponding to each bucket, are then added to the calculation parameters in the above smart contract call transaction.
  • the confidence probability ⁇ corresponding to the approximate calculation and the optimal error value ⁇ k corresponding to each solved bucket are entered into the above mathematical relationships maintained for calculation respectively, and the corresponding buckets corresponding to the above non-outlier data subsets are obtained. the optimal number of samples.
  • stratified sampling can be performed on the data samples in the non-outlier data subset.
  • optimization solution method used by the above-mentioned smart contract is not particularly limited in this specification.
  • those skilled in the art can flexibly adopt different optimal solutions based on actual needs.
  • Optimization solving algorithm For example, in practical applications, commonly used optimal solution algorithms such as gradient descent can be used.
  • the optimal solution goals may include finding the optimal number of divided buckets, finding the optimal error value corresponding to each bucket, and so on. Then, in practical applications, the above constraints can be set for the above optimization solution method based on the above optimization objectives.
  • setting constraints for the above optimization solution method may specifically be:
  • a weighted average calculation is performed on the error values corresponding to each bucket, and the weighted average error value obtained is the smallest and no greater than the total error value corresponding to the approximate calculation for the above-mentioned non-outlier data subset.
  • ⁇ g represents the total error value corresponding to the approximate calculation for the above-mentioned non-outlier data subset.
  • N k represents the number of samples sampled from the k-th bucket.
  • N represents the total number of samples sampled from the above non-outlier data subset.
  • Figure 2 is a flow chart of an optimization solution method shown in this specification, including the following execution steps:
  • Step 201 Initialize the i value; where the i value represents the number of samples included in each bucket of the initialization settings.
  • the following steps are iteratively executed:
  • Step 202 adjust the value corresponding to the initialized i value
  • the adjustment range of the i value can be set flexibly and is not specifically limited in this manual.
  • Step 203 Divide the non-outlier data subset into several buckets each containing i samples;
  • Step 204 Use the above-mentioned confidence probability ⁇ (i.e., the confidence probability ⁇ included in the calculation parameters carried by the smart contract call transaction) and the adjusted i value (i.e., the number of samples corresponding to each bucket) as calculation parameters, and input them into the mathematical Calculate in the relationship to obtain the error value corresponding to each bucket, and perform a weighted average calculation on the error value corresponding to each bucket to obtain the weighted average error value;
  • the confidence probability ⁇ i.e., the confidence probability ⁇ included in the calculation parameters carried by the smart contract call transaction
  • the adjusted i value i.e., the number of samples corresponding to each bucket
  • steps 202 to 204 will be executed again to execute the second round of iteration.
  • Step 205 Determine whether the weighted average error value is not greater than the total error value (that is, the error value corresponding to the approximate calculation contained in the calculation parameters carried by the smart contract call transaction), and is less than the weighted value calculated in the previous round of iterations.
  • the average error value that is, the weighted average error value calculated based on the i value before this round of iteration adjustment
  • re-execute the above steps 202 to 205 continue to execute the next round of iteration, and repeat the above iteration process, The iteration stops until the optimization solution algorithm converges and the weighted average error value that satisfies the above constraints is calculated.
  • Step 206 After stopping the iteration, obtain the optimal i value when the calculated weighted average error value satisfies the above constraints;
  • Step 207 Determine the optimal number of buckets to be divided when performing stratified sampling on the non-outlier data subset based on the optimal i value, and again combine the above confidence probability and the above optimal i value , input into the mathematical relationship for calculation, and obtain the optimal error value corresponding to each bucket.
  • the above-mentioned smart contract uses the optimization solution method to solve for the optimal number of buckets that need to be divided, and the optimal number of samples corresponding to each bucket, and then the optimal number and the optimal error value can be calculated based on the optimal number and the optimal error value.
  • the above non-outlier data subsets are stratified sampled.
  • the above-mentioned non-outlier data subset can first be stratified according to the optimal number.
  • the group data subset is divided into several buckets; for example, assuming that the above-mentioned optimal number is K, the above-mentioned non-outlier data subset can be divided into K buckets. Then, from each divided bucket, the data samples in each bucket can be sampled according to the above-mentioned optimal sampling number.
  • the specific sampling method used to sample the data samples in each bucket according to the above-mentioned optimal sampling number is not particularly limited in this specification.
  • Step 106 Further call the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and perform accurate calculations on the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.
  • the calculation type specified by the calculation initiator can be used for approximate calculation, or the default calculation type supported by the above smart contract can be used for approximate calculation. Specially limited.
  • the sampling algorithm ID may also be included in the above-mentioned smart contract call transaction.
  • the sampling algorithm ID may be specifically used to indicate the calculation type specified by the calculation initiator to perform approximate calculation on the above-mentioned data set.
  • the above-mentioned smart contract call transaction does not include the above-mentioned sampling algorithm ID, that is, the calculation initiator does not specify a calculation type for approximate calculation of the above-mentioned data set, it can also be based on the default calculation type supported by the above-mentioned smart contract for the sampled results. Approximate calculations are performed on non-outlier sample data.
  • the outlier data samples in the above-mentioned outlier data subset may not be sampled.
  • the results of approximate calculations for outlier data samples usually deviate from the approximate calculation results of the above-mentioned data sets, it is not necessary to perform approximate calculations for the outlier data samples in the above-mentioned outlier data subsets. Instead, make precise calculations.
  • the calculation type specified by the calculation initiator can be used for accurate calculation, or the default calculation type supported by the above smart contract can be used for accurate calculation. In this manual, There are no special restrictions.
  • calculation types corresponding to the above approximate calculations and exact calculations are not particularly limited in this specification. For example, it may include summing, averaging, etc., which will not be listed one by one in this specification.
  • the above-mentioned smart contract for performing approximate calculations may specifically be a privacy smart contract deployed in a trusted execution environment mounted on a blockchain node device.
  • the calculation parameters in the above-mentioned smart contract call transaction and the obtained data samples in the above-mentioned data collection are usually encrypted in advance.
  • the blockchain node device Before calling the sampling logic contained in the above-mentioned smart contract to divide the data set into outlier data subsets and non-outlier data subsets, the blockchain node device can also perform the above calculation parameters and calculations in the trusted execution environment. Decrypt the obtained data samples in the above data set respectively.
  • the above-mentioned trusted execution environment may be assigned an asymmetric key pair for encryption and decryption of data, and the private key of the above-mentioned asymmetric key may be stored in the above-mentioned trusted execution environment, Publish the public key of the above-mentioned asymmetric key to the above-mentioned calculation initiator.
  • the calculation parameters in the above-mentioned smart contract call transaction and the obtained data samples in the above-mentioned data collection can be encrypted in advance based on the above-mentioned public key.
  • the blockchain node device Before calling the sampling logic contained in the above-mentioned smart contract to randomly sample the data set corresponding to the above-mentioned data identification, the blockchain node device can also use the maintained private key to calculate the above-mentioned calculation parameters in the trusted execution environment. And decrypt the obtained data samples in the above data set respectively.
  • the accuracy of the approximate calculation results can be reduced without sacrificing the accuracy of the results. The time consuming when performing approximate calculations on this data set improves the computing efficiency when performing approximate calculations on this data set.
  • the total time taken by the smart contract to calculate the business-related data sets will be , usually can be expressed by the following formula:
  • n g represents the number of data samples sampled from the data set.
  • N g represents the total number of data samples in the data set. Since the value of n g is usually an order of magnitude different from the value of N g , after the sampling mechanism is introduced in the smart contract, the time-consuming calculation of the data set through the smart contract will also be reduced by an order of magnitude. It can be seen that the introduction of a sampling mechanism in smart contracts can significantly shorten the time-consuming calculation of data sets and improve the calculation efficiency of approximate calculations for this data set.
  • the outlier data in the data set is not sampled and then the approximate calculation is performed, but accurate calculation is performed directly without sampling, it is possible to include outliers in the data set.
  • further avoiding the impact of these outlier data samples on the accuracy of the approximate calculation results for the data set can ensure the accuracy of the approximate calculation for the data set to the greatest extent.
  • this application also provides device embodiments.
  • the embodiments of the device in this specification can be applied to electronic equipment.
  • the device embodiments may be implemented by software, or may be implemented by hardware or a combination of software and hardware.
  • Taking software implementation as an example as a logical device, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory and running them through the processor of the electronic device where it is located.
  • FIG. 3 it is a hardware structure diagram of the electronic equipment where the device of this specification is located.
  • the electronic device in which the device in the embodiment is located usually may also include other hardware according to the actual functions of the electronic device, which will not be described again.
  • FIG. 4 is a block diagram of a smart contract-based computing device according to an exemplary embodiment of this specification.
  • the smart contract-based computing device 40 can be applied in the electronic device shown in Figure 3.
  • a smart contract for performing approximate calculations is deployed on the blockchain.
  • the device 40 includes:
  • the receiving module 401 receives a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include data participating in the approximate calculation The data identifier of the collection;
  • the sampling module 402 in response to the smart contract call transaction, calls the sampling logic contained in the smart contract call transaction, and divides the data set corresponding to the data identifier into outlier data composed of a number of outlier data samples. subset, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
  • the calculation module 403 further calls the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and on the non-outlier samples sampled from the non-outlier data subset. Approximate calculations are performed on the data samples, and the results of the exact calculations and the approximate calculations are combined to serve as approximate calculation results for the data set.
  • the device 40 further includes:
  • the acquisition module 404 (not shown in Figure 4), in the sampling module 402, divides the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a number of non-outlier data subsets. Before forming a non-outlier data subset composed of data samples, obtain the data set corresponding to the data identifier stored on the blockchain; or, through the oracle program corresponding to the smart contract, obtain the data set corresponding to the data identifier from the Obtain the data set corresponding to the data identifier from the off-chain database connected to the blockchain.
  • sampling module 402 the sampling module 402:
  • the outlier data subset is created based on the outlier data samples, and a non-outlier data subset is created based on the non-outlier data samples.
  • the calculation parameters include a confidence probability corresponding to the approximate calculation; and a total error value corresponding to the approximate calculation; the confidence probability represents the accuracy of the approximate calculation; the intelligence
  • the contract maintains three values derived based on Hoeffding's inequality to describe the confidence probability corresponding to the approximate calculation, the error value corresponding to the approximate calculation, and the number of samples corresponding to the data set participating in the approximate calculation. the mathematical relationship between them;
  • the sampling module 402 further:
  • the confidence probability corresponding to the approximate calculation and the error value corresponding to the approximate calculation are input into the mathematical relationship. Calculate and obtain the number of samples corresponding to the non-outlier data subset.
  • n g represents the sampling number
  • b g and a g represent the maximum and minimum values of the data samples in the data set respectively; ⁇ represents the confidence probability; ⁇ g represents the error value; N g represents the total value of the data samples in the data set. quantity.
  • sampling module 402 the sampling module 402:
  • the non-outlier data samples in the non-outlier data subset are sampled according to the calculated sampling number.
  • sampling for non-outlier data samples in the non-outlier data subset includes random sampling
  • the sampling module 402 further:
  • sampling for non-outlier data samples in the non-outlier data subset includes stratified sampling
  • the sampling module 402 further:
  • the confidence probability corresponding to the approximate calculation and the optimal error value corresponding to each data subset are input into the mathematical relationship for calculation to obtain the corresponding corresponding to each data subset.
  • Optimal sampling number
  • the data set is divided into several data subsets according to the optimal number, and non-outlier data samples in each data subset are sampled according to the optimal sampling data.
  • the constraints adopted by the optimization solution method include: the weighted average error value obtained by performing weighted average calculation on the error values corresponding to each data subset is the smallest and is not greater than the total error value. ;
  • the sampling module 402 further performs the following steps:
  • Step A Adjust the value corresponding to the initialized i value
  • Step B Divide the non-outlier data subset into several data subsets each containing i number of samples;
  • Step C Use the confidence probability and the adjusted i value as calculation parameters, input them into the mathematical relationship for calculation, obtain the error values corresponding to each of the data subsets, and calculate the error values for each of the data subsets.
  • the corresponding error values are calculated as a weighted average to obtain a weighted average error value;
  • Step D Determine whether the weighted average error value is not greater than the total error value and less than the weighted average error value calculated based on the i value before this adjustment; if not, re-execute step A above -Step D, stop iteration until the calculated weighted average error value satisfies the constraint condition, and obtain the optimal i value when the weighted average error value satisfies the constraint condition;
  • Step E Determine the optimal number of data subsets that need to be divided when performing stratified sampling on the non-outlier data subset based on the optimal i value, and compare the confidence probability and the optimal The i value is input into the mathematical relationship for calculation, and the optimal error value corresponding to each of the data subsets is obtained.
  • sampling module further:
  • sampling module further performs any of the following:
  • the target data parameter as the random number seed from the data parameters related to the data maintained by the smart contract, and generate random numbers for random sampling in the smart contract based on the acquired target data parameters;
  • a random number seed generated outside the chain for generating random numbers is obtained, and a random number used for random sampling is generated in the smart contract based on the obtained target data parameters.
  • Number obtain the random number seed generated outside the chain included in the calculation parameters, and generate random numbers for random sampling in the smart contract based on the random number seed.
  • calculation parameters also include algorithm identifiers
  • the calculation module 403 calculates the calculation module 403:
  • Performing approximate calculations on non-outlier data samples sampled from the non-outlier data subset includes:
  • the smart contract is deployed in a trusted execution environment mounted on the node device; the calculation parameters and the data samples in the data set have been encrypted in advance;
  • the sampling module 402 further:
  • the possible The calculation parameters and the obtained data samples in the data set are respectively decrypted in the information execution environment.
  • a typical implementation device is a computer, which may be in the form of a personal computer, a laptop, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, or a game controller. desktop, tablet, wearable device, or a combination of any of these devices.
  • a computer includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash random access memory
  • Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information.
  • Information may be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • read-only memory read-only memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • compact disc read-only memory CD-ROM
  • DVD digital versatile disc
  • Magnetic tape cartridges magnetic disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transmission medium, can be used to store information that can be accessed by computing devices.
  • computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
  • first, second, third, etc. may use the terms first, second, third, etc. to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other.
  • first information may also be called second information, and similarly, the second information may also be called first information.
  • word “if” as used herein may be interpreted as "when” or “when” or “in response to determining.”

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Accounting & Taxation (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Technology Law (AREA)
  • Algebra (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Development Economics (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A smart contract-based calculation method. A smart contract for performing approximate calculation is deployed on a blockchain. The method comprises: receiving a smart contract call transaction for a smart contract initiated by a calculation initiating party, the smart contract call transaction comprising calculation parameters corresponding to approximate calculation, and the calculation parameters comprising a data identifier of a data collection participating in the approximate calculation; in response to the smart contract call transaction, calling a sampling logic comprised in the smart contract call transaction, dividing the data collection corresponding to the data identifier into an outlier data subset consisting of a plurality of outlier data samples and a non-outlier data subset consisting of a plurality of non-outlier data samples, and performing sampling for the non-outlier data samples in the non-outlier data subset; and calling a calculation logic comprised in the smart contract call transaction, performing accurate calculation for the outlier data samples in the outlier data subset, performing approximate calculation on the non-outlier data samples obtained by sampling, and combining the results of accurate calculation and approximate calculation.

Description

基于智能合约的计算、更新、读取方法及装置、电子设备Computing, updating, reading methods and devices and electronic equipment based on smart contracts
本申请要求于2022年03月30日提交中国专利局、申请号为202210332050.4、发明名称为“基于智能合约的计算、更新、读取方法及装置、电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires the priority of the Chinese patent application submitted to the China Patent Office on March 30, 2022, with the application number 202210332050.4, and the invention name is "Smart Contract-Based Calculation, Update, Reading Methods and Devices, Electronic Equipment", which The entire contents are incorporated herein by reference.
技术领域Technical field
本说明书一个或多个实施例涉及区块链技术领域,尤其涉及一种基于智能合约的计算装置、电子设备。One or more embodiments of this specification relate to the field of blockchain technology, and in particular to a computing device and electronic equipment based on smart contracts.
背景技术Background technique
区块链(Blockchain)是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链系统中按照时间顺序将数据区块以顺序相连的方式组合成链式数据结构,并以密码学方式保证的不可篡改和不可伪造的分布式账本。由于区块链具有去中心化、信息不可篡改、自治性等特性,区块链也受到人们越来越多的重视和应用。Blockchain is a new application model of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. In the blockchain system, data blocks are combined into a chained data structure in a chronological manner and are cryptographically guaranteed to be an untamperable and unforgeable distributed ledger. Due to the characteristics of blockchain, such as decentralization, non-tamperable information, and autonomy, blockchain has also received more and more attention and applications.
发明内容Contents of the invention
本说明书提出一种基于智能合约的计算方法,应用于区块链中的节点设备,所述区块链上部署了用于执行近似计算的智能合约,所述方法包括:This specification proposes a calculation method based on smart contracts, which is applied to node devices in the blockchain. Smart contracts for performing approximate calculations are deployed on the blockchain. The method includes:
接收计算发起方发起的针对所述智能合约的智能合约调用交易;其中,所述智能合约调用交易包括与所述近似计算对应的计算参数;所述计算参数包括参与近似计算的数据集合的数据标识;Receive a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include the data identifier of the data set participating in the approximate calculation ;
响应于所述智能合约调用交易,调用所述智能合约调用交易包含的采样逻辑,将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,并针对所述非离群数据子集中的非离群数据样本进行采样;In response to the smart contract call transaction, calling the sampling logic included in the smart contract call transaction, dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
进一步调用所述智能合约调用交易包含的计算逻辑,针对所述离群数据子集中的离群数据样本进行精确计算,针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,并合并所述精确计算和所述近似计算的结果,以作为针对所述数据集合的近似计算结果。Further call the calculation logic contained in the smart contract call transaction to perform precise calculations on the outlier data samples in the outlier data subset, and perform approximate calculations on the non-outlier data samples sampled from the non-outlier data subset. Calculate, and combine the results of the exact calculation and the approximate calculation as an approximate calculation result for the data set.
本说明书还提出一种基于智能合约的计算装置,应用于区块链中的节点设备,所述区块链上部署了用于执行近似计算的智能合约,所述装置包括:This specification also proposes a computing device based on smart contracts, which is applied to node equipment in the blockchain. Smart contracts for performing approximate calculations are deployed on the blockchain. The device includes:
接收模块,接收计算发起方发起的针对所述智能合约的智能合约调用交易;其中,所述智能合约调用交易包括与所述近似计算对应的计算参数;所述计算参数包括参与近似计算的数据集合的数据标识;A receiving module that receives a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include a data set participating in the approximate calculation data identification;
采样模块,响应于所述智能合约调用交易,调用所述智能合约调用交易包含的采样逻辑,将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,并针对所述非离群数据子集中的非离群数据样本进行采样;A sampling module, in response to the smart contract call transaction, calls the sampling logic included in the smart contract call transaction, and divides the data set corresponding to the data identifier into outlier data sub-sub-sets composed of a number of outlier data samples. a set, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
计算模块,进一步调用所述智能合约调用交易包含的计算逻辑,针对所述离群数据子集中的离群数据样本进行精确计算,针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,并合并所述精确计算和所述近似计算的结果,以作为针对所述数据集合的近似计算结果。The calculation module further calls the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and on the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.
以上技术方案中,在调用智能合约针对数据集合进行近似计算的场景下,通过在智能合约中引入针 对该数据集合的采样机制,可以在不牺牲近似计算结果的准确度的基础上,降低对该数据集合进行近似计算时的耗时,提高针对该数据集合进行近似计算时的计算效率。而且,由于在对该数据集合进行近似计算的过程中,不对该数据集合中的离群数据进行采样后执行近似计算,而是不进行采样直接进行精确计算,从而可以该数据集合中包括离群数据的情况下,进一步避免这些离群数据样本对针对该数据集合的近似计算结果的准确度造成影响,可以最大程度的确保针对该数据集合进行近似计算的准确度。In the above technical solution, when a smart contract is called to perform approximate calculations on a data set, by introducing a sampling mechanism for the data set in the smart contract, the accuracy of the approximate calculation results can be reduced without sacrificing the accuracy of the data set. The time consuming when performing approximate calculations on a data set improves the computational efficiency when performing approximate calculations on the data set. Moreover, since in the process of approximate calculation of the data set, the outlier data in the data set is not sampled and then the approximate calculation is performed, but accurate calculation is performed directly without sampling, it is possible to include outliers in the data set. In the case of data, further avoiding the impact of these outlier data samples on the accuracy of the approximate calculation results for the data set can ensure the accuracy of the approximate calculation for the data set to the greatest extent.
附图说明Description of drawings
图1是一示例性实施例提供的一种基于智能合约的计算方法的流程图;Figure 1 is a flow chart of a smart contract-based calculation method provided by an exemplary embodiment;
图2是一示例性实施例提供的一种最优化求解方法的流程图;Figure 2 is a flow chart of an optimization solution method provided by an exemplary embodiment;
图3是一示例性实施例提供的一种电子设备的结构示意图;Figure 3 is a schematic structural diagram of an electronic device provided by an exemplary embodiment;
图4是一示例性实施例提供的一种基于智能合约的计算装置的框图。FIG. 4 is a block diagram of a smart contract-based computing device provided by an exemplary embodiment.
具体实施方式Detailed ways
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本说明书一个或多个实施例相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本说明书一个或多个实施例的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of this specification. Rather, they are merely examples of apparatus and methods consistent with some aspects of one or more embodiments of this specification as detailed in the appended claims.
需要说明的是:在其他实施例中并不一定按照本说明书示出和描述的顺序来执行相应方法的步骤。在一些其他实施例中,其方法所包括的步骤可以比本说明书所描述的更多或更少。此外,本说明书中所描述的单个步骤,在其他实施例中可能被分解为多个步骤进行描述;而本说明书中所描述的多个步骤,在其他实施例中也可能被合并为单个步骤进行描述。It should be noted that in other embodiments, the steps of the corresponding method are not necessarily performed in the order shown and described in this specification. In some other embodiments, methods may include more or fewer steps than described in this specification. In addition, a single step described in this specification may be broken down into multiple steps for description in other embodiments; and multiple steps described in this specification may also be combined into a single step in other embodiments. describe.
随着智能合约技术的不断发展,在使用智能合约与业务进行对接时,智能合约也逐渐开始承担一部分与该业务相关的算力。With the continuous development of smart contract technology, when smart contracts are used to connect with businesses, smart contracts gradually begin to bear part of the computing power related to the business.
例如,在实际应用中,区块链上部署的用于与业务进行对接的智能合约中,除了可以包括与业务相关的业务逻辑以外,还可以包括针对该业务相关的业务数据进行计算的逻辑,从而使得用户可以通过调用该智能合约的方式,在区块链上完成针对该业务相关的计算。For example, in practical applications, smart contracts deployed on the blockchain to interface with businesses can include, in addition to business logic related to the business, logic for calculating business data related to the business. This allows users to complete business-related calculations on the blockchain by calling the smart contract.
当利用智能合约对与业务相关的数据集合进行计算时,其计算总耗时通常取决于针对每一条数据分别进行I/O操作的耗时和对上述一组数据批量进行计算的耗时。When smart contracts are used to calculate business-related data sets, the total calculation time usually depends on the time it takes to perform I/O operations on each piece of data and the time it takes to calculate the above set of data in batches.
例如,在实际应用中,以与业务相关的数据集合预先存证在区块链上为例,此时智能合约对与业务相关的数据集合进行计算时的总耗时,通常可以用如下的公式进行表示:For example, in practical applications, taking a business-related data set that is pre-stored on the blockchain as an example, the total time taken by a smart contract to calculate a business-related data set can usually be calculated using the following formula: To express:
Figure PCTCN2022135439-appb-000001
Figure PCTCN2022135439-appb-000001
其中,在上述公式中,i表示上述数据集合中的第i条数据;IO i表示针对第i条数据进行I/O操作处理操作的耗时;Operatioo i表示对数据集合中的i条数据批量进行计算的耗时。 Among them, in the above formula, i represents the i-th piece of data in the above-mentioned data set; IO i represents the time-consuming I/O operation for the i-th piece of data; Operatioo i represents the batch processing of the i-th piece of data in the data set. The time it takes to perform the calculation.
需要说明的是,由于数据集合在区块链上进行存证时,通常是以key-Value键值对的形式,逐条的存储在区块链节点设备搭载的存储介质中,因此对于存储在区块链上的上述数据集合,通常只能根据数据的key键值,逐条的从区块链节点设备搭载的存储介质中来读取数据。It should be noted that when data sets are stored on the blockchain, they are usually stored one by one in the storage medium mounted on the blockchain node device in the form of key-value pairs. Therefore, for data stored in the area The above-mentioned data collection on the blockchain can usually only read data one by one from the storage media carried by the blockchain node device based on the key value of the data.
在一些对数据计算的隐私性和安全性要求较高的应用场景中,上述智能合约还可以部署在区块链节点设备搭载的TEE(Trusted execution environment,可信执行环境)中。In some application scenarios that require high privacy and security of data computing, the above smart contracts can also be deployed in the TEE (Trusted execution environment) mounted on the blockchain node device.
在这种情况下,上述数据集合中的数据,通常都需要加密存储。此时,利用智能合约对与业务相关的数据集合进行计算时,其计算总耗时则通常取决于针对每一条数据分别进行I/O操作的耗时、针对每 一条数据分别进行解密的耗时、和对上述一组数据批量进行计算的耗时。In this case, the data in the above data collection usually needs to be encrypted and stored. At this time, when smart contracts are used to calculate business-related data sets, the total calculation time usually depends on the time it takes to perform I/O operations on each piece of data and the time it takes to decrypt each piece of data. , and the time it takes to calculate the above set of data in batches.
例如,在实际应用中,以与业务相关的数据集合预先存证在区块链上为例,此时智能合约对与业务相关的数据集合进行计算时的总耗时,通常可以用如下的公式进行表示:For example, in practical applications, taking a business-related data set that is pre-stored on the blockchain as an example, the total time taken by a smart contract to calculate a business-related data set can usually be calculated using the following formula: To express:
Figure PCTCN2022135439-appb-000002
Figure PCTCN2022135439-appb-000002
其中,在上述公式中,Operation i表示对数据集合中的第i条数据进行解密的耗时。 Among them, in the above formula, Operation i represents the time taken to decrypt the i-th piece of data in the data set.
通过以上的介绍不难看出,在利用智能合约对与业务相关的数据集合进行计算的场景下,如果该数据集合包含的数据量比较大,通过智能合约对该数据集合进行计算,得到准确的计算结果是非常耗时的。From the above introduction, it is not difficult to see that in the scenario of using smart contracts to calculate business-related data sets, if the data set contains a relatively large amount of data, the smart contract can be used to calculate the data set to obtain accurate calculations. The result is very time consuming.
而在实际应用中,在一些业务场景之下,可能并不需要针对与业务相关的数据的精确计算结果,而是可以容忍一些计算精度上的损失。In practical applications, in some business scenarios, precise calculation results for business-related data may not be required, but some loss in calculation accuracy can be tolerated.
例如,在计算用户平均年龄的计算场景下,大多数情况下是不需要准确的计算结果的,通常只需要近似计算得到一个平均年龄的区间即可。For example, in the calculation scenario of calculating the average age of users, in most cases, accurate calculation results are not required. Usually, only an approximate calculation is needed to obtain an average age range.
基于此,本说明书提出一种在智能合约中引入近似计算和数据采样的机制,来提升针对业务相关的数据进行计算的计算效率的技术方案。Based on this, this specification proposes a technical solution that introduces approximate calculation and data sampling mechanisms into smart contracts to improve the computational efficiency of calculating business-related data.
在实现时,可以在区块链上部署用于进行数据计算的智能合约,在该智能合约中可以包含用于进行近似计算的近似计算逻辑和用于进行数据采样的采样逻辑。计算发起方可以通过发起一笔智能合约调用交易的方式,来调用该智能合约对参与计算的数据集合进行近似计算。其中,该智能合约调用交易可以包括与近似计算对应的计算参数;该计算参数可以包括参与近似计算的数据集合的数据标识;When implemented, a smart contract for data calculation can be deployed on the blockchain, and the smart contract can contain approximate calculation logic for approximate calculation and sampling logic for data sampling. The calculation initiator can initiate a smart contract call transaction to call the smart contract to perform approximate calculations on the data set participating in the calculation. Wherein, the smart contract call transaction may include calculation parameters corresponding to the approximate calculation; the calculation parameters may include the data identifier of the data set participating in the approximate calculation;
而区块链中的节点设备在接收到计算发起方发起的该智能合约调用交易时,可以响应于智能合约调用交易,调用该智能合约调用交易包含的采样逻辑,将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,并针对所述非离群数据子集中的非离群数据样本进行采样;在采样完成后,可以进一步调用该智能合约包含的近似计算逻辑,针对所述离群数据子集中的离群数据样本进行精确计算,针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,并合并所述精确计算和所述近似计算的结果,以作为针对所述数据集合的近似计算结果。When the node device in the blockchain receives the smart contract call transaction initiated by the calculation initiator, it can respond to the smart contract call transaction, call the sampling logic contained in the smart contract call transaction, and convert the data corresponding to the data identifier. The data set is divided into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, and the non-outlier data subset in the non-outlier data subset is Group data samples are sampled; after the sampling is completed, the approximate calculation logic contained in the smart contract can be further called to perform accurate calculations for the outlier data samples in the outlier data subset, and for the outlier data samples from the non-outlier data subset The sampled non-outlier data samples are subjected to approximate calculation, and the results of the precise calculation and the approximate calculation are combined to serve as the approximate calculation result for the data set.
在以上技术方案中,在调用智能合约针对数据集合进行近似计算的场景下,通过在智能合约中引入针对该数据集合的采样机制,可以在不牺牲近似计算结果的准确度的基础上,降低对该数据集合进行近似计算时的耗时,提高针对该数据集合进行近似计算时的计算效率。In the above technical solution, when a smart contract is called to perform approximate calculations on a data set, by introducing a sampling mechanism for the data set in the smart contract, the accuracy of the approximate calculation results can be reduced without sacrificing the accuracy of the results. The time consuming when performing approximate calculations on this data set improves the computing efficiency when performing approximate calculations on this data set.
而且,由于在对该数据集合进行近似计算的过程中,不对该数据集合中的离群数据进行采样后执行近似计算,而是不进行采样直接进行精确计算,从而可以该数据集合中包括离群数据的情况下,进一步避免这些离群数据样本对针对该数据集合的近似计算结果的准确度造成影响,可以最大程度的确保针对该数据集合进行近似计算的准确度。Moreover, since in the process of approximate calculation of the data set, the outlier data in the data set is not sampled and then the approximate calculation is performed, but accurate calculation is performed directly without sampling, it is possible to include outliers in the data set. In the case of data, further avoiding the impact of these outlier data samples on the accuracy of the approximate calculation results for the data set can ensure the accuracy of the approximate calculation for the data set to the greatest extent.
请参见图1,图1是一示例性实施例提供的一种基于智能合约的计算方法的流程图。所述方法应用于区块链中的节点设备;其中,所述区块链上部署了用于执行近似计算的智能合约,所述方法包括以下步骤:Please refer to Figure 1, which is a flow chart of a smart contract-based calculation method provided by an exemplary embodiment. The method is applied to node devices in the blockchain; wherein a smart contract for performing approximate calculations is deployed on the blockchain, and the method includes the following steps:
步骤102,接收计算发起方发起的针对所述智能合约的智能合约调用交易;其中,所述智能合约调用交易包括与所述近似计算对应的计算参数;所述计算参数包括参与近似计算的数据集合的数据标识;Step 102: Receive a smart contract call transaction for the smart contract initiated by the calculation initiator; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include a data set participating in the approximate calculation data identification;
上述计算发起方,具体可以是具有数据计算需求的一方。例如,在一个例子中,上述计算发起方可以是一个具有数据计算需求的用户。在另一个例子中,在基于智能合约与业务对接的场景下,该计算发起方具体也可以是一个具有数据计算需求的链外业务系统。The above-mentioned calculation initiator may specifically be a party with data calculation requirements. For example, in one example, the above calculation initiator may be a user with data calculation requirements. In another example, in a scenario based on smart contracts and business docking, the calculation initiator can also be an off-chain business system with data calculation requirements.
在区块链上,可以部署用于进行数据计算的智能合约,该智能合约包含的合约代码对应的执行逻辑, 具体可以包括用于进行近似计算的近似计算逻辑和用于进行数据采样的采样逻辑。通过这种方式,可以在该智能合约中引入对数据的近似计算和数据采样的逻辑。On the blockchain, a smart contract for data calculation can be deployed. The smart contract contains execution logic corresponding to the contract code. Specifically, it can include approximate calculation logic for approximate calculation and sampling logic for data sampling. . In this way, the logic of approximate calculation of data and data sampling can be introduced into the smart contract.
其中,需要说明的是,上述数据采样所采用的采样方式,在本说明书中不进行特别限定;例如,可以采用随机采样(Random Sampling)、分层采样(Stratified Sampling),等等。Among them, it should be noted that the sampling method used for the above data sampling is not particularly limited in this specification; for example, random sampling (Random Sampling), stratified sampling (Stratified Sampling), etc. can be used.
上述计算发起方可以通过发起一笔智能合约调用交易的方式,来调用上述智能合约对参与计算的数据集合进行近似计算。The above-mentioned calculation initiator can call the above-mentioned smart contract to perform approximate calculations on the data set participating in the calculation by initiating a smart contract call transaction.
例如,以上述计算发起方为用户,以及上述区块链为采用账户模型的区块链为例,在这种情况下,上述智能合约可以理解为区块链上的一个锚定了合约代码的合约账户,而该用户可以在区块链上注册外部账户,并通过该外部账户发起一笔智能合约调用交易,并将该智能合约调用交易提交至接入的区块链节点设备,来调用该智能合约。For example, take the above calculation initiator as the user and the above blockchain as a blockchain using an account model. In this case, the above smart contract can be understood as a contract code anchored on the blockchain. Contract account, and the user can register an external account on the blockchain, initiate a smart contract call transaction through the external account, and submit the smart contract call transaction to the connected blockchain node device to call the Smart contracts.
其中,需要说明的是,在上述智能合约调用交易中,具体可以包括与近似计算对应的计算参数;该计算参数可以包括参与近似计算的数据集合的数据标识。Among them, it should be noted that in the above-mentioned smart contract call transaction, the calculation parameters corresponding to the approximate calculation may be included; the calculation parameters may include the data identifier of the data set participating in the approximate calculation.
上述计算发起方在发起上述智能合约调用交易时,如果该计算发起方直接与区块链节点进行对接,则可以打包一笔智能合约交易,点对点的直接提交至接入的区块链节点设备即可。而如果该计算发起方通过诸如Baas(Blockchain as a Service)平台提供的区块链接入服务接入区块链,则可以生成一个针对上述智能合约的调用请求,并将该调用请求提交至Baas平台,再由该Baas平台基于该调用请求中携带的调用参数打包一笔智能合约调用交易,提交至区块链节点设备。When the above calculation initiator initiates the above smart contract call transaction, if the calculation initiator directly connects with the blockchain node, it can package a smart contract transaction and directly submit it to the connected blockchain node device point-to-point. Can. If the calculation initiator accesses the blockchain through the blockchain connection service provided by the Baas (Blockchain as a Service) platform, it can generate a call request for the above smart contract and submit the call request to the Baas platform. , and then the Baas platform packages a smart contract call transaction based on the call parameters carried in the call request and submits it to the blockchain node device.
区块链节点设备可以接收上述计算发起方发起的上述智能合约调用交易,并在接收到上述智能合约调用交易时,可以响应该智能合约调用交易,在区块链上调用上述智能合约,对上述数据集合进行近似计算。The blockchain node device can receive the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, and when receiving the above-mentioned smart contract call transaction, can respond to the smart contract call transaction and call the above-mentioned smart contract on the blockchain. Approximate calculations are performed on data sets.
步骤104,响应于所述智能合约调用交易,调用所述智能合约调用交易包含的采样逻辑,将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,并针对所述非离群数据子集中的非离群数据样本进行采样; Step 104, in response to the smart contract call transaction, call the sampling logic contained in the smart contract call transaction, and divide the data set corresponding to the data identifier into outlier data sub-sets composed of a number of outlier data samples. a set, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
区块链节点设备接收到上述计算发起方发起的上述智能合约调用交易之后,可以响应该智能合约调用交易,调用所述智能合约包含的采样逻辑,对与所述数据标识对应的所述数据集合中的数据样本进行采样。After the blockchain node device receives the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, it can respond to the smart contract call transaction, call the sampling logic contained in the smart contract, and process the data set corresponding to the data identifier. Sample the data samples in .
其中,需要说明的是,区块链节点设备在接收到上述计算发起方发起的上述智能合约调用交易之后,通常还需要基于区块链支持的共识算法,与其它参与共识的区块链节点一起,对该智能合约调用交易以及该智能合约调用交易的执行结果进行共识处理。由于本说明书并不涉及对区块链的共识过程进行改进,故在本说明书中对该智能合约调用交易以及该智能合约调用交易的执行结果进行共识处理的过程不再进行详述。Among them, it should be noted that after receiving the above-mentioned smart contract call transaction initiated by the above-mentioned calculation initiator, the blockchain node device usually needs to work with other blockchain nodes participating in the consensus based on the consensus algorithm supported by the blockchain. , perform consensus processing on the smart contract call transaction and the execution results of the smart contract call transaction. Since this manual does not involve improving the consensus process of the blockchain, the process of consensus processing of the smart contract call transaction and the execution results of the smart contract call transaction will not be described in detail in this manual.
在示出的一种实施方式中,区块链节点设备在调用所述智能合约包含的采样逻辑,对与所述数据标识对应的所述数据集合中的数据样本进行采样之前,可以先获取上述智能合约调用交易中包含的上述数据标识,并基于该数据标识来读取参与近似计算的数据集合。In an embodiment shown, before calling the sampling logic contained in the smart contract to sample the data samples in the data set corresponding to the data identification, the blockchain node device may first obtain the above-mentioned The smart contract calls the above data identifier contained in the transaction and reads the data collection involved in the approximate calculation based on the data identifier.
其中,基于该数据标识来读取参与近似计算的数据集合时,具体可以从区块链上来读取,也可以从链外读取,在本说明书中不进行特别限定。When reading the data set participating in the approximate calculation based on the data identification, it can be read from the blockchain or from outside the chain, which is not particularly limited in this specification.
在一种实现方式中,该数据集合具体可以预先存证在上述区块链上。In an implementation manner, the data set can be pre-certified on the above-mentioned blockchain.
例如,在区块链上还可以部署一个用于进行数据存证的存证合约,计算发起方在调用上述智能合约进行计算之前,可以通过打包一笔存证交易的方式,将该需要参与计算的数据集合发布至该存证合约进行存证。For example, a certificate deposit contract for data certificate can be deployed on the blockchain. Before calling the above smart contract for calculation, the calculation initiator can package a certificate deposit transaction to include the need in the calculation. The data set is published to the certificate deposit contract for certificate deposit.
又如,上述智能合约包含的合约代码对应的执行逻辑,除了可以包括上述近似计算逻辑和上述采样逻辑以外,还可以包含数据存证逻辑。也即,该智能合约除了可以用于进行近似计算以外,其本身也自 带针对数据的存证功能。此时计算发起方在调用该智能合约进行计算之前,也可以先通过打包一笔存证交易的方式,将该需要参与计算的数据集合预先发布至该智能合约进行存证,后续该智能合约可以从自身的合约存储空间中来读取存证完毕的上述数据集合来进行近似计算。As another example, the execution logic corresponding to the contract code contained in the above-mentioned smart contract may include, in addition to the above-mentioned approximate calculation logic and the above-mentioned sampling logic, data storage logic. That is to say, in addition to being used for approximate calculations, the smart contract itself also has its own data storage function. At this time, before calling the smart contract for calculation, the calculation initiator can also package a certificate deposit transaction and pre-release the data set that needs to participate in the calculation to the smart contract for certificate deposit. Subsequently, the smart contract can Read the above-mentioned data set that has been certified from its own contract storage space to perform approximate calculations.
在这种情况下,区块链节点设备可以基于上述数据标识,来获取区块链上存证的与该数据标识对应的数据集合。例如,在这种情况下,该数据标识具体可以是上述数据集合在区块链上存证成功之后,由区块链节点返回的存证hash。In this case, the blockchain node device can obtain the data set corresponding to the data identifier stored on the blockchain based on the above-mentioned data identifier. For example, in this case, the data identifier may be a certificate hash returned by the blockchain node after the above-mentioned data set is successfully certificated on the blockchain.
在另一种实现方式中,该数据集合具体也可以预先存证在与上述区块链对接的链外数据库中。在这种情况下,该智能合约可以通过与其对应的预言机程序(oracle machine),从上述链外数据库中获取与该数据标识对应的数据集合。In another implementation, the data set can also be pre-stored in an off-chain database connected to the above-mentioned blockchain. In this case, the smart contract can obtain the data set corresponding to the data identifier from the above-mentioned off-chain database through its corresponding oracle machine.
其中,上述预言机程序具体可以是中心化的预言机程序,也可以是去中心化的预言机程序。当上述预言机程序为中心化的预言机程序时,此时该预言机程序可以是部署在链外的服务设备上的一个预言机服务程序。当上述预言机程序为去中心化的预言机程序时,此时该预言机程序可以是部署在区块链上的一个与上述智能合约进行对接的预言机合约。需要说明的是,由于本说明书并不涉及预言机程序相关的改进,故在本说明书中对上述智能合约通过与其对应的预言机程序,从上述链外数据库中获取与该数据标识对应的数据集合的具体实现过程,在本说明书中不再详述。Among them, the above-mentioned oracle program can specifically be a centralized oracle program or a decentralized oracle program. When the above-mentioned oracle program is a centralized oracle program, the oracle program may be an oracle service program deployed on a service device outside the chain. When the above-mentioned oracle program is a decentralized oracle program, the oracle program may be an oracle contract deployed on the blockchain that interfaces with the above-mentioned smart contract. It should be noted that since this specification does not involve improvements related to the oracle program, in this specification the above-mentioned smart contract obtains the data set corresponding to the data identifier from the above-mentioned off-chain database through its corresponding oracle program. The specific implementation process will not be described in detail in this manual.
对于上述智能合约调用交易中包含的计算参数,除了可以包括以上提到的上述数据集合的数据标识以外,在实际应用中,还可以包括其它形式的与近似计算相关的参数。For the calculation parameters included in the above-mentioned smart contract call transaction, in addition to the data identifiers of the above-mentioned data sets mentioned above, in practical applications, they can also include other forms of parameters related to approximate calculations.
在示出的一种实施方式中,上述计算参数具体可以包括下表中示出的各类参数:In an embodiment shown, the above calculation parameters may specifically include various types of parameters shown in the following table:
参数类型Parameter Type 参数含义Parameter meaning
数据集IDData set ID 表示参与近似计算的数据集合Represents a collection of data involved in approximate calculations
计算类型IDCalculation type ID 表示需要进行的近似计算的计算类型The type of calculation that represents the approximate calculation that needs to be made
误差值difference 表示可容忍的近似计算的计算误差Indicates the tolerable calculation error for approximate calculations
置信概率confidence probability 表示期望的近似计算的准确度Indicates the expected accuracy of the approximate calculation
采样算法IDSampling algorithm ID 表示指定的采样算法类型Indicates the specified sampling algorithm type
其中,需要说明的是,上表中除了数据集ID以外,其它参数均为可选参数。Among them, it should be noted that except for the data set ID, other parameters in the above table are optional parameters.
例如,如果上述智能合约调用交易中的计算参数中,不包含计算类型ID,则表示允许上述智能合约采用默认的计算类型对上述数据集合进行近似计算。如果上述智能合约调用交易中的计算参数中,不包含误差值,则表示可容忍的计算误差为0。如果上述智能合约调用交易中的计算参数中,不包含置信概率,则表示置信概率为100%,期望的近似计算的准确度100%,在这种情况下,上述智能合约会针对上述数据集合进行精确计算,不再进行近似计算。For example, if the calculation parameters in the above-mentioned smart contract call transaction do not contain the calculation type ID, it means that the above-mentioned smart contract is allowed to use the default calculation type to perform approximate calculations on the above-mentioned data set. If the calculation parameters in the above smart contract call transaction do not contain an error value, it means that the tolerable calculation error is 0. If the calculation parameters in the above-mentioned smart contract call transaction do not include the confidence probability, it means that the confidence probability is 100% and the expected accuracy of the approximate calculation is 100%. In this case, the above-mentioned smart contract will perform the above-mentioned data collection. Calculate exactly, no more approximations.
在示出的一种实施方式中,区块链节点设备在调用上述智能合约包含的采样逻辑,对获取到与上述数据标识对应的数据集合进行采样时,为了避免上述数据集合中的离群数据对最终的近似计算结果造成影响,具体可以将上述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,然后仅针对非离群数据子集中的数据样本进行采样。In an embodiment shown, when the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, in order to avoid outlier data in the above-mentioned data set To affect the final approximate calculation result, the above data set can be divided into an outlier data subset composed of several outlier data samples, and a non-outlier data subset composed of several non-outlier data samples, and then only Sampling data samples from non-outlier data subsets.
其中,上述数据集合中的离群数据,具体可以由上述智能合约进行计算得出。Among them, the outlier data in the above data set can be calculated by the above smart contract.
在示出的一种实施方式中,上述智能合约包含的上述采样逻辑中具体还可以包括对上述数据集合进行离群计算的逻辑。在这种情况下,区块链节点设备在调用上述智能合约包含的采样逻辑,对获取到与上述数据标识对应的数据集合进行采样时,具体可以执行上述进行离群计算的逻辑,针对上述数据集合中的数据样本进行离群数据计算,确定出该数据集合中包含的离群数据样本和非离群数据样本,然后再根据确定出的离群数据样本创建离群数据子集,根据确定出的非离群数据样本创建非离群数据子集。In one embodiment shown, the sampling logic included in the smart contract may further include logic for performing outlier calculations on the data set. In this case, when the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, it can specifically execute the above-mentioned outlier calculation logic and target the above-mentioned data. Perform outlier data calculation on the data samples in the set to determine the outlier data samples and non-outlier data samples contained in the data set, and then create an outlier data subset based on the determined outlier data samples. Create a non-outlier data subset from non-outlier data samples.
其中,针对上述数据集合中的数据样本进行离群数据计算的过程中,通常是指基于一定的统计学算法,统计出上述数据集合中明显与其它数据样本差异比较大的数据样本的过程,具体的统计计算的方式在本说明书中不进行特别限定。例如,在一个例子中,可以通过计算上述数据集合中的数据样本对应的 数值的中位数,然后基于该中位数来筛选出该数据集合中数值明显偏离该中位数的数据样本,作为离群数据样本。Among them, the process of calculating outlier data for the data samples in the above-mentioned data set usually refers to the process of counting data samples in the above-mentioned data set that are obviously different from other data samples based on a certain statistical algorithm. Specifically, The method of statistical calculation is not particularly limited in this specification. For example, in one example, the median of the values corresponding to the data samples in the above data set can be calculated, and then based on the median, the data samples in the data set whose values deviate significantly from the median can be filtered out, as Outlier data samples.
当然,在实际应用中,上述数据集合中的离群数据,也可以由人工预先进行标定。这种情况下,上述智能合约包含的上述采样逻辑中具体还可以包括对上述数据集合进行离群数据筛选的逻辑。区块链节点设备在调用上述智能合约包含的采样逻辑,对获取到与上述数据标识对应的数据集合进行采样时,具体可以执行上述对离群数据进行筛选的逻辑,针对上述数据集合中的数据样本进行离群数据筛选,确定出该数据集合中包含的离群数据样本和非离群数据样本,然后再根据确定出的离群数据样本创建离群数据子集,根据确定出的非离群数据样本创建非离群数据子集。Of course, in practical applications, the outlier data in the above data set can also be manually calibrated in advance. In this case, the above-mentioned sampling logic contained in the above-mentioned smart contract may also specifically include logic for filtering outlier data on the above-mentioned data collection. When the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to sample the data set corresponding to the above-mentioned data identifier, it can specifically execute the above-mentioned logic of filtering outlier data, and target the data in the above-mentioned data set. The samples are screened for outlier data to determine the outlier data samples and non-outlier data samples contained in the data set, and then an outlier data subset is created based on the identified outlier data samples. Data samples create non-outlier data subsets.
在示出的一种实施方式中,在针对所述非离群数据子集中的数据样本进行采样之前,具体还可以先计算针对该非离群数据子集中的数据样本进行采样的采样数量,然后再按照计算出的采样数量对该非离群数据子集中的数据样本进行采样。In an embodiment shown, before sampling the data samples in the non-outlier data subset, the number of samples to be sampled for the data samples in the non-outlier data subset can be calculated first, and then Then sample the data samples in the non-outlier data subset according to the calculated sampling number.
在示出的一种实施方式中,霍夫丁不等式(Hoeffding’s Inequality)通常用于描述随机变量和与其期望值偏差的概率上限。而在近似计算的场景下,上述采样数量可以作为随机变量,上述近似计算的误差值可以作为期望值偏差,上述近似计算的置信概率可以作为上述概率上限。因此,在本说明书中可以利用霍夫丁不等式来描述上述采样数量、上述近似计算的误差值和上述近似计算的置信概率之间的数学关系。换言之,在近似计算的场景下,可以利用霍夫丁不等式来推导出上述采样数量、上述近似计算的误差值和上述近似计算的置信概率之间的数学关系。In one embodiment shown, Hoeffding’s Inequality is generally used to describe a random variable and an upper bound on the probability of its deviation from its expected value. In the scenario of approximate calculation, the above-mentioned sampling number can be used as a random variable, the error value of the above-mentioned approximate calculation can be used as the expected value deviation, and the confidence probability of the above-mentioned approximate calculation can be used as the upper limit of the above-mentioned probability. Therefore, in this specification, Hoeffding's inequality can be used to describe the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation, and the confidence probability of the above-mentioned approximate calculation. In other words, in the scenario of approximate calculation, Hoeffding's inequality can be used to derive the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation, and the confidence probability of the above-mentioned approximate calculation.
其中,在利用霍夫丁不等式来描述上述采样数量、上述近似计算的误差值和上述近似计算的置信概率之间的数学关系时,霍夫丁不等式表示成如下公式:Among them, when Hoeffding's inequality is used to describe the mathematical relationship between the above-mentioned sampling number, the error value of the above-mentioned approximate calculation and the confidence probability of the above-mentioned approximate calculation, Hoeffding's inequality is expressed as the following formula:
Figure PCTCN2022135439-appb-000003
Figure PCTCN2022135439-appb-000003
在上述公式中,H表示霍夫丁不等式的数学标识符。n g表示所述采样数量。 In the above formula, H represents the mathematical identifier of Hoeffding's inequality. n g represents the number of samples.
b g、a g分别表示所述数据集合中的数据样本的最大值和最小值。δ表示所述置信概率;ε g表示与上述近似计算对应的误差值;N g表示所述数据集合中的数据样本的总数量。 b g and a g respectively represent the maximum value and minimum value of the data samples in the data set. δ represents the confidence probability; ε g represents the error value corresponding to the above approximate calculation; N g represents the total number of data samples in the data set.
而基于上述公式推导出的上述采样数量、上述近似计算的误差值和上述近似计算的置信概率之间的数学关系,则可以用如下公式表示:The mathematical relationship between the above-mentioned sampling number derived based on the above-mentioned formula, the error value of the above-mentioned approximate calculation and the confidence probability of the above-mentioned approximate calculation can be expressed by the following formula:
Figure PCTCN2022135439-appb-000004
Figure PCTCN2022135439-appb-000004
而在上述智能合约中,可以预先维护上述数学关系。区块链节点设备在调用上述智能合约计算针对上述非离群数据子集中的数据样本进行采样所需的采样数量时,可以获取上述智能合约调用交易中的计算参数中的与近似计算对应的置信概率δ,以及与近似计算对应的误差值ε g,再将获取到的置信概率δ和误差值ε g输入维护的上述数学关系中进行计算,得到与上述非离群数据子集对应的采样数量。 In the above-mentioned smart contract, the above-mentioned mathematical relationship can be maintained in advance. When the blockchain node device calls the above-mentioned smart contract to calculate the number of samples required for sampling the data samples in the above-mentioned non-outlier data subset, it can obtain the confidence corresponding to the approximate calculation in the calculation parameters in the above-mentioned smart contract call transaction. probability δ, and the error value ε g corresponding to the approximate calculation, and then input the obtained confidence probability δ and error value ε g into the above mathematical relationship maintained for calculation, and obtain the number of samples corresponding to the above non-outlier data subset .
其中,需要说明的是,在针对非离群数据子集中的数据样本进行采样所采用的采样方式,在本说明书中不进行特别限定;例如,可以采用随机采样(Random Sampling)、分层采样(Stratified Sampling),等等。Among them, it should be noted that the sampling method used to sample data samples in non-outlier data subsets is not particularly limited in this specification; for example, random sampling (Random Sampling), stratified sampling ( Stratified Sampling), etc.
以下实施例中将分别以采用随机采样和分层采样为例,来详细描述针对非离群数据子集中的非离群数据样本的采样过程。In the following embodiments, random sampling and stratified sampling will be used as examples to describe in detail the sampling process for non-outlier data samples in non-outlier data subsets.
在示出的一种实施方式中,如果采用随机采样的方式针对非离群数据子集中的数据样本进行随机采样,在这种情况下,区块链节点设备在调用上述智能合约,基于计算出的采样数量对上述数据集合进行 随机采样时,具体可以先获取用于进行随机采样的随机数,然后再基于获取到的该随机数对非离群数据子集中的非离群数据样本进行随机采样,得到与计算出的上述采样数量对应的数据样本。In an embodiment shown, if random sampling is used to randomly sample data samples in the non-outlier data subset, in this case, the blockchain node device calls the above smart contract, based on the calculated When randomly sampling the above data set, you can first obtain the random number used for random sampling, and then randomly sample the non-outlier data samples in the non-outlier data subset based on the obtained random number. , obtain the data samples corresponding to the calculated number of samples above.
其中,上述随机数具体用于控制从上述非离群数据子集中采样的非离群数据样本的随机性,在实际应用中,可以按照获取到的随机数,来确定需要从上述非离群数据子集中采样的非离群数据样本。例如,在一个例子中,可以利用随机数来表示待采样的非离群数据样本的样本标识,在进行随机采样的过程中,可以按照该随机数,随机的从非离群数据子集中抽取将该随机数的数值作为样本标识的非离群数据样本完成数据采样。Among them, the above-mentioned random numbers are specifically used to control the randomness of the non-outlier data samples sampled from the above-mentioned non-outlier data subsets. In practical applications, the obtained random numbers can be used to determine the need to obtain the above-mentioned non-outlier data. Non-outlier data samples sampled from the subset. For example, in one example, a random number can be used to represent the sample identifier of the non-outlier data sample to be sampled. During the random sampling process, the random number can be used to randomly extract a subset of the non-outlier data. The value of this random number serves as the non-outlier data sample identified by the sample to complete the data sampling.
需要说明的是,关于上述随机数具体的获取方式,可以在区块链上生成,也可以从链外获取,在本说明书中不进行特别限定。It should be noted that the specific method of obtaining the above random numbers can be generated on the blockchain or obtained from outside the chain, and is not particularly limited in this specification.
以下是本说明书示出的几种用于获取随机数的具体方式:The following are several specific methods for obtaining random numbers shown in this manual:
在示出的一种方式中,在区块链上可以预先部署一个用于生成随机数的随机函数。例如,在实际应用中,上述随机函数具体可以作为一个独立的智能合约部署在区块链上,或者作为上述用于进行近似计算的智能合约包含的执行逻辑部署在该智能合约中。在这种情况下,可以通过调用上述随机函数在区块链上生成随机树;In one way shown, a random function for generating random numbers can be pre-deployed on the blockchain. For example, in practical applications, the above random function can be deployed on the blockchain as an independent smart contract, or deployed in the smart contract as the execution logic contained in the smart contract for approximate calculation. In this case, a random tree can be generated on the blockchain by calling the random function mentioned above;
在示出的另一种方式中,上述区块链节点设备上可以搭载一个可信执行环境(Trusted Execution Environment)。在该可信执行环境中,预先可以维护一个用于生成随机数的随机数种子。在这种情况下,可以通过在该可信执行环境中,基于该随机种子来生成随机数。In another method shown, the above-mentioned blockchain node device can be equipped with a Trusted Execution Environment (Trusted Execution Environment). In this trusted execution environment, a random number seed for generating random numbers can be maintained in advance. In this case, random numbers can be generated based on the random seed in the trusted execution environment.
在示出的第三种方式中,也可以从用于进行近似计算的上述智能合约维护的数据相关的数据参数中,来获取可以作为随机数种子的目标数据参数,然后可以基于获取到的目标数据参数在上述智能合约中生成随机数。例如,还可以从上述智能合约维护的历史区块的hash值、历史区块的生成时间戳这些具有唯一性的参数来作为随机数种子,在该智能合约中计算随机数。In the third way shown, the target data parameters that can be used as random number seeds can also be obtained from the data parameters related to the data maintained by the above-mentioned smart contract used for approximate calculation, and then the target data parameters can be obtained based on the obtained targets. The data parameters generate random numbers in the above smart contract. For example, the unique parameters such as the hash value of the historical block maintained by the above-mentioned smart contract and the generation timestamp of the historical block can be used as a random number seed to calculate the random number in the smart contract.
在示出的第四种方式中,上述随机数可以在链外生成。在这种情况下,上述智能合约,也可以通过与该智能合约对应的预言机程序,来获取在链外生成的该随机数。In the fourth way shown, the random numbers described above can be generated off-chain. In this case, the above-mentioned smart contract can also obtain the random number generated outside the chain through the oracle program corresponding to the smart contract.
在示出的第五种方式中,可以在链外生成一个用于进一步生成上述随机数的随机数种子。在这种情况下,上述智能合约,也可以通过与该智能合约对应的预言机程序,来获取在链外生成的该随机数种子,然后基于获取到的该随机数种子在该智能合约中生成随机数。In the fifth way shown, a random number seed for further generating the above-mentioned random numbers can be generated outside the chain. In this case, the above-mentioned smart contract can also obtain the random number seed generated outside the chain through the oracle program corresponding to the smart contract, and then generate the random number seed in the smart contract based on the obtained random number seed. random number.
在示出的第六种方式中,在链外生成的上述随机数种子,具体也可以作为计算参数携带在上述智能合约调用交易中。在这种情况下,可以获取该智能合约调用交易中包括的在链外生成的随机数种子,然后基于获取到的该随机数种子在该智能合约中生成随机数。In the sixth method shown, the random number seed generated outside the chain can also be carried as a calculation parameter in the smart contract call transaction. In this case, the random number seed generated off-chain included in the smart contract call transaction can be obtained, and then a random number can be generated in the smart contract based on the obtained random number seed.
以上举了几种用于获取随机数的常见实现方式,需要强调的是,在实际应用中,显然也可以采用以上列举的实现方式以外的方式来获取随机数,在本说明书中不再进行一一列举。The above are several common implementation methods for obtaining random numbers. It should be emphasized that in practical applications, it is obvious that other methods other than the implementation methods listed above can be used to obtain random numbers. This will not be discussed in this manual. List one.
在示出的一种实施方式中,如果采用分层采样的方式针对非离群数据子集中的数据样本进行随机采样,在这种情况下,由于采用分层采样时,通常需要将上述非离群数据子集划分成若干个bucket,再从这些bucket中分别进行数据采样。因此,区块链节点设备在调用上述智能合约计算进行分层采样所需的采样数量时,可以获取上述智能合约调用交易携带的计算参数中包含的与近似计算对应的置信概率δ,以及与各个bucket对应的误差值ε k,再将获取到的置信概率δ和各个bucket的误差值ε k输入维护的上述数学关系中分别进行计算,得到与上述非离群数据子集划分出的各个bucket对应的采样数量。 In the illustrated embodiment, if stratified sampling is used to randomly sample data samples in non-outlier data subsets, in this case, since stratified sampling is used, it is usually necessary to The group data subset is divided into several buckets, and then data sampling is performed from these buckets. Therefore, when the blockchain node device calls the above-mentioned smart contract to calculate the number of samples required for stratified sampling, it can obtain the confidence probability δ corresponding to the approximate calculation contained in the calculation parameters carried by the above-mentioned smart contract call transaction, and the confidence probability δ corresponding to each The error value ε k corresponding to the bucket, and then input the obtained confidence probability δ and the error value ε k of each bucket into the above mathematical relationship maintained for calculation respectively, and obtain the corresponding bucket corresponding to the above non-outlier data subset. the number of samples.
需要说明的是,在对上述非离群数据子集进行分层采样时,需要划分出的bucket的数量K,以及每一个bucket对应的误差值ε k,可以由计算发起方来指定,并作为计算参数携带在上述智能合约调用交易中。例如,在这种情况下,上述计算参数中除了需要携带一个计算发起方指定的针对上述数据集合进行近似计算的总误差ε g以外,还需要携带K个与各个bucket对应的误差值ε kIt should be noted that when performing stratified sampling on the above-mentioned non-outlier data subset, the number of buckets K that need to be divided and the error value ε k corresponding to each bucket can be specified by the calculation initiator and used as The calculation parameters are carried in the above smart contract call transaction. For example, in this case, in addition to the total error ε g specified by the calculation initiator for the approximate calculation of the above data set, the above calculation parameters also need to carry K error values ε k corresponding to each bucket.
除此之外,在对上述非离群数据子集进行分层采样时,需要划分出的bucket的数量K,以及每一个bucket对应的误差值ε k,具体也可以是由上述智能合约在链上自主的进行计算得到的最优值。 In addition, when performing stratified sampling on the above-mentioned non-outlier data subset, the number of buckets K needs to be divided, and the error value ε k corresponding to each bucket, which can also be specified by the above-mentioned smart contract on the chain. The optimal value obtained by independent calculation.
在示出的一种实施方式中,需要划分出的bucket的数量K,以及每一个bucket对应的误差值ε k,可以是由上述智能合约采用最优化求解方法求解出的最优值。 In an embodiment shown, the number K of buckets to be divided and the error value ε k corresponding to each bucket can be the optimal value solved by the above-mentioned smart contract using the optimization solution method.
在这种情况下,区块链节点设备在调用上述智能合约包含的采样逻辑,对非离群数据子集中的非离群数据样本进行分层采样时,可以先采用最优化求解方法,求解在针对上述非离群数据子集进行分层采样时所需划分出的bucket的最优数量K,以及每一个bucket对应的最优误差值ε k,再将上述智能合约调用交易中的计算参数中与近似计算对应的置信概率δ,与求解出的各个bucket对应的最优误差值ε k,输入维护的上述数学关系中分别进行计算,得到与上述非离群数据子集划分出的各个bucket对应的最优采样数量。然后,可以基于计算出的上述最优数量K和上述最优采样数量,对上述非离群数据子集中的数据样本进行分层采样。 In this case, when the blockchain node device calls the sampling logic contained in the above-mentioned smart contract to conduct hierarchical sampling of non-outlier data samples in the non-outlier data subset, it can first use the optimization solution method to solve the problem of The optimal number K of buckets required for stratified sampling of the above non-outlier data subset, and the optimal error value ε k corresponding to each bucket, are then added to the calculation parameters in the above smart contract call transaction. The confidence probability δ corresponding to the approximate calculation and the optimal error value ε k corresponding to each solved bucket are entered into the above mathematical relationships maintained for calculation respectively, and the corresponding buckets corresponding to the above non-outlier data subsets are obtained. the optimal number of samples. Then, based on the calculated optimal number K and the optimal sampling number, stratified sampling can be performed on the data samples in the non-outlier data subset.
需要说明的是,上述智能合约所采用的最优化求解方法的具体类型,在本说明书中不进行特别限定,在实际应用中,本领域技术人员可以基于实际的需求,来灵活的采用不同的最优化求解算法。例如,在实际应用中,具体可以采用诸如梯度下降法(Gradient Descent)等常用的最优求解算法。It should be noted that the specific type of optimization solution method used by the above-mentioned smart contract is not particularly limited in this specification. In practical applications, those skilled in the art can flexibly adopt different optimal solutions based on actual needs. Optimization solving algorithm. For example, in practical applications, commonly used optimal solution algorithms such as gradient descent can be used.
其中,对于最优化求解方法而言,通常需要设置一个明确的约束条件。而在实际应用中,通常可以基于具体的求解目标,来设置上述约束条件。Among them, for the optimization solution method, it is usually necessary to set a clear constraint. In practical applications, the above constraints can usually be set based on specific solution goals.
在对上述非离群数据子集进行分层采样的场景下,最优化的求解目标可以包括求解划分出的bucket的最优数量、求解出各个bucket对应的最优误差值,等等。那么,在实际应用中,就可以基于上述优化目标来为上述最优化求解方法设置上述约束条件。In the scenario of stratified sampling of the above-mentioned non-outlier data subsets, the optimal solution goals may include finding the optimal number of divided buckets, finding the optimal error value corresponding to each bucket, and so on. Then, in practical applications, the above constraints can be set for the above optimization solution method based on the above optimization objectives.
在示出的一种实施方式中,基于上述求解目标,为上述最优化求解方法设置约束条件具体可以是:In an embodiment shown, based on the above solution objective, setting constraints for the above optimization solution method may specifically be:
针对各个bucket对应的误差值进行加权平均计算,得到的加权平均误差值最小,并且不大于针对上述非离群数据子集进行近似计算对应的总误差值。A weighted average calculation is performed on the error values corresponding to each bucket, and the weighted average error value obtained is the smallest and no greater than the total error value corresponding to the approximate calculation for the above-mentioned non-outlier data subset.
例如,上述约束条件可以表示成如下的公式:For example, the above constraints can be expressed as the following formula:
Figure PCTCN2022135439-appb-000005
Figure PCTCN2022135439-appb-000005
以上公式中,ε g表示针对上述非离群数据子集进行近似计算对应的总误差值。N k表示从第k个bucket中采样的样本数量。N表示从上述非离群数据子集中采样的总样本数量。 In the above formula, ε g represents the total error value corresponding to the approximate calculation for the above-mentioned non-outlier data subset. N k represents the number of samples sampled from the k-th bucket. N represents the total number of samples sampled from the above non-outlier data subset.
以下通过附图和具体的实施例来描述采用上述约束条件,来求解分层采样所需的bucket的最优数量和各个bucket的最优误差值的具体算法流程。The following describes the specific algorithm flow using the above constraints to solve the optimal number of buckets required for stratified sampling and the optimal error value of each bucket through the accompanying drawings and specific embodiments.
请参见图2,图2为本说明书示出的一种最优化求解方法的流程图,包括以下的执行步骤:Please refer to Figure 2. Figure 2 is a flow chart of an optimization solution method shown in this specification, including the following execution steps:
步骤201,初始化i值;其中,所述i值表示初始化设置的各个bucket中包含的样本数量。除了步骤201以外,以下步骤为迭代执行的步骤:Step 201: Initialize the i value; where the i value represents the number of samples included in each bucket of the initialization settings. In addition to step 201, the following steps are iteratively executed:
步骤202,对初始化的i值对应的数值进行调整; Step 202, adjust the value corresponding to the initialized i value;
其中,对i值的调整幅度可以灵活设置,在本说明书中不进行特别限定。Among them, the adjustment range of the i value can be set flexibly and is not specifically limited in this manual.
步骤203,将所述非离群数据子集划分为分别包含i个样本数量的若干bucket;Step 203: Divide the non-outlier data subset into several buckets each containing i samples;
步骤204,将上述置信概率δ(即智能合约调用交易携带的计算参数中包含的置信概率δ)以及调整之后的i值(即与各个bucket对应的样本数量)作为计算参数,输入至所述数学关系中进行计算,得到与各个bucket分别对应的误差值,并对各个bucket对应的误差值进行加权平均计算,得到加权平均误差值;Step 204: Use the above-mentioned confidence probability δ (i.e., the confidence probability δ included in the calculation parameters carried by the smart contract call transaction) and the adjusted i value (i.e., the number of samples corresponding to each bucket) as calculation parameters, and input them into the mathematical Calculate in the relationship to obtain the error value corresponding to each bucket, and perform a weighted average calculation on the error value corresponding to each bucket to obtain the weighted average error value;
其中,需要说明的是,如果是第一轮迭代,执行完步骤204,会重新执行步骤202-步骤204,执行第二轮迭代。Among them, it should be noted that if it is the first round of iteration, after step 204 is executed, steps 202 to 204 will be executed again to execute the second round of iteration.
步骤205,确定所述加权平均误差值是否不大于所述总误差值(即智能合约调用交易携带的计算参数中包含的与近似计算对应的误差值),并且小于上一轮迭代计算出的加权平均误差值(即基于本轮迭代调整之前的i值计算出的加权平均误差值);如果否,重新执行以上的步骤202-步骤205,继续执行下一轮迭代,并重复以上的迭代过程,直至最优化求解算法收敛,计算出满足上述约束条件的加权平均 误差值时停止迭代。Step 205: Determine whether the weighted average error value is not greater than the total error value (that is, the error value corresponding to the approximate calculation contained in the calculation parameters carried by the smart contract call transaction), and is less than the weighted value calculated in the previous round of iterations. The average error value (that is, the weighted average error value calculated based on the i value before this round of iteration adjustment); if not, re-execute the above steps 202 to 205, continue to execute the next round of iteration, and repeat the above iteration process, The iteration stops until the optimization solution algorithm converges and the weighted average error value that satisfies the above constraints is calculated.
步骤206,在停止迭代后,获取使得计算出的加权平均误差值满足上述约束条件时的最优i值;Step 206: After stopping the iteration, obtain the optimal i value when the calculated weighted average error value satisfies the above constraints;
步骤207,基于所述最优i值确定针对所述非离群数据子集进行分层抽样时,所需划分出的bucket的最优数量,并再次将上述置信概率以及与上述最优i值,输入至所述数学关系中进行计算,得到与各个bucket对应的最优误差值。需要说明的是,在以上实施例中,是基于以上描述的求解目标为上述最优化求解方法设置约束条件的一种具体的实施方式,在实际应用中,显然也可以基于上述求解目标,来为上述最优化求解方法设置其它形式的约束条件。在本说明书中,上述智能合约采用最优化求解方法求解出需要划分出的bucket的最优数量,以及每一个bucket对应的最优采样数量之后,可以基于该最优数量和该最优误差值对上述非离群数据子集进行分层采样。Step 207: Determine the optimal number of buckets to be divided when performing stratified sampling on the non-outlier data subset based on the optimal i value, and again combine the above confidence probability and the above optimal i value , input into the mathematical relationship for calculation, and obtain the optimal error value corresponding to each bucket. It should be noted that in the above embodiment, it is a specific implementation manner to set constraints for the above-mentioned optimization solution method based on the above-described solution objectives. In practical applications, it is obvious that the above-mentioned solution objectives can also be used to set constraints for The above optimization solution method sets other forms of constraints. In this specification, the above-mentioned smart contract uses the optimization solution method to solve for the optimal number of buckets that need to be divided, and the optimal number of samples corresponding to each bucket, and then the optimal number and the optimal error value can be calculated based on the optimal number and the optimal error value. The above non-outlier data subsets are stratified sampled.
在示出的一种实施方式中,上述智能合约在基于该最优数量和该最优误差值对上述非离群数据子集进行分层采样时,首先可以按照该最优数量将上述非离群数据子集划分为若干个bucket;比如,假设上述最优数量为K,则可以将上述非离群数据子集划分为K个bucket。然后,可以从划分出的各个bucket中,按照上述最优采样数量分别对各个bucket中的数据样本进行采样。In an embodiment shown, when the above-mentioned smart contract performs stratified sampling on the above-mentioned non-outlier data subset based on the optimal number and the optimal error value, the above-mentioned non-outlier data subset can first be stratified according to the optimal number. The group data subset is divided into several buckets; for example, assuming that the above-mentioned optimal number is K, the above-mentioned non-outlier data subset can be divided into K buckets. Then, from each divided bucket, the data samples in each bucket can be sampled according to the above-mentioned optimal sampling number.
其中,按照上述最优采样数量分别对各个bucket中的数据样本进行采样所采用的具体的采样方式,在本说明书中不进行特别限定。Among them, the specific sampling method used to sample the data samples in each bucket according to the above-mentioned optimal sampling number is not particularly limited in this specification.
例如,如果采用随机采样的方式,按照上述最优采样数量分别对各个bucket中的数据样本进行采样时,具体可以先获取用于进行随机采样的随机数,然后再基于获取到的该随机数对各个bucket中的非离群数据样本进行随机采样,得到与计算出的上述最优采样数量对应的非离群数据样本。其中,关于上述随机数具体的获取方式可以参照之前实施例的描述,不再赘述。For example, if random sampling is used to sample data samples in each bucket according to the above-mentioned optimal sampling number, you can first obtain the random number used for random sampling, and then pair the data based on the obtained random number. Non-outlier data samples in each bucket are randomly sampled to obtain non-outlier data samples corresponding to the calculated optimal sampling number. Regarding the specific acquisition method of the above random number, reference may be made to the description of the previous embodiment, and no further description will be given.
步骤106,进一步调用所述智能合约调用交易包含的计算逻辑,针对所述离群数据子集中的离群数据样本进行精确计算,针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,并合并所述精确计算和所述近似计算的结果,以作为针对所述数据集合的近似计算结果。Step 106: Further call the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and perform accurate calculations on the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.
在本说明书中,当完成针对上述非离群数据子集中的非离群数据子集的数据采样后,可以进一步针对采样到的非离群数据样本进行近似计算。In this specification, after completing data sampling for the non-outlier data subset in the above-mentioned non-outlier data subset, approximate calculations can be further performed on the sampled non-outlier data samples.
其中,在针对采样得到的非离群数据样本进行近似计算时,可以采用计算发起方指定的计算类型进行近似计算,也可以采用上述智能合约支持的默认计算类型进行近似计算,在本说明书中不进行特别限定。Among them, when performing approximate calculations on the sampled non-outlier data samples, the calculation type specified by the calculation initiator can be used for approximate calculation, or the default calculation type supported by the above smart contract can be used for approximate calculation. Specially limited.
例如,在示出的一种实施方式中,在上述智能合约调用交易中,还可以包括采样算法ID。该采样算法ID具体可以用于指示计算发起方指定的针对上述数据集合进行近似计算的计算类型。For example, in one embodiment shown, the sampling algorithm ID may also be included in the above-mentioned smart contract call transaction. The sampling algorithm ID may be specifically used to indicate the calculation type specified by the calculation initiator to perform approximate calculation on the above-mentioned data set.
在这种情况下,在针对采样得到的非离群数据样本进行近似计算时,可以获取该智能合约调用交易中包括的采样算法ID,然后按照该采样算法ID指示的计算类型针对采集得到的非离群数据样本进行近似计算。In this case, when performing approximate calculations on the sampled non-outlier data samples, you can obtain the sampling algorithm ID included in the smart contract call transaction, and then use the calculation type indicated by the sampling algorithm ID to perform approximate calculations on the collected non-outlier data samples. Approximate calculations are performed on outlier data samples.
当然,如果上述智能合约调用交易不包括上述采样算法ID,也即计算发起方并没有指定的针对上述数据集合进行近似计算的计算类型,也可以基于上述智能合约支持的默认计算类型针对采样得到的非离群样本数据进行近似计算。Of course, if the above-mentioned smart contract call transaction does not include the above-mentioned sampling algorithm ID, that is, the calculation initiator does not specify a calculation type for approximate calculation of the above-mentioned data set, it can also be based on the default calculation type supported by the above-mentioned smart contract for the sampled results. Approximate calculations are performed on non-outlier sample data.
而对于上述离群数据子集来说,可以不对上述离群数据子集中的离群数据样本进行采样。同时,由于针对离群数据样本进行近似计算的结果,通常会偏移针对上述数据集合进行近似计算的近似计算结果,因此对于上述离群数据子集中的离群数据样本,可以不进行近似计算,而是进行精确计算。For the above-mentioned outlier data subset, the outlier data samples in the above-mentioned outlier data subset may not be sampled. At the same time, since the results of approximate calculations for outlier data samples usually deviate from the approximate calculation results of the above-mentioned data sets, it is not necessary to perform approximate calculations for the outlier data samples in the above-mentioned outlier data subsets. Instead, make precise calculations.
其中,在针对离群数据子集中的离群数据样本进行精确计算时,可以采用计算发起方指定的计算类型进行精确计算,也可以采用上述智能合约支持的默认计算类型进行精确计算,在本说明书中不进行特别限定。Among them, when accurately calculating the outlier data samples in the outlier data subset, the calculation type specified by the calculation initiator can be used for accurate calculation, or the default calculation type supported by the above smart contract can be used for accurate calculation. In this manual, There are no special restrictions.
例如,在针对离群数据子集中的离群数据进行近似计算时,可以获取上述智能合约调用交易中包括的采样算法ID,然后按照该采样算法ID指示的计算类型针对离群数据子集中的离群数据样本进行精确 计算。当然,如果上述智能合约调用交易不包括上述采样算法ID,也可以基于上述智能合约支持的默认计算类型针对离群数据子集中的离群数据进行精确计算。For example, when performing approximate calculations on outlier data in an outlier data subset, you can obtain the sampling algorithm ID included in the above-mentioned smart contract call transaction, and then calculate the outlier data in the outlier data subset according to the calculation type indicated by the sampling algorithm ID. Accurate calculation of group data samples. Of course, if the above-mentioned smart contract call transaction does not include the above-mentioned sampling algorithm ID, accurate calculations can also be performed on the outlier data in the outlier data subset based on the default calculation type supported by the above-mentioned smart contract.
需要说明的是,上述近似计算和精确计算对应的计算类型,在本说明书中不进行特别限定。例如,可以包括求和、求平均值,等等,在本说明书中不再进行一一列举。It should be noted that the calculation types corresponding to the above approximate calculations and exact calculations are not particularly limited in this specification. For example, it may include summing, averaging, etc., which will not be listed one by one in this specification.
当完成针对离群数据子集中的离群数据样本的精确计算,以及针对非离群数据子集中采样得到的非离群数据样本的近似计算之后,可以合并上述精确计算和上述近似计算的结果,以作为针对上述数据集合的最终的近似计算结果。After completing the precise calculation for the outlier data samples in the outlier data subset and the approximate calculation for the non-outlier data samples sampled in the non-outlier data subset, the results of the above precise calculation and the above approximate calculation can be combined, As the final approximate calculation result for the above data set.
在示出的一种实施方式中,上述用于进行近似计算的智能合约,具体还可以是一个部署在区块链节点设备搭载的可信执行环境中的隐私智能合约。In one embodiment shown, the above-mentioned smart contract for performing approximate calculations may specifically be a privacy smart contract deployed in a trusted execution environment mounted on a blockchain node device.
在这种场景下,上述智能合约调用交易中的计算参数,以及获取到的上述数据集合中的数据样本,通常都预先进行了加密处理。区块链节点设备在调用上述智能合约包含的采样逻辑,将该数据集合划分为离群数据子集和非离群数据子集之前,还可以在该可信执行环境中,对上述计算参数以及对获取到的上述数据集合中的数据样本分别进行解密。In this scenario, the calculation parameters in the above-mentioned smart contract call transaction and the obtained data samples in the above-mentioned data collection are usually encrypted in advance. Before calling the sampling logic contained in the above-mentioned smart contract to divide the data set into outlier data subsets and non-outlier data subsets, the blockchain node device can also perform the above calculation parameters and calculations in the trusted execution environment. Decrypt the obtained data samples in the above data set respectively.
例如,在一个例子中,可以为上述可信执行环境分配一对用于对数据进行加解密的非对称密钥对,并将上述非对称密钥的私钥存储在上述可信执行环境中,将上述非对称密钥的公钥发布给上述计算发起方。而上述智能合约调用交易中的计算参数,以及获取到的上述数据集合中的数据样本,都可以预先基于上述公钥进行加密。区块链节点设备在调用上述智能合约包含的采样逻辑,对获取到与上述数据标识对应的数据集合进行随机采样之前,还可以在该可信执行环境中,使用维护的私钥对上述计算参数以及对获取到的上述数据集合中的数据样本分别进行解密。在以上技术方案中,在调用智能合约针对数据集合进行近似计算的场景下,通过在智能合约中引入针对该数据集合的采样机制,可以在不牺牲近似计算结果的准确度的基础上,降低对该数据集合进行近似计算时的耗时,提高针对该数据集合进行近似计算时的计算效率。For example, in one example, the above-mentioned trusted execution environment may be assigned an asymmetric key pair for encryption and decryption of data, and the private key of the above-mentioned asymmetric key may be stored in the above-mentioned trusted execution environment, Publish the public key of the above-mentioned asymmetric key to the above-mentioned calculation initiator. The calculation parameters in the above-mentioned smart contract call transaction and the obtained data samples in the above-mentioned data collection can be encrypted in advance based on the above-mentioned public key. Before calling the sampling logic contained in the above-mentioned smart contract to randomly sample the data set corresponding to the above-mentioned data identification, the blockchain node device can also use the maintained private key to calculate the above-mentioned calculation parameters in the trusted execution environment. And decrypt the obtained data samples in the above data set respectively. In the above technical solution, when a smart contract is called to perform approximate calculations on a data set, by introducing a sampling mechanism for the data set in the smart contract, the accuracy of the approximate calculation results can be reduced without sacrificing the accuracy of the results. The time consuming when performing approximate calculations on this data set improves the computing efficiency when performing approximate calculations on this data set.
例如,仍以与业务相关的数据集合预先存证在区块链上为例,在智能合约中引入了数据采样机制之后,此时智能合约对与业务相关的数据集合进行计算时的总耗时,通常可以用如下的公式进行表示:For example, taking the case that business-related data sets are pre-stored on the blockchain, after the data sampling mechanism is introduced in the smart contract, the total time taken by the smart contract to calculate the business-related data sets will be , usually can be expressed by the following formula:
Figure PCTCN2022135439-appb-000006
Figure PCTCN2022135439-appb-000006
其中,在上述公式中,n g表示从数据集合中采样得到的数据样本的数量。N g表示数据集合中的数据样本的总数量。由于n g的数值与N g的数值相比,通常是数量级的差异,因此智能合约中引入了采样机制之后,通过该智能合约对数据集合进行计算时的耗时,也会数量级的减少。可见,在智能合约中引入了采样机制,可以显著的缩短对数据集合进行计算时的耗时,提高针对该数据集合进行近似计算时的计算效率。 Among them, in the above formula, n g represents the number of data samples sampled from the data set. N g represents the total number of data samples in the data set. Since the value of n g is usually an order of magnitude different from the value of N g , after the sampling mechanism is introduced in the smart contract, the time-consuming calculation of the data set through the smart contract will also be reduced by an order of magnitude. It can be seen that the introduction of a sampling mechanism in smart contracts can significantly shorten the time-consuming calculation of data sets and improve the calculation efficiency of approximate calculations for this data set.
而且,由于在对该数据集合进行近似计算的过程中,不对该数据集合中的离群数据进行采样后执行近似计算,而是不进行采样直接进行精确计算,从而可以该数据集合中包括离群数据的情况下,进一步避免这些离群数据样本对针对该数据集合的近似计算结果的准确度造成影响,可以最大程度的确保针对该数据集合进行近似计算的准确度。Moreover, since in the process of approximate calculation of the data set, the outlier data in the data set is not sampled and then the approximate calculation is performed, but accurate calculation is performed directly without sampling, it is possible to include outliers in the data set. In the case of data, further avoiding the impact of these outlier data samples on the accuracy of the approximate calculation results for the data set can ensure the accuracy of the approximate calculation for the data set to the greatest extent.
与上述方法实施例相对应,本申请还提供了装置的实施例。Corresponding to the above method embodiments, this application also provides device embodiments.
本说明书的装置的实施例可以应用在电子设备上。装置实施例可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在电子设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。The embodiments of the device in this specification can be applied to electronic equipment. The device embodiments may be implemented by software, or may be implemented by hardware or a combination of software and hardware. Taking software implementation as an example, as a logical device, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory and running them through the processor of the electronic device where it is located.
从硬件层面而言,如图3所示,为本说明书的装置所在电子设备的一种硬件结构图,除了图3所示的处理器、内存、网络接口、以及非易失性存储器之外,实施例中装置所在的电子设备通常根据该电子 设备的实际功能,还可以包括其他硬件,对此不再赘述。From the hardware level, as shown in Figure 3, it is a hardware structure diagram of the electronic equipment where the device of this specification is located. In addition to the processor, memory, network interface, and non-volatile memory shown in Figure 3, The electronic device in which the device in the embodiment is located usually may also include other hardware according to the actual functions of the electronic device, which will not be described again.
图4是本说明书一示例性实施例示出的一种基于智能合约的计算装置的框图。FIG. 4 is a block diagram of a smart contract-based computing device according to an exemplary embodiment of this specification.
请参考图4,所述基于智能合约的计算装置40可以应用在前述图3所示的电子设备中,所述区块链上部署了用于执行近似计算的智能合约,所述装置40包括:Please refer to Figure 4. The smart contract-based computing device 40 can be applied in the electronic device shown in Figure 3. A smart contract for performing approximate calculations is deployed on the blockchain. The device 40 includes:
接收模块401,接收计算发起方发起的针对所述智能合约的智能合约调用交易;其中,所述智能合约调用交易包括与所述近似计算对应的计算参数;所述计算参数包括参与近似计算的数据集合的数据标识;The receiving module 401 receives a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include data participating in the approximate calculation The data identifier of the collection;
采样模块402,响应于所述智能合约调用交易,调用所述智能合约调用交易包含的采样逻辑,将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,并针对所述非离群数据子集中的非离群数据样本进行采样;The sampling module 402, in response to the smart contract call transaction, calls the sampling logic contained in the smart contract call transaction, and divides the data set corresponding to the data identifier into outlier data composed of a number of outlier data samples. subset, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
计算模块403,进一步调用所述智能合约调用交易包含的计算逻辑,针对所述离群数据子集中的离群数据样本进行精确计算,针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,并合并所述精确计算和所述近似计算的结果,以作为针对所述数据集合的近似计算结果。The calculation module 403 further calls the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and on the non-outlier samples sampled from the non-outlier data subset. Approximate calculations are performed on the data samples, and the results of the exact calculations and the approximate calculations are combined to serve as approximate calculation results for the data set.
在本实施例中,所述装置40还包括:In this embodiment, the device 40 further includes:
获取模块404(图4中未示出),在采样模块402将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集之前,获取所述区块链上存证的与所述数据标识对应的数据集合;或者,通过与所述智能合约对应的预言机程序,从与所述区块链对接的链外数据库中获取与所述数据标识对应的数据集合。The acquisition module 404 (not shown in Figure 4), in the sampling module 402, divides the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a number of non-outlier data subsets. Before forming a non-outlier data subset composed of data samples, obtain the data set corresponding to the data identifier stored on the blockchain; or, through the oracle program corresponding to the smart contract, obtain the data set corresponding to the data identifier from the Obtain the data set corresponding to the data identifier from the off-chain database connected to the blockchain.
在本实施例中,所述采样模块402:In this embodiment, the sampling module 402:
针对与所述数据标识对应的所述数据集合中的数据样本进行离群数据计算,以确定所述数据集合中包含的离群数据样本和非离群数据样本;Perform outlier data calculations on the data samples in the data set corresponding to the data identifier to determine outlier data samples and non-outlier data samples contained in the data set;
基于所述离群数据样本创建所述离群数据子集,并基于所述非离群数据样本创建非离群数据子集。The outlier data subset is created based on the outlier data samples, and a non-outlier data subset is created based on the non-outlier data samples.
在本实施例中,所述计算参数包括与所述近似计算对应的置信概率;以及,与所述近似计算对应的总误差值;所述置信概率表征所述近似计算的准确度;所述智能合约维护了基于霍夫丁不等式推导出的,用于描述与所述近似计算对应的置信概率,与所述近似计算对应的误差值,以及与参与所述近似计算的数据集合对应的采样数量三者之间的数学关系;In this embodiment, the calculation parameters include a confidence probability corresponding to the approximate calculation; and a total error value corresponding to the approximate calculation; the confidence probability represents the accuracy of the approximate calculation; the intelligence The contract maintains three values derived based on Hoeffding's inequality to describe the confidence probability corresponding to the approximate calculation, the error value corresponding to the approximate calculation, and the number of samples corresponding to the data set participating in the approximate calculation. the mathematical relationship between them;
所述采样模块402进一步:The sampling module 402 further:
在针对所述非离群数据子集中的数据样本进行采样之前,将与所述近似计算对应的所述置信概率以及与所述近似计算对应的所述误差值,输入至所述数学关系中进行计算,得到与所述非离群数据子集对应的采样数量。Before sampling the data samples in the non-outlier data subset, the confidence probability corresponding to the approximate calculation and the error value corresponding to the approximate calculation are input into the mathematical relationship. Calculate and obtain the number of samples corresponding to the non-outlier data subset.
在本实施例中,所述数据关系利用如下的公式进行表示:In this embodiment, the data relationship is expressed using the following formula:
Figure PCTCN2022135439-appb-000007
Figure PCTCN2022135439-appb-000007
其中,在上述公式中,n g表示所述采样数量; Among them, in the above formula, n g represents the sampling number;
b g、a g分别表示所述数据集合中的数据样本的最大值和最小值;δ表示所述置信概率;ε g表示所述误差值;N g表示所述数据集合中的数据样本的总数量。 b g and a g represent the maximum and minimum values of the data samples in the data set respectively; δ represents the confidence probability; ε g represents the error value; N g represents the total value of the data samples in the data set. quantity.
在本实施例中,所述采样模块402:In this embodiment, the sampling module 402:
按照计算出的所述采样数量针对所述非离群数据子集中的非离群数据样本进行采样。The non-outlier data samples in the non-outlier data subset are sampled according to the calculated sampling number.
在本实施例中,针对所述非离群数据子集中的非离群数据样本进行的采样包括随机采样;In this embodiment, sampling for non-outlier data samples in the non-outlier data subset includes random sampling;
所述采样模块402进一步:The sampling module 402 further:
获取用于进行随机采样的随机数;Get the random number used for random sampling;
基于所述随机数针对所述非离群数据子集中的非离群数据样本进行随机采样,得到与计算出的所述采样数量对应的非离群数据样本。Randomly sample non-outlier data samples in the non-outlier data subset based on the random number to obtain non-outlier data samples corresponding to the calculated sampling number.
在本实施例中,针对所述非离群数据子集中的非离群数据样本进行的采样包括分层采样;In this embodiment, sampling for non-outlier data samples in the non-outlier data subset includes stratified sampling;
所述采样模块402进一步:The sampling module 402 further:
采用最优化求解方法,求解在针对所述非离群数据子集进行分层采样时所需划分出的数据子集的最优数量,以及与划分出的各个数据子集对应的最优误差值;Use the optimization solution method to find the optimal number of data subsets that need to be divided when performing stratified sampling for the non-outlier data subset, and the optimal error value corresponding to each divided data subset. ;
将与所述近似计算对应的所述置信概率以及与所述各个数据子集对应的所述最优误差值,输入至所述数学关系中进行计算,得到与所述各个数据子集分别对应的最优采样数量;The confidence probability corresponding to the approximate calculation and the optimal error value corresponding to each data subset are input into the mathematical relationship for calculation to obtain the corresponding corresponding to each data subset. Optimal sampling number;
按照所述最优数量将所述数据集合划分为若干个数据子集,并按照所述最优采样数据分别对各个数据子集中的非离群数据样本进行采样。The data set is divided into several data subsets according to the optimal number, and non-outlier data samples in each data subset are sampled according to the optimal sampling data.
在本实施例中,其中,所述最优化求解方法所采用的约束条件包括:针对各个数据子集对应的误差值进行加权平均计算得到的加权平均误差值最小,并且不大于所述总误差值;In this embodiment, the constraints adopted by the optimization solution method include: the weighted average error value obtained by performing weighted average calculation on the error values corresponding to each data subset is the smallest and is not greater than the total error value. ;
所述采样模块402进一步执行如下步骤:The sampling module 402 further performs the following steps:
步骤A,对初始化的i值对应的数值进行调整;Step A: Adjust the value corresponding to the initialized i value;
步骤B,将所述非离群数据子集划分为分别包含i个样本数量的若干数据子集;Step B: Divide the non-outlier data subset into several data subsets each containing i number of samples;
步骤C,将所述置信概率以及调整之后的i值作为计算参数,输入至所述数学关系中进行计算,得到与所述各个数据子集分别对应的误差值,并对所述各个数据子集对应的误差值进行加权平均计算,得到加权平均误差值;Step C: Use the confidence probability and the adjusted i value as calculation parameters, input them into the mathematical relationship for calculation, obtain the error values corresponding to each of the data subsets, and calculate the error values for each of the data subsets. The corresponding error values are calculated as a weighted average to obtain a weighted average error value;
步骤D,确定所述加权平均误差值是否不大于所述总误差值,并且小于基于本次调整之前的所述i值计算出的所述加权平均误差值;如果否,重新执行以上的步骤A-步骤D,直到计算出的所述加权平均误差值满足所述约束条件时停止迭代,并获取使得所述加权平均误差值满足所述约束条件时的最优i值;Step D: Determine whether the weighted average error value is not greater than the total error value and less than the weighted average error value calculated based on the i value before this adjustment; if not, re-execute step A above -Step D, stop iteration until the calculated weighted average error value satisfies the constraint condition, and obtain the optimal i value when the weighted average error value satisfies the constraint condition;
步骤E,基于所述最优i值确定针对所述非离群数据子集进行分层抽样时所需划分出的数据子集的最优数量,并将所述置信概率以及与所述最优i值,输入至所述数学关系中进行计算,得到与所述各个数据子集分别对应的最优误差值。Step E: Determine the optimal number of data subsets that need to be divided when performing stratified sampling on the non-outlier data subset based on the optimal i value, and compare the confidence probability and the optimal The i value is input into the mathematical relationship for calculation, and the optimal error value corresponding to each of the data subsets is obtained.
在本实施例中,所述采样模块进一步:In this embodiment, the sampling module further:
获取用于进行随机采样的随机数;Get the random number used for random sampling;
基于所述随机数分别对各个数据子集中的非离群数据样本进行随机采样,得到与计算出的所述最优采样数量对应的非离群数据样本。Randomly sample non-outlier data samples in each data subset based on the random number to obtain non-outlier data samples corresponding to the calculated optimal sampling number.
在本实施例中,所述采样模块进一步执行以下示出的任一:In this embodiment, the sampling module further performs any of the following:
调用所述区块链上部署的随机函数生成用于进行随机采样的随机树;Call the random function deployed on the blockchain to generate a random tree for random sampling;
基于所述节点设备中搭载的可信执行环境中维护的随机数种子,在所述可信执行环境中生成随机数;Generate random numbers in the trusted execution environment based on the random number seeds maintained in the trusted execution environment carried in the node device;
从所述智能合约维护的数据相关的数据参数中,获取作为所述随机数种子的目标数据参数,并基于获取到的目标数据参数在所述智能合约中生成用于进行随机采样的随机数;Obtain the target data parameter as the random number seed from the data parameters related to the data maintained by the smart contract, and generate random numbers for random sampling in the smart contract based on the acquired target data parameters;
通过与所述智能合约对应的预言机程序,获取在链外生成的用于进行随机采样的随机数;Obtain the random numbers generated outside the chain for random sampling through the oracle program corresponding to the smart contract;
通过与所述智能合约对应的预言机程序,获取在链外生成的用于生成随机数的随机数种子,并基于获取到的目标数据参数在所述智能合约中生成用于进行随机采样的随机数;获取所述计算参数中包括的在链外生成的随机数种子,基于所述随机数种子在所述智能合约中生成用于进行随机采样的随机数。Through the oracle program corresponding to the smart contract, a random number seed generated outside the chain for generating random numbers is obtained, and a random number used for random sampling is generated in the smart contract based on the obtained target data parameters. Number; obtain the random number seed generated outside the chain included in the calculation parameters, and generate random numbers for random sampling in the smart contract based on the random number seed.
在本实施例中,所述计算参数还包括算法标识;In this embodiment, the calculation parameters also include algorithm identifiers;
所述计算模块403:The calculation module 403:
按照所述算法标识指示的计算类型,针对所述离群数据子集中的离群数据样本进行精确计算;Perform accurate calculations on outlier data samples in the outlier data subset according to the calculation type indicated by the algorithm identifier;
针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,包括:Performing approximate calculations on non-outlier data samples sampled from the non-outlier data subset includes:
按照所述算法标识指示的计算类型,针对从所述非离群数据子集中采样得到的非离群数据样本进行 近似计算。According to the calculation type indicated by the algorithm identifier, approximate calculation is performed on the non-outlier data samples sampled from the non-outlier data subset.
在本实施例中,所述智能合约部署在所述节点设备搭载的可信执行环境中;所述计算参数和所述数据集合中的数据样本预先经过了加密处理;In this embodiment, the smart contract is deployed in a trusted execution environment mounted on the node device; the calculation parameters and the data samples in the data set have been encrypted in advance;
所述采样模块402进一步:The sampling module 402 further:
将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集之前,在所述可信执行环境中对所述计算参数以及对获取到的所述数据集合中的数据样本分别进行解密。Before dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples and a non-outlier data subset composed of a number of non-outlier data samples, the possible The calculation parameters and the obtained data samples in the data set are respectively decrypted in the information execution environment.
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。The systems, devices, modules or units described in the above embodiments may be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer, which may be in the form of a personal computer, a laptop, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, or a game controller. desktop, tablet, wearable device, or a combination of any of these devices.
在一个典型的配置中,计算机包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computer includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带、磁盘存储、量子存储器、基于石墨烯的存储介质或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information. Information may be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory. (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transmission medium, can be used to store information that can be accessed by computing devices. As defined in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the terms "comprises," "comprises," or any other variation thereof are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that includes a list of elements not only includes those elements, but also includes Other elements are not expressly listed or are inherent to the process, method, article or equipment. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article, or device that includes the stated element.
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desired results. Additionally, the processes depicted in the figures do not necessarily require the specific order shown, or sequential order, to achieve desirable results. Multitasking and parallel processing are also possible or may be advantageous in certain implementations.
在本说明书一个或多个实施例使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本说明书一个或多个实施例。在本说明书一个或多个实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in one or more embodiments of this specification is for the purpose of describing particular embodiments only and is not intended to limit the one or more embodiments of this specification. As used in one or more embodiments of this specification and the appended claims, the singular forms "a," "the" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.
应当理解,尽管在本说明书一个或多个实施例可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本说明书一个或多个实施例范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that although one or more embodiments of this specification may use the terms first, second, third, etc. to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other. For example, without departing from the scope of one or more embodiments of this specification, the first information may also be called second information, and similarly, the second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "when" or "when" or "in response to determining."
以上所述仅为本说明书一个或多个实施例的较佳实施例而已,并不用以限制本说明书一个或多个实施例,凡在本说明书一个或多个实施例的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本说明书一个或多个实施例保护的范围之内。The above are only preferred embodiments of one or more embodiments of this specification, and are not intended to limit one or more embodiments of this specification. Within the spirit and principles of one or more embodiments of this specification, Any modifications, equivalent substitutions, improvements, etc. shall be included in the scope of protection of one or more embodiments of this specification.

Claims (16)

  1. 一种基于智能合约的计算方法,应用于区块链中的节点设备,所述区块链上部署了用于执行近似计算的智能合约,所述方法包括:A calculation method based on smart contracts, applied to node devices in a blockchain where smart contracts for performing approximate calculations are deployed, and the method includes:
    接收计算发起方发起的针对所述智能合约的智能合约调用交易;其中,所述智能合约调用交易包括与所述近似计算对应的计算参数;所述计算参数包括参与近似计算的数据集合的数据标识;Receive a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include the data identifier of the data set participating in the approximate calculation ;
    响应于所述智能合约调用交易,调用所述智能合约调用交易包含的采样逻辑,将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,并针对所述非离群数据子集中的非离群数据样本进行采样;In response to the smart contract call transaction, calling the sampling logic included in the smart contract call transaction, dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
    进一步调用所述智能合约调用交易包含的计算逻辑,针对所述离群数据子集中的离群数据样本进行精确计算,针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,并合并所述精确计算和所述近似计算的结果,以作为针对所述数据集合的近似计算结果。Further call the calculation logic contained in the smart contract call transaction to perform precise calculations on the outlier data samples in the outlier data subset, and perform approximate calculations on the non-outlier data samples sampled from the non-outlier data subset. Calculate, and combine the results of the exact calculation and the approximate calculation as an approximate calculation result for the data set.
  2. 根据权利要求1所述的方法,所述将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集之前,还包括:The method of claim 1, wherein the data set corresponding to the data identifier is divided into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples. Before outlier data subsets, also include:
    获取所述区块链上存证的与所述数据标识对应的数据集合;或者,Obtain the data set corresponding to the data identifier stored on the blockchain; or,
    通过与所述智能合约对应的预言机程序,从与所述区块链对接的链外数据库中获取与所述数据标识对应的数据集合。Through the oracle program corresponding to the smart contract, the data set corresponding to the data identifier is obtained from the off-chain database connected to the blockchain.
  3. 根据权利要求2所述的方法,将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,包括:According to the method of claim 2, the data set corresponding to the data identifier is divided into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples. Data subsets, including:
    针对与所述数据标识对应的所述数据集合中的数据样本进行离群数据计算,以确定所述数据集合中包含的离群数据样本和非离群数据样本;Perform outlier data calculations on the data samples in the data set corresponding to the data identifier to determine outlier data samples and non-outlier data samples contained in the data set;
    基于所述离群数据样本创建所述离群数据子集,并基于所述非离群数据样本创建非离群数据子集。The outlier data subset is created based on the outlier data samples, and a non-outlier data subset is created based on the non-outlier data samples.
  4. 根据权利要求3所述的方法,所述计算参数包括与所述近似计算对应的置信概率;以及,与所述近似计算对应的总误差值;所述置信概率表征所述近似计算的准确度;所述智能合约维护了基于霍夫丁不等式推导出的,用于描述与所述近似计算对应的置信概率,与所述近似计算对应的误差值,以及与参与所述近似计算的数据集合对应的采样数量三者之间的数学关系;The method according to claim 3, wherein the calculation parameter includes a confidence probability corresponding to the approximate calculation; and, a total error value corresponding to the approximate calculation; the confidence probability represents the accuracy of the approximate calculation; The smart contract maintains the confidence probability corresponding to the approximate calculation, the error value corresponding to the approximate calculation, and the data set participating in the approximate calculation derived based on Hoeffding's inequality. The mathematical relationship between the three sampling quantities;
    针对所述非离群数据子集中的数据样本进行采样之前,还包括:Before sampling the data samples in the non-outlier data subset, it also includes:
    将与所述近似计算对应的所述置信概率以及与所述近似计算对应的所述误差值,输入至所述数学关系中进行计算,得到与所述非离群数据子集对应的采样数量。The confidence probability corresponding to the approximate calculation and the error value corresponding to the approximate calculation are input into the mathematical relationship for calculation, and the number of samples corresponding to the non-outlier data subset is obtained.
  5. 根据权利要求4所述的方法,所述数据关系利用如下的公式进行表示:According to the method of claim 4, the data relationship is expressed using the following formula:
    Figure PCTCN2022135439-appb-100001
    Figure PCTCN2022135439-appb-100001
    其中,在上述公式中,n g表示所述采样数量;b g、a g分别表示所述数据集合中的数据样本的最大值和最小值;δ表示所述置信概率;ε g表示所述误差 值;N g表示所述数据集合中的数据样本的总数量。 Among them, in the above formula, n g represents the number of samples; b g and a g represent the maximum and minimum values of the data samples in the data set respectively; δ represents the confidence probability; ε g represents the error Value; N g represents the total number of data samples in the data set.
  6. 根据权利要求3所述的方法,针对所述非离群数据子集中的非离群数据样本进行采样,包括:The method of claim 3, sampling non-outlier data samples in the non-outlier data subset includes:
    按照计算出的所述采样数量针对所述非离群数据子集中的非离群数据样本进行采样。The non-outlier data samples in the non-outlier data subset are sampled according to the calculated sampling number.
  7. 根据权利要求6所述的方法,针对所述非离群数据子集中的非离群数据样本进行的采样包括随机采样;The method of claim 6, wherein sampling of non-outlier data samples in the non-outlier data subset includes random sampling;
    按照计算出的所述采样数量针对所述非离群数据子集中的非离群数据样本进行采样,包括:Sampling the non-outlier data samples in the non-outlier data subset according to the calculated sampling number includes:
    获取用于进行随机采样的随机数;Get the random number used for random sampling;
    基于所述随机数针对所述非离群数据子集中的非离群数据样本进行随机采样,得到与计算出的所述采样数量对应的非离群数据样本。Randomly sample non-outlier data samples in the non-outlier data subset based on the random number to obtain non-outlier data samples corresponding to the calculated sampling number.
  8. 根据权利要求6所述的方法,针对所述非离群数据子集中的非离群数据样本进行的采样包括分层采样;The method of claim 6, wherein sampling of non-outlier data samples in the non-outlier data subset includes stratified sampling;
    按照计算出的所述采样数量针对所述非离群数据子集中的非离群数据样本进行采样,包括:Sampling the non-outlier data samples in the non-outlier data subset according to the calculated sampling number includes:
    采用最优化求解方法,求解在针对所述非离群数据子集进行分层采样时所需划分出的数据子集的最优数量,以及与划分出的各个数据子集对应的最优误差值;Use the optimization solution method to find the optimal number of data subsets that need to be divided when performing stratified sampling for the non-outlier data subset, and the optimal error value corresponding to each divided data subset. ;
    将与所述近似计算对应的所述置信概率以及与所述各个数据子集对应的所述最优误差值,输入至所述数学关系中进行计算,得到与所述各个数据子集分别对应的最优采样数量;The confidence probability corresponding to the approximate calculation and the optimal error value corresponding to each data subset are input into the mathematical relationship for calculation to obtain the corresponding corresponding to each data subset. Optimal sampling number;
    按照所述最优数量将所述数据集合划分为若干个数据子集,并按照所述最优采样数据分别对各个数据子集中的非离群数据样本进行采样。The data set is divided into several data subsets according to the optimal number, and non-outlier data samples in each data subset are sampled according to the optimal sampling data.
  9. 根据权利要求8所述的方法,其中,所述最优化求解方法所采用的约束条件包括:针对各个数据子集对应的误差值进行加权平均计算得到的加权平均误差值最小,并且不大于所述总误差值;The method according to claim 8, wherein the constraints adopted by the optimization solution method include: the weighted average error value obtained by performing a weighted average calculation on the error values corresponding to each data subset is the smallest, and is not greater than the total error value;
    采用最优化求解方法,求解在针对所述非离群数据子集进行分层采样时所需划分出的数据子集的最优数量,以及与划分出的各个数据子集对应的最优误差值,包括:Use the optimization solution method to find the optimal number of data subsets that need to be divided when performing stratified sampling for the non-outlier data subset, and the optimal error value corresponding to each divided data subset. ,include:
    步骤A,对初始化的i值对应的数值进行调整;Step A: Adjust the value corresponding to the initialized i value;
    步骤B,将所述非离群数据子集划分为分别包含i个样本数量的若干数据子集;Step B: Divide the non-outlier data subset into several data subsets each containing i number of samples;
    步骤C,将所述置信概率以及调整之后的i值作为计算参数,输入至所述数学关系中进行计算,得到与所述各个数据子集分别对应的误差值,并对所述各个数据子集对应的误差值进行加权平均计算,得到加权平均误差值;Step C: Use the confidence probability and the adjusted i value as calculation parameters, input them into the mathematical relationship for calculation, obtain the error values corresponding to each of the data subsets, and calculate the error values for each of the data subsets. The corresponding error values are calculated as a weighted average to obtain a weighted average error value;
    步骤D,确定所述加权平均误差值是否不大于所述总误差值,并且小于基于本次调整之前的所述i值计算出的所述加权平均误差值;如果否,重新执行以上的步骤A-步骤D,直到计算出的所述加权平均误差值满足所述约束条件时停止迭代,并获取使得所述加权平均误差值满足所述约束条件时的最优i值;Step D: Determine whether the weighted average error value is not greater than the total error value and less than the weighted average error value calculated based on the i value before this adjustment; if not, re-execute step A above -Step D, stop iteration until the calculated weighted average error value satisfies the constraint condition, and obtain the optimal i value when the weighted average error value satisfies the constraint condition;
    步骤E,基于所述最优i值确定针对所述非离群数据子集进行分层抽样时所需划分出的数据子集的最优数量,并将所述置信概率以及与所述最优i值,输入至所述数学关系中进行计算,得到与所述各个数据子集分别对应的最优误差值。Step E: Determine the optimal number of data subsets that need to be divided when performing stratified sampling on the non-outlier data subset based on the optimal i value, and compare the confidence probability and the optimal The i value is input into the mathematical relationship for calculation, and the optimal error value corresponding to each of the data subsets is obtained.
  10. 根据权利要求8所述的方法,按照所述最优采样数据分别对各个数据子集中的非离群数据样本进行采样,包括:The method according to claim 8, sampling non-outlier data samples in each data subset according to the optimal sampling data, including:
    获取用于进行随机采样的随机数;Get the random number used for random sampling;
    基于所述随机数分别对各个数据子集中的非离群数据样本进行随机采样,得到与计算出的所述最优采样数量对应的非离群数据样本。Randomly sample non-outlier data samples in each data subset based on the random number to obtain non-outlier data samples corresponding to the calculated optimal sampling number.
  11. 根据权利要求7或10所述的方法,所述获取用于进行随机采样的随机数,包括以下示出的任一:The method according to claim 7 or 10, said obtaining random numbers used for random sampling includes any of the following:
    调用所述区块链上部署的随机函数生成用于进行随机采样的随机树;Call the random function deployed on the blockchain to generate a random tree for random sampling;
    基于所述节点设备中搭载的可信执行环境中维护的随机数种子,在所述可信执行环境中生成随机数;Generate random numbers in the trusted execution environment based on the random number seeds maintained in the trusted execution environment carried in the node device;
    从所述智能合约维护的数据相关的数据参数中,获取作为所述随机数种子的目标数据参数,并基于获取到的目标数据参数在所述智能合约中生成用于进行随机采样的随机数;Obtain the target data parameter as the random number seed from the data parameters related to the data maintained by the smart contract, and generate random numbers for random sampling in the smart contract based on the acquired target data parameters;
    通过与所述智能合约对应的预言机程序,获取在链外生成的用于进行随机采样的随机数;Obtain the random numbers generated outside the chain for random sampling through the oracle program corresponding to the smart contract;
    通过与所述智能合约对应的预言机程序,获取在链外生成的用于生成随机数的随机数种子,并基于获取到的目标数据参数在所述智能合约中生成用于进行随机采样的随机数;获取所述计算参数中包括的在链外生成的随机数种子,基于所述随机数种子在所述智能合约中生成用于进行随机采样的随机数。Through the oracle program corresponding to the smart contract, a random number seed generated outside the chain for generating random numbers is obtained, and a random number used for random sampling is generated in the smart contract based on the obtained target data parameters. Number; obtain the random number seed generated outside the chain included in the calculation parameters, and generate random numbers for random sampling in the smart contract based on the random number seed.
  12. 根据权利要求1所述的方法,所述计算参数还包括算法标识;The method according to claim 1, the calculation parameters further include an algorithm identifier;
    针对所述离群数据子集中的离群数据样本进行精确计算,包括:Perform precise calculations on outlier data samples in the outlier data subset, including:
    按照所述算法标识指示的计算类型,针对所述离群数据子集中的离群数据样本进行精确计算;Perform accurate calculations on outlier data samples in the outlier data subset according to the calculation type indicated by the algorithm identifier;
    针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,包括:Performing approximate calculations on non-outlier data samples sampled from the non-outlier data subset includes:
    按照所述算法标识指示的计算类型,针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算。According to the calculation type indicated by the algorithm identifier, approximate calculation is performed on the non-outlier data samples sampled from the non-outlier data subset.
  13. 根据权利要求1所述的方法,所述智能合约部署在所述节点设备搭载的可信执行环境中;所述计算参数和所述数据集合中的数据样本预先经过了加密处理;According to the method of claim 1, the smart contract is deployed in a trusted execution environment mounted on the node device; the calculation parameters and data samples in the data set have been encrypted in advance;
    所述将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集之前,还包括:Before dividing the data set corresponding to the data identifier into an outlier data subset composed of a number of outlier data samples, and a non-outlier data subset composed of a number of non-outlier data samples, the method further includes: :
    在所述可信执行环境中对所述计算参数以及对获取到的所述数据集合中的数据样本分别进行解密。The calculation parameters and the obtained data samples in the data set are decrypted respectively in the trusted execution environment.
  14. 一种基于智能合约的计算装置,应用于区块链中的节点设备,所述区块链上部署了用于执行近似计算的智能合约,所述装置包括:A computing device based on smart contracts, applied to node equipment in a blockchain. Smart contracts for performing approximate calculations are deployed on the blockchain. The device includes:
    接收模块,接收计算发起方发起的针对所述智能合约的智能合约调用交易;其中,所述智能合约调用交易包括与所述近似计算对应的计算参数;所述计算参数包括参与近似计算的数据集合的数据标识;A receiving module that receives a smart contract call transaction initiated by the calculation initiator for the smart contract; wherein the smart contract call transaction includes calculation parameters corresponding to the approximate calculation; the calculation parameters include a data set participating in the approximate calculation data identification;
    采样模块,响应于所述智能合约调用交易,调用所述智能合约调用交易包含的采样逻辑,将与所述数据标识对应的所述数据集合划分为由若干离群数据样本构成的离群数据子集,和由若干非离群数据样本构成的非离群数据子集,并针对所述非离群数据子集中的非离群数据样本进行采样;A sampling module, in response to the smart contract call transaction, calls the sampling logic included in the smart contract call transaction, and divides the data set corresponding to the data identifier into outlier data sub-sub-sets composed of a number of outlier data samples. a set, and a non-outlier data subset composed of a number of non-outlier data samples, and sampling the non-outlier data samples in the non-outlier data subset;
    计算模块,进一步调用所述智能合约调用交易包含的计算逻辑,针对所述离群数据子集中的离群数据样本进行精确计算,针对从所述非离群数据子集中采样得到的非离群数据样本进行近似计算,并合并所述精确计算和所述近似计算的结果,以作为针对所述数据集合的近似计算结果。The calculation module further calls the calculation logic contained in the smart contract call transaction to perform accurate calculations on the outlier data samples in the outlier data subset, and on the non-outlier data sampled from the non-outlier data subset. An approximate calculation is performed on the sample, and the results of the exact calculation and the approximate calculation are combined as an approximate calculation result for the data set.
  15. 一种电子设备,包括:An electronic device including:
    处理器;processor;
    用于存储处理器可执行指令的存储器;Memory used to store instructions executable by the processor;
    其中,所述处理器通过运行所述可执行指令以实现如权利要求1-13中任一项所述的方法的步骤。Wherein, the processor executes the steps of the method according to any one of claims 1-13 by running the executable instructions.
  16. 一种计算机可读存储介质,其上存储有计算机指令,该指令被处理器执行时实现如权利要求1-13中任一项所述方法的步骤。A computer-readable storage medium having computer instructions stored thereon, which when executed by a processor, implements the steps of the method according to any one of claims 1-13.
PCT/CN2022/135439 2022-03-30 2022-11-30 Smart contract-based calculation, update and read method and apparatus, and electronic device WO2023185052A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210332050.4 2022-03-30
CN202210332050.4A CN114693450A (en) 2022-03-30 2022-03-30 Intelligent contract-based calculating, updating and reading method and device and electronic equipment

Publications (1)

Publication Number Publication Date
WO2023185052A1 true WO2023185052A1 (en) 2023-10-05

Family

ID=82140687

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/135439 WO2023185052A1 (en) 2022-03-30 2022-11-30 Smart contract-based calculation, update and read method and apparatus, and electronic device

Country Status (2)

Country Link
CN (1) CN114693450A (en)
WO (1) WO2023185052A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114693450A (en) * 2022-03-30 2022-07-01 蚂蚁区块链科技(上海)有限公司 Intelligent contract-based calculating, updating and reading method and device and electronic equipment
CN114708096A (en) * 2022-03-30 2022-07-05 蚂蚁区块链科技(上海)有限公司 Intelligent contract-based calculation, updating and reading method and device and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102736896A (en) * 2011-03-29 2012-10-17 国际商业机器公司 Run-ahead approximated computations
CN108876610A (en) * 2018-05-31 2018-11-23 深圳市零度智控科技有限公司 Intelligent contract implementation method, user equipment, storage medium and device
CN108898390A (en) * 2018-06-27 2018-11-27 阿里巴巴集团控股有限公司 Intelligent contract call method and device, electronic equipment based on block chain
US20180365686A1 (en) * 2017-06-19 2018-12-20 Hitachi, Ltd. Smart contract lifecycle management
CN113779624A (en) * 2021-08-27 2021-12-10 浙江数秦科技有限公司 Private data sharing method based on intelligent contracts
CN114693450A (en) * 2022-03-30 2022-07-01 蚂蚁区块链科技(上海)有限公司 Intelligent contract-based calculating, updating and reading method and device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102736896A (en) * 2011-03-29 2012-10-17 国际商业机器公司 Run-ahead approximated computations
US20180365686A1 (en) * 2017-06-19 2018-12-20 Hitachi, Ltd. Smart contract lifecycle management
CN108876610A (en) * 2018-05-31 2018-11-23 深圳市零度智控科技有限公司 Intelligent contract implementation method, user equipment, storage medium and device
CN108898390A (en) * 2018-06-27 2018-11-27 阿里巴巴集团控股有限公司 Intelligent contract call method and device, electronic equipment based on block chain
CN113779624A (en) * 2021-08-27 2021-12-10 浙江数秦科技有限公司 Private data sharing method based on intelligent contracts
CN114693450A (en) * 2022-03-30 2022-07-01 蚂蚁区块链科技(上海)有限公司 Intelligent contract-based calculating, updating and reading method and device and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KANG QI, WANG LEI, AN JING, WU QI-DI: "Approximate Dynamic Programming Based Parameter Optimization of Particle Swarm Systems", ACTA AUTOMATICA SINICA, KEXUE CHUBANSHE, BEIJING, CN, vol. 36, no. 08, 15 August 2010 (2010-08-15), CN , pages 1171 - 1181, XP009549279, ISSN: 0254-4156 *
ZHOU SHI-GUO, HU LIANG-PING, LI CHANG-PING: "Sample Size Estimation based on Width of Confidence Interval", MILITARY MEDICAL SCIENCES, CN, no. 01, 28 February 2007 (2007-02-28), CN , pages 66 - 68, 99, XP009549278, ISSN: 1674-9960 *

Also Published As

Publication number Publication date
CN114693450A (en) 2022-07-01

Similar Documents

Publication Publication Date Title
WO2023185052A1 (en) Smart contract-based calculation, update and read method and apparatus, and electronic device
WO2023185050A1 (en) Smart contract-based calculating, updating, and reading methods and apparatuses, and electronic device
WO2023185057A1 (en) Smart contract-based computing method and apparatus, and electronic device
TWI735820B (en) Asset management method and device, electronic equipment
US10789244B1 (en) Asset management system, method, apparatus, and electronic device
US11386405B2 (en) Dynamic blockchain transactional policy management
US10733176B2 (en) Detecting phantom items in distributed replicated database
US11823178B2 (en) Optimization of high volume transaction performance on a blockchain
WO2020082871A1 (en) Method, device and system for executing blockchain transactions in parallel
KR101959153B1 (en) System for efficient processing of transaction requests related to an account in a database
US10728020B2 (en) Efficient mining operations in blockchain environments with non-secure devices
TW201828220A (en) Service processing method and apparatus
Zhang et al. A weighted kernel possibilistic c‐means algorithm based on cloud computing for clustering big data
WO2020108050A1 (en) Data evidence preservation method and system based on multiple blockchain networks
EP3961453B1 (en) Method and apparatus for invoking smart contract, electronic device, and storage medium
WO2019147744A1 (en) Consistency and consensus management in decentralized and distributed systems
TWI706664B (en) Data storage method and system based on multiple blockchain networks
WO2020019799A1 (en) Object selection method and device and electronic device
WO2020108052A1 (en) Data reading method based on a plurality of block chain networks and system
US20220292387A1 (en) Byzantine-robust federated learning
US20210028924A1 (en) System and method for extendable cryptography in a distributed ledger
TW201903639A (en) Method and device for performing security verification based on biometrics
Keller et al. Balancing quality and efficiency in private clustering with affinity propagation
WO2023091203A1 (en) Generating cryptographic proof of a series of transactions
US12020242B2 (en) Fair transaction ordering in blockchains

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22934870

Country of ref document: EP

Kind code of ref document: A1