CN113536352A - Private data calling method based on block chain - Google Patents
Private data calling method based on block chain Download PDFInfo
- Publication number
- CN113536352A CN113536352A CN202110855544.6A CN202110855544A CN113536352A CN 113536352 A CN113536352 A CN 113536352A CN 202110855544 A CN202110855544 A CN 202110855544A CN 113536352 A CN113536352 A CN 113536352A
- Authority
- CN
- China
- Prior art keywords
- data
- storage
- execution server
- data line
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000003860 storage Methods 0.000 claims abstract description 146
- 238000012545 processing Methods 0.000 claims abstract description 35
- 238000011084 recovery Methods 0.000 claims abstract description 26
- 238000013475 authorization Methods 0.000 claims abstract description 7
- 238000001914 filtration Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 8
- 210000002569 neuron Anatomy 0.000 description 6
- 230000004927 fusion Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 241000282414 Homo sapiens Species 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000002354 daily effect Effects 0.000 description 3
- 239000012634 fragment Substances 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/64—Protecting data integrity, e.g. using checksums, certificates or signatures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioethics (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of block chains, in particular to a private data calling method based on a block chain, which comprises the following steps: establishing a data receiving node, distributing batch numbers and distributing data numbers for data lines; extracting a data hash set, and uploading the data hash set to a block chain for storage after associating batch numbers; the data receiving nodes dispersedly store the data lines in a plurality of storage nodes; submitting the data processing model to an execution server, and recovering the data line by the execution server after obtaining the authorization; the execution server substitutes the recovery data row into the data processing model to obtain a model result, and destroys the recovery data row; and feeding the hash value of the recovered data line, the model result and the hash value of the model result back to the data demand side, and paying the data source side according to the hash value and the model result. The substantial effects of the invention are as follows: the private data are hidden, and the security of the private data is ensured; when the data demand party uses the private data, the data demand party does not directly contact the private data, and the privacy of the private data is guaranteed.
Description
Technical Field
The invention relates to the technical field of block chains, in particular to a private data calling method based on a block chain.
Background
Information technology has penetrated into every corner of human daily life, and along with human daily life and production activities of enterprises, massive data is generated and stored every day. By means of an electronic management system, a data-based social operation system has extremely high operation efficiency, great convenience is brought to daily activities of human beings, and social operation cost is reduced. With the progress of big data technology and artificial intelligence technology in recent years, the values contained in the data are mined and applied again. Such as big data based freight rate adjustment mechanism, cultivation management, spam identification, telecom fraud avoidance, etc. And the fields of machine translation, expert systems, unmanned driving and the like based on the artificial intelligence technology play an important role. With the introduction of new theories, big data technologies and artificial intelligence technologies are moving towards more sophistication and intelligence. But also encounters the problems that the data demand is larger and larger, and the acquisition of enough data is more and more difficult. The enterprises or organizations generating the data have the problem of data heterogeneity, and the circulation and sharing scale of the data is very limited due to market competition, privacy policy and the like, so that individual data islands are formed. And seriously hinders the progress and development of the technology. Research and development of a technology capable of achieving both data privacy and data distribution have been an important issue in the field.
For example, chinese patent CN111061713A, published 2020, 4, 24, is a method, apparatus, device and storage medium for block chain data fusion, and belongs to the technical field of computer information. The method comprises the following steps: acquiring a data resource and a contract request, and standardizing the data resource; respectively acquiring corresponding contract algorithm resources and contract algorithm resources through a block chain according to the contract request; and carrying out parallel processing on the data resources after the standardized processing according to the contract algorithm resources and the contract algorithm resources. The technical scheme realizes the sharing of data governance results by carrying out standardized processing on data resources, but contract calculation is carried out by means of a block chain, so that the privacy of data is revealed. It is not suitable to solve the problem that private data is difficult to share and share.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: at present, the technical problem that the private data is difficult to share and share is solved. The method can provide data calling and using for multiple parties under the condition of ensuring data privacy and safety, and better exerts the value of the data.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a private data calling method based on a block chain comprises the following steps: establishing a data receiving node, receiving the private data submitted according to the batch, distributing batch numbers for the batch private data rows, and distributing data numbers for the data rows; extracting the hash value of the data line in the same batch, bringing the hash value into a data hash set, extracting the hash value of the data hash set, uploading the hash value to a block chain for storage after associating batch numbers, and storing the data hash set by a data receiving node; establishing a plurality of storage nodes, and storing data lines on the plurality of storage nodes in a dispersed manner by the data receiving node, and associating the batch number with the data number; establishing an execution server, wherein the execution server is connected with a plurality of storage nodes; submitting the data processing model to an execution server, distributing a model number for the data processing model, extracting a hash value of the data processing model by the execution server, and uploading the hash value of the data processing model to the model number in association with the block chain for storage; the data demand side submits a calling request to an execution server, the calling request comprises a data number of a data line to be called, and the execution server sends an authorization request to a data source side; after obtaining the authorization, the execution server communicates with a plurality of storage nodes, recovers the data line, and extracts the hash value of the recovered data line; the execution server substitutes the recovery data row into the data processing model to obtain a model result, and destroys the recovery data row; and extracting the hash value of the model result, feeding the hash value of the recovered data line, the model result and the hash value of the model result back to the data demand side, and paying the data source side according to the hash value after the data demand side verifies the hash value of the recovered data line.
Preferably, the execution server is provided with a virtual account on the block chain, and when the data demand side submits a call request, the corresponding amount of tokens are transferred to the virtual account of the execution server; the execution server executes the data processing model, and transfers the tokens with the corresponding number to a virtual account of the data source side according to the number of the substituted data lines when obtaining a model result; the data source side has a deposit on the virtual account of the execution server, and if the data demand side checks that the hash value of the recovered data line is not consistent with the data hash set, the data demand side returns a corresponding amount of tokens from the deposit to the data demand side.
Preferably, the data demander submits a filtering hash set, wherein the filtering hash set comprises a plurality of hash values; when the execution server obtains the recovery data line, comparing the hash value of the recovery data line with the filtering hash set; if the hash value of the recovery data row exists in the filtered hash set, skipping the recovery data row and not charging; and if the hash value of the recovery data line does not exist in the filtering hash set, substituting the recovery data line into the data processing model, and charging.
Preferably, the method for the data receiving node to store the data lines in the storage nodes in a scattered manner comprises the following steps: establishing a plurality of copies for each data line, wherein the number of the copies is matched with the number of the storage nodes; generating an confusion value for each non-numerical field of the data line, wherein the confusion value falls into a field real value range, the real value of the field of the data line is randomly distributed to a plurality of copies for storage, and the confusion value is filled in the field which is not distributed to the real value in the copies; generating a plurality of addends and an obfuscated value for each numerical type field of the data line, wherein the positive and negative attributes of the addends are consistent with the real value, the number of the addends is less than the number of the storage nodes minus 3, the number of the addends is a preset value, the real value and the addends of the numerical type field are respectively allocated to a plurality of copies, and the copies which are not allocated to the value are filled with the obfuscated value; distributing the plurality of copies to a plurality of storage nodes for storage; when the data row is restored, all the copies are read, each non-numerical field has a real value and a plurality of confusion values, the real value can be distinguished according to the same number of values, each numerical field has a real value, a plurality of addends and at least 2 confusion values, the confusion values are easy to distinguish, the residual values can be combined into an addition equation, the real values can also be distinguished, and the restoration of the data row is completed.
Preferably, the obfuscated value has a preset functional relationship with the data line number.
Preferably, when the execution server executes the data processing model, the execution server determines whether a calculation formula for performing weighted summation on the numerical field exists; if yes, the weight coefficients are sent to a plurality of storage nodes; the plurality of storage nodes recover the stored data lines, the stored values of the corresponding numerical type fields are multiplied by the corresponding weight coefficients, then the sum is summed, and the sum is encrypted and sent to the execution server; the execution server receives all the sums and then sums the sums again to obtain a sum; the execution server calculates a confusion value according to the data line number and a preset function; deducing the number of confusion values according to the number of addends, multiplying the confusion values by the corresponding weight coefficients, and then multiplying by the number of the confusion values to obtain a confusion sum; and after the mixed sum is subtracted from the sum, dividing the sum by 2 to obtain the result of weighted summation of the true values.
Preferably, the fields and the allowed lengths of the fields of the data lines form a data structure, the storage node opens up a plurality of storage areas for each data structure on a storage medium of the storage node, each storage area opens up a plurality of storage blocks, the length of each storage block is matched with the maximum occupied space of the data structure, each storage block is also provided with an area special for storing a matching number and a data number, when the storage node receives the data lines, the data structure of the data lines is extracted, the corresponding storage area is found, the data lines, the batch numbers and the data numbers are stored into a first blank storage block, and if the storage blocks of the storage areas are full, a new storage area is opened up.
Preferably, the storage node establishes an exchange table for each storage area, the exchange table records a plurality of exchange pairs, the exchange pairs record two binary sequences, after a data line is stored, whether an aligned exchange pair exists between the data line and a data line stored in a previous storage block is checked in a binary mode, if the aligned exchange pair exists, a field where the exchange pair exists is exchanged, when the execution server requests the data line, the storage node finds the storage block according to the data number, backward checks whether the aligned exchange pair exists, if the aligned exchange pair exists, the next storage block is taken as a reference, backward checks whether the aligned exchange pair exists again, only checks whether the aligned exchange pair exists in the field position where the exchange pair exists, and makes a copy of all the storage blocks which are checked to exist the aligned exchange pair until no aligned exchange pair exists in the checking position or the last storage block of the storage area is found, and sequentially restoring fields where the aligned exchange pairs are located from the last storage block until a data line required by the execution server is restored, adding the content of the last storage block of the data line required by the execution server into a copy, upwards checking whether the aligned exchange pairs exist with the last storage block or not by taking the content in the restored data line as a reference, exchanging corresponding fields in the copy if the aligned exchange pairs exist, obtaining the data line stored with the original data after the exchange, and submitting the data line to the execution server.
The substantial effects of the invention are as follows: the distributed storage node is established to perform distributed storage on the private data, so that the private data are hidden, and the security of the private data is ensured in the long-time storage process; by using the data processing model, the execution result is fed back to the data demanding party, so that the data demanding party does not directly contact with the private data when using the private data, and the privacy of the private data is ensured; because the data processing model can be directly operated based on the private data, the private data does not need to be fuzzified, and therefore the data processing model has higher accuracy.
Drawings
Fig. 1 is a schematic flowchart of a private data calling method according to an embodiment.
Fig. 2 is a schematic diagram of a charging transfer process according to an embodiment.
Fig. 3 is a schematic diagram illustrating a flow of using a filtered hash set according to an embodiment.
FIG. 4 is a schematic diagram illustrating an embodiment of a distributed storage process.
FIG. 5 is a flowchart illustrating a process of a weighting and calculation formula according to an embodiment.
FIG. 6 is a diagram illustrating data storage of a storage node according to an embodiment.
Wherein: 11. batch number, 12, data number, 13, storage area, 14, swap table, 15, storage block, 16, field.
Detailed Description
The following provides a more detailed description of the present invention, with reference to the accompanying drawings.
The first embodiment is as follows:
referring to fig. 1, a private data calling method based on a block chain includes the following steps:
step A01) establishing a data receiving node, receiving the privacy data submitted according to the batch, distributing a batch number 11 to the batch privacy data line, and distributing a data number 12 to the data line;
step A02) extracting the hash value of the data line of the same batch, incorporating the hash value into the data hash set, extracting the hash value of the data hash set, uploading the hash value to a block chain for storage after associating the batch number 11, and storing the data hash set by the data receiving node;
step A03), establishing a plurality of storage nodes, and storing data lines on the plurality of storage nodes by the data receiving node in a scattered manner, and associating the batch number 11 and the data number 12; for example, the shopping mall provides the member consumption data of the current year, including detailed consumption records and member information, and relates to a large amount of privacy information. The data related to a large amount of privacy information are stored on a plurality of storage nodes in a scattered manner, each node only stores a fragment of member information or a fragment of consumption transaction, for example, a name of a purchased commodity is located on a first storage node, a name of a purchaser is located on a second storage node, and a sum of the commodity is stored on a third storage node, so that the data stored by the individual storage nodes cannot know specific privacy information, and privacy disclosure can be caused only if all the storage nodes are attacked.
Step A04), establishing an execution server, wherein the execution server is connected with a plurality of storage nodes;
step A05) submitting the data processing model to an execution server, distributing model numbers for the data processing model, extracting the hash value of the data processing model by the execution server, and uploading the hash value of the data processing model to the model numbers in association with the hash value to store the hash value;
step A06) the data demand side submits the call request to the execution server, the call request includes the data number 12 of the data row to be called, the execution server sends the authorization request to the data source side;
step A07), after obtaining authorization, the execution server communicates with a plurality of storage nodes, recovers the data line, and extracts the hash value of the recovered data line;
step A08), the execution server substitutes the recovery data row into the data processing model to obtain a model result, and destroys the recovery data row; the data processing model is mainly a neural network model, and other data processing models which are self-made by a data demander can be conveniently executed. In order to facilitate the data demander to formulate a data processing model by itself, the data source side needs to disclose several pieces of example data, and needs to disclose the value range of each field 16. The source, integrity and authenticity of each batch of data should also be accounted for and known to the data requesting party to facilitate the transaction of the data. The information is written and provided by the data source side. The data demander may first formulate a verification model and submit it to the execution server for execution, for example, the verification model reads a certain numeric field 16, and outputs the maximum value, the minimum value, and the null rate of the field 16, so as to compare with the introduction disclosed by the data source, and determine whether the data range disclosed by the data source is real, and whether the declared null rate is satisfied. To reduce the risk of data purchasing by data consumers.
Step A09), the hash value of the model result is extracted, the hash value of the recovered data line, the model result and the hash value of the model result are fed back to the data demand side, and the data demand side pays the data source side according to the hash value of the recovered data line after verifying the hash value.
Referring to fig. 2, the embodiment further includes: step B01), the execution server sets a virtual account on the block chain, and when the data demand side submits a call request, the corresponding amount of tokens are transferred to the virtual account of the execution server; step B02), the execution server executes the data processing model, and when the model result is obtained, transferring the tokens with the corresponding quantity to the virtual account of the data source side according to the quantity of the substituted data lines; step B03) the data source side has deposit on the virtual account of the execution server, if the data demand side checks that the hash value of the recovered data line does not accord with the data hash set, the corresponding amount of token is returned from the deposit to the data demand side. The method is used for improving the trust of data demand parties and reducing transaction risks.
Referring to fig. 3, in order to avoid purchasing duplicate data, the embodiment further includes: step C01) the data demander submits a filtering hash set, wherein the filtering hash set comprises a plurality of hash values; step C02), when the execution server obtains the recovery data line, the hash value of the recovery data line is compared with the filtering hash set; step C03), if the hash value of the recovery data line exists in the filtering hash set, skipping the recovery data line and not charging; step C04), if the hash value of the recovery data line does not exist in the filtering hash set, substituting the recovery data line into the data processing model, and charging.
Referring to fig. 4, a method for a data receiving node to store data lines in a distributed manner in a plurality of storage nodes includes:
step D01) establishing a plurality of copies for each data line, wherein the number of the copies is matched with the number of the storage nodes;
step D02) generating an alias value for each non-numeric field 16 of the data row, the alias value falling within the range of values of the real values of the field 16;
step D03) randomly distributing the real values of the data line fields 16 to a plurality of copy storages, and filling the confusion values in the fields 16 which are not distributed to the real values in the copies;
step D04) generating a plurality of addends and an confusion value for each numerical field 16 of the data row, wherein the positive and negative attributes of the addends are consistent with the true values, the number of the addends is less than the number of the storage nodes minus 3, and the number of the addends is a preset value;
step D05) assigning the true value and the addends of the numeric field 16 to the copies, respectively, and filling the copies not assigned to the value with an obfuscated value;
step D06) distributing the copies to storage nodes for storage;
step D07), when restoring the data row, reading all the copies, each non-numerical field 16 will have a true value and several confusion values, the true value can be distinguished according to the same number of values, each numerical field 16 will have a true value, several addends and at least 2 confusion values, the confusion values are easy to distinguish, the remaining values can be combined into an addition equation, the true value can also be distinguished, and the restoration of the data row is completed.
If the real data comprises: age of consumer, 33, monthly average consumption amount: 10,000.00 yuan, monthly average consumption frequency: 16 times. The total number of storage nodes is 6, the confusion number generated for the age of the consumer is 56, and the generated addend is as follows: 33=12+8+13, 6 copies are assigned the values: 33,12,8,13,56,56, respectively, are allocated to 6 storage nodes for storage.
The obfuscated value has a predetermined functional relationship with the data line number. The obfuscation number is obtained as the obfuscation value for the consumer's age = data line number mod 150. The method can be implemented to effectively hide the real data, namely the age 33 of the consumer, because the data stored in each storage node does not directly appear at the value 33. The attacker does not know which is the real data without obtaining the data of the other storage nodes.
Referring to fig. 5, when the execution server executes the data processing model, the following steps are performed:
step E01) determining whether there is a calculation formula for weighted summation of the numeric field 16;
step E02), if the weight coefficient exists, sending the weight coefficient to a plurality of storage nodes;
step E03), restoring the stored data line by a plurality of storage nodes, multiplying the stored value of the corresponding numerical value type field 16 by the corresponding weight coefficient, summing, encrypting and sending the sum to an execution server;
step E04), the execution server receives the total sum and then sums the sum again to obtain a sum;
step E05), the execution server calculates an confusion value according to the data line number and a preset function;
step E06) deducing the quantity of the confusion values according to the quantity of the addends, multiplying the confusion values by the corresponding weight coefficients, and multiplying the confusion values by the quantity of the confusion values to obtain a confusion sum;
step E07), the confusion sum is subtracted from the sum, and then the result is divided by 2, namely the result of weighted summation of real values.
For example, in the neural network model, the input layer has three neurons corresponding to the age of the consumer, the average monthly consumption amount and the average monthly consumption frequency, the first layer has two neurons, one of the neurons is connected with the three neurons of the input layer, the excitation function is a sigmod function, the weight coefficients are represented by a11, a12 and a13, the offset is represented by b1, the output is equal to sigmod (x), and x = a11 the age of the consumer + a12 the average monthly consumption amount + a13 the average monthly consumption frequency + b 1.
Generating a confusion count for the monthly average spend amount of 56, the generated addend being: 10,000.00=3,000.00+1,000.00+6,000.00, 6 copies being assigned the following values: 10,000.00, 3,000.00, 1,000.00, 6,000.00, 56.00,56.00. Generating a confusion number of 56 for the average monthly consumption frequency, the generated addend being: 16=3+17+6, 6 copies are assigned the values: 16, 3, 7, 6,56,56.
After the disordering sequence is distributed to 6 storage nodes, the data stored by the first storage node is assumed to be: 33, 6,000.00,56, the sum calculated by the first storage node is: a11 + a12 6,000.00+ a13 56, and so on, the sums sent by all 6 storage nodes are summed, and the result is:
sum = a11 (33 +12+8+13+56+ 56) + a12 (10,000.00 + 3,000.00+1,000.00+6,000.00 +56.00 + 56.00) + a13 (16 + 3+ 7+ 6+56 + 56). And E) calculating the confusion number 56 according to the step E06), calculating the confusion sum =2 (a 11 + a12 + a13 + 56), subtracting the confusion sum from the sum to obtain a result, namely, a result of weighted summation of the real value twice, dividing the result by 2, adding the offset value b1 to obtain a value x, and substituting the value x into a sigmod (x) function to obtain the output of the neuron. The output of the neuron is calculated by calculating the value of x, which is the weighted sum of the numeric field 16.
Example two:
a private data calling method based on a block chain is disclosed, please refer to fig. 6, wherein fields 16 and 16 of data lines allow length to form a data structure, a storage node opens up a plurality of storage areas 13 for each data structure on a storage medium of the storage node, each storage area 13 opens up a plurality of storage blocks 15, the length of each storage block 15 is matched with the maximum occupied space of the data structure, each storage block 15 is also provided with an area special for storing a matching number and a data number 12, when the storage node receives the data lines, the data structure of the data lines is extracted, the corresponding storage area 13 is found, the data lines, the batch numbers 11 and the data numbers 12 are stored in a first blank storage block 15, and if the storage blocks 15 of the storage areas 13 are full, a new storage area 13 is opened up.
The storage node establishes a switching table 14 for each storage area 13, the switching table 14 records a plurality of switching pairs, and the switching pairs record two binary sequences. After storing the data line, checking whether the data line has an aligned exchange pair with the data line stored in the last storage block 15 in a binary form, if so, exchanging the field 16 where the exchange pair is located, when the fusion server asks for the data line, the storage node finds the storage block 15 according to the data number 12, checking backwards whether the aligned exchange pair exists, if so, checking backwards whether the aligned exchange pair exists again by taking the next storage block 15 as a reference, checking only whether the aligned exchange pair exists at the position of the field 16 where the exchange pair exists until finding the checking position without the aligned exchange pair or reaching the last storage block 15 of the storage area 13, making a copy of all the storage blocks 15 where the aligned exchange pair exists, starting from the last storage block 15, restoring the field 16 where the aligned exchange pair exists in sequence until recovering the data line asked for by the fusion server, adding the content of the last storage block 15 of the data line requested by the fusion server into the copy, then upwards checking whether an aligned exchange pair exists with the last storage block 15 or not based on the content in the recovered data line, if so, exchanging the corresponding field 16 in the copy, obtaining the data line stored with the original data after exchange, and submitting the data line to the fusion server. The corresponding field 16 is swapped because there is an aligned swap pair for the data line and the previous data line, and when the next data line is stored, the swap pair for it still appears, and the field 16 is swapped again. The swap pair is two random binary strings with associated swap relation. Such as: "010 … 10100" and "110 … 0110" form an exchange pair. The length of the binary string is determined based on the average binary length of the field 16 values in the data row so that the field 16 has an appropriate probability of finding an aligned swap pair. Compared with the first embodiment, the storage of the original data in the storage node is further scattered, and the security of the private data is further improved.
The above-described embodiments are only preferred embodiments of the present invention, and are not intended to limit the present invention in any way, and other variations and modifications may be made without departing from the spirit of the invention as set forth in the claims.
Claims (8)
1. A private data calling method based on a block chain is characterized in that,
the method comprises the following steps:
establishing a data receiving node, receiving the private data submitted according to the batch, distributing batch numbers for the batch private data rows, and distributing data numbers for the data rows;
extracting the hash value of the data line in the same batch, bringing the hash value into a data hash set, extracting the hash value of the data hash set, uploading the hash value to a block chain for storage after associating batch numbers, and storing the data hash set by a data receiving node;
establishing a plurality of storage nodes, and storing data lines on the plurality of storage nodes in a dispersed manner by the data receiving node, and associating the batch number with the data number;
establishing an execution server, wherein the execution server is connected with a plurality of storage nodes;
submitting the data processing model to an execution server, distributing a model number for the data processing model, extracting a hash value of the data processing model by the execution server, and uploading the hash value of the data processing model to the model number in association with the block chain for storage;
the data demand side submits a calling request to an execution server, the calling request comprises a data number of a data line to be called, and the execution server sends an authorization request to a data source side;
after obtaining the authorization, the execution server communicates with a plurality of storage nodes, recovers the data line, and extracts the hash value of the recovered data line;
the execution server substitutes the recovery data row into the data processing model to obtain a model result, and destroys the recovery data row;
and extracting the hash value of the model result, feeding the hash value of the recovered data line, the model result and the hash value of the model result back to the data demand side, and paying the data source side according to the hash value after the data demand side verifies the hash value of the recovered data line.
2. The method for calling private data based on block chain according to claim 1,
the execution server is provided with a virtual account on the block chain, and when a data demand party submits a calling request, the corresponding number of tokens are transferred to the virtual account of the execution server;
the execution server executes the data processing model, and transfers the tokens with the corresponding number to a virtual account of the data source side according to the number of the substituted data lines when obtaining a model result;
the data source side has a deposit on the virtual account of the execution server, and if the data demand side checks that the hash value of the recovered data line is not consistent with the data hash set, the data demand side returns a corresponding amount of tokens from the deposit to the data demand side.
3. The method for calling private data based on block chain according to claim 1 or 2,
a data demander submits a filtering hash set, wherein the filtering hash set comprises a plurality of hash values;
when the execution server obtains the recovery data line, comparing the hash value of the recovery data line with the filtering hash set;
if the hash value of the recovery data row exists in the filtered hash set, skipping the recovery data row and not charging;
and if the hash value of the recovery data line does not exist in the filtering hash set, substituting the recovery data line into the data processing model, and charging.
4. The method for calling private data based on block chain according to claim 1 or 2,
the method for the data receiving node to store the data lines in a plurality of storage nodes in a scattered manner comprises the following steps:
establishing a plurality of copies for each data line, wherein the number of the copies is matched with the number of the storage nodes;
generating an confusion value for each non-numerical field of the data line, wherein the confusion value falls into a field real value range, the real value of the field of the data line is randomly distributed to a plurality of copies for storage, and the confusion value is filled in the field which is not distributed to the real value in the copies;
generating a plurality of addends and an obfuscated value for each numerical type field of the data line, wherein the positive and negative attributes of the addends are consistent with the real value, the number of the addends is less than the number of the storage nodes minus 3, the number of the addends is a preset value, the real value and the addends of the numerical type field are respectively allocated to a plurality of copies, and the copies which are not allocated to the value are filled with the obfuscated value;
distributing the plurality of copies to a plurality of storage nodes for storage;
when the data row is restored, all the copies are read, each non-numerical field has a real value and a plurality of confusion values, the real value can be distinguished according to the same number of values, each numerical field has a real value, a plurality of addends and at least 2 confusion values, the confusion values are easy to distinguish, the residual values can be combined into an addition equation, the real values can also be distinguished, and the restoration of the data row is completed.
5. The method for calling private data based on block chain according to claim 4,
the confusion value and the data line number have a preset functional relationship.
6. The method for calling private data based on block chain according to claim 5,
when the execution server executes the data processing model, judging whether a calculation formula for carrying out weighted summation on the numerical field exists or not;
if yes, the weight coefficients are sent to a plurality of storage nodes;
the plurality of storage nodes recover the stored data lines, the stored values of the corresponding numerical type fields are multiplied by the corresponding weight coefficients, then the sum is summed, and the sum is encrypted and sent to the execution server;
the execution server receives all the sums and then sums the sums again to obtain a sum;
the execution server calculates a confusion value according to the data line number and a preset function;
deducing the number of confusion values according to the number of addends, multiplying the confusion values by the corresponding weight coefficients, and then multiplying by the number of the confusion values to obtain a confusion sum;
and after the mixed sum is subtracted from the sum, dividing the sum by 2 to obtain the result of weighted summation of the true values.
7. The method for calling private data based on block chain according to claim 1 or 2,
the field of the data line and the allowable length of the field form a data structure, the storage node opens up a plurality of storage areas for each data structure on a storage medium of the storage node, each storage area opens up a plurality of storage blocks, the length of each storage block is matched with the maximum occupied space of the data structure, each storage block is also provided with an area special for storing a matching number and a data number, when the storage node receives the data line, the data structure of the data line is extracted, the corresponding storage area is found, the data line, the batch number and the data number are stored into a first blank storage block, and if the storage block of the storage area is fully stored, a new storage area is opened up.
8. The method for calling private data based on block chain according to claim 7,
the storage node establishes an exchange table for each storage area, the exchange table records a plurality of exchange pairs, the exchange pairs record two binary sequences, after the data lines are stored, the binary form is used for checking whether the data lines and the data lines stored in the previous storage block have aligned exchange pairs or not, if so, the fields where the exchange pairs are located are exchanged, when the execution server asks for the data lines, the storage node finds out the storage blocks according to the data numbers, backwards checks whether the aligned exchange pairs exist or not, if so, the next storage block is taken as a reference, backwards checks whether the aligned exchange pairs exist or not again, only checks whether the aligned exchange pairs exist in the field positions where the exchange pairs exist or not, and makes a copy for all the storage blocks which are checked to exist the aligned exchange pairs until no aligned exchange pairs are found out at the checking position or the last storage block of the storage area is reached, and sequentially restoring fields where the aligned exchange pairs are located from the last storage block until a data line required by the execution server is restored, adding the content of the last storage block of the data line required by the execution server into a copy, upwards checking whether the aligned exchange pairs exist with the last storage block or not by taking the content in the restored data line as a reference, exchanging corresponding fields in the copy if the aligned exchange pairs exist, obtaining the data line stored with the original data after the exchange, and submitting the data line to the execution server.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110855544.6A CN113536352A (en) | 2021-07-28 | 2021-07-28 | Private data calling method based on block chain |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110855544.6A CN113536352A (en) | 2021-07-28 | 2021-07-28 | Private data calling method based on block chain |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113536352A true CN113536352A (en) | 2021-10-22 |
Family
ID=78089396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110855544.6A Pending CN113536352A (en) | 2021-07-28 | 2021-07-28 | Private data calling method based on block chain |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113536352A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114692209A (en) * | 2022-05-31 | 2022-07-01 | 蓝象智联(杭州)科技有限公司 | Graph federation method and system based on confusion technology |
-
2021
- 2021-07-28 CN CN202110855544.6A patent/CN113536352A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114692209A (en) * | 2022-05-31 | 2022-07-01 | 蓝象智联(杭州)科技有限公司 | Graph federation method and system based on confusion technology |
CN114692209B (en) * | 2022-05-31 | 2022-09-20 | 蓝象智联(杭州)科技有限公司 | Graph federation method and system based on confusion technology |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108737361B (en) | Data verification method based on block chain | |
CN110009340A (en) | Card method and apparatus are deposited based on block chain | |
Longley et al. | Data And Computer Security: A Dictionary Of Terms And Concepts | |
CN113420335B (en) | Block chain-based federal learning system | |
CN113343284B (en) | Private data sharing method based on block chain | |
CN109493047A (en) | A kind of commission settlement method, device and terminal device based on block chain | |
Zhao et al. | CORK: A privacy-preserving and lossless federated learning scheme for deep neural network | |
CN114175028B (en) | Cryptographic pseudonym mapping method, computer system, computer program and computer-readable medium | |
CN110493268A (en) | A kind of data processing method and device based on block chain network | |
CN110264325A (en) | A kind of invoice checking method and device based on block chain | |
CN113536357B (en) | Data sharing platform based on block chain | |
CN115735212A (en) | Distributed machine learning via secure multi-party computing and ensemble learning | |
CN108805574B (en) | Transaction method and system based on privacy protection | |
CN114693241A (en) | Block chain-based electronic resume system and implementation method thereof | |
CN113536352A (en) | Private data calling method based on block chain | |
CN113779624A (en) | Private data sharing method based on intelligent contracts | |
CN106204329A (en) | A kind of intelligent grid load management system | |
Far et al. | An unlinkable reputation transfer framework for blockchain-based retail markets using non-fungible tokens | |
CN116596561A (en) | Method, system and equipment for evaluating credit of energy utilization enterprise based on longitudinal federal learning | |
US20200175562A1 (en) | Gem trade and exchange system and previous-block verification method for block chain transactions | |
CN115525922A (en) | Financial privacy data security decision-making method, device and equipment based on privacy calculation | |
CN116071159A (en) | System and method for service green power transaction | |
CN114119214A (en) | Credit evaluation calculation method based on multi-party safety calculation | |
CN110365730A (en) | Air quality information sharing method and device | |
CN113536353A (en) | Private data processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20211022 |
|
WD01 | Invention patent application deemed withdrawn after publication |