CN114244853A - Big data sharing method and device and big data sharing system - Google Patents

Big data sharing method and device and big data sharing system Download PDF

Info

Publication number
CN114244853A
CN114244853A CN202111434664.5A CN202111434664A CN114244853A CN 114244853 A CN114244853 A CN 114244853A CN 202111434664 A CN202111434664 A CN 202111434664A CN 114244853 A CN114244853 A CN 114244853A
Authority
CN
China
Prior art keywords
data
node
index information
storage
block chain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111434664.5A
Other languages
Chinese (zh)
Inventor
席嫣娜
张宏宇
高鑫
梁惠施
李伟
胡彩娥
冯楠
陈波
王健
杨铭
王舒
王思涵
周奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Sichuan Energy Internet Research Institute EIRI Tsinghua University
State Grid Beijing Electric Power Co Ltd
Economic and Technological Research Institute of State Grid Beijing Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Sichuan Energy Internet Research Institute EIRI Tsinghua University
State Grid Beijing Electric Power Co Ltd
Economic and Technological Research Institute of State Grid Beijing Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Sichuan Energy Internet Research Institute EIRI Tsinghua University, State Grid Beijing Electric Power Co Ltd, Economic and Technological Research Institute of State Grid Beijing Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202111434664.5A priority Critical patent/CN114244853A/en
Publication of CN114244853A publication Critical patent/CN114244853A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/0442Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply asymmetric encryption, i.e. different keys for encryption and decryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3239Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/062Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00 applying encryption of the keys

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a big data sharing method, a big data sharing device and a big data sharing system, wherein the method comprises the following steps: a data request is sent by a data receiving party through a block chain, wherein the data request comprises index information of data, and the index information comprises storage position information of the data; retrieving data in a local memory of a data provider according to the index information to obtain target data; and sending the target data to a data demand side through the block chain. The method solves the problem that the redundant information quantity of the block chain adopted by large data sharing in the prior art is excessive.

Description

Big data sharing method and device and big data sharing system
Technical Field
The present application relates to the field of big data sharing technologies, and in particular, to a big data sharing method, a big data sharing device, a computer-readable storage medium, a processor, and a big data sharing system.
Background
With the advancement of the energy revolution, the energy is transformed to clean and distributed, the occupation ratio of renewable energy in an energy system is gradually increased, and the large-scale application of distributed energy supply gradually forms an energy supply pattern with centralized and distributed co-participation. The new energy cloud platform based on the block chain can transversely aggregate industrial resources of suppliers, power grid enterprises, energy users and the like, longitudinally serve business links of station building consultation, building scheme evaluation, equipment purchasing, operation maintenance, settlement subsidy, grid connection and the like of users, carry out all-process chain linking on business demand data, realize credible sharing of data, break through data barriers and improve multi-party cooperative efficiency.
At present, data sharing adopts a safe multi-party calculation method, namely, a plurality of data owners are allowed to execute cooperative calculation under the condition of mutual distrust, calculation results are output, and any party can not obtain any other redundant information except the calculation results. Data sharing applications under the blockchain can be divided into two types of modes: (1) the resource sharing service mode supports data sharing between two or more business departments, synchronizes data of each node in real time through the decentralized characteristic, and realizes authorized sharing of internal data resources in a recorded manner under the rule through the transparent and intelligent contract characteristic. (2) The business cooperation service mode comprises the following steps: the characteristics of triggering generation and automatic operation of the intelligent contract on the block chain are utilized to realize safe and reliable circulation of online data among multiple departments and multiple units in a condition judgment mode.
The current blockchain technology mainly has 2 problems: on one hand, the data expansibility is insufficient, the performance is low due to insufficient throughput of shared transactions, and for an energy source block chain, the problems of various energy transaction varieties, increased frequency, centralized participants and low data cooperation efficiency exist, so that an intermediate layer data processing mode with expansibility is urgently needed. On the other hand, the energy is large, the data volume is large, the distribution is wide, the storage space of the block chain is limited, and therefore distributed storage is available, but the storage nodes become more and more common along with the continuous expansion of the data scale. To solve the problem, more erasure code encoding and decoding are applied to blockchain storage, in the prior art, the most common is a multi-copy mechanism and an MDS code, the multi-copy mechanism introduces redundancy simply, but the storage efficiency is very low, and the traditional MDS code has relatively low storage cost but higher repair cost and access delay.
The prior art has three main problems:
1) excessive block chain redundancy information: because any node in the block chain network backups all the on-chain transaction information, the increase of the data volume carried by the transaction seriously affects the efficiency of a consensus mechanism and causes unnecessary waste of storage and calculation resources;
2) the data source has weak expansibility: because the centralized database is difficult to meet and adapt to various data storage requirements, a user-defined data source is adopted, but the data source standards are not uniform, the heterogeneity is strong, and an intermediate layer with expansibility is urgently needed for information connection. The distributed file system connects a computer network with nodes, a user can play two roles of a client and a server without paying attention to a specific storage position and can upload and acquire information files without paying attention to the network, and the distributed file system can be perfectly suitable for file sharing of such distributed scenes;
3) and (3) repairing the data storage node: the distributed storage has the problem that nodes are unreliable, the fault-tolerant capability is improved by introducing redundant data based on an error correcting code, compared with the traditional MDS code, the method for storing the energy block chain random slices is provided by combining a piggybacking framework, and the transmission bandwidth and the repair cost of data repair are reduced on the premise that the distribution and storage of an original system are not changed.
The above information disclosed in this background section is only for enhancement of understanding of the background of the technology described herein and, therefore, certain information may be included in the background that does not form the prior art that is already known in this country to a person of ordinary skill in the art.
Disclosure of Invention
The present application mainly aims to provide a big data sharing method, device, computer readable storage medium, processor and big data sharing system, so as to solve the problem in the prior art that the amount of redundant information of a block chain is too large for big data sharing.
According to an aspect of the embodiments of the present invention, there is provided a big data sharing method, including: receiving a data request sent by a data demander through a block chain, wherein the data request comprises index information of data, and the index information comprises storage position information of the data; retrieving data in a local memory of a data provider according to the index information to obtain target data; and sending the target data to the data demander through the block chain.
Optionally, retrieving data in a local storage of a data provider according to the index information to obtain target data, including: determining a storage node where a corresponding data block is located according to the storage location information, wherein data in the local storage comprises a plurality of data blocks, the data blocks are in one-to-one correspondence with the storage nodes, and the index information further comprises data block information of the target data and storage location information of the data block of the target data; and reading the data block stored by the storage node to obtain the target data.
Optionally, the storage nodes of the local storage include a data node and a check node, where the data node is configured to store the target data, the data node is configured to store check data corresponding to the target data, and before reading the data block stored in the storage node to obtain the target data, the method further includes: under the condition that a data node corresponding to the target data is invalid, calculating according to the check data to obtain the target data; and under the condition that the check node corresponding to the target data fails, calculating to obtain the check data according to the target data.
Optionally, when the data node fails or the check node fails, a byzantine fault-tolerant algorithm is adopted to repair the data.
Optionally, the nodes of the blockchain communicate over a P2P network.
Optionally, the index information is encrypted and stored on the block chain.
According to another aspect of the embodiments of the present invention, there is provided a big data sharing apparatus, including: the device comprises a receiving unit, a sending unit and a receiving unit, wherein the receiving unit is used for receiving a data request sent by a data demander through a block chain, the data request comprises index information of data, and the index information comprises storage position information of the data; the acquisition unit is used for retrieving data in a local memory of a data provider according to the index information to obtain target data; and the sending unit is used for sending the target data to the data demander through the block chain.
According to still another aspect of embodiments of the present invention, there is provided a computer-readable storage medium including a stored program, wherein the program performs any one of the methods.
According to a further aspect of the embodiments of the present invention, there is provided a processor for executing a program, wherein the program executes to perform any one of the methods.
According to a further aspect of the embodiments of the present invention, there is provided a big data sharing system, including a blockchain, a local memory of a blockchain node, and a big data sharing apparatus, where the big data sharing apparatus includes a module for performing any one of the methods.
In the embodiment of the present invention, in the big data sharing method, first, a data receiving demander sends a data request through a block chain, where the data request includes index information of data, and the index information includes storage location information of the data; then, retrieving data in a local memory of a data provider according to the index information to obtain target data; and finally, sending the target data to a data demand side through the block chain. The data owner of the method randomly stores data in a local storage under a chain, each independent data index is encrypted and then uploaded to a block chain common identification node, when data transaction sharing is needed, a data demand party sends a data request to the data owner through the block chain, after the data request passes verification, local coded data are retrieved and obtained, and complete data are obtained through decoding.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:
FIG. 1 shows a flow diagram of a big data sharing method according to an embodiment of the present application;
FIG. 2 shows a schematic diagram of blockchain local data storage according to an embodiment of the present application;
FIG. 3 shows a schematic diagram of a blockchain big data sharing model according to an embodiment of the present application;
FIG. 4 shows a schematic diagram of a data random access memory model according to an embodiment of the present application.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
It will be understood that when an element such as a layer, film, region, or substrate is referred to as being "on" another element, it can be directly on the other element or intervening elements may also be present. Also, in the specification and claims, when an element is described as being "connected" to another element, the element may be "directly connected" to the other element or "connected" to the other element through a third element.
As mentioned in the background, in order to solve the problem of excessive block chain redundancy information in the large data sharing in the prior art, in an exemplary embodiment of the present application, a large data sharing method, an apparatus, a computer-readable storage medium, a processor, and a large data sharing system are provided.
According to an embodiment of the present application, a big data sharing method is provided.
Fig. 1 is a flowchart of a big data sharing method according to an embodiment of the present application. As shown in fig. 1, the method comprises the steps of:
step S101, a data request is sent by a data receiving party through a block chain, wherein the data request comprises index information of data, and the index information comprises storage position information of the data;
step S102, retrieving data in a local memory of a data provider according to the index information to obtain target data;
step S103, sending the target data to the data demander through the block chain.
In the big data sharing method, firstly, a data receiving demand side sends a data request through a block chain, wherein the data request comprises index information of data, and the index information comprises storage position information of the data; then, retrieving data in a local memory of a data provider according to the index information to obtain target data; and finally, sending the target data to a data demand side through the block chain. The data owner of the method randomly stores data in a local memory under a chain, each index information is encrypted and uploaded to a block chain, when data transaction sharing is needed, a data demand party sends a data request to the data owner through the block chain, after the data request passes verification, local coded data are retrieved and obtained, and complete data are obtained by decoding.
In an embodiment of the present application, retrieving data in a local storage of a data provider according to the index information to obtain target data includes: determining a storage node where a corresponding data block is located according to the storage location information, where data in the local storage includes multiple data blocks, the data blocks are in one-to-one correspondence with the storage nodes, and the index information further includes data block information of the target data and storage location information of the data block of the target data; and reading the data block stored by the storage node to obtain the target data.
Specifically, as shown in fig. 2, a data slice of a data block N is divided into a plurality of sub-blocks to form a plurality of sub-bands, an encoding matrix B is set, and an independent multiplication operation is performed on the encoding matrix and each sub-band of the block matrix N to obtain a basic encoding matrix, that is, a basic encoding matrix
Figure BDA0003381195360000051
Wherein the content of the first and second substances,
Figure BDA0003381195360000052
piggybacking a local check of a certain check in a previous sub-stripe to check data of a next stripe, taking a data block as k items, generating a check item as r items, and taking f as1(a) Dividing into (r-1) groups for local check, piggybacking p groups of data into the second sub-stripe to form new check, i.e.
Figure BDA0003381195360000053
Where p ═ k/(r-1) denotes the number of data piggybacked per group, and the data owner is interested in the sliced data X1、X2…Xk+rIs carried out at randomAddress assignment, raw data X1、X2…XkStored in the data nodes under the chain, and the check data Xk +1, Xk +2 … Xx + r are stored in the check nodes under the chain. The hash function is used for different nodes of the data under the chain to generate indexes, and each piece of index information contains slice information, storage position information and the like of the data. The method comprises the steps of generating a private key of a data provider by using a random function, further generating a public key, carrying out hash operation on the private key and the public key to obtain a hash value, adding a check code to ensure the existence of the hash value, carrying out base58 coding to obtain an account address, encrypting index information by the public key of the data provider and then storing the encrypted index information on a block chain, and obtaining target data according to the index information.
In an embodiment of the application, as shown in fig. 2, the storage nodes of the local storage include a data node and a check node, the data node is configured to store the target data, the data node is configured to store check data corresponding to the target data, and before reading the data block stored in the storage node to obtain the target data, the method further includes: under the condition that a data node corresponding to the target data fails, calculating according to the check data to obtain the target data; and under the condition that the check node corresponding to the target data fails, calculating to obtain the check data according to the target data.
Specifically, when data needs to be reconstructed, a request is submitted through a block chain to obtain enough coding slices, and a node can reconstruct block data through decoding operation. Taking the (14, 10) -MDS code as an example,
Figure BDA0003381195360000054
the data N is cut into 20 subblocks and 2 subblocks (A, B), an encoding matrix B is set, the encoding matrix is multiplied by each independent subblock respectively to obtain a matrix, the number of data piggybacked in each group is obtained to be 3 according to p ═ k/(r-1)) (k ═ 10), the grouping cannot be divided completely at the moment, and the number of the data piggybacked in the last group obtained by q ═ k mod (r-1) is p + q and is 4. The 10 data blocks can constitute a complete original data without decoding. If check node X11Fail, can connect X1-X10The 10 nodes are, according to,
Figure BDA0003381195360000061
i.e. can reconstruct f1(a),f1(b) If data node X1Failure, first connecting X by MDS property2-X11Download { b2,...,b10,f1(b) These 10 data sub-blocks, the product of the inverse of the coding matrix and the coded slice is used to obtain b1, i.e.
Figure BDA0003381195360000062
At the connection X2,X3And X12Download { a2,a3,f2(b)+a1+a2+a3F can be obtained by equation 22(b) Thus, a1 is obtained, which constitutes a complete original data. Compared with the method that the total data block downloading amount is 11 blocks, the method that the decoding coding data downloading amount is at least 20 blocks by directly using the MSD code obviously saves the network bandwidth.
In an embodiment of the present application, in a case where the data node fails or the calibration node fails, data recovery is performed by using a byzantine fault-tolerant algorithm. Specifically, the process of the byzantine fault-tolerant algorithm PBFT is mainly divided into three stages: pre-preparation, preparation and submission. The PBFT algorithm can tolerate less than 1/3 invalid or malicious nodes, which can reach the final result faster, but requires a smaller number of nodes.
In one embodiment of the present application, the nodes of the blockchain communicate via a P2P network. In particular, using a P2P network architecture, typically built using gossip transport protocols, the network layer may contain different protocols based on different needs. The computers connected with each other are in equal positions and can serve as a server to send requests and a workstation to respond, and therefore decentralization is achieved without a central link. When a data demander needs complete information, the data demander sends a request to a data owner based on a P2P network of a block chain, the data owner receives the request, retrieves the locally stored data and responds, the nodes with corresponding slices are connected with the request nodes, and data are transmitted to the data demander.
In an embodiment of the present application, the index information is encrypted and stored in the block chain. Specifically, the nodes of the block chain may be divided into a consensus node and a common node, the consensus node is a node participating in consensus in the alliance chain, the node locally stores all local block data, the common node is a node in the alliance chain that normally executes a data sharing function, and the index information is encrypted and then uploaded to the consensus node of the block chain.
It should be noted that, as shown in fig. 3, the block chain big data sharing model architecture may be divided into a data layer, a block chain layer, a consensus layer, a contract layer, and an application layer, where the data layer is based on erasure code encoding and decoding, and combines with a piggybacking framework, so as to implement optimization of distributed storage of data slices without changing block chain links and functions, and also implement data reconstruction and repair; the network layer is the basis of block chain information interaction, carries a consensus process between nodes, realizes transparent data transmission between two end systems, and selects a Byzantine fault-tolerant algorithm PBFT as a consensus algorithm for further improvement to provide the characteristics of high throughput and low delay; the contract layer includes various codes and algorithm mechanisms, etc. and encapsulates intelligent contracts, which are digital protocols essentially composed of computer programs, and each node automatically transacts according to contract contents without intermediate operation when the conditions written in the source code are satisfied. By means of the non-modifiable block chain, the intelligent contract can ensure that all the participants can process things together, execute codes and maintain the consistency of states and resources. The method comprises the steps that after a data request is verified, an intelligent contract is executed, a script is locked according to access constraint conditions set by nodes, shared data are decrypted according to a provided secret key, a public key of the access node is used for encrypting and outputting results, after transaction sharing time passes, data slice address information is changed randomly, meanwhile, latest information is fed back to a data owner, and a data requester cannot check the data before authorization is obtained again; the application layer mainly realizes the functions of inquiring, sharing and modifying energy big data, entering a new data chain and the like through one client. The client comprises an administrator client and a user client. The administrator client can add a new energy data uplink, and can realize data sharing under the authorized condition, and the user client can perform data query under the authorized condition.
The embodiment of the present application further provides a big data sharing device, and it should be noted that the big data sharing device according to the embodiment of the present application may be used to execute the method for big data sharing provided in the embodiment of the present application. The following describes a big data sharing apparatus provided in an embodiment of the present application.
Fig. 4 is a schematic diagram of a big data sharing device according to an embodiment of the present application. As shown in fig. 4, the apparatus includes:
a receiving unit 10, configured to receive a data request sent by a data demander through a block chain, where the data request includes index information of data, and the index information includes storage location information of the data;
an obtaining unit 20, configured to retrieve data in a local storage of a data provider according to the index information to obtain target data;
a sending unit 30, configured to send the target data to the data consumer through the block chain.
In the big data sharing device, a receiving unit receives a data request sent by a data demander through a block chain, wherein the data request comprises index information of data, and the index information comprises storage position information of the data; the acquisition unit retrieves data in a local memory of a data provider according to the index information to obtain target data; the sending unit sends the target data to the data demand side through the block chain. The data owner of the device randomly stores data in a local memory under a chain, each index information is encrypted and then uploaded to a block chain, when data transaction sharing is needed, a data demand party sends a data request to the data owner through the block chain, after the data request passes verification, local coded data are retrieved and obtained, and complete data are obtained by decoding.
In an embodiment of the present application, the obtaining unit includes a determining module and a reading module, where the determining module is configured to determine a storage node where a corresponding data block is located according to the storage location information, the data in the local storage includes a plurality of data blocks, the data blocks are in one-to-one correspondence with the storage node, and the index information further includes data block information of the target data and storage location information of the data block of the target data; the reading module is used for reading the data block stored by the storage node to obtain the target data.
Specifically, as shown in fig. 2, a data slice of a data block N is divided into a plurality of sub-blocks to form a plurality of sub-bands, an encoding matrix B is set, and an independent multiplication operation is performed on the encoding matrix and each sub-band of the block matrix N to obtain a basic encoding matrix, that is, a basic encoding matrix
Figure BDA0003381195360000071
Wherein the content of the first and second substances,
Figure BDA0003381195360000072
piggybacking a local check of a certain check in a previous sub-stripe to check data of a next stripe, taking a data block as k items, generating a check item as r items, and taking f as1(a) Dividing into (r-1) groups for local check, piggybacking p groups of data into the second sub-stripe to form new check, i.e.
Figure BDA0003381195360000081
Where p ═ k/(r-1) denotes the number of data piggybacked per group, and the data owner is interested in the sliced data X1、X2…Xk+rRandom address assignment, raw data X1、X2…XkStored in the data nodes under the chain, the check data Xk +1, Xk +2 … Xx + r are stored in the check nodes under the chainIn (1). The hash function is used for different nodes of the data under the chain to generate indexes, and each piece of index information contains slice information, storage position information and the like of the data. The method comprises the steps of generating a private key of a data provider by using a random function, further generating a public key, carrying out hash operation on the private key and the public key to obtain a hash value, adding a check code to ensure the existence of the hash value, carrying out base58 coding to obtain an account address, encrypting index information by the public key of the data provider and then storing the encrypted index information on a block chain, and obtaining target data according to the index information.
In an embodiment of the present application, as shown in fig. 2, the storage nodes of the local storage include data nodes and check nodes, where the data nodes are configured to store the target data, and the data nodes are configured to store check data corresponding to the target data, and the apparatus further includes a computing unit, where the computing unit includes a first computing module and a second computing module, where the first computing module is configured to, when a data node corresponding to the target data fails, obtain the target data according to the check data before reading the data block stored in the storage node to obtain the target data; the second calculation module is configured to, when a check node corresponding to the target data fails, obtain the check data by calculation according to the target data before reading the data block stored in the storage node to obtain the target data.
Specifically, when data needs to be reconstructed, a request is submitted through a block chain to obtain enough coding slices, and a node can reconstruct block data through decoding operation. Taking the (14, 10) -MDS code as an example,
Figure BDA0003381195360000082
the data N is cut into 20 subblocks and 2 subblocks (A, B), an encoding matrix B is set, the encoding matrix is multiplied by each independent subblock respectively to obtain a matrix, the number of data piggybacked in each group is obtained to be 3 according to p ═ k/(r-1)) (k ═ 10), the grouping cannot be divided completely at the moment, and the number of the data piggybacked in the last group obtained by q ═ k mod (r-1) is p + q and is 4. 10 data blocks may be composedA complete original data without decoding. If check node X11Fail, can connect X1-X10The 10 nodes are, according to,
Figure BDA0003381195360000083
i.e. can reconstruct f1(a),f1(b) If data node X1Failure, first connecting X by MDS property2-X11Download { b2,...,b10,f1(b) These 10 data sub-blocks, the product of the inverse of the coding matrix and the coded slice is used to obtain b1, i.e.
Figure BDA0003381195360000091
At the connection X2,X3And X12Download { a2,a3,f2(b)+a1+a2+a3F can be obtained by equation 22(b) Thus, a1 is obtained, which constitutes a complete original data. Compared with the method that the total data block downloading amount is 11 blocks, the method that the decoding coding data downloading amount is at least 20 blocks by directly using the MSD code obviously saves the network bandwidth.
In an embodiment of the present application, in a case where the data node fails or the calibration node fails, data recovery is performed by using a byzantine fault-tolerant algorithm. Specifically, the process of the byzantine fault-tolerant algorithm PBFT is mainly divided into three stages: pre-preparation, preparation and submission. The PBFT algorithm can tolerate less than 1/3 invalid or malicious nodes, which can reach the final result faster, but requires a smaller number of nodes.
In one embodiment of the present application, the nodes of the blockchain communicate via a P2P network. In particular, using a P2P network architecture, typically built using gossip transport protocols, the network layer may contain different protocols based on different needs. The computers connected with each other are in equal positions and can serve as a server to send requests and a workstation to respond, and therefore decentralization is achieved without a central link. When a data demander needs complete information, the data demander sends a request to a data owner based on a P2P network of a block chain, the data owner receives the request, retrieves the locally stored data and responds, the nodes with corresponding slices are connected with the request nodes, and data are transmitted to the data demander.
In an embodiment of the present application, the index information is encrypted and stored in the block chain. Specifically, the nodes of the block chain may be divided into a consensus node and a common node, the consensus node is a node participating in consensus in the alliance chain, the node locally stores all local block data, the common node is a node in the alliance chain that normally executes a data sharing function, and the index information is encrypted and then uploaded to the consensus node of the block chain.
It should be noted that, as shown in fig. 3, the block chain big data sharing model architecture may be divided into a data layer, a block chain layer, a consensus layer, a contract layer, and an application layer, where the data layer is based on erasure code encoding and decoding, and combines with a piggybacking framework, so as to implement optimization of distributed storage of data slices without changing block chain links and functions, and also implement data reconstruction and repair; the network layer is the basis of block chain information interaction, carries a consensus process between nodes, realizes transparent data transmission between two end systems, and selects a Byzantine fault-tolerant algorithm PBFT as a consensus algorithm for further improvement to provide the characteristics of high throughput and low delay; the contract layer includes various codes and algorithm mechanisms, etc. and encapsulates intelligent contracts, which are digital protocols essentially composed of computer programs, and each node automatically transacts according to contract contents without intermediate operation when the conditions written in the source code are satisfied. By means of the non-modifiable block chain, the intelligent contract can ensure that all the participants can process things together, execute codes and maintain the consistency of states and resources. The method comprises the steps that after a data request is verified, an intelligent contract is executed, a script is locked according to access constraint conditions set by nodes, shared data are decrypted according to a provided secret key, a public key of the access node is used for encrypting and outputting results, after transaction sharing time passes, data slice address information is changed randomly, meanwhile, latest information is fed back to a data owner, and a data requester cannot check the data before authorization is obtained again; the application layer mainly realizes the functions of inquiring, sharing and modifying energy big data, entering a new data chain and the like through one client. The client comprises an administrator client and a user client. The administrator client can add a new energy data uplink, and can realize data sharing under the authorized condition, and the user client can perform data query under the authorized condition.
An embodiment of the present application further provides a big data sharing system, including a blockchain, a local storage of a blockchain node, and a big data sharing apparatus, where the big data sharing apparatus includes a function of executing any one of the methods described above.
The big data sharing system comprises a block chain, a local memory of a block chain node and the big data sharing device, wherein a receiving unit receives a data request sent by a data demander through the block chain, the data request comprises index information of data, and the index information comprises storage position information of the data; the acquisition unit retrieves data in a local memory of a data provider according to the index information to obtain target data; the sending unit sends the target data to the data demand side through the block chain. The data owner of the device randomly stores data in a local memory under a chain, each index information is encrypted and then uploaded to a block chain, when data transaction sharing is needed, a data demand party sends a data request to the data owner through the block chain, after the data request passes verification, local coded data are retrieved and obtained, and complete data are obtained by decoding.
The big data sharing device comprises a processor and a memory, the receiving unit, the acquiring unit, the sending unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, and the problem of excessive block chain redundancy information amount adopted by large data sharing in the prior art is solved by adjusting kernel parameters.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
An embodiment of the present invention provides a computer-readable storage medium on which a program is stored, which when executed by a processor implements the above-described method.
The embodiment of the invention provides a processor, which is used for running a program, wherein the method is executed when the program runs.
The embodiment of the invention provides equipment, which comprises a processor, a memory and a program which is stored on the memory and can run on the processor, wherein when the processor executes the program, at least the following steps are realized:
step S101, a data request is sent by a data receiving party through a block chain, the data request comprises index information of data, and the index information comprises storage position information of the data;
step S102, retrieving data in a local memory of a data provider according to the index information to obtain target data;
step S103, sending the target data to the data demander through the blockchain.
The device herein may be a server, a PC, a PAD, a mobile phone, etc.
The present application further provides a computer program product adapted to perform a program of initializing at least the following method steps when executed on a data processing device:
step S101, a data request is sent by a data receiving party through a block chain, the data request comprises index information of data, and the index information comprises storage position information of the data;
step S102, retrieving data in a local memory of a data provider according to the index information to obtain target data;
step S103, sending the target data to the data demander through the blockchain.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a computer-readable storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned computer-readable storage media comprise: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
From the above description, it can be seen that the above-described embodiments of the present application achieve the following technical effects:
1) in the big data sharing method, a data receiving party sends a data request through a block chain, wherein the data request comprises index information of data, and the index information comprises storage position information of the data; retrieving data in a local memory of a data provider according to the index information to obtain target data; and sending the target data to the data demand side through the block chain. The data owner of the method randomly stores data in a local memory under a chain, each index information is encrypted and uploaded to a block chain, when data transaction sharing is needed, a data demand party sends a data request to the data owner through the block chain, after the data request passes verification, local coded data are retrieved and obtained, and complete data are obtained by decoding.
2) In the big data sharing device, a receiving unit receives a data request sent by a data demander through a block chain, wherein the data request comprises index information of data, and the index information comprises storage position information of the data; the acquisition unit retrieves data in a local memory of a data provider according to the index information to obtain target data; the sending unit sends the target data to the data demand side through the block chain. The data owner of the device randomly stores data in a local memory under a chain, each index information is encrypted and then uploaded to a block chain, when data transaction sharing is needed, a data demand party sends a data request to the data owner through the block chain, after the data request passes verification, local coded data are retrieved and obtained, and complete data are obtained by decoding.
3) The big data sharing system comprises a block chain, a local memory of a block chain node and the big data sharing device, wherein a receiving unit receives a data request sent by a data demander through the block chain, the data request comprises index information of data, and the index information comprises storage position information of the data; the acquisition unit retrieves data in a local memory of a data provider according to the index information to obtain target data; the sending unit sends the target data to the data demand side through the block chain. The data owner of the device randomly stores data in a local memory under a chain, each index information is encrypted and then uploaded to a block chain, when data transaction sharing is needed, a data demand party sends a data request to the data owner through the block chain, after the data request passes verification, local coded data are retrieved and obtained, and complete data are obtained by decoding.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A big data sharing method is characterized by comprising the following steps:
receiving a data request sent by a data demander through a block chain, wherein the data request comprises index information of data, and the index information comprises storage position information of the data;
retrieving data in a local memory of a data provider according to the index information to obtain target data;
and sending the target data to the data demander through the block chain.
2. The method of claim 1, wherein retrieving data in a local storage of a data provider according to the index information to obtain target data comprises:
determining a storage node where a corresponding data block is located according to the storage location information, wherein data in the local storage comprises a plurality of data blocks, the data blocks are in one-to-one correspondence with the storage nodes, and the index information further comprises data block information of the target data and storage location information of the data block of the target data;
and reading the data block stored by the storage node to obtain the target data.
3. The method according to claim 2, wherein the storage nodes of the local storage include a data node and a check node, the data node is configured to store the target data, the data node is configured to store check data corresponding to the target data, and before reading the data block stored in the storage node to obtain the target data, the method further includes:
under the condition that a data node corresponding to the target data is invalid, calculating according to the check data to obtain the target data;
and under the condition that the check node corresponding to the target data fails, calculating to obtain the check data according to the target data.
4. The method according to claim 3, characterized in that in case of failure of the data node or failure of the check node, a Byzantine fault tolerance algorithm is used for data repair.
5. The method of claim 1, wherein the nodes of the blockchain communicate over a P2P network.
6. The method of claim 1, wherein the index information is stored on the blockchain after being encrypted.
7. A big data sharing apparatus, comprising:
the device comprises a receiving unit, a sending unit and a receiving unit, wherein the receiving unit is used for receiving a data request sent by a data demander through a block chain, the data request comprises index information of data, and the index information comprises storage position information of the data;
the acquisition unit is used for retrieving data in a local memory of a data provider according to the index information to obtain target data;
and the sending unit is used for sending the target data to the data demander through the block chain.
8. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a stored program, wherein the program performs the method of any one of claims 1 to 6.
9. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the method of any of claims 1 to 6.
10. A big data sharing system comprising a blockchain, a local storage of a blockchain node, and a big data sharing apparatus, characterized in that the big data sharing apparatus comprises means for performing the method of any one of claims 1 to 6.
CN202111434664.5A 2021-11-29 2021-11-29 Big data sharing method and device and big data sharing system Pending CN114244853A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111434664.5A CN114244853A (en) 2021-11-29 2021-11-29 Big data sharing method and device and big data sharing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111434664.5A CN114244853A (en) 2021-11-29 2021-11-29 Big data sharing method and device and big data sharing system

Publications (1)

Publication Number Publication Date
CN114244853A true CN114244853A (en) 2022-03-25

Family

ID=80751879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111434664.5A Pending CN114244853A (en) 2021-11-29 2021-11-29 Big data sharing method and device and big data sharing system

Country Status (1)

Country Link
CN (1) CN114244853A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115314202A (en) * 2022-10-10 2022-11-08 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Data processing method based on secure multi-party computing, electronic equipment and storage medium
CN115865364A (en) * 2022-11-24 2023-03-28 杭州微毅科技有限公司 Block chain transaction security evaluation method and system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103703446A (en) * 2012-06-11 2014-04-02 北京大学深圳研究生院 Data reconstruction method and apparatus against byzantine failure in network storage, and method and apparatus for restoring failure data
CN109274717A (en) * 2018-08-22 2019-01-25 泰康保险集团股份有限公司 Shared storage method, device, medium and electronic equipment based on block chain
WO2019179539A2 (en) * 2019-07-11 2019-09-26 Alibaba Group Holding Limited Shared blockchain data storage
WO2020001108A1 (en) * 2018-06-29 2020-01-02 阿里巴巴集团控股有限公司 Block chain-based data processing method and device
CN111261250A (en) * 2020-01-19 2020-06-09 江苏恒宝智能系统技术有限公司 Medical data sharing method and device based on block chain technology, electronic equipment and storage medium
CN111291407A (en) * 2020-01-21 2020-06-16 江苏荣泽信息科技股份有限公司 Data sharing method based on block chain privacy protection
CN111339086A (en) * 2020-02-18 2020-06-26 腾讯科技(深圳)有限公司 Block processing method, and data query method and device based on block chain
CN111444042A (en) * 2020-03-24 2020-07-24 哈尔滨工程大学 Block chain data storage method based on erasure codes
US10771524B1 (en) * 2019-07-31 2020-09-08 Theta Labs, Inc. Methods and systems for a decentralized data streaming and delivery network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103703446A (en) * 2012-06-11 2014-04-02 北京大学深圳研究生院 Data reconstruction method and apparatus against byzantine failure in network storage, and method and apparatus for restoring failure data
WO2020001108A1 (en) * 2018-06-29 2020-01-02 阿里巴巴集团控股有限公司 Block chain-based data processing method and device
CN109274717A (en) * 2018-08-22 2019-01-25 泰康保险集团股份有限公司 Shared storage method, device, medium and electronic equipment based on block chain
WO2019179539A2 (en) * 2019-07-11 2019-09-26 Alibaba Group Holding Limited Shared blockchain data storage
US10771524B1 (en) * 2019-07-31 2020-09-08 Theta Labs, Inc. Methods and systems for a decentralized data streaming and delivery network
CN111261250A (en) * 2020-01-19 2020-06-09 江苏恒宝智能系统技术有限公司 Medical data sharing method and device based on block chain technology, electronic equipment and storage medium
CN111291407A (en) * 2020-01-21 2020-06-16 江苏荣泽信息科技股份有限公司 Data sharing method based on block chain privacy protection
CN111339086A (en) * 2020-02-18 2020-06-26 腾讯科技(深圳)有限公司 Block processing method, and data query method and device based on block chain
CN111444042A (en) * 2020-03-24 2020-07-24 哈尔滨工程大学 Block chain data storage method based on erasure codes

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115314202A (en) * 2022-10-10 2022-11-08 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Data processing method based on secure multi-party computing, electronic equipment and storage medium
US11853449B1 (en) 2022-10-10 2023-12-26 Harbin Institute of Technology, (Shenzhen) (Shenzhen Int'l Technical Innovation Rearch Institute) Data processing method based on secure multi-party computation, electronic device, and storage medium
CN115865364A (en) * 2022-11-24 2023-03-28 杭州微毅科技有限公司 Block chain transaction security evaluation method and system
CN115865364B (en) * 2022-11-24 2023-11-17 杭州微毅科技有限公司 Block chain transaction security assessment method and system

Similar Documents

Publication Publication Date Title
US20220368457A1 (en) Distributed Storage System Data Management And Security
US9672108B2 (en) Dispersed storage network (DSN) and system with improved security
US20170006099A1 (en) Using broadcast for parallelized and rapid slice replication in a dispersed storage network
US11582299B2 (en) Allocating cache memory in a dispersed storage network
CN114244853A (en) Big data sharing method and device and big data sharing system
EP1612982A2 (en) Content distribution using network coding
US10437678B2 (en) Updating an encoded data slice
US10255135B2 (en) Method and apparatus for non-interactive information dispersal
US11853547B1 (en) Generating audit record data files for a transaction in a storage network
US11537470B1 (en) Audit record aggregation in a storage network
CN104520822A (en) Data storage application programming interface
US20190004727A1 (en) Using a namespace to augment de-duplication
US20170269842A1 (en) Adaptive dispersed storage network (dsn) and system
André et al. Archiving cold data in warehouses with clustered network coding
Ren et al. Data storage mechanism of industrial IoT based on LRC sharding blockchain
US20170192705A1 (en) Data formats of self-contained audit objects
US20170132079A1 (en) Rebuilding and verifying an encoded data slice utilizing slice verification information
CN113315753A (en) Block data credibility recovery method based on coding technology
US9971802B2 (en) Audit record transformation in a dispersed storage network
US20190056997A1 (en) Chaining computes in a distributed computing system
US10805042B2 (en) Creating transmission data slices for use in a dispersed storage network
US10304096B2 (en) Renting a pipe to a storage system
Song et al. Hv-snsp: A low-overhead data recovery method based on cross-checking
US11818089B1 (en) Processing requests for a data range within a data object in a distributed storage system
Meng et al. Blockchain storage method based on Erasure Code

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination