CN113839989B - Multi-node data processing method - Google Patents

Multi-node data processing method Download PDF

Info

Publication number
CN113839989B
CN113839989B CN202110998845.4A CN202110998845A CN113839989B CN 113839989 B CN113839989 B CN 113839989B CN 202110998845 A CN202110998845 A CN 202110998845A CN 113839989 B CN113839989 B CN 113839989B
Authority
CN
China
Prior art keywords
data
node
block
data block
node device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110998845.4A
Other languages
Chinese (zh)
Other versions
CN113839989A (en
Inventor
陶敬
贾健真
王嘉康
王晨旭
韩婷
王平辉
赵俊舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202110998845.4A priority Critical patent/CN113839989B/en
Publication of CN113839989A publication Critical patent/CN113839989A/en
Application granted granted Critical
Publication of CN113839989B publication Critical patent/CN113839989B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3239Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/50Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using hash chains, e.g. blockchains or hash trees

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention is a multi-node data processing method, in order to solve the problem how to synchronize the data reliably and sequentially under the complicated network environment; the method comprises four parts, namely node initialization, node adding in a block chain network, data synchronization between nodes, and data verification and storage of the nodes. The node equipment in the block chain network synchronizes and checks data, searches missing data, and finally achieves the purposes of orderly and reliably transmitting data, ensuring the authenticity of the data and realizing more data; the problem of reliable synchronization of multi-machine data in a low-bandwidth wide area network environment is solved, and a method basis and technical support are provided for node communication and data synchronization in a complex network environment.

Description

Multi-node data processing method
Technical Field
The invention belongs to the technical field of block chains, and particularly relates to a multi-node data processing method.
Background
In a large-scale information acquisition and analysis system, information acquisition terminals and analysis terminals are often scattered in various places and need to communicate in a wide area network environment. To ensure reliable information transmission, it is conventional to construct a secure communication channel or encrypt data, and transmit streaming data in a "client-server" mode. However, when the acquisition end or the transmission end is down, or the network environments of the acquisition end and the transmission end are unstable, the acquired information is often at risk of being lost and out of order. In addition, the server is difficult to horizontally expand in a client-server mode, and the risk of data loss exists. In summary, it is difficult to achieve good data synchronization in a complex network environment based on a "client-server" mode of data transmission.
Disclosure of Invention
In order to overcome the defects of the prior art and solve the problems of how to synchronize data reliably and orderly under a complex network environment, the invention aims to provide a multi-node data processing method. And each node device in the block chain network synchronizes and checks data, searches missing data, and finally achieves the purposes of orderly and reliably transmitting data, ensuring the authenticity of the data and realizing more data. The invention can be used for orderly and reliably transmitting and checking data, particularly multi-machine data in a complex network environment (such as low bandwidth, intermittent network disconnection and the like) and a wide area network environment, ensures the authenticity of the data and realizes the multi-activity of the data, and provides a method basis and a technical support for the communication and the data synchronization of node equipment in the complex network environment and the wide area network environment.
In order to achieve the purpose, the invention adopts the technical scheme that:
a multi-node data processing method includes the following steps:
step 1, initializing node equipment;
step 2, adding the initialized node equipment into a block chain network;
step 3, the node equipment added into the block chain network receives a plurality of data files input by a user, generates a data block for each data file in sequence in a file chain mode according to receiving time, and broadcasts the data block to other node equipment in the block chain network to synchronize data among the node equipment;
step 4, the node equipment of the data block receiving the broadcast checks each data block, if the check is passed, the data block is output to a user and the data file contained in the data block is stored, and if the check is not passed, the data block is discarded; the node device checks whether the previous data block of the data block is received or not for the data block which passes the check, and if not, the node device broadcasts and requests the previous data block in the block chain network.
The node device is a device capable of accessing the internet and running a node program, the node program at least comprises the data synchronization and verification method and is accompanied by a modifiable configuration file, and the content of the configuration file at least comprises:
(1) the node equipment is connected with an IP address used by the Internet and a port number used for communication;
(2) the node device is the IP address and port number of the object node device in the block chain network;
(3) a block chain network password, namely a password required by the node equipment to join the block chain network;
(4) the storage position of the data file input by the user;
(5) the node device synchronizes to the save location of the data file.
The initialization of the node equipment comprises the following steps:
(1) reading a configuration file of the mobile terminal, and identifying configuration items in the configuration file;
(2) monitoring an IP address and a port number used for communication, which are used by the node equipment for connecting the Internet;
(3) newly building a data structure, wherein the data structure is divided into two layers, and the upper layer is an array and is used for storing the IP addresses of all node equipment in the block chain network; the lower layer is a plurality of arrays with the same structure, and is used for storing the pulling condition of each data block of the file chain corresponding to the node device, that is, each value of the array is only "pulled" or "not pulled".
In step 2, the node device to be added to the blockchain network can be added to the blockchain network only by confirming information of one node device in the blockchain network, that is, the target node device, without knowing information of all other node devices in the blockchain network, and the method includes:
after the node device to be added into the block chain network reads the configuration file and starts, an empty routing table is newly established, whether the IP address and the port number used for communication are the same as those of the target node device is judged, if so, the node device to be added into the block chain network is considered to be the first node device of the block chain network, and no operation is carried out; if the two data packets are different, sending a UDP data packet I to the object node equipment to request to join the blockchain network, wherein the content of the UDP data packet I is as follows: the IP address and the port number of the node equipment to be added into the block chain network, the serial number of the next data block to be generated and a block chain network password;
the object node equipment receives the UDP packet I, checks whether the block chain network password is correct or not, discards the UDP packet I if the password is wrong, and returns a UDP packet II if the password is correct, wherein the content of the UDP packet II is as follows: serialized texts of the object node equipment routing table;
after receiving the second UDP data packet, the node device to be added into the blockchain network deserializes the received routing table, adds the information of each node device in the received routing table into the routing table, and simultaneously sends a third UDP data packet to the node devices in the received routing table one by one, wherein the third UDP data packet comprises the following contents: the IP address, the port number and the block chain network password of the node equipment to be added into the block chain network;
and the node equipment in the routing table receives the UDP data packet III, checks whether the password of the block chain network is correct, adds the node information of the node equipment to be added into the block chain network into the routing table if the password of the block chain network is correct, and discards the UDP data packet III if the password of the block chain network is incorrect.
In step 3, the mode of generating the data block by the node device is as follows:
saving the hash of the last generated data block, after receiving the data file, splicing the hash with the IP address of the node device at the head of the data file to obtain the data block corresponding to the data file, calculating the hash of the spliced file as the name of a new data block, and recording the serial number of the new data block, then setting the corresponding serial number under the IP address as 'pulled' in the data structure, wherein the serial number refers to the position of adding a file chain after the new data block is generated under the node device, namely the first data block of the file chain, the serial number of which is 0, the serial number of the second data block is 1, and so on.
The node device stores the new data block after generating the new data block, and broadcasts the new data block to other node devices in the block chain network by the following method:
the node device sends a UDP data packet four to each other node device in the routing table, where the content of the UDP data packet four is: the blockchain network password, the new data block, and its hash value and sequence number.
In the step 4, the node device of the data block receiving the broadcast checks the block chain network password in the UDP data packet four, if the password is correct, the data block, the hash value and the serial number of the data block are analyzed, and if the password is wrong, the UDP data packet four is discarded; after the node equipment receiving the broadcasted data block analyzes the information of the data block, hash is calculated on the data block, if the calculated hash value is consistent with the hash value in the UDP data packet four, the data block is considered to be reliable, the data file contained in the data block is output to a user (stored in a storage medium including but not limited to a hard disk, a network hard disk and the like in a mode specified by the user in a configuration file), and if the calculated hash value is inconsistent with the hash value, the data block is discarded.
For the data block which passes the check, the node equipment extracts the serial number of the data block and the IP address for generating the data block, and marks the data block corresponding to the serial number under the IP address as pulled; then, checking the previous sequence number of the sequence number, checking whether the data block corresponding to the sequence number is marked as "pulled" in the data structure of the current node device, if so, not performing other operations, otherwise, sending a UDP data packet five to other node devices in the routing table of the node device, wherein the content of the UDP data packet five is as follows: the IP address of the node equipment to which the data block belongs, the serial number of the previous data block, the hash value corresponding to the previous serial number and the network password of the block chain; other node equipment receives the UDP data packet five, checks the network password of the block chain, if the password is correct, checks the data structure of the node equipment, checks whether the serial number under the IP is marked as pulled, if the serial number is marked, compares whether the IP address in the UDP data packet five is the IP address of the node equipment, if yes, takes out the data block from the position of storing the input data file in the configuration file and sends the data block, and if the IP address is not the IP address of the node equipment, takes out the data block from the position of storing the data file synchronized by other node equipment in the configuration file and sends the data block; and if the password is wrong, discarding the UDP packet five.
Compared with the prior art, the invention has the beneficial effects that:
1. based on an end-to-end architecture, the identity of each node device in the block chain network is equal, the node devices can be dynamically increased and decreased as required, data can be backed up, and the risk of single-point device failure is avoided.
2. Based on the block chain technology, even if the node device does not acquire a part of the data blocks due to the block chain network or loses a part of the data blocks due to short-time disconnection of the block chain network, the lost data blocks can be traced back and pulled through the latest data blocks, and finally data consistency is achieved after the block chain network is recovered.
3. Based on the block chain technology, the data content forms a block chain in a file chain form and is stored in a chain structure, and even if a data packet is intercepted and tampered, the data packet cannot be verified through node equipment, so that the safety, reliability and authenticity of all data are ensured.
4. After the data blocks are synchronized, the node equipment can take out the data files according to the sequence of the sequence numbers of the data blocks, so that the data order is ensured.
Drawings
Fig. 1 is a schematic diagram of the overall work flow of the node device.
FIG. 2 is a schematic diagram of a multi-node device deployment.
Fig. 3 is a schematic diagram of a data block file structure.
FIG. 4 is a diagram of a completed array structure.
Fig. 5 is a schematic diagram of input and output of a node device.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the drawings and examples.
A multi-node data processing method is characterized in that each node device constructs a block chain network in a routing table mode and can be increased and decreased dynamically, each node device in the block chain network carries out data synchronization and verification, missing data of each node device is searched, and finally the purposes of orderly and reliable data transmission, data authenticity guarantee and data multi-activity realization are achieved.
As shown in fig. 1, the present invention mainly comprises the following four steps:
step 1, node equipment is initialized.
The node equipment of the invention can access the internet and run the node program, and all the node equipment can be connected with each other through the internet, and has equal identity and consistent function in the block chain network.
The node program at least comprises a data synchronization and verification method and is attached with a configuration file which can be modified by a user, the content format of the configuration file can be JSON or XML, the file format is a text file, the suffix name comprises but is not limited to cfg, conf or txt, and the content of the configuration file at least comprises:
(1) the node equipment is connected with an IP address used by the Internet and a port number used for communication;
(2) the node device is the IP address and port number of the object node device in the block chain network;
(3) a block chain network password, namely a password required by the node equipment to join the block chain network;
(4) the storage position of the data file input by the user;
(5) the node device synchronizes to the save location of the data file.
Therefore, the specific method for initializing the node device is as follows:
(1) reading a configuration file of the mobile terminal, and identifying configuration items in the configuration file;
(2) monitoring an IP address and a port number used for communication, which are used by the node equipment for connecting the Internet;
(3) newly building a data structure, wherein the data structure is divided into two layers, and the upper layer is an array and is used for storing the IP addresses of all node equipment in the block chain network; the lower layer is a plurality of arrays with the same structure, and is used for storing the pulling condition of each data block of the file chain corresponding to the node device, that is, each value of the array is only "pulled" or "not pulled".
And 2, adding the initialized node equipment into the block chain network.
In the invention, the node device to be added into the block chain network can be added into the block chain network only by confirming the information of one node device in the block chain network, namely the object node device, without knowing the information of all other node devices in the block chain network, wherein the object node device can be randomly specified in the configuration file.
The specific adding method comprises the following steps:
1) starting and reading a configuration file by the node equipment to be added into the block chain network, building an empty routing table, judging whether the IP address and the port number used for communication are the same as those of the target node equipment, if so, considering that the node equipment to be added into the block chain network is the first node equipment of the block chain network, and not operating; if the two data packets are different, sending a UDP data packet I to the object node equipment to request to join the blockchain network, wherein the content of the UDP data packet I is as follows: the IP address and the port number of the node equipment to be added into the block chain network, the serial number of the next data block to be generated and a block chain network password;
2) the object node equipment receives the first UDP data packet, checks whether the network password of the block chain is correct, if so, discards the first UDP data packet, and if so, returns the second UDP data packet, wherein the contents of the second UDP data packet are as follows: serialized text of the object node device routing table;
3) after receiving the UDP packet two, the node device to be added to the blockchain network deserializes the received routing table, adds the information of each node device in the received routing table to the routing table of the node device, and simultaneously sends a UDP packet three to the node device in the received routing table one by one, where the content of the UDP packet three is: the IP address, the port number and the block chain network password of the node equipment to be added into the block chain network;
4) and the node equipment in the routing table receives the UDP data packet III, checks whether the password of the block chain network is correct, adds the node information of the node equipment to be added into the block chain network into the routing table if the password of the block chain network is correct, and discards the UDP data packet III if the password of the block chain network is incorrect. The routing table refers to an array, a plurality of binary groups are stored in the array, and the content of the binary groups is the IP addresses and the port numbers of all other node devices in the block chain network.
And 3, the node equipment added into the block chain network receives a plurality of data files input by a user, generates a data block for each data file in sequence in a file chain mode according to the receiving time, and broadcasts the data block to other node equipment in the block chain network to synchronize data among the node equipment.
Specifically, the mode of inputting the data file to the node device by the user is as follows: given a folder or file path, or directly through a blockchain network. The node device saves the hash of the last generated data block, after receiving the data file, the hash and the IP address of the node device are spliced at the head of the data file to obtain the data block corresponding to the data file, the hash is calculated for the new data block obtained after splicing to be used as the name of the new data block and record the serial number of the new data block, and then the serial number corresponding to the IP address is set as pulled in a data structure. Here, algorithms used to compute the hash include, but are not limited to: SHA-256, MD5, etc. The sequence number refers to a position where the new data block is added to the file chain after being generated under the node device, that is, the sequence number of the first data block of the file chain is 0, the sequence number of the second data block is 1, and so on.
The node device saves the new data block after generating it, and the saving location may be any accessible path or storage middleware, including but not limited to a disk path, various distributed storage systems (such as IPFS, etc.), and broadcasts to other node devices in the blockchain network in the following way:
the node device sends a UDP data packet four to each other node device in the routing table, where the content of the UDP data packet four is: the blockchain network password, the new data block, and its hash value and sequence number.
And 4, the node equipment receiving the broadcasted data blocks checks each data block, if the check is passed, the node equipment outputs and stores the data files contained in the data block to a user, and the storage position is any accessible path or storage middleware and comprises a disk path or a distributed storage system. If the check is not passed, discarding the data block; the node device checks whether the previous data block of the data block is received or not for the data block which passes the check, and if not, the node device broadcasts and requests the previous data block in the block chain network.
Specifically, the node device of the data block receiving the broadcast checks the block chain network password in the UDP data packet four, if the password is correct, the data block, the hash value and the serial number of the data block are analyzed, and if the password is wrong, the UDP data packet four is discarded; after parsing the information of the data block, the node device receiving the broadcasted data block calculates a hash value for the data block, if the calculated hash value is consistent with the hash value in the UDP data packet four, the data block is considered to be reliable, and outputs a data file contained in the data block to a user (the data file is printed on a screen or stored in a storage medium including but not limited to a hard disk, a network hard disk and the like in a manner specified by the user in a configuration file), and if the data file is not consistent, the data block is discarded. Each node device in the blockchain network should use the same hash calculation method.
The invention adopts the following marking method:
a two-dimensional array or similar data structure is opened, all values are 'not pulled' initially, and the subscript value corresponding to the serial number is modified. The labels "pulled" and "not pulled" may be employed in code writing including, but not limited to: the modes such as Boolean value and integer are distinguished.
For the data block which passes the check, the node equipment extracts the serial number of the data block and the IP address for generating the data block, and marks the data block corresponding to the serial number under the IP address as pulled; then, checking the previous sequence number of the sequence number, checking whether the data block corresponding to the sequence number is marked as "pulled" in the data structure of the current node device, if so, not performing other operations, and if still "not pulled", sending a UDP data packet five to other node devices in the routing table, where the content of the UDP data packet five is: the IP address of the node device to which the data block belongs, the sequence number of the previous data block, the hash value corresponding to the previous sequence number and the network password of the block chain.
Other node equipment receives the UDP data packet five, checks the network password of the block chain, if the password is correct, checks the data structure of the node equipment, checks whether the serial number under the IP is marked as pulled, if the serial number is marked, compares whether the IP address in the UDP data packet five is the IP address of the node equipment, if yes, takes out the data block from the position of storing the input data file in the configuration file and sends the data block, and if the IP address is not the IP address of the node equipment, takes out the data block from the position of storing the data file synchronized by other node equipment in the configuration file and sends the data block; and if the password is wrong, discarding the UDP packet five.
Based on the steps of the invention, a blockchain network of a plurality of node devices can be deployed. Here, three node devices are taken as an example, and the number of the node devices is not limited in actual deployment.
Assume that the IP addresses of three node devices in the blockchain network are x.x.x.1, x.x.x.2, and x.x.x.3, respectively, and the used port numbers are 8080. For convenience of description, hereinafter, a node device with an IP address of x.x.x.1 is referred to as node 1, a node device with an IP address of x.x.x.2 is referred to as node 2, and a node with an IP address of x.x.x.3 is referred to as node 3, as shown in fig. 2.
Node 1, node 2, and node 3 form the blockchain network, and can synchronize data blocks with each other. The routing table of the node 1 comprises information of the nodes 2 and 3; the routing table of the node 2 contains the information of the nodes 1 and 3; the routing table of the node 3 includes information of the nodes 1 and 2. Because the equipment identities of all nodes in the network are equal, the routing table is not changed due to the change of the adding sequence of the nodes.
For each node device in the blockchain network, the routing table of the node device contains the IP addresses and port numbers of all other node devices in the network. Taking a block chain network including five node devices as an example, the routing table of each node device includes information of other four node devices, and so on.
The following describes the data blocks in the file chain defined by the present invention in detail with reference to fig. 3.
Each time a user inputs a data file, a data block is correspondingly generated. The data blocks are also stored in a file format, each data block corresponds to a file and is stored in a path for storing an upload file designated by a user in a configuration file.
The data block includes three parts: the hash of the last data block, the IP address of the node device that generated the data block, and the data file entered by the user. For the first data chunk of the file chain (i.e., the data chunk with sequence number 0), the data chunk has no preceding chunk, so the hash portion of the last data chunk in the chunk is replaced by a null hash (e.g., an all-zero string) that can be predefined by the user. When other node devices receive such a data block with sequence number 0, it will not check its previous block.
From the perspective of the file chain, each data block except the first data block contains the hash of the last data block, thereby constituting a chain structure. Modification of any data file on the file chain can result in modification of the data block hash, so that the non-tamper property and the security of the data file are ensured.
The data structure for storing the pull status of the data block (hereinafter referred to as the completion array) defined in the present invention is described in detail with reference to fig. 4.
Each node device in the blockchain network opens up the data structure upon initialization. The completion array consists of two layers: the upper layer is an array used for storing the IP addresses of all the node devices in the block chain network; the lower layer is a plurality of arrays with the same structure, and is used for storing the pulling condition of each data block of the file chain corresponding to the node device, that is, each value of the array is only "pulled" or "not pulled".
Completing an array may be accomplished in a variety of ways, including but not limited to a combination of basic data structures such as arrays, hash tables, bit sets, and the like.
The IP addresses stored in the upper layer of the completion array are the IP addresses (including the IP addresses) of all node devices in the block chain network, and when the node devices receive data files uploaded by users and generate data blocks or the node devices receive data blocks sent by other node devices, the states of the corresponding serial numbers under the IP addresses corresponding to the completion array are updated to be 'pulled'.
For each node device in the block chain network, the completion array contains the file chain states of all the node devices, so that the node devices can achieve final consistency, namely all the node devices can completely hold all blocks of all the node devices in the block chain network.
In summary, the node device defined by the present invention can upload data files input by a user safely and reliably by adding to the blockchain network, and receive data files uploaded by other users. The input and output schematic diagram of the node device is shown in fig. 5.
Based on the technical scheme, the node equipment mainly comprises a node communication module, a data uploading module, a data downloading module, a data checking module and a data searching module, and the technical details of the modules are as follows:
node communication module
The node communication module is used for monitoring the port, maintaining a routing table of the node equipment, and processing and sending UDP data packets. The module needs a specified configuration file, and the content comprises:
1. the node equipment is connected with an IP address used by the Internet and a port number used for communication; 2. the node device is the IP address and port number of the object node device in the block chain network; 3. a block chain network password, namely a password required by the node equipment to join the block chain network; 4. the storage position of the data file input by the user;
5. the node device synchronizes to the save location of the data file.
After the node device is started, reading the configuration file, and starting the module, wherein the module performs the following processing:
1. checking and opening a port for monitoring the IP address;
2. newly building an array as a routing table, and storing a binary group consisting of the IP address and the port number of the node equipment in the table;
3. sending a UDP (user Datagram protocol) data packet I to the object node equipment in the configuration file, wherein the data packet I comprises the following contents:
a. the IP address and port number of the node device;
b. the sequence number of the next block to be generated;
c. a blockchain network password.
4. And the node communication module of the object node equipment receives the UDP data packet I sent by the node equipment, checks whether the block link network password is correct or not, returns a UDP data packet II if the block link network password is correct, and discards the data packet if the block link network password is wrong. And the content in the returned UDP packet II is the serialized text of the routing table stored in the node equipment.
5. And after receiving the UDP data packet II, the node equipment deserializes the routing table, adds the information of each node equipment in the routing table into the routing table, and simultaneously sends UDP data packets III to the node equipment one by one. The content of the UDP packet three is:
a. the IP address and port number of the node device;
b. a blockchain network password.
6. And the node communication modules of other node equipment in the block chain network receive the UDP packet III, check whether the password of the block chain network is correct or not, add the node equipment information into a routing table of the node equipment if the password of the block chain network is correct, and discard the packet III if the password of the block chain network is incorrect.
After the module completes the processing, the node device may reach the following state:
1. the node equipment is added into the block chain network, acquires the IP addresses and port numbers of other node equipment in the system and stores the IP addresses and port numbers in the routing table;
2. the node equipment monitors the appointed port, waits for receiving UDP data packets sent by other node equipment, distinguishes the types of the data packets through the packet headers of the data packets, and transfers the data packets to a corresponding module (such as a data downloading module, a data searching module or the node equipment) for processing.
Second, data upload module
The data uploading module is started after the node equipment joins the block chain network and is used for receiving the data file to be synchronized input by the user. The processing flow after the module is started is as follows:
1. circularly receiving data files input by a user in a designated mode, such as searching files through a path, inputting through a port and the like;
2. the module performs the following operations each time it receives a data file:
a. reading the content of the data file;
b. adding a hash value of a block (if the block is the first block, the hash value is an agreed initial value, such as an all-zero character string) and the IP address of the node device to the header of the file, wherein the modified data file is a newly generated data block;
c. calculating the hash value of the data block, wherein the algorithm adopted by the calculation can be public algorithms such as SHA-256 or MD5, and the calculated hash value can be encoded by encoding algorithms such as Base58 or Base64, so that the readability of a user is enhanced;
d. storing the block into a file system which is specified by a user in a configuration file and used for storing the received data block, such as a hard disk, a block chain network hard disk, a distributed storage system and the like, naming the file as the hash value of the data block, and marking the data block as 'pulled' in a finished group;
e. assembling a UDP data packet IV, wherein the content in the packet is the hash value of the data block, the serial number of the data block, the content of the data block and the block chain network password;
f. and delivering the UDP data packet to a node communication module to broadcast the UDP data packet to the whole block chain network.
The data uploading module continuously operates during the starting period of the node equipment, receives a data file input by a user, generates a data block and then stores and broadcasts the data block.
Third, data download module
The data downloading module is started after the node equipment joins the block chain network, and is used for receiving the blocks broadcast by other node equipment and calling the data checking module to check the data blocks. The module starts the following post-processing flow:
1. establishing and initializing a completion array for the node equipment information in each routing table, and marking each item in the completion array as 'not pulled';
2. and waiting for the UDP packet IV forwarded by the communication module of the receiving node, and performing the following processing on each packet IV:
a. checking the password of the block chain network in the UDP data packet four, if the password is correct, continuing the process, and if the password is wrong, directly stopping the process and discarding the data packet;
b. taking out the data block content, calling a data checking module to check the block, continuing the process if the check is passed, or directly stopping the process if the check is not passed;
c. reading IP address information in a data block, and modifying the value of the position corresponding to the serial number of the block under the IP address into pulled in a completion array;
d. storing the data block content to a storage position of a data block which is specified by a user and is stored synchronously in a configuration file, such as a hard disk, a block chain network hard disk, a distributed storage system and the like, wherein the file is named as a hash value of the data block;
e. transmitting the file content, the serial number, the hash and other information of the data block to a data searching module;
f. and outputting the data file of the data block to a user, such as screen printing, port output and the like.
And the data downloading module continuously operates during the starting period of the node equipment, receives the data blocks sent by other node equipment, and outputs the data files to the user after verification.
Fourth, data checking module
The data checking module is started after the node equipment joins the block chain network and is used for checking authenticity and accuracy of data block contents broadcast by other node equipment. The module waits to be called by a data downloading module after being started, and the specific processing flow during calling is as follows:
1. reading a hash value I of a data block to be checked and the content of the data block;
2. checking the structure of the data block, judging whether the hash of the previous block and the IP address of the node equipment are included, if the structure is correct, continuing the process, and if the structure is wrong, directly stopping the process;
2. and calculating a second hash value of the data block by using the same algorithm as the data uploading module, comparing the read first hash values, if the read first hash values are consistent, the verification is passed, and if the read first hash values are inconsistent, the verification fails.
Fifth, data search module
And the data searching module is started after the node equipment joins the block chain network, receives the data block information transmitted by the data downloading module, and is used for checking the integrity of the file chain and searching the data blocks missing from other nodes.
The module has the following specific processing flow:
1. waiting for receiving data block information transmitted by the data downloading module or the node communication module;
2. for the data block information transmitted by each data downloading module, the following operations are carried out:
a. and checking a value corresponding to the serial number of the next previous block (namely, the serial number of the current data block minus one) of the corresponding IP address in the finished array, if the value is 'not pulled', continuing the process, and if the value is 'pulled', directly ending the process. Specifically, if the sequence number of the current data block is 0, the process is terminated directly
b. Assembling a UDP data packet V, wherein the content is as follows: the IP address of the node device to which the previous data block belongs, the serial number of the previous data block, the hash value (which can be obtained from the current data block) corresponding to the serial number, and the network password of the block chain;
c. and delivering the UDP data packet five to a node communication module to broadcast the whole block chain network.
3. For the data block information transferred by each node communication module, the following operations are carried out:
a. checking the password of the blockchain network, if the password is correct, continuing the process, and if the password is wrong, directly ending the process;
b. taking out the IP address I of the node equipment to which the data block in the UDP data packet five belongs, comparing the IP address I with the IP address II of the node equipment, if the IP address I is the same as the IP address II of the node equipment, executing the flow c1, and if the IP address I is different from the IP address II of the node equipment, executing the flow c 2;
c1. searching a data block generated by the user in a storage path of an uploaded file specified by the user in the configuration file, and finding a corresponding data block file through the hash value and the sequence number;
c2. searching a completion array, checking whether the serial number is marked as 'pulled' or not after the IP address is searched, if so, searching a corresponding data block file from a path for storing the received data block specified by a user in the configuration file according to the serial number and the hash value, and if not, directly ending the flow;
d. reading the data block file, and assembling a UDP data packet IV, wherein the contents are as follows: the hash value of the data block, the serial number of the data block, the IP address to which the data block belongs, the content of the data block, and the network password of the block chain;
e. and fourthly, the UDP data packet is delivered to the node communication module to be broadcast to the whole block chain network.
The data searching module processes the information from the data downloading module and the node communication module respectively and runs continuously during the starting period of the node equipment.

Claims (8)

1. A multi-node data processing method, comprising the steps of:
step 1, initializing node equipment;
step 2, adding the initialized node equipment into a block chain network;
step 3, the node equipment added into the block chain network receives a plurality of data files input by a user, generates a data block for each data file in sequence in a file chain mode according to receiving time, and broadcasts the data block to other node equipment in the block chain network to synchronize data among the node equipment;
step 4, the node equipment of the data block receiving the broadcast checks each data block, if the check is passed, the data block is output to a user and the data file contained in the data block is stored, and if the check is not passed, the data block is discarded; the node device checks whether the previous data block of the data block is received or not for the data block which passes the check, and if not, the node device broadcasts and requests the previous data block in the block chain network.
2. The multi-node data processing method as claimed in claim 1, wherein the node device is a device capable of accessing the internet and running a node program, the node program at least comprises the data synchronization and verification method and is accompanied by a modifiable configuration file, and the contents of the configuration file at least comprise:
(1) the node equipment is connected with an IP address used by the Internet and a port number used for communication;
(2) the node device is the IP address and port number of the object node device in the block chain network;
(3) a block chain network password, namely a password required by the node equipment to join the block chain network;
(4) the storage position of the data file input by the user;
(5) the node device synchronizes to the save location of the data file.
3. The multi-node data processing method according to claim 2, wherein initialization of the node device is performed by:
(1) reading a configuration file of the mobile terminal, and identifying configuration items in the configuration file;
(2) monitoring an IP address and a port number used for communication, which are used by the node equipment for connecting the Internet;
(3) newly building a data structure, wherein the data structure is divided into two layers, and the upper layer is an array and is used for storing the IP addresses of all node equipment in the block chain network; the lower layer is a plurality of arrays with the same structure, and is used for storing the pulling condition of each data block of the file chain corresponding to the node device, that is, each value of the array is only "pulled" or "not pulled".
4. The multi-node data processing method according to claim 3, wherein in step 2, the node device to be added to the blockchain network can be added to the blockchain network only by confirming information of one node device in the blockchain network, that is, the target node device, without knowing information of all other node devices in the blockchain network, and the method includes:
after the node device to be added into the block chain network reads the configuration file and starts, an empty routing table is newly established, whether the IP address and the port number used for communication are the same as those of the target node device is judged, if so, the node device to be added into the block chain network is considered to be the first node device of the block chain network, and no operation is carried out; if the two data packets are different, sending a UDP data packet I to the object node equipment to request to join the blockchain network, wherein the content of the UDP data packet I is as follows: the IP address and the port number of the node equipment to be added into the block chain network, the serial number of the next data block to be generated and a block chain network password;
the object node equipment receives the UDP packet I, checks whether the block chain network password is correct or not, discards the UDP packet I if the password is wrong, and returns a UDP packet II if the password is correct, wherein the content of the UDP packet II is as follows: serialized texts of the object node equipment routing table;
after receiving the second UDP data packet, the node device to be added into the blockchain network deserializes the received routing table, adds the information of each node device in the received routing table into the routing table, and simultaneously sends a third UDP data packet to the node devices in the received routing table one by one, wherein the third UDP data packet comprises the following contents: the IP address, the port number and the block chain network password of the node equipment to be added into the block chain network;
and the node equipment in the routing table receives the UDP data packet III, checks whether the password of the block chain network is correct, adds the node information of the node equipment to be added into the block chain network into the routing table if the password of the block chain network is correct, and discards the UDP data packet III if the password of the block chain network is incorrect.
5. The multi-node data processing method according to claim 1, wherein in step 3, the node device generates the data block by:
saving the hash of the last generated data block, after receiving the data file, splicing the hash with the IP address of the node device at the head of the data file to obtain the data block corresponding to the data file, calculating the hash of the spliced file as the name of a new data block, and recording the serial number of the new data block, then setting the corresponding serial number under the IP address as 'pulled' in the data structure, wherein the serial number refers to the position of adding a file chain after the new data block is generated under the node device, namely the first data block of the file chain, the serial number of which is 0, the serial number of the second data block is 1, and so on.
6. The multi-node data processing method of claim 5, wherein the node device saves the new data block after generating it, and broadcasts it to other node devices in the blockchain network as follows:
the node device sends a UDP data packet four to each other node device in the routing table, where the content of the UDP data packet four is: the blockchain network password, the new data block, and its hash value and sequence number.
7. The multi-node data processing method according to claim 6, wherein in step 4, the node device that receives the broadcasted data block checks the network password of the block chain in the UDP packet four, and if the password is correct, the data block, the hash value and the sequence number thereof are analyzed, and if the password is incorrect, the UDP packet four is discarded; and after the node equipment receiving the broadcasted data block analyzes the information of the data block, calculating the hash value of the data block, if the calculated hash value is consistent with the hash value in the UDP data packet IV, considering that the data block is reliable, outputting a data file contained in the data block to a user, and if the calculated hash value is inconsistent with the hash value in the UDP data packet IV, discarding the data block.
8. The multi-node data processing method of claim 7, wherein the node device extracts the serial number of the data block and the IP address generating the data block for the data block that passes the check, and marks the data block corresponding to the serial number under the IP address as "pulled"; then, checking the previous sequence number of the sequence number, checking whether the data block corresponding to the sequence number is marked as "pulled" in the data structure of the current node device, if so, not performing other operations, otherwise, sending a UDP data packet five to other node devices in the routing table of the node device, wherein the content of the UDP data packet five is as follows: the IP address of the node equipment to which the data block belongs, the serial number of the previous data block, the hash value corresponding to the previous serial number and the network password of the block chain; other node equipment receives the UDP data packet five, checks the network password of the block chain, if the password is correct, checks the data structure of the node equipment, checks whether the serial number under the IP is marked as pulled, if the serial number is marked, compares whether the IP address in the UDP data packet five is the IP address of the node equipment, if yes, takes out the data block from the position of storing the input data file in the configuration file and sends the data block, and if the IP address is not the IP address of the node equipment, takes out the data block from the position of storing the data file synchronized by other node equipment in the configuration file and sends the data block; and if the password is wrong, discarding the UDP packet five.
CN202110998845.4A 2021-08-28 2021-08-28 Multi-node data processing method Active CN113839989B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110998845.4A CN113839989B (en) 2021-08-28 2021-08-28 Multi-node data processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110998845.4A CN113839989B (en) 2021-08-28 2021-08-28 Multi-node data processing method

Publications (2)

Publication Number Publication Date
CN113839989A CN113839989A (en) 2021-12-24
CN113839989B true CN113839989B (en) 2022-08-05

Family

ID=78961429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110998845.4A Active CN113839989B (en) 2021-08-28 2021-08-28 Multi-node data processing method

Country Status (1)

Country Link
CN (1) CN113839989B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117614607B (en) * 2024-01-18 2024-04-12 深圳市海域达赫科技有限公司 Information security transmission system and method based on block chain

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106452785A (en) * 2016-09-29 2017-02-22 财付通支付科技有限公司 Block chain network, branch node and block chain network application method
CN107249009A (en) * 2017-08-02 2017-10-13 广东工业大学 A kind of data verification method and system based on block chain
CN108462582A (en) * 2018-02-09 2018-08-28 北京欧链科技有限公司 Feedback method for treating in block chain and device
CN108470039A (en) * 2018-02-09 2018-08-31 北京欧链科技有限公司 Data processing method and device in block chain
CN108494828A (en) * 2018-02-26 2018-09-04 网易(杭州)网络有限公司 A kind of update method of node data, medium, device and computing device
CN109460405A (en) * 2018-09-27 2019-03-12 上海点融信息科技有限责任公司 For the block generation method of block chain network, synchronous method, storage medium, calculate equipment
WO2019081917A1 (en) * 2017-10-24 2019-05-02 Copa Fin Limited Data storage and verification
EP3554051A1 (en) * 2017-02-17 2019-10-16 Alibaba Group Holding Limited Data processing method and device
CN110430179A (en) * 2019-07-26 2019-11-08 西安交通大学 A kind of control method and system for intranet and extranet secure access
CN111177265A (en) * 2019-12-27 2020-05-19 安徽讯呼信息科技有限公司 Block chain domain division method
CN111382463A (en) * 2020-04-02 2020-07-07 中国工商银行股份有限公司 Block chain system and method based on stream data

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106452785A (en) * 2016-09-29 2017-02-22 财付通支付科技有限公司 Block chain network, branch node and block chain network application method
EP3554051A1 (en) * 2017-02-17 2019-10-16 Alibaba Group Holding Limited Data processing method and device
CN107249009A (en) * 2017-08-02 2017-10-13 广东工业大学 A kind of data verification method and system based on block chain
WO2019081917A1 (en) * 2017-10-24 2019-05-02 Copa Fin Limited Data storage and verification
CN108462582A (en) * 2018-02-09 2018-08-28 北京欧链科技有限公司 Feedback method for treating in block chain and device
CN108470039A (en) * 2018-02-09 2018-08-31 北京欧链科技有限公司 Data processing method and device in block chain
CN108494828A (en) * 2018-02-26 2018-09-04 网易(杭州)网络有限公司 A kind of update method of node data, medium, device and computing device
CN109460405A (en) * 2018-09-27 2019-03-12 上海点融信息科技有限责任公司 For the block generation method of block chain network, synchronous method, storage medium, calculate equipment
CN110430179A (en) * 2019-07-26 2019-11-08 西安交通大学 A kind of control method and system for intranet and extranet secure access
CN111177265A (en) * 2019-12-27 2020-05-19 安徽讯呼信息科技有限公司 Block chain domain division method
CN111382463A (en) * 2020-04-02 2020-07-07 中国工商银行股份有限公司 Block chain system and method based on stream data

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"A review of Artificial Intelligence Emerging technologies and challenges in Block Chain Technology";Mamata Rath;《2019 International Conference on Smart Systems and Inventive Technology (ICSSIT)》;20200210;全文 *
"基于区块链的业务协同数据安全共享方案";王冠;《信息安全研究》;20210721;全文 *
"有向网络兴趣社区的快速挖掘";王晨旭;《西安交通大学学报》;20140630;全文 *

Also Published As

Publication number Publication date
CN113839989A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
US6681248B1 (en) Method for port connectivity discovery in transparent high bandwidth networks
Li et al. Research based on OSI model
CN101217431B (en) A method and system of photos in synchronous mobile terminal network TV and network album
US8510401B2 (en) File folder transmission on network
JP4829316B2 (en) Method, apparatus, and system for synchronizing data in response to an interrupted synchronization process
KR101193001B1 (en) Method, system, and device for data synchronization
CN111083161A (en) Data transmission processing method and device and Internet of things equipment
EP1562351B1 (en) Distributing membership information for multi-party application layer sessions
US20080250110A1 (en) Peer-to-peer messaging system
Li et al. A brief introduction to ndn dataset synchronization (ndn sync)
CN113839989B (en) Multi-node data processing method
CN108337163B (en) Method and apparatus for aggregating links
CN112994839A (en) Flexible Ethernet overhead multiframe receiving method, device, equipment and medium
US9294528B2 (en) System and method for delivering content over a multicast network
US20060224758A1 (en) System and method for file header operation in a peer-to-peer network providing streaming services
US20040068575A1 (en) Method and apparatus for achieving a high transfer rate with TCP protocols by using parallel transfers
CN112131014B (en) Decision engine system and business processing method thereof
Burns Hands-On Network Programming with C# and. NET Core: Build robust network applications with C# and. NET Core
CN113746807A (en) Block chain node point support cryptographic algorithm communication detection method
US7788384B1 (en) Virtual media network
CN112311817A (en) Multimedia data access method based on multi-protocol convergence network
US20050251676A1 (en) Method for offloading the digest portion of protocols
Ilie Gnutella network traffic: Measurements and characteristics
TWI603599B (en) Data sharing method and electronic device using same
CN111200804B (en) Data synchronization method and system based on Bluetooth transmission

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant