CN112346908A - Data backup method in peer-to-peer distributed system - Google Patents
Data backup method in peer-to-peer distributed system Download PDFInfo
- Publication number
- CN112346908A CN112346908A CN201910736515.0A CN201910736515A CN112346908A CN 112346908 A CN112346908 A CN 112346908A CN 201910736515 A CN201910736515 A CN 201910736515A CN 112346908 A CN112346908 A CN 112346908A
- Authority
- CN
- China
- Prior art keywords
- node
- data
- list
- backup
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1464—Management of the backup or restore process for networked environments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1469—Backup restoration techniques
Abstract
The invention provides a data backup method in a peer-to-peer distributed system, which is used for improving the data redundancy in the peer-to-peer distributed system and ensuring the data reliability in a severe environment. A user access node scans nodes in a system to obtain an online node list; the user access node requests the file storage status of each node, completes the acquisition of the current global file storage status and constructs a global storage list; the user access node completes the construction of the global data storage capacity table according to the file storage condition of each node and the global storage list; sequencing the nodes according to a priority rule set by a user to realize the construction of a candidate node pool; calculating the backup quantity of data according to the backup proportion specified by a user, and sequentially acquiring candidate nodes from the candidate node pool according to the quantity to realize the selection of the backup nodes; and the user data uploading node sends a data backup instruction and the data to be backed up to the candidate node, and the candidate node receives and stores the data to complete data backup.
Description
Technical Field
The invention relates to the field of computer data storage, in particular to a data backup method in a peer-to-peer distributed system.
Background
With the rapid development of computer technology and information technology, data storage technology is more and more widely applied, and the data storage requirements under different environments are increasing day by day. Most of the current mainstream data storage schemes rely on a centralized system. The centralized system is that one or more host computers form a central node, data is stored in the central node in a centralized manner, all service units of the whole system are deployed on the central node in a centralized manner, and all functions of the system are processed in a centralized manner. In a centralized system, each terminal or client machine is solely responsible for the entry and output of data, while the data storage and control process is done entirely by the nodes. The centralized system has the greatest characteristic of simple deployment structure, and is usually based on large nodes with excellent bottom performance, so that how to deploy a plurality of nodes for a service does not need to be considered, and the problem of distributed cooperation among the nodes does not need to be considered. However, under the scenes of military operations, emergency disaster relief, field exploration, resource exploration and acquisition and the like which need data storage, the centralized system is difficult to adapt to the characteristics of rapid deployment, poor basic conditions, strong mobility, random access positions, high equipment destructiveness and the like.
A distributed system, as opposed to a centralized system, is a software system in which hardware or software components are distributed among different network computers and communicate and coordinate with each other solely through message passing. Distributed systems allow a large number of applications to access data stored in local or remote databases. In this case, the data distribution is achieved by a replication process. A standard distributed system will have the characteristics of distribution, peering, and concurrency without any specific business logic constraints. Due to the characteristics, the distributed system is suitable for occasions where centralized systems such as military operations, emergency disaster relief, field exploration, resource exploration and collection and the like are not suitable.
The distributed system is divided into a centralized management type distributed system with non-peer nodes and a peer type distributed system with peer nodes. The centralized control type distributed system needs one or more core nodes to perform global control on the whole system, and the externally provided access interface is necessarily limited. The number of core nodes occupies a lower number weight among all the nodes, which causes the stability of the distributed system depending on the centralized management of the core nodes to be greatly influenced by the stability of the core nodes. The peer-to-peer distribution system has the characteristics of no dependence on core nodes, equal topology and identical functions of all nodes, and can avoid the dependence of the overall stability of the system on the stability of key nodes, so that the stability of the whole system is not limited by a small number of specific nodes.
With the advent of the big data application era, data storage backup becomes more and more important, and a distributed system also needs to perform data backup to ensure the redundancy and the availability of data. The data backup method of the distributed system needs to have the characteristics of safety, reliability, simplicity, convenience and the like. The backup content is guaranteed to be complete and effective. The backup and restore do not require complicated manual operations. The data storage and backup system is oriented to an application program database, a service system and a core server, and realizes the functions of data storage, data backup and recovery, system backup and recovery, application program backup and recovery and the like.
In summary, in order to implement data backup in a peer-to-peer distributed system, a data caching method based on the peer-to-peer distributed system needs to be designed, in which a data node uploaded by a user scans an online node list of the whole system to request file storage conditions of the online node, then the data uploading node constructs a global data storage capacity table according to the file storage conditions of the online node, sorts the nodes according to priorities set by the user according to the capacity table of all data storage to complete construction of a candidate node pool, then calculates the number of backup nodes according to a backup proportion specified by the user, sequentially selects backup nodes from the candidate node pool, finally the data uploading node of the user sends data to be backed up to the backup nodes, and the backup nodes receive the data to complete data backup. The redundancy and the availability of data are ensured.
Disclosure of Invention
In view of this, the present invention provides a data backup method in a peer-to-peer distributed system, where the method includes:
a user access node scans a node list in a system to obtain an online node list;
the user access node requests the file storage status of each node in the network, completes the acquisition of the current global file storage status and constructs a global storage list;
the user access node completes the construction of the global data storage capacity table according to the file storage condition of each node and the global storage list;
sequencing the nodes according to a priority rule set by a user to realize the construction of a candidate node pool;
calculating the backup quantity of data according to the backup proportion specified by a user, and sequentially acquiring candidate nodes from the candidate node pool according to the quantity to realize the selection of the backup nodes;
and the user data uploading node sends a data backup instruction and the data to be backed up to the candidate node, and the candidate node receives and stores the data to complete data backup.
The method for acquiring the online node list comprises the following steps:
defining a user access node as AN, a node list in a distributed system as LN, AN online node list as LON, a node as Ni and AN online node as ONj; f (Ni) - > ONj is defined as the direct mapping relation between the nodes and the online nodes.
And the user access node AN sequentially sends handshake data packets to the node Ni according to a pre-stored node list LN, determines AN online node ONj according to the response, and obtains AN online node list LON according to F (Ni) — > ONj, wherein the LON is a subset of the LN.
The method for acquiring the storage condition of the global file comprises the following steps:
the file storage information of each online node ONj is defined as FONj, the file storage information list, i.e., the global storage list, of all online nodes ONj is defined as FLON, and an algorithm for acquiring the global storage list FLON through the node storage information FON is defined.
The user access node AN sends a file storage information request packet to the node ONj in the online node list LON in sequence, the node receiving the request packet sends its file storage information FON to the requester, and when the user access node requests the file storage information of all online nodes, a global storage list is constructed according to AN algorithm.
The construction method of the global storage capability table comprises the following steps:
and defining a global storage capability table as SLON and defining the storage space of each online node as SONj. An algorithm for acquiring the global storage capability table SLON through the node storage information FLON is defined.
And after the FLON construction is completed, acquiring a global storage capability table SLON from the global storage list FLON according to an algorithm.
The construction method of the candidate node pool comprises the following steps:
the candidate node pool is defined as CNP, and the user-defined node priority rule is R.
And sequencing the nodes according to a node priority rule R set by a user to realize the construction of a candidate node pool CNP.
The backup node selection comprises the following steps:
defining the backup ratio as eta, the number of the candidate node pool nodes as SCNP and the candidate node list CNL.
And calculating the number of candidate nodes according to the backup proportion. And sequentially selecting a corresponding number of nodes from the candidate node pool to construct a candidate node list CNL.
The method for sending the backup data comprises the following steps:
and the user access node AN sequentially sends a cache command packet and a cache data packet according to the candidate node list CNL to realize data backup.
The invention has the following technical effects: the method and the system have the advantages that the online node list of the data node scanning system uploaded by a user is used for requesting the file storage condition of the online node, the global data storage capacity table is constructed, the candidate node pool is constructed, the number of backup nodes is calculated, the backup nodes are selected, the data to be backed up is sent, and the like, so that the backup of the data in the peer-to-peer distributed system is completed, the redundancy and the availability of the data in the peer-to-peer distributed system are ensured, and the system can meet the requirements of different application scenes.
Drawings
Fig. 1 is a flowchart of a data backup method in a peer-to-peer distributed system according to the present invention.
Detailed Description
The invention is described in detail below by way of example with reference to the accompanying drawings.
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.
The method comprises the steps of utilizing an online node list of a user access node scanning system to request file storage conditions of online nodes, constructing a global data storage capacity table, constructing a candidate node pool, calculating the number of backup nodes, selecting the backup nodes, sending data to be backed up and the like to complete backup of the data in a peer-to-peer distributed system, and realizing backup of the data in the peer-to-peer distributed system with redundancy and usability.
Example one
The embodiment of the invention provides a method for acquiring an online node list, which comprises the following steps:
the invention defines that the user access node is AN, the node list in the distributed system is LN, the online node list is LON, the node is Ni, and the online node is ONj; f (Ni) - > ONj is defined as the direct mapping relation between the nodes and the online nodes.
The mapping function F is implemented as follows:
setting a variable a as the number of existing elements in the LON;
then the definition of F for the input node x ∈ LON is as follows:
F(x)=ONa
thus, an online node list LON is obtained according to F, the LON being a subset of LNs.
Therefore, the online node list is obtained, and preparation is made for obtaining the global file storage condition in the next step.
Example two
With reference to the content of the first embodiment, an embodiment of the present invention provides an obtaining method for constructing a global file storage status, where:
the invention defines the file storage information of each online node ONj as FONj, the file storage information list of all online nodes ONj, namely the global storage list, as FLON, and defines an algorithm for acquiring the global storage list FLON through the node storage information LON.
Algorithm 1 obtaining a global storage list FLON from an online node list LON
Input-Online node List LON
Output Global storage List FLON
ForeachONjasLON
create FONjrequest Rj
sendRj to ONj
End For
ForeachONjasLON
listen askAj
End For
ForeachONjasLON
selectFONjfromAj
add FONjto FLON
End For
ReturnFLON
Therefore, the acquisition of the global file storage list is realized, and preparation is made for the construction of the next global storage capacity table.
EXAMPLE III
With reference to the content of the second embodiment, this embodiment provides a method for constructing the global storage capability table, and a specific method flow of this embodiment includes:
and defining the global storage capability table as SLON, and defining an algorithm for acquiring the global storage capability table SLON through the global storage list FLON.
Algorithm 2 obtaining global storage capability table by global storage list FLON
SLON
Global storage List FLON
Output global storage capability table SLON
ForeachFONiasFLON
VarRONi=SONi–FONi
ADDRONitoSLON
End For
ReturnSLON
Therefore, the construction of the global storage capacity table is obtained, and preparation is made for the construction of a candidate node pool in the next step.
Example four
With reference to the content of the third embodiment, this embodiment provides a method for constructing a candidate node pool, and a specific method flow of this embodiment includes:
the invention defines a global storage capacity table SLON, a candidate node pool CNP and a node priority rule R defined by a user.
Algorithm 3 of candidate node pool CNP
Construction of
Input global storage capability table
SLON
Output candidate node pool CNP
varflag=true
copy SLON to CNP
While flag==true
flag=false
ForeachSONiasSLON
if R(SONi,SONi+1)==false
swap(SONi,SONi+1)
flag=true
end if
End For
Loop
ReturnCNP
Therefore, the construction of the candidate node pool is realized, and preparation is made for the selection of the next backup node.
EXAMPLE five
In combination with the content of the third embodiment, this embodiment provides a backup node selection method, and a specific method flow of this embodiment includes:
defining the backup ratio as eta, the number of the candidate node pool nodes as SCNP and the candidate node list CNL.
And calculating the number of candidate nodes according to the backup proportion. And sequentially selecting a corresponding number of nodes from the candidate node pool to construct a candidate node list CNL.
Algorithm 4 construction of candidate node list CNL
Input backup ratio η, candidate node pool CNP
Output candidate node list CNL
ForeachONiasCNPand(i<SCNP*η)
Add Oni to CNL
End For
ReturnCNL
Thus, the construction of the backup node list is realized, and the backup node list is prepared for the next step.
EXAMPLE six
With reference to the content of the third embodiment, this embodiment provides a method for sending backup data, and a specific flow of the method of this embodiment includes:
the user access node AN firstly converts the data to be sent into data stream based on JSON protocol, and then uses TCP connection to send a cache command packet and a cache data packet in sequence according to a backup node list CNP to realize data backup.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (7)
1. A method of data backup in a peer-to-peer distributed system, the method comprising:
a user access node scans nodes in a system to obtain an online node list;
the user access node requests the file storage status of each node in the network, completes the acquisition of the current global file storage status and constructs a global storage list;
the user access node completes the construction of the global data storage capacity table according to the file storage condition of each node and the global storage list;
sequencing the nodes according to a priority rule set by a user to realize the construction of a candidate node pool;
calculating the backup quantity of data according to the backup proportion specified by a user, and sequentially acquiring candidate nodes from the candidate node pool according to the quantity to realize the selection of the backup nodes;
and the user data uploading node sends a data backup instruction and the data to be backed up to the candidate node, and the candidate node receives and stores the data to complete data backup.
2. The method of claim 1, wherein the method for obtaining the online node list comprises:
defining a user access node as AN, a node list in a distributed system as LN, AN online node list as LON, a node as Ni and AN online node as ONj; defining F (Ni) - > ONj as the direct mapping relation between the nodes and the online nodes;
and the user access node AN sequentially sends handshake data packets to the node Ni according to a pre-stored node list LN, determines AN online node ONj according to the response, and obtains AN online node list LON according to F (Ni) — > ONj, wherein the LON is a subset of the LN.
3. The method according to claim 1, wherein the method for obtaining the storage status of the global file comprises:
defining file storage information of each online node ONj as FONj, a global storage list which is a file storage information list of all online nodes ONj as FLON, and defining an algorithm for acquiring the global storage list FLON through the node storage information FON;
the user access node AN sends a file storage information request packet to the node ONj in the online node list LON in sequence, the node receiving the request packet sends its file storage information FON to the requester, and when the user access node requests the file storage information of all online nodes, a global storage list is constructed according to AN algorithm.
4. The method according to claim 1, wherein the method for constructing the global storage capability table comprises:
and defining a global storage capability table as SLON and defining the storage space of each online node as SONj. Defining an algorithm for acquiring a global storage capability table SLON through node storage information FLON;
and after the FLON construction is completed, acquiring a global storage capability table SLON from the global storage list FLON according to an algorithm.
5. The method of claim 1, wherein the method for constructing the candidate node pool comprises:
defining a candidate node pool as CNP, and defining a node priority rule defined by a user as R;
and sequencing the nodes according to a node priority rule R set by a user to realize the construction of a candidate node pool CNP.
6. The method of claim 1, wherein the backup node selection method comprises:
defining backup ratio as eta, the number of candidate node pool nodes as SCNP and a candidate node list CNL;
and calculating the number of candidate nodes according to the backup proportion. And sequentially selecting a corresponding number of nodes from the candidate node pool to construct a candidate node list CNL.
7. The method according to any one of claims 1 to 6, wherein the backup data sending method comprises:
and the user access node AN sequentially sends a cache command packet and a cache data packet according to the candidate node list CNL to realize data backup.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910736515.0A CN112346908A (en) | 2019-08-09 | 2019-08-09 | Data backup method in peer-to-peer distributed system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910736515.0A CN112346908A (en) | 2019-08-09 | 2019-08-09 | Data backup method in peer-to-peer distributed system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112346908A true CN112346908A (en) | 2021-02-09 |
Family
ID=74367052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910736515.0A Pending CN112346908A (en) | 2019-08-09 | 2019-08-09 | Data backup method in peer-to-peer distributed system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112346908A (en) |
-
2019
- 2019-08-09 CN CN201910736515.0A patent/CN112346908A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7752311B2 (en) | Gracefully changing a node in a distributed computing network | |
CN103399894A (en) | Distributed transaction processing method on basis of shared storage pool | |
CN114442912A (en) | Method and apparatus for distributed data storage | |
US10664359B2 (en) | Determining a replication path for resources of different failure domains | |
CN101179475B (en) | Method for downloading map data in peer-to-peer network mode | |
CN102984174A (en) | Method and system for reliability guarantee in publish-subscribe system | |
CN115079935A (en) | Method, electronic device and computer program product for storing and querying data | |
CN113347238A (en) | Message partitioning method, system, device and storage medium based on block chain | |
CN117176796A (en) | Message pushing method, device, computer equipment and storage medium | |
CN112346908A (en) | Data backup method in peer-to-peer distributed system | |
CN111522662B (en) | Node system for financial analysis and implementation method thereof | |
Pérez-Miguel et al. | High throughput computing over peer-to-peer networks | |
CN113746894A (en) | Method and device for realizing load balancing based on HDFS RBF routing layer | |
CN109040214B (en) | Service deployment method for enhancing reliability in cloud environment | |
Ni et al. | Designing file replication schemes for peer-to-peer file sharing systems | |
Alzboon et al. | Towards self-resource discovery and selection models in grid computing | |
US20140188833A1 (en) | Information processing device, information processing terminal, recording medium storing information search program, and information search method | |
CN108351894B (en) | File system with distributed entity states | |
CN116954870B (en) | Migration method, recovery method and device of cross-cluster application and cluster system | |
CN113597605B (en) | Multi-level data lineage view | |
CN113010545B (en) | Data searching method, device, server, storage medium and system | |
CN101534323A (en) | Method for realizing adaptation and initiative of nomadic mission and nomadic calculation supporting platform | |
CN116980460A (en) | Node connection method, device and equipment of RabbitMQ cluster and storage medium | |
CN113326129A (en) | Heterogeneous virtual resource management system and method | |
Cortés et al. | GeoTrie: A scalable architecture for location-temporal range queries over massive geotagged data sets |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210209 |