CN112346908A - Data backup method in peer-to-peer distributed system - Google Patents

Data backup method in peer-to-peer distributed system Download PDF

Info

Publication number
CN112346908A
CN112346908A CN201910736515.0A CN201910736515A CN112346908A CN 112346908 A CN112346908 A CN 112346908A CN 201910736515 A CN201910736515 A CN 201910736515A CN 112346908 A CN112346908 A CN 112346908A
Authority
CN
China
Prior art keywords
node
data
list
backup
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910736515.0A
Other languages
Chinese (zh)
Inventor
许长桥
杨树杰
郝昊
皮文超
赵楠
熊永平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Functional Intelligent Technology Research Institute Co ltd
Original Assignee
Nanjing Functional Intelligent Technology Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Functional Intelligent Technology Research Institute Co ltd filed Critical Nanjing Functional Intelligent Technology Research Institute Co ltd
Priority to CN201910736515.0A priority Critical patent/CN112346908A/en
Publication of CN112346908A publication Critical patent/CN112346908A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques

Abstract

The invention provides a data backup method in a peer-to-peer distributed system, which is used for improving the data redundancy in the peer-to-peer distributed system and ensuring the data reliability in a severe environment. A user access node scans nodes in a system to obtain an online node list; the user access node requests the file storage status of each node, completes the acquisition of the current global file storage status and constructs a global storage list; the user access node completes the construction of the global data storage capacity table according to the file storage condition of each node and the global storage list; sequencing the nodes according to a priority rule set by a user to realize the construction of a candidate node pool; calculating the backup quantity of data according to the backup proportion specified by a user, and sequentially acquiring candidate nodes from the candidate node pool according to the quantity to realize the selection of the backup nodes; and the user data uploading node sends a data backup instruction and the data to be backed up to the candidate node, and the candidate node receives and stores the data to complete data backup.

Description

Data backup method in peer-to-peer distributed system
Technical Field
The invention relates to the field of computer data storage, in particular to a data backup method in a peer-to-peer distributed system.
Background
With the rapid development of computer technology and information technology, data storage technology is more and more widely applied, and the data storage requirements under different environments are increasing day by day. Most of the current mainstream data storage schemes rely on a centralized system. The centralized system is that one or more host computers form a central node, data is stored in the central node in a centralized manner, all service units of the whole system are deployed on the central node in a centralized manner, and all functions of the system are processed in a centralized manner. In a centralized system, each terminal or client machine is solely responsible for the entry and output of data, while the data storage and control process is done entirely by the nodes. The centralized system has the greatest characteristic of simple deployment structure, and is usually based on large nodes with excellent bottom performance, so that how to deploy a plurality of nodes for a service does not need to be considered, and the problem of distributed cooperation among the nodes does not need to be considered. However, under the scenes of military operations, emergency disaster relief, field exploration, resource exploration and acquisition and the like which need data storage, the centralized system is difficult to adapt to the characteristics of rapid deployment, poor basic conditions, strong mobility, random access positions, high equipment destructiveness and the like.
A distributed system, as opposed to a centralized system, is a software system in which hardware or software components are distributed among different network computers and communicate and coordinate with each other solely through message passing. Distributed systems allow a large number of applications to access data stored in local or remote databases. In this case, the data distribution is achieved by a replication process. A standard distributed system will have the characteristics of distribution, peering, and concurrency without any specific business logic constraints. Due to the characteristics, the distributed system is suitable for occasions where centralized systems such as military operations, emergency disaster relief, field exploration, resource exploration and collection and the like are not suitable.
The distributed system is divided into a centralized management type distributed system with non-peer nodes and a peer type distributed system with peer nodes. The centralized control type distributed system needs one or more core nodes to perform global control on the whole system, and the externally provided access interface is necessarily limited. The number of core nodes occupies a lower number weight among all the nodes, which causes the stability of the distributed system depending on the centralized management of the core nodes to be greatly influenced by the stability of the core nodes. The peer-to-peer distribution system has the characteristics of no dependence on core nodes, equal topology and identical functions of all nodes, and can avoid the dependence of the overall stability of the system on the stability of key nodes, so that the stability of the whole system is not limited by a small number of specific nodes.
With the advent of the big data application era, data storage backup becomes more and more important, and a distributed system also needs to perform data backup to ensure the redundancy and the availability of data. The data backup method of the distributed system needs to have the characteristics of safety, reliability, simplicity, convenience and the like. The backup content is guaranteed to be complete and effective. The backup and restore do not require complicated manual operations. The data storage and backup system is oriented to an application program database, a service system and a core server, and realizes the functions of data storage, data backup and recovery, system backup and recovery, application program backup and recovery and the like.
In summary, in order to implement data backup in a peer-to-peer distributed system, a data caching method based on the peer-to-peer distributed system needs to be designed, in which a data node uploaded by a user scans an online node list of the whole system to request file storage conditions of the online node, then the data uploading node constructs a global data storage capacity table according to the file storage conditions of the online node, sorts the nodes according to priorities set by the user according to the capacity table of all data storage to complete construction of a candidate node pool, then calculates the number of backup nodes according to a backup proportion specified by the user, sequentially selects backup nodes from the candidate node pool, finally the data uploading node of the user sends data to be backed up to the backup nodes, and the backup nodes receive the data to complete data backup. The redundancy and the availability of data are ensured.
Disclosure of Invention
In view of this, the present invention provides a data backup method in a peer-to-peer distributed system, where the method includes:
a user access node scans a node list in a system to obtain an online node list;
the user access node requests the file storage status of each node in the network, completes the acquisition of the current global file storage status and constructs a global storage list;
the user access node completes the construction of the global data storage capacity table according to the file storage condition of each node and the global storage list;
sequencing the nodes according to a priority rule set by a user to realize the construction of a candidate node pool;
calculating the backup quantity of data according to the backup proportion specified by a user, and sequentially acquiring candidate nodes from the candidate node pool according to the quantity to realize the selection of the backup nodes;
and the user data uploading node sends a data backup instruction and the data to be backed up to the candidate node, and the candidate node receives and stores the data to complete data backup.
The method for acquiring the online node list comprises the following steps:
defining a user access node as AN, a node list in a distributed system as LN, AN online node list as LON, a node as Ni and AN online node as ONj; f (Ni) - > ONj is defined as the direct mapping relation between the nodes and the online nodes.
And the user access node AN sequentially sends handshake data packets to the node Ni according to a pre-stored node list LN, determines AN online node ONj according to the response, and obtains AN online node list LON according to F (Ni) — > ONj, wherein the LON is a subset of the LN.
The method for acquiring the storage condition of the global file comprises the following steps:
the file storage information of each online node ONj is defined as FONj, the file storage information list, i.e., the global storage list, of all online nodes ONj is defined as FLON, and an algorithm for acquiring the global storage list FLON through the node storage information FON is defined.
The user access node AN sends a file storage information request packet to the node ONj in the online node list LON in sequence, the node receiving the request packet sends its file storage information FON to the requester, and when the user access node requests the file storage information of all online nodes, a global storage list is constructed according to AN algorithm.
The construction method of the global storage capability table comprises the following steps:
and defining a global storage capability table as SLON and defining the storage space of each online node as SONj. An algorithm for acquiring the global storage capability table SLON through the node storage information FLON is defined.
And after the FLON construction is completed, acquiring a global storage capability table SLON from the global storage list FLON according to an algorithm.
The construction method of the candidate node pool comprises the following steps:
the candidate node pool is defined as CNP, and the user-defined node priority rule is R.
And sequencing the nodes according to a node priority rule R set by a user to realize the construction of a candidate node pool CNP.
The backup node selection comprises the following steps:
defining the backup ratio as eta, the number of the candidate node pool nodes as SCNP and the candidate node list CNL.
And calculating the number of candidate nodes according to the backup proportion. And sequentially selecting a corresponding number of nodes from the candidate node pool to construct a candidate node list CNL.
The method for sending the backup data comprises the following steps:
and the user access node AN sequentially sends a cache command packet and a cache data packet according to the candidate node list CNL to realize data backup.
The invention has the following technical effects: the method and the system have the advantages that the online node list of the data node scanning system uploaded by a user is used for requesting the file storage condition of the online node, the global data storage capacity table is constructed, the candidate node pool is constructed, the number of backup nodes is calculated, the backup nodes are selected, the data to be backed up is sent, and the like, so that the backup of the data in the peer-to-peer distributed system is completed, the redundancy and the availability of the data in the peer-to-peer distributed system are ensured, and the system can meet the requirements of different application scenes.
Drawings
Fig. 1 is a flowchart of a data backup method in a peer-to-peer distributed system according to the present invention.
Detailed Description
The invention is described in detail below by way of example with reference to the accompanying drawings.
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.
The method comprises the steps of utilizing an online node list of a user access node scanning system to request file storage conditions of online nodes, constructing a global data storage capacity table, constructing a candidate node pool, calculating the number of backup nodes, selecting the backup nodes, sending data to be backed up and the like to complete backup of the data in a peer-to-peer distributed system, and realizing backup of the data in the peer-to-peer distributed system with redundancy and usability.
Example one
The embodiment of the invention provides a method for acquiring an online node list, which comprises the following steps:
the invention defines that the user access node is AN, the node list in the distributed system is LN, the online node list is LON, the node is Ni, and the online node is ONj; f (Ni) - > ONj is defined as the direct mapping relation between the nodes and the online nodes.
The mapping function F is implemented as follows:
setting a variable a as the number of existing elements in the LON;
then the definition of F for the input node x ∈ LON is as follows:
F(x)=ONa
thus, an online node list LON is obtained according to F, the LON being a subset of LNs.
Therefore, the online node list is obtained, and preparation is made for obtaining the global file storage condition in the next step.
Example two
With reference to the content of the first embodiment, an embodiment of the present invention provides an obtaining method for constructing a global file storage status, where:
the invention defines the file storage information of each online node ONj as FONj, the file storage information list of all online nodes ONj, namely the global storage list, as FLON, and defines an algorithm for acquiring the global storage list FLON through the node storage information LON.
Algorithm 1 obtaining a global storage list FLON from an online node list LON
Input-Online node List LON
Output Global storage List FLON
ForeachONjasLON
create FONjrequest Rj
sendRj to ONj
End For
ForeachONjasLON
listen askAj
End For
ForeachONjasLON
selectFONjfromAj
add FONjto FLON
End For
ReturnFLON
Therefore, the acquisition of the global file storage list is realized, and preparation is made for the construction of the next global storage capacity table.
EXAMPLE III
With reference to the content of the second embodiment, this embodiment provides a method for constructing the global storage capability table, and a specific method flow of this embodiment includes:
and defining the global storage capability table as SLON, and defining an algorithm for acquiring the global storage capability table SLON through the global storage list FLON.
Algorithm 2 obtaining global storage capability table by global storage list FLON
SLON
Global storage List FLON
Output global storage capability table SLON
ForeachFONiasFLON
VarRONi=SONi–FONi
ADDRONitoSLON
End For
ReturnSLON
Therefore, the construction of the global storage capacity table is obtained, and preparation is made for the construction of a candidate node pool in the next step.
Example four
With reference to the content of the third embodiment, this embodiment provides a method for constructing a candidate node pool, and a specific method flow of this embodiment includes:
the invention defines a global storage capacity table SLON, a candidate node pool CNP and a node priority rule R defined by a user.
Algorithm 3 of candidate node pool CNP
Construction of
Input global storage capability table
SLON
Output candidate node pool CNP
varflag=true
copy SLON to CNP
While flag==true
flag=false
ForeachSONiasSLON
if R(SONi,SONi+1)==false
swap(SONi,SONi+1)
flag=true
end if
End For
Loop
ReturnCNP
Therefore, the construction of the candidate node pool is realized, and preparation is made for the selection of the next backup node.
EXAMPLE five
In combination with the content of the third embodiment, this embodiment provides a backup node selection method, and a specific method flow of this embodiment includes:
defining the backup ratio as eta, the number of the candidate node pool nodes as SCNP and the candidate node list CNL.
And calculating the number of candidate nodes according to the backup proportion. And sequentially selecting a corresponding number of nodes from the candidate node pool to construct a candidate node list CNL.
Algorithm 4 construction of candidate node list CNL
Input backup ratio η, candidate node pool CNP
Output candidate node list CNL
ForeachONiasCNPand(i<SCNP*η)
Add Oni to CNL
End For
ReturnCNL
Thus, the construction of the backup node list is realized, and the backup node list is prepared for the next step.
EXAMPLE six
With reference to the content of the third embodiment, this embodiment provides a method for sending backup data, and a specific flow of the method of this embodiment includes:
the user access node AN firstly converts the data to be sent into data stream based on JSON protocol, and then uses TCP connection to send a cache command packet and a cache data packet in sequence according to a backup node list CNP to realize data backup.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. A method of data backup in a peer-to-peer distributed system, the method comprising:
a user access node scans nodes in a system to obtain an online node list;
the user access node requests the file storage status of each node in the network, completes the acquisition of the current global file storage status and constructs a global storage list;
the user access node completes the construction of the global data storage capacity table according to the file storage condition of each node and the global storage list;
sequencing the nodes according to a priority rule set by a user to realize the construction of a candidate node pool;
calculating the backup quantity of data according to the backup proportion specified by a user, and sequentially acquiring candidate nodes from the candidate node pool according to the quantity to realize the selection of the backup nodes;
and the user data uploading node sends a data backup instruction and the data to be backed up to the candidate node, and the candidate node receives and stores the data to complete data backup.
2. The method of claim 1, wherein the method for obtaining the online node list comprises:
defining a user access node as AN, a node list in a distributed system as LN, AN online node list as LON, a node as Ni and AN online node as ONj; defining F (Ni) - > ONj as the direct mapping relation between the nodes and the online nodes;
and the user access node AN sequentially sends handshake data packets to the node Ni according to a pre-stored node list LN, determines AN online node ONj according to the response, and obtains AN online node list LON according to F (Ni) — > ONj, wherein the LON is a subset of the LN.
3. The method according to claim 1, wherein the method for obtaining the storage status of the global file comprises:
defining file storage information of each online node ONj as FONj, a global storage list which is a file storage information list of all online nodes ONj as FLON, and defining an algorithm for acquiring the global storage list FLON through the node storage information FON;
the user access node AN sends a file storage information request packet to the node ONj in the online node list LON in sequence, the node receiving the request packet sends its file storage information FON to the requester, and when the user access node requests the file storage information of all online nodes, a global storage list is constructed according to AN algorithm.
4. The method according to claim 1, wherein the method for constructing the global storage capability table comprises:
and defining a global storage capability table as SLON and defining the storage space of each online node as SONj. Defining an algorithm for acquiring a global storage capability table SLON through node storage information FLON;
and after the FLON construction is completed, acquiring a global storage capability table SLON from the global storage list FLON according to an algorithm.
5. The method of claim 1, wherein the method for constructing the candidate node pool comprises:
defining a candidate node pool as CNP, and defining a node priority rule defined by a user as R;
and sequencing the nodes according to a node priority rule R set by a user to realize the construction of a candidate node pool CNP.
6. The method of claim 1, wherein the backup node selection method comprises:
defining backup ratio as eta, the number of candidate node pool nodes as SCNP and a candidate node list CNL;
and calculating the number of candidate nodes according to the backup proportion. And sequentially selecting a corresponding number of nodes from the candidate node pool to construct a candidate node list CNL.
7. The method according to any one of claims 1 to 6, wherein the backup data sending method comprises:
and the user access node AN sequentially sends a cache command packet and a cache data packet according to the candidate node list CNL to realize data backup.
CN201910736515.0A 2019-08-09 2019-08-09 Data backup method in peer-to-peer distributed system Pending CN112346908A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910736515.0A CN112346908A (en) 2019-08-09 2019-08-09 Data backup method in peer-to-peer distributed system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910736515.0A CN112346908A (en) 2019-08-09 2019-08-09 Data backup method in peer-to-peer distributed system

Publications (1)

Publication Number Publication Date
CN112346908A true CN112346908A (en) 2021-02-09

Family

ID=74367052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910736515.0A Pending CN112346908A (en) 2019-08-09 2019-08-09 Data backup method in peer-to-peer distributed system

Country Status (1)

Country Link
CN (1) CN112346908A (en)

Similar Documents

Publication Publication Date Title
US7752311B2 (en) Gracefully changing a node in a distributed computing network
CN103399894A (en) Distributed transaction processing method on basis of shared storage pool
CN114442912A (en) Method and apparatus for distributed data storage
US10664359B2 (en) Determining a replication path for resources of different failure domains
CN101179475B (en) Method for downloading map data in peer-to-peer network mode
CN102984174A (en) Method and system for reliability guarantee in publish-subscribe system
CN115079935A (en) Method, electronic device and computer program product for storing and querying data
CN113347238A (en) Message partitioning method, system, device and storage medium based on block chain
CN117176796A (en) Message pushing method, device, computer equipment and storage medium
CN112346908A (en) Data backup method in peer-to-peer distributed system
CN111522662B (en) Node system for financial analysis and implementation method thereof
Pérez-Miguel et al. High throughput computing over peer-to-peer networks
CN113746894A (en) Method and device for realizing load balancing based on HDFS RBF routing layer
CN109040214B (en) Service deployment method for enhancing reliability in cloud environment
Ni et al. Designing file replication schemes for peer-to-peer file sharing systems
Alzboon et al. Towards self-resource discovery and selection models in grid computing
US20140188833A1 (en) Information processing device, information processing terminal, recording medium storing information search program, and information search method
CN108351894B (en) File system with distributed entity states
CN116954870B (en) Migration method, recovery method and device of cross-cluster application and cluster system
CN113597605B (en) Multi-level data lineage view
CN113010545B (en) Data searching method, device, server, storage medium and system
CN101534323A (en) Method for realizing adaptation and initiative of nomadic mission and nomadic calculation supporting platform
CN116980460A (en) Node connection method, device and equipment of RabbitMQ cluster and storage medium
CN113326129A (en) Heterogeneous virtual resource management system and method
Cortés et al. GeoTrie: A scalable architecture for location-temporal range queries over massive geotagged data sets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210209