CN113568783A - Distributed data storage system, management method, device and storage medium - Google Patents

Distributed data storage system, management method, device and storage medium Download PDF

Info

Publication number
CN113568783A
CN113568783A CN202110884351.3A CN202110884351A CN113568783A CN 113568783 A CN113568783 A CN 113568783A CN 202110884351 A CN202110884351 A CN 202110884351A CN 113568783 A CN113568783 A CN 113568783A
Authority
CN
China
Prior art keywords
node
backed
target block
block file
backup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110884351.3A
Other languages
Chinese (zh)
Inventor
梁宁君
蒋聪聪
罗键
沈峻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Tika Technology Co ltd
Original Assignee
Shanghai Tika Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Tika Technology Co ltd filed Critical Shanghai Tika Technology Co ltd
Priority to CN202110884351.3A priority Critical patent/CN113568783A/en
Publication of CN113568783A publication Critical patent/CN113568783A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks

Abstract

The application discloses a distributed data storage system, a management method, a device and a storage medium. The method comprises the following steps: monitoring a fault event of each edge node, responding to the monitored fault event, determining a target block file to be backed up, dispatching at least one edge node by the central node to serve as a backup node to perform backup operation on the target block file to be backed up, detecting the target edge node in which the target block file is stored by the backup node based on the dispatching of the central node, acquiring the target block file to be backed up from the target edge node, and then backing up the acquired target block file to be backed up locally. By adopting the scheme of the application, the integrity and the safety storage function of the block files in the system can be realized, so that the data acquisition is not influenced under the condition that a small number of edge nodes are in failure, and the integrity and the safety of the data are ensured.

Description

Distributed data storage system, management method, device and storage medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a distributed data storage system, a management method, an apparatus, and a storage medium.
Background
With the development of computer technology, security management technology has been widely used to manage various types of application systems. In various application environments, it is often necessary to perform backup operations on data in an application system so that the application system can be restored based on the backup data when a failure or other condition occurs in the application system. Since backup data is the basis for restoring an application system, how to manage the backup data of the application system in a more secure and reliable manner becomes a research hotspot.
The distributed data storage technology is a technology that divides data into a plurality of data blocks (block files) and performs a plurality of backups, and then distributively stores the backed-up data blocks (block files) in a plurality of nodes of an application system, so that a large number of data block (block file) backups may be stored in a single node, and once the node fails, the large number of data block (block file) backups stored in the node are lost, and then the data recovery device cannot retrieve the data block (block file) backups from the node, so that an original data stream cannot be reconstructed.
Disclosure of Invention
An object of the present application is to provide a distributed data storage system, a management method, an apparatus and a storage medium, which are used to solve the above problems in the prior art.
According to an aspect of the present application, an embodiment of the present application provides a distributed data storage management method, which is applied to a distributed data storage system, where the system includes at least one central node and a plurality of edge nodes connected to the at least one central node through a network, and the method includes: monitoring each of the edge nodes for a fault event; determining a target block file to be backed up in response to the monitored fault event; the central node schedules at least one edge node as a backup node to execute backup operation on the target block file to be backed up; and the backup node detects a target edge node in which the target block file is stored based on the scheduling of the central node, acquires the target block file to be backed up from the target edge node, and then backs up the acquired target block file to be backed up locally.
According to another aspect of the present application, an embodiment of the present application provides a distributed data storage management method, which is performed by a central node in a distributed data storage system, the system further includes a plurality of edge nodes connected to the central node via a network, and the method includes: monitoring each of the edge nodes for a fault event; determining a target block file to be backed up in response to the monitored fault event; and scheduling at least one edge node as a backup node to execute backup operation on the target block file to be backed up.
According to another aspect of the present application, an embodiment of the present application provides a distributed data storage management method, which is performed by an edge node in a distributed data storage system, where the system includes at least one central node and a plurality of edge nodes connected to the at least one central node through a network, and the method includes: and responding to a received scheduling instruction from the central node, detecting and storing a target edge node of the target block file to be backed up indicated in the scheduling instruction, acquiring the target block file to be backed up from the target edge node, and then backing up the acquired target block file to be backed up locally.
According to yet another aspect of the present application, an embodiment of the present application provides a distributed data storage system, which includes at least one central node and a plurality of edge nodes connected to the at least one central node via a network, each edge node storing a plurality of block files; the central node monitors a fault event of each edge node, determines a target block file to be backed up in response to the monitored fault event, and then schedules at least one edge node as a backup node to perform backup operation on the target block file to be backed up; and each edge node serving as the backup node detects a target edge node in which the target block file is stored based on the scheduling of the central node, acquires the target block file to be backed up from the target edge node, and then locally backs up the acquired target block file to be backed up. According to another aspect of the present application, an embodiment of the present application provides a distributed data storage management apparatus, the apparatus operating on a central node of a distributed data storage system, the system further including a plurality of edge nodes connected to the central node via a network, the apparatus including: the device comprises a monitoring unit, a judging unit and a scheduling unit;
the monitoring unit is used for monitoring the fault event of each edge node;
the judging unit is used for responding to the monitored fault event to determine a target block file to be backed up;
and the scheduling unit is used for scheduling at least one edge node as a backup node to execute backup operation on the target block file to be backed up.
According to another aspect of the present application, an embodiment of the present application provides an apparatus for managing distributed data storage, the apparatus operating on edge nodes of a distributed data storage system, the system including at least one central node and a plurality of edge nodes connected to the at least one central node via a network, the apparatus including: a communication unit, a receiving unit and a processing unit;
the communication unit is used for reporting a fault event to the central node;
the receiving unit is used for receiving a scheduling instruction of the central node;
the processing unit is configured to detect and store a target edge node of a target block file to be backed up indicated in the scheduling instruction in response to the received scheduling instruction, acquire the target block file to be backed up from the target edge node, and then locally back up the acquired target block file to be backed up.
According to yet another aspect of the present application, a storage medium is provided, where a computer program is stored, where the computer program can be loaded by a processor to execute the steps in the distributed data storage management method according to any of the foregoing embodiments.
In the distributed data storage system, the management method, the device and the storage medium provided by the application, a fault event of each edge node is monitored, a target block file to be backed up is determined in response to the monitored fault event, the central node schedules at least one first edge node as a backup node to perform backup operation on the target block file to be backed up, the backup node detects the target edge node where the target block file is stored based on the scheduling of the central node, acquires the target block file to be backed up from the target edge node, and then locally backs up the acquired target block file to be backed up. The problem that in the prior art, once an edge node fails, a block file stored in the node is lost is solved, and the functions of integrity and safe storage of the block file in the system are realized, so that the data acquisition is not influenced under the condition that a small number of edge nodes fail.
Drawings
The technical solution and other advantages of the present application will become apparent from the detailed description of the embodiments of the present application with reference to the accompanying drawings.
FIG. 1 is a schematic diagram of an architecture of a distributed data storage system according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a distributed data storage management method according to an embodiment of the present application;
FIG. 3 is a schematic sub-flow chart of step S200 shown in FIG. 2;
FIG. 4 is a schematic sub-flow chart of step S300 shown in FIG. 2;
FIG. 5 is a schematic flow chart of a step subsequent to step S400 shown in FIG. 2;
FIG. 6 is a flowchart illustrating a further distributed data storage management method according to an embodiment of the present application;
FIG. 7 is a flowchart illustrating a further distributed data storage management method according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a distributed data storage management apparatus according to an embodiment of the present application;
FIG. 9 is a schematic diagram of a distributed data storage management apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first", "second" and "first" are used herein for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
In the description of the present application, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; may be mechanically connected, may be electrically connected or may be in communication with each other; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.
The following disclosure provides many different embodiments or examples for implementing different features of the application. In order to simplify the disclosure of the present application, specific example components and arrangements are described below. Of course, they are merely examples and are not intended to limit the present application. Moreover, the present application may repeat reference numerals and/or letters in the various examples, such repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
Fig. 1 is a schematic structural diagram of a distributed data storage system according to an embodiment of the present application, and referring to fig. 1, the system includes a plurality of edge nodes, optionally, in any embodiment of the present application, the edge nodes may be an intelligent terminal or a cloud server, the edge nodes are used for storing block files to be stored, and the edge nodes are communicatively coupled to each other, and the distributed data storage system is used for performing a plurality of backup storages on the block files to be stored as needed, so as to ensure continuity of data access. Meanwhile, in order to manage a plurality of edge nodes, the distributed data storage system further includes at least one central node, the at least one central node is connected to the plurality of edge nodes through a network, the central node may be a general server or a controller, and the central node is configured to perform centralized management, maintenance, and the like on block file backups in the edge nodes, for example, monitor an online state and an operating state of each edge node, and generate an abnormal alarm prompt when there is an abnormality, so as to facilitate timely maintenance of the block file backups stored in the edge nodes with the abnormality.
In a distributed data storage system, several edge nodes may be dropped simultaneously due to edge node stability factors, the current network state, user human factors, and the like, and if block files (data blocks) are just present on the dropped nodes, the data of the block files in the current state (node drop) cannot be acquired. Therefore, each block file needs to ensure its stability and serviceability, and the same block file needs to be distributed on different edge nodes to store multiple backups, so that when the application system fails or otherwise, the original data stream in the application system can be restored based on the backup data. Since backup data is the basis for restoring an application system, there is a need to manage backup data in an application system in a secure and reliable manner.
Illustratively, in the present embodiment, it is assumed that there are 100 edge nodes (N1, … …, N100), 2 central nodes (S1, S2), and that there are 5000 block files (B1, … …, B5000) in the entire distributed data storage system, where 100 edge nodes (N1, … …, N100) and 2 central nodes (S1, S2) are connected via a network, and each edge node holds about 250 block files. The same block file is randomly distributed on 5-7 edge nodes to store 5-7 backups, that is, 5-7 backups are made corresponding to the same block file and are respectively stored on the random 5-7 edge nodes in a one-to-one correspondence manner. In addition, the threshold value of the backup number of the block files in the distributed data storage system can be calculated dynamically according to the corresponding service type and according to the requirement and the stability. At this time, if the N1 and N50 fail and cannot be connected (500 block files are lost), the central node detects the failure event, and then confirms that the backup number of 300 block files in the 500 lost block files is possibly lost due to the fact that the backup number is lower than a preset threshold value. Therefore, the 300 block files need to be backed up and maintained, so as to prevent the problem that the system data cannot be restored due to the fact that the backup of some block files is lower than a preset threshold value in the system.
Fig. 2 is a schematic flow chart of a distributed data storage management method provided according to an embodiment of the present application, referring to fig. 2, where the method is applied to a distributed data storage system, where the system includes at least one central node and a plurality of edge nodes network-connected to the at least one central node, and the method includes: step S100, monitoring the fault event of each edge node; step S200, in response to the monitored fault event, determining a target block file to be backed up; step S300, the central node dispatches at least one edge node as a backup node to execute backup operation on the target block file to be backed up; step S400, the backup node detects a target edge node stored with the target block file based on the dispatching of the central node, acquires the target block file to be backed up from the target edge node, and then locally backs up the acquired target block file to be backed up.
In step S100, a central node may monitor the online and operating status of each edge node in the entire distributed data storage system to manage a plurality of edge nodes, for example, the central node may monitor the operating status of each edge node in a polling manner, or may send a communication request to the central node at regular intervals in a manner similar to "heartbeat detection", where the central node monitors the online and operating status of each edge node based on the communication request.
In step S200, in response to monitoring the failure event, determining a target block file to be backed up; in this embodiment, when the central node monitors that the edge node has a failure event, for example: and under the conditions of power failure, network failure and the like, the central node receives the backup request reported by the edge node or actively generates the backup request, points to any one of the edge nodes, and simultaneously determines the target block file to be backed up and which edge nodes in the distributed data storage system have spare storage space.
In step S300, the central node issues a scheduling instruction to at least one edge node in the system, which may respond to the data storage request, and instructs the at least one edge node to perform a backup operation on the target block file to be backed up as a backup node.
In step S400, the backup node detects and saves the target edge node of the target block file based on the scheduling of the central node, and obtains the target block file to be backed up from the target edge node, and then locally backs up the obtained target block file to be backed up.
In this embodiment, in step S400, the target edge node having the target block file to be backed up may be found by an ID (i.e., an identifier) of each target block file to be backed up. Illustratively, each chunk file may be hashed to generate a corresponding hash value, and the hash value may be used as an identifier of the chunk file having a unique characteristic value attribute.
The technical scheme provided by the embodiment of the application can be applied to the technical field with large file storage requirements, such as video file storage, database file storage and the like, solves the problem that in the prior art, once an edge node fails, backup of block files stored in the node is lost, and realizes the functions of integrity and safe storage of the backup of the block files in the system, so that data acquisition is not influenced under the condition that a small number of edge nodes fail, and the safety, the read-write performance and the effectiveness of data storage are ensured.
Illustratively, the integrity of the backup data is important to the distributed data storage system. To achieve this, in the distributed data storage system, data checking is required to be performed periodically, for example, using a global information table stored in the central node to determine a block file stored in each of the edge nodes.
Fig. 3 is a schematic sub-flow diagram of step S200 shown in fig. 2, and referring to fig. 3, in step S200, the method for determining a target block file to be backed up in response to a monitored failure event includes: step S210, determining lost block files according to each edge node related to the fault event; step S220, based on the determined lost block files, the central node determines the number of the currently remaining block files in the system, step S230, determines whether the number is lower than a preset threshold, if so, step S240 is executed, and determines the block files lower than the preset threshold as the target block files to be backed up.
Specifically, the central node may dynamically calculate the threshold requirement of the number of each block file according to the corresponding service type and the requirement and stability, and when finding that the number of any one block file is lower than a preset threshold, the central node reallocates a new edge node to perform a backup task, and uses the block file lower than the preset threshold as a target block file to be backed up; for example, the central node may grasp global information of block files stored in each edge node, a global information table is stored in the central node, a lost block file is determined according to each edge node involved in the failure event, and the central node may index, based on the global information table, a remaining amount of a current block file backup and category information of block files lower than a preset threshold, and determine a block file lower than the preset threshold as the target block file to be backed up.
Fig. 4 is a sub-flowchart of step S300 shown in fig. 2, and referring to fig. 4, in step S300, the step of the central node scheduling at least one edge node to perform a backup operation on the target block file to be backed up includes: step S310, the central node determines and selects at least one edge node which currently has an expected storage space in the system as the backup node based on a global information table or a polling mechanism; step S320, the central node determines the target block files which need to be backed up by each backup node based on the load condition of each backup node; step S330, the central node sends a scheduling instruction to each backup node to instruct each backup node to perform a backup operation on the target block file to be backed up.
Specifically, in step S310, the central node determines and selects at least one edge node, which currently has an expected storage space in the system, as the backup node based on a global information table, where a mapping storage space and a used storage space of each current edge node are recorded in the global information table, and a free storage space of each edge node can be obtained by calculating a difference between the mapping storage space and the used storage space of each edge node, so as to determine which edge nodes also have the expected storage space as the backup node capable of responding to the data storage request. In specific implementation, the edge nodes with larger free storage spaces may be preferentially selected as backup nodes according to the size of the free storage spaces. Of course, in some application scenarios, if there are multiple edge nodes that can respond to the data storage request, a candidate edge node list may be formed, and then according to the size of the free storage space or the size of the stability, an optimal one or multiple edge nodes are screened out from the candidate edge node list as backup nodes to perform storage of block file backup. In addition, the central node may send a data storage request to each edge node in a polling manner, for example, it may inquire whether there is any expected storage space as a backup node responding to the data storage request from edge nodes around each edge node involved in the failure event based on a geographical location proximity principle, so as to backup and store the target block file to be backed up.
In step S320, the central node determines, based on the load condition of each backup node, a target block file that each backup node needs to backup; for example, in this embodiment, the central node obtains the free storage space of each edge node by calculating the difference between the mark-allocated storage space and the used storage space of each edge node according to the mark-allocated storage space and the used storage space of each edge node recorded on the global information table, and the central node may determine the size and the number of the target block files to be backed up, which are allocated to each backup node by the central node, according to the free storage space of each backup node that is screened out. Illustratively, for example, it is determined that the saving number of 300 block files is less than the threshold value and may be lost, then the central node schedules the remaining edge nodes with free storage space to perform backup operation, the central node allocates and instructs the N2 th edge node to store 5 block files, the N3 th edge node to store 10 block files, the N4 th edge node to store 7 block files … … the N100 th edge node to store 4 block files according to the free storage space of each backup node screened out, until all the lost 300 block files are saved completely.
In step S330, the central node sends a scheduling instruction to each backup node to instruct each backup node to perform a backup operation on the target block file to be backed up; for example, in this embodiment, the central node may instruct each backup node to perform a backup operation according to the requirement of the size and the number of the target block files to be backed up that have been previously allocated to each backup node, and then each backup node obtains the target block file that needs to be backed up from the target edge node that stores the corresponding target block file.
Further, in step S400, the step of the backup node probing the target edge node storing the target block file based on the schedule of the central node includes: each backup node detects and determines a target edge node storing a target block file to be backed up by the backup node in a mode of information interaction with the peripheral edge nodes of the backup node; and/or each backup node acquires the address information of the target edge node which stores the target block file to be backed up by the backup node from the central node.
The method for detecting the position of the target edge node stored with the target block file comprises the following steps: each backup node detects a target edge node which stores a target block file required by the backup node in a mode of information interaction with the peripheral edge nodes of the backup node; and/or the central node detects a target edge node which stores the target block file required by the backup node according to the global information table. In a specific application scenario, if each backup node does not find a target edge node storing a desired target block file after performing information interaction with its peripheral edge nodes, it may then send a request to the central node to obtain address information of the target edge node storing the desired target block file, and the central node sends the address information of the target edge node to the backup node in response to the request, or, of course, the central node may send the address information of the target edge node storing the target block file required by the backup node to the backup node while allocating and executing a target block file backup task.
Further, in step S400, the step of acquiring the target block file to be backed up from the target edge node, and then backing up the acquired target block file to be backed up locally includes: each backup node sends a file acquisition request to a corresponding target edge node so as to acquire a target block file which needs to be backed up by the backup node from the corresponding target edge node.
Further, in step S400, the step of acquiring the target block file to be backed up from the target edge node, and then backing up the acquired target block file to be backed up locally further includes: each target edge node responds to the received file acquisition request and sends a copy of the target block file indicated in the file acquisition request to the backup node initiating the file acquisition request.
Optionally, based on the obtained address information of the target block file required to save the backup node, each backup node sends a file obtaining request to the target edge node that saves the required target block file, so as to backup and store the target block file to be backed up from the target edge node to the backup node. For example, in this embodiment, a copy of the target block file to be backed up may be made in the target edge node corresponding to the backup node, and the copy of the target block file to be backed up is stored on the backup node, so as to ensure that the original data of the target block file to be backed up is not damaged, and avoid an unsafe risk caused in the distributed data storage system. In addition, in this embodiment, a data communication channel between the target edge node and the backup node is established, and the copy of the target block file is transmitted from the target edge node to the backup node through the data communication channel. For example, a data communication channel between the target edge node and the backup node may be established according to a software defined network, and in this embodiment, the data communication channel may be based on an internet data communication channel or a wireless network operator data communication channel.
Illustratively, according to the foregoing, when the N2 th edge node receives a scheduling instruction from the central node, the target edge node holding the target block file to be backed up indicated by the scheduling instruction is detected, for example, a required target block file is found on the N5 th and N7 th edge nodes, and then the N2 th edge node sends a request for obtaining the required target block file to the N5 th and N7 th edge nodes, respectively, and then saves a copy corresponding to the target block file on the N2 th edge node.
Fig. 5 is a schematic flow chart of steps subsequent to step S400 shown in fig. 2, and referring to fig. 1 to 5, after step S400 is executed, the method may further include the following steps: step S501, after the backup node locally backs up the acquired target block file to be backed up, the backup node sends a notification that the backup operation is completed to the central node; step S502, in response to the received notification that the backup operation is completed, the central node updates the backed-up quantity value of the corresponding target block file to be backed-up.
As described above, since there are a plurality of edge nodes in the distributed data storage system, and only a part of the edge nodes are backup nodes, wherein the target edge node storing the target block file corresponding to the backup node is also only a part of the edge nodes, and participates in the process of performing the backup processing, the backup node and the target node in the edge nodes can be classified, such as the edge node already participating in the backup processing process and the edge node not participating in the backup processing, and then the edge nodes already participating in the backup processing process can be correspondingly marked, so that when the edge node failure event occurs next time, the priority order of taking the edge nodes already marked to participate in the backup processing as backup nodes is reduced, to improve the operating efficiency of the distributed data storage system.
Fig. 6 is a schematic flow chart of another distributed data storage management method provided according to an embodiment of the present application, and referring to fig. 6, an embodiment of the present application provides a distributed data storage management method, where the method is performed by a central node in a distributed data storage system, the system further includes a plurality of edge nodes connected to the central node through a network, and the method includes: step S610, monitoring the fault event of each edge node; step S620, in response to the monitored fault event, determining a target block file to be backed up; step S630, at least one of the edge nodes is scheduled to serve as a backup node to perform a backup operation on the target block file to be backed up.
In step S610, the central node may monitor the online and operating status of each edge node in the entire distributed data storage system to manage a plurality of edge nodes, for example, the central node may monitor the operating status of each edge node in a polling manner, or may send a communication request to the central node by each edge node at regular intervals in a manner similar to "heartbeat detection", where the central node monitors the online status of each edge node based on the communication request.
In step S620, in response to monitoring the failure event, determining a target block file to be backed up; in this embodiment, the central node determines the lost block files according to each edge node monitored to be involved in the fault event, and based on the determined lost block files, the central node determines whether the number of the block files currently remaining in the system is lower than a preset threshold, and if so, takes the block files lower than the preset threshold as target block files to be backed up.
Specifically, the central node may dynamically calculate the threshold requirement of the number of each block file according to the corresponding service type and the requirement and stability, and when finding that the number of any one block file is lower than a preset threshold, the central node reallocates a new edge node to perform a backup task, and uses the block file lower than the preset threshold as a target block file to be backed up; for example, the central node may grasp global information of block files stored in each edge node, a global information table is stored in the central node, a lost block file is determined according to each edge node involved in the failure event, and the central node may index, based on the global information table, a remaining amount of a current block file backup and category information of block files lower than a preset threshold, and determine a block file lower than the preset threshold as the target block file to be backed up.
In step S630, scheduling at least one of the edge nodes as a backup node to perform a backup operation on the target block file to be backed up, including: the central node determines and selects at least one edge node which currently has expected storage space in the system as the backup node based on a global information table or based on a polling mechanism; the central node determines a target block file which needs to be backed up by each backup node based on the load condition of each backup node; and the central node sends a scheduling instruction to each backup node to instruct each backup node to execute backup operation on the target block file which needs to be backed up.
In this embodiment, a global information table capable of indexing each edge node block file backup is stored in the central node, and in a specific application scenario, when a unique characteristic feature value is generated according to each block file in the global information table, a corresponding hash value may be generated by performing secure hash processing on each block file, where the hash value is used as the unique feature value. The hash value may be used as digest information to index the block files stored in each edge node, and address information of the target edge node in which the target block file is stored may be indexed based on the digest information of the target block file to be backed up, so that when the backup node acquires, from the target edge node, the address information of the target edge node in which the target block file to be backed up is stored, which is required by the backup node, the central node can accurately provide the address information.
The management method for distributed data storage provided by the embodiment of the application can determine which block files are target block files to be stored based on the lost block files of each edge node involved in a fault event, and performs backup operation on the target block files to be backed up by using at least one edge node in a central node scheduling system as a backup node, so that data acquisition is not affected under the condition that a small number of edge nodes are in fault, and the safety and integrity of data are ensured.
Other aspects of the distributed data storage management method provided by this embodiment, which are executed by the central node, are the same as or similar to those of the distributed data storage management method described above, and are not described herein again.
Fig. 7 is a schematic flow chart of another distributed data storage management method provided according to an embodiment of the present application, and referring to fig. 7, an embodiment of the present application provides a distributed data storage management method, where the method is performed by an edge node in a distributed data storage system, and the system includes at least one central node and a plurality of edge nodes connected to the at least one central node through a network. The method comprises the following steps: step S710, in response to the received scheduling instruction from the central node, step S720, detecting a target edge node in which a target block file to be backed up indicated in the scheduling instruction is stored, step S730, obtaining the target block file to be backed up from the target edge node, and then backing up the obtained target block file to be backed up locally.
The management method for distributed data storage provided by the embodiment of the application can realize backup storage of target block files which are possibly lost by using edge nodes with spare storage spaces in a system when the edge nodes have a fault event, so that the data acquisition is not influenced under the condition that a small number of edge nodes have faults, and the safety and the integrity of the data are ensured.
Other aspects of the distributed data storage management method provided by this embodiment, which are executed by the edge node, are the same as or similar to those of the distributed data storage management method described above, and are not described herein again.
According to yet another aspect of the present application, an embodiment of the present application provides a distributed data storage system, which includes at least one central node and a plurality of edge nodes connected to the at least one central node via a network, each edge node storing a plurality of block files; the central node monitors a fault event of each edge node, determines a target block file to be backed up in response to the monitored fault event, and then schedules at least one edge node as a backup node to perform backup operation on the target block file to be backed up; and each edge node serving as the backup node detects a target edge node in which the target block file is stored based on the scheduling of the central node, acquires the target block file to be backed up from the target edge node, and then locally backs up the acquired target block file to be backed up.
The management system for distributed data storage provided by this embodiment can implement that when an edge node fails, a central node can schedule the edge node with an empty storage space to perform a backup operation on a target block file that may be lost, so as to ensure that data acquisition is not affected when a small number of edge nodes fail, and ensure the security and integrity of data.
Other aspects of the distributed data storage management system provided in this embodiment are the same as or similar to those of the distributed data storage management method described above, and are not described herein again.
Fig. 8 is a schematic diagram of a distributed data storage management apparatus according to an embodiment of the present application, and referring to fig. 8, an embodiment of the present application provides a distributed data storage management apparatus, where the apparatus operates on a central node of a distributed data storage system, the system further includes a plurality of edge nodes connected to the central node through a network, and the apparatus includes: a monitoring unit 801, a determination unit 802, and a scheduling unit 803.
The monitoring unit 801 is configured to monitor a fault event of each edge node;
the judging unit 802 is configured to determine a target block file to be backed up in response to the monitored failure event;
the scheduling unit 803 is configured to schedule at least one edge node as a backup node to perform a backup operation on the target block file to be backed up.
Optionally, the apparatus further includes a screening unit, configured to screen at least one edge node from the at least one edge node responding to the scheduling requirement as a backup node to perform the backup operation to obtain the target block file to be backed up, and then locally backup the obtained target block file to be backed up.
The management device for distributed data storage provided by this embodiment can implement backup storage of target block files that may be lost by scheduling other edge nodes that can respond to data storage requests by the central node when an edge node fails, so as to ensure that data acquisition is not affected when a small number of edge nodes fail, and ensure the security and integrity of data.
Other aspects of the management apparatus for distributed data storage according to this embodiment are the same as or similar to the management method for distributed data storage described above, and are not described herein again.
Fig. 9 is a schematic diagram of a management apparatus for distributed data storage according to an embodiment of the present application, and as shown in fig. 9, an embodiment of the present application provides a management apparatus for distributed data storage, where the apparatus is applied to a plurality of edge nodes in a distributed data storage system, and the system further includes at least one central node; the device comprises: a communication unit 901, a receiving unit 902 and a processing unit 903.
The communication unit 901 is configured to report a fault event to the central node;
the receiving unit 902 is configured to receive a scheduling instruction of the central node;
the processing unit 903 is configured to detect, in response to the received scheduling instruction, a target edge node where a target block file to be backed up indicated in the scheduling instruction is stored, acquire the target block file to be backed up from the target edge node, and then locally back up the acquired target block file to be backed up.
The management device for distributed data storage provided by this embodiment can implement that when an edge node has a failure event, the rest edge nodes can execute backup operation on a target block file to be backed up based on the detected edge node storing the target block file to be backed up under the scheduling instruction of the central node, so as to ensure that data acquisition is not affected and the safety and integrity of data are ensured under the condition that a small number of edge nodes have failures.
Other aspects of the distributed data storage management apparatus proposed in this embodiment are the same as or similar to those of the distributed data storage management method described above, and are not described herein again.
Furthermore, the present application also provides a storage medium storing a computer program that can be loaded by a processor to perform any of the steps for the distributed data storage management method described above.
Illustratively, the storage medium may be any one of the following: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The distributed data storage system, the management method, the device and the storage medium provided by the embodiment of the present application are introduced in detail, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the above embodiment is only used to help understanding the technical scheme and the core idea of the present application; those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the present disclosure as defined by the appended claims.

Claims (13)

1. A distributed data storage management method applied to a distributed data storage system including at least one central node and a plurality of edge nodes network-connected to the at least one central node, the method comprising:
monitoring each of the edge nodes for a fault event;
determining a target block file to be backed up in response to the monitored fault event;
the central node schedules at least one edge node as a backup node to execute backup operation on the target block file to be backed up;
and the backup node detects a target edge node in which the target block file is stored based on the scheduling of the central node, acquires the target block file to be backed up from the target edge node, and then backs up the acquired target block file to be backed up locally.
2. The method of claim 1, wherein the step of determining a target block file to be backed up in response to the monitored failure event comprises:
determining a lost block file according to each edge node involved by the fault event;
based on the determined lost block files, the central node judges whether the backup quantity of each currently remaining block file in the system is lower than a preset threshold value, if so, the block file lower than the preset threshold value is determined as the target block file to be backed up.
3. The method of claim 2, wherein the step of the central node scheduling at least one of the edge nodes to perform the backup operation on the target block file to be backed up comprises:
the central node determines and selects at least one edge node which currently has expected storage space in the system as the backup node based on a global information table or based on a polling mechanism;
the central node determines a target block file which needs to be backed up by each backup node based on the load condition of each backup node;
and the central node sends a scheduling instruction to each backup node to instruct each backup node to execute backup operation on the target block file which needs to be backed up.
4. The method of claim 3, wherein the step of the backup node probing the target edge node holding the target block file based on the schedule of the central node comprises:
each backup node detects and determines a target edge node storing a target block file to be backed up by the backup node in a mode of information interaction with the peripheral edge nodes of the backup node; and/or
And each backup node acquires the address information of the target edge node which stores the target block file to be backed up of the backup node from the central node.
5. The method according to claim 4, wherein the step of obtaining the target block files to be backed up from the target edge node and then backing up the obtained target block files to be backed up locally comprises:
each backup node sends a file acquisition request to a corresponding target edge node so as to acquire a target block file which needs to be backed up by the backup node from the corresponding target edge node.
6. The method according to claim 5, wherein the step of obtaining the target block file to be backed up from the target edge node and then backing up the obtained target block file to be backed up locally further comprises:
each target edge node responds to the received file acquisition request and sends a copy of the target block file indicated in the file acquisition request to the backup node initiating the file acquisition request.
7. The method according to any one of claims 1-6, further comprising:
after the backup node locally backs up the acquired target block file to be backed up, the backup node sends a notification that the backup operation is completed to the central node;
and in response to the received notification that the backup operation is completed, the central node updates the backup quantity value of the corresponding target block file to be backed up.
8. A distributed data storage management method, the method being performed by a central node in a distributed data storage system, the system further comprising a plurality of edge nodes networked with the central node, the method comprising:
monitoring each of the edge nodes for a fault event;
determining a target block file to be backed up in response to the monitored fault event;
and scheduling at least one edge node as a backup node to execute backup operation on the target block file to be backed up.
9. A distributed data storage management method, the method being performed by an edge node in a distributed data storage system, the system comprising at least one central node and a plurality of edge nodes networked to the at least one central node, the method comprising:
and responding to a received scheduling instruction from the central node, detecting and storing a target edge node of the target block file to be backed up indicated in the scheduling instruction, acquiring the target block file to be backed up from the target edge node, and then backing up the acquired target block file to be backed up locally.
10. A distributed data storage system is characterized in that the distributed data storage system comprises at least one central node and a plurality of edge nodes connected with the at least one central node in a network mode, and each edge node stores a plurality of block files;
the central node monitors a fault event of each edge node, determines a target block file to be backed up in response to the monitored fault event, and then schedules at least one edge node as a backup node to perform backup operation on the target block file to be backed up;
and each edge node serving as the backup node detects a target edge node in which the target block file is stored based on the scheduling of the central node, acquires the target block file to be backed up from the target edge node, and then locally backs up the acquired target block file to be backed up.
11. A distributed data storage management apparatus, wherein the apparatus operates on a central node of a distributed data storage system, wherein the system further comprises a plurality of edge nodes networked to the central node, wherein the apparatus comprises:
the monitoring unit is used for monitoring the fault event of each edge node;
the judging unit is used for responding to the monitored fault event to determine a target block file to be backed up;
and the scheduling unit is used for scheduling at least one edge node as a backup node to execute backup operation on the target block file to be backed up.
12. A distributed data storage management apparatus, wherein the apparatus operates on edge nodes of a distributed data storage system, the system comprising at least one central node and a plurality of edge nodes networked to the at least one central node, the apparatus comprising:
a communication unit for reporting a fault event to the central node;
a receiving unit, configured to receive a scheduling instruction of the central node;
and the processing unit is used for responding to the received scheduling instruction, detecting and storing a target edge node of the target block file to be backed up indicated in the scheduling instruction, acquiring the target block file to be backed up from the target edge node, and then backing up the acquired target block file to be backed up locally.
13. A storage medium, characterized in that it stores a computer program that can be loaded by a processor to perform the distributed data storage management method according to any one of claims 1 to 9.
CN202110884351.3A 2021-08-03 2021-08-03 Distributed data storage system, management method, device and storage medium Pending CN113568783A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110884351.3A CN113568783A (en) 2021-08-03 2021-08-03 Distributed data storage system, management method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110884351.3A CN113568783A (en) 2021-08-03 2021-08-03 Distributed data storage system, management method, device and storage medium

Publications (1)

Publication Number Publication Date
CN113568783A true CN113568783A (en) 2021-10-29

Family

ID=78170056

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110884351.3A Pending CN113568783A (en) 2021-08-03 2021-08-03 Distributed data storage system, management method, device and storage medium

Country Status (1)

Country Link
CN (1) CN113568783A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114168083A (en) * 2021-12-10 2022-03-11 四川爱联科技股份有限公司 Data storage system and method and electronic equipment
WO2023093079A1 (en) * 2021-11-26 2023-06-01 浪潮通信信息系统有限公司 Consistency check method and apparatus for distributed edge cloud edge nodes

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023093079A1 (en) * 2021-11-26 2023-06-01 浪潮通信信息系统有限公司 Consistency check method and apparatus for distributed edge cloud edge nodes
CN114168083A (en) * 2021-12-10 2022-03-11 四川爱联科技股份有限公司 Data storage system and method and electronic equipment
CN114168083B (en) * 2021-12-10 2023-08-08 四川爱联科技股份有限公司 Data storage system, method and electronic equipment

Similar Documents

Publication Publication Date Title
US7480644B2 (en) Systems methods, and software for distributed loading of databases
CN113568783A (en) Distributed data storage system, management method, device and storage medium
CN111818159B (en) Management method, device, equipment and storage medium of data processing node
CN110535692B (en) Fault processing method and device, computer equipment, storage medium and storage system
US20090198385A1 (en) Storage medium for storing power consumption monitor program, power consumption monitor apparatus and power consumption monitor method
US20090259741A1 (en) Grid Computing Implementation
CN103354503A (en) Cloud storage system capable of automatically detecting and replacing failure nodes and method thereof
CN114064374A (en) Fault detection method and system based on distributed block storage
CN112783792A (en) Fault detection method and device of distributed database system and electronic equipment
CN111611057A (en) Distributed retry method, device, electronic equipment and storage medium
CN114118991A (en) Third-party system monitoring system, method, device, equipment and storage medium
CN110515757B (en) Information processing method, device, server and medium of distributed storage system
CN110545197B (en) Node state monitoring method and device
CN113765687A (en) Fault alarm method, device, equipment and storage medium of server
CN112416731B (en) Stability monitoring method and device applied to block chain system
CN115145782A (en) Server switching method, mooseFS system and storage medium
CN114328033A (en) Method and device for keeping service configuration consistency of high-availability equipment group
JP2022052504A (en) Bmc, server system, device stabilization determination method, and program
JP2009086758A (en) Computer system and system management program
CN115473802B (en) Node management method, system, equipment and storage medium
CN116662040B (en) Message distribution method and device, electronic equipment and storage medium
CN110830281B (en) Hot standby method and system based on mesh network structure
US20230222846A1 (en) Task managing system for testing-configuring vehicles based on a task order and method thereof
CN111865670B (en) Warehouse network rapid recovery method and warehouse network rapid recovery server
CN115834603A (en) Data synchronization method and device, storage medium and processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination