CN113076298A - Distributed small file storage system - Google Patents
Distributed small file storage system Download PDFInfo
- Publication number
- CN113076298A CN113076298A CN202110404012.0A CN202110404012A CN113076298A CN 113076298 A CN113076298 A CN 113076298A CN 202110404012 A CN202110404012 A CN 202110404012A CN 113076298 A CN113076298 A CN 113076298A
- Authority
- CN
- China
- Prior art keywords
- file
- node
- masternode
- datanode
- directory tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004891 communication Methods 0.000 claims description 8
- 238000000034 method Methods 0.000 claims description 6
- 230000001360 synchronised effect Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 description 5
- 238000013500 data storage Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000007418 data mining Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/547—Remote procedure calls [RPC]; Web services
Abstract
The invention discloses a distributed small file storage system, which comprises a Master node and a plurality of DataNode nodes, wherein the Master node and the DataNode nodes are deployed by adopting a Master-Slave architecture; the MasterNode node is used for operating and managing a file directory tree and managing a DataNode node; the DataNode node is used for storing file data recorded in the file directory tree; the file directory tree is stored in a redis database cluster, and when the MasterNode node operates the file directory tree, the file directory tree is obtained from the redis database cluster. The invention solves the problem that a large amount of small files cannot be efficiently stored.
Description
Technical Field
The invention belongs to the technical field of data storage, and particularly relates to a distributed small file storage system.
Background
The invention is mainly based on two backgrounds, firstly, the enterprise digital transformation is accelerated, the requirement of mass data storage exists, secondly, the rapid development of a mass distributed file system, especially the development of the prior big data technology, the distributed file storage is widely applied in the enterprise, and the technical development is relatively mature.
At present, distributed file storage is widely applied to enterprises, plays an important role in data storage, data backup, data mining, machine learning and the like, and functions of a distributed file storage system are developed more and more along with further improvement of the technology. The distributed file storage system has the basic functions of file storage, providing various interfaces for users to store files on a server and providing storage and backup functions, and the server can conveniently store various files.
Secondly, the wide application of the distributed file storage system is also an important background of the invention, the distributed file storage system is a system based on file reading and writing and file management, and can store files in a server, namely, write the files into a disk of the server, and also can download and view the files from the server, namely, read the files from the disk of the server, and simultaneously, manage file directories of the whole file system.
The distributed file storage systems which are widely applied at present mainly comprise two file storage systems, namely FastDFS developed by C language and Haoop developed by Java language.
The FastDFS system lacks a backup notification mechanism, and once a copy is successfully written to a storage, when synchronizing to other storage backups, a failure of the source storage may result in loss of user data, which is unacceptable for the file storage system. Second, FastDFS lacks an automatic recovery mechanism and data recovery is inefficient.
Hadoop is a product of large data storage, and although Hadoop has the characteristics of high reliability, high expansibility and high fault tolerance, Hadoop architecture causes that Hadoop is not suitable for low-delay data access, secondly, Hadoop adopts a memory management file directory tree, a memory bottleneck exists, and massive small files occupy a large amount of memory space, so massive small files cannot be efficiently stored.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a distributed small file storage system to solve the problem that a large amount of small files cannot be efficiently stored.
In order to solve the technical problems, the invention adopts the technical scheme that: the distributed small file storage system comprises a Master node and a plurality of DataNode nodes which are deployed by adopting a Master-Slave architecture;
the MasterNode node is used for operating and managing a file directory tree and managing a DataNode node; the DataNode node is used for storing file data recorded in the file directory tree; the file directory tree is stored in a redis database cluster, and when the MasterNode node operates the file directory tree, the file directory tree is obtained from the redis database cluster.
The distributed small file storage system also comprises a SecondaryMasterNode node, wherein the SecondaryMasterNode node maintains the same file directory tree as the MasterNode node; the operation of the MasterNode node on the file directory tree can synchronize the SecondardryMasterNode node; after the SecondaryMasterNode synchronously operates the file directory tree, the file directory tree is synchronized to the redis database cluster.
In the distributed small file storage system, the MasterNode node generates editslog files for operating the file directory tree; every T time, the SecondaryMasterNode performs backup operation on the file directory tree, the fsimage file obtained by backup is synchronized to the MasterNode, and the MasterNode clears the editslog file in the time before the fsimage file is generated.
In the distributed small file storage system, the operation of the MasterNode node on the file directory tree includes data adding operation, data deleting operation, data querying operation and/or data modifying operation on the file directory tree.
In the distributed small file storage system, a plurality of DataNode nodes are communicated with each other through a gRPC protocol.
In the distributed small file storage system, each DataNode node sends a heartbeat packet to the MasterNode through the gRPC protocol to report the self state.
The distributed small file storage system also comprises a client side, wherein the client side is used for a user to access the MasterNode node to operate the file directory tree, and upload files to the DataNode node and/or read files from the DataNode node.
In the distributed small file storage system, the client uploads the file to the DataNode node, and the method includes the following steps:
step 1, a client sends a file uploading request to a MasterNode node;
step 2, the MasterNode node inquires a file directory tree, judges whether the ID of the uploaded file is recorded in the file directory tree or not, if so, returns whether the file is written in a covering mode to the client side, and if so, enters the next step; if not, entering the next step;
step 3, the MasterNode inquires the list information of the DataNode and returns the position of the DataNode node which is closest to the network distance and can upload files; the network distance is the communication distance between the client and the DataNode;
and 4, the client establishes a pipeline request with the returned DataNode node, and after the pipeline request is established, the client uploads the file to the DataNode node through the SocketStream by streaming data.
Step 5, when the DataNode node receives the data, writing the data into a file in an IO stream mode, and synchronizing the data to a backup DataNode node in a SocketStream stream mode;
and 6, after the data writing is finished, the DataNode node returns the file writing success to the MasterNode node and the client.
In the distributed small file storage system, the client reads files from the DataNode node, and the method comprises the following steps:
step a, a client sends a file name of a request reading file to a MasterNode node;
b, searching the position of the file by the MasterNode node according to the file name;
step c, the MasterNode node returns the position of the file to the client according to the network distance; the network distance is the communication distance between the client and the DataNode;
and d, the client reads the file from the DataNode according to the returned file position.
Compared with the prior art, the invention has the following advantages: the file directory tree is stored in the redis database cluster, so that the problem of memory bottleneck is solved.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a block diagram of the system of the present invention.
Fig. 2 is a schematic diagram of a file uploading process according to the present invention.
FIG. 3 is a diagram illustrating a process of reading a file according to the present invention.
Detailed Description
As shown in fig. 1, the distributed small file storage system includes a MasterNode node and a plurality of DataNode nodes deployed by a Master-Slave architecture;
the MasterNode node is used for operating and managing a file directory tree and managing a DataNode node; the DataNode node is used for storing file data recorded in the file directory tree; the file directory tree is stored in a redis database cluster, and when the MasterNode node operates the file directory tree, the file directory tree is obtained from the redis database cluster.
It should be noted that the problem of insufficient memory is greatly improved by constructing a redis database cluster as a metadata center to store the file directory tree, and the bottleneck problem of the Hadoop memory is further solved, so that a large amount of small files can be stored.
The distributed small file storage system also comprises a SecondaryMasterNode node which is an assistant node of the MasterNode node, and because the task of the MasterNode node is heavy, a node is needed to help the MasterNode to complete the backup operation of the file directory; the SecondaryMasterNode node maintains the same file directory tree as the MasterNode node; the operation of the MasterNode node on the file directory tree can synchronize the SecondardryMasterNode node; after the SecondaryMasterNode synchronously operates the file directory tree, the file directory tree is synchronized to the redis database cluster.
The distributed small file storage system also comprises a client side, wherein the client side is used for enabling a user to access the MasterNode node to operate the file directory tree, and uploading files to the DataNode node and/or reading files from the DataNode node.
It should be noted that the DFSClient at the client is a module that provides an interface for the user in the system of the present invention, and the user can use the DFSClient module to perform operations such as creation, viewing, deletion, uploading, downloading, and deletion of a directory. Of course, the module may be relied upon by other systems, such that other systems may access the cluster through a particular API to perform operations on the system.
In this embodiment, the operation of the MasterNode node on the file directory tree generates an editslog file; every T time interval, T can be set according to actual requirements; the SecondaryMasterNode node performs backup operation on the file directory tree, the fsimage file obtained by backup is synchronized to the MasterNode node, and the MasterNode node clears the editslog file in the time before the fsimage file is generated.
It should be noted that the MasterNode node is a main core module in the system of the present invention, and the module is mainly used for providing services to the outside, managing a file directory tree, managing an operation log, managing a DataNode node, and the like. All requests from the client end are sent to the MasterNode node first, and after the MasterNode node receives the requests, different responses are carried out according to different request types. In addition, the MasterNode has the most important function of managing operation logs, and all operations of a client on a file system are recorded in the editslog file, so that when the MasterNode fails, the MasterNode can be restarted and played back once from the edisslog file according to the operation logs, a complete file directory tree is obtained, and data cannot be lost. In addition, because edisllog is continuously written, the size of the file is continuously increased, and if no measures are taken, the MasterNode can read the editllog file too large, so that the performance is underground, and the editllog file can be stored by adopting a segmented storage mechanism, so that the system only needs to read a small section of file, and the efficiency is greatly improved. Secondly, because it takes a lot of time to read the editslog data for playback, as the operation log increases, the problem of too long recovery time occurs when the MasterNode is down to recover, based on this problem, the SecondaryMasterNode also provides the function of fsimage directory, and the SecondaryMasterNode writes the file directory tree backup in the redis database into the fsimage file at intervals, synchronizes the fsimage file to the MasterNode, and clears the editslog before this time point, so that when the MasterNode is down to recover, only the file needs to be read from the fsimage, and a part of the operations is played back from the editslog file to obtain the complete file directory. Greatly shortens the time for recovering the MasterNode downtime.
In addition, the operation of the MasterNode node on the file directory tree includes data adding operation, data deleting operation, data querying operation and/or data modifying operation on the file directory tree.
As shown in fig. 1, a plurality of the DataNode nodes are in communication with each other via the gRPC protocol. Each DataNode node sends a heartbeat packet to the MasterNode through the gRPC protocol to report the self state. Communication is maintained through a gRPC protocol, and the high-efficiency low-delay use of the whole cluster is ensured.
As shown in fig. 2, the uploading of the file to the DataNode node by the client includes the following steps:
step 1, a client sends a file uploading request to a MasterNode node;
step 2, the MasterNode node inquires a file directory tree, judges whether the ID of the uploaded file is recorded in the file directory tree or not, if so, returns whether the file is written in a covering mode to the client side, and if so, enters the next step; if not, entering the next step;
step 3, the MasterNode inquires the list information of the DataNode and returns the position of the DataNode node which is closest to the network distance and can upload files; the network distance is the communication distance between the client and the DataNode;
and 4, the client establishes a pipeline request with the returned DataNode node, and after the pipeline request is established, the client uploads the file to the DataNode node through the SocketStream by streaming data.
Step 5, when the DataNode node receives the data, writing the data into a file in an IO stream mode, and synchronizing the data to a backup DataNode node in a SocketStream stream mode;
and 6, after the data writing is finished, the DataNode node returns the file writing success to the MasterNode node and the client.
As shown in fig. 3, the client reads the file from the DataNode node, and includes the following steps:
step a, a client sends a file name of a request reading file to a MasterNode node;
b, searching the position of the file by the MasterNode node according to the file name;
step c, the MasterNode node returns the position of the file to the client according to the network distance; the network distance is the communication distance between the client and the DataNode;
and d, the client reads the file from the DataNode according to the returned file position.
It should be noted that the distributed storage of the massive small file system is met, and the distributed storage can be responded quickly in time. The system has a universal API interface, and can access the operation cluster only by simply introducing DFSClient and simply configuring. The system is designed aiming at the small files, so that the bottleneck problem, the high availability and high expansion problem and the like in small file storage are fully considered, the problem of backup of the file directory of the MasterNode node is creatively solved by introducing the SecondaryMasterNode node, and the high availability and high fault tolerance of the whole cluster are ensured. The distributed file storage system has low requirement on computer hardware, can provide high-efficiency and reliable file storage service by only forming a cluster by a plurality of cheap computer servers, and can be easily expanded. Theoretically, the data can be stored infinitely as long as the DataNode nodes are continuously added.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and all simple modifications, changes and equivalent structural changes made to the above embodiment according to the technical spirit of the present invention still fall within the protection scope of the technical solution of the present invention.
Claims (10)
1. Distributed small file storage system, its characterized in that: the system comprises a Master node and a plurality of DataNode nodes which are deployed by adopting a Master-Slave architecture;
the MasterNode node is used for operating and managing a file directory tree and managing a DataNode node; the DataNode node is used for storing file data recorded in the file directory tree; the file directory tree is stored in a redis database cluster, and when the MasterNode node operates the file directory tree, the file directory tree is obtained from the redis database cluster.
2. The distributed doclet storage system of claim 1, wherein: the system also comprises a SecondaryMasterNode node, wherein the SecondaryMasterNode node maintains the same file directory tree as the MasterNode node; the operation of the MasterNode node on the file directory tree can synchronize the SecondardryMasterNode node; after the SecondaryMasterNode synchronously operates the file directory tree, the file directory tree is synchronized to the redis database cluster.
3. The distributed doclet storage system of claim 2, wherein: the MasterNode node generates editslog files by operating the file directory tree; every T time, the SecondaryMasterNode performs backup operation on the file directory tree, the fsimage file obtained by backup is synchronized to the MasterNode, and the MasterNode clears the editslog file in the time before the fsimage file is generated.
4. The distributed doclet storage system of claim 3, wherein: the operation of the MasterNode node on the file directory tree comprises data adding operation, data deleting operation, data inquiring operation and/or data modifying operation on the file directory tree.
5. The distributed doclet storage system of claim 1, 2 or 3, wherein: a plurality of the DataNode nodes are communicated with each other through a gRPC protocol.
6. The distributed doclet storage system of claim 1, 2 or 3, wherein: each DataNode node sends a heartbeat packet to the MasterNode through the gRPC protocol to report the self state.
7. The distributed doclet storage system of claim 1, 2 or 3, wherein: the client is used for a user to access the MasterNode node to operate the file directory tree, and upload files to the DataNode node and/or read files from the DataNode node.
8. The distributed doclet storage system of claim 7, wherein: the method for uploading the file to the DataNode node by the client comprises the following steps:
step 1, a client sends a file uploading request to a MasterNode node;
step 2, the MasterNode node inquires a file directory tree, judges whether the ID of the uploaded file is recorded in the file directory tree or not, if so, returns whether the file is written in a covering mode to the client side, and if so, enters the next step; if not, entering the next step;
step 3, the MasterNode inquires the list information of the DataNode and returns the position of the DataNode node which is closest to the network distance and can upload files; the network distance is the communication distance between the client and the DataNode;
and 4, the client establishes a pipeline request with the returned DataNode node, and after the pipeline request is established, the client uploads the file to the DataNode node through the SocketStream by streaming data.
9. The distributed doclet storage system of claim 8, wherein: further comprising:
step 5, when the DataNode node receives the data, writing the data into a file in an IO stream mode, and synchronizing the data to a backup DataNode node in a SocketStream stream mode;
and 6, after the data writing is finished, the DataNode node returns the file writing success to the MasterNode node and the client.
10. The distributed doclet storage system of claim 7, wherein: the method for reading the file from the DataNode node by the client comprises the following steps:
step a, a client sends a file name of a request reading file to a MasterNode node;
b, searching the position of the file by the MasterNode node according to the file name;
step c, the MasterNode node returns the position of the file to the client according to the network distance; the network distance is the communication distance between the client and the DataNode;
and d, the client reads the file from the DataNode according to the returned file position.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110404012.0A CN113076298A (en) | 2021-04-15 | 2021-04-15 | Distributed small file storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110404012.0A CN113076298A (en) | 2021-04-15 | 2021-04-15 | Distributed small file storage system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113076298A true CN113076298A (en) | 2021-07-06 |
Family
ID=76617776
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110404012.0A Pending CN113076298A (en) | 2021-04-15 | 2021-04-15 | Distributed small file storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113076298A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115499426A (en) * | 2022-07-29 | 2022-12-20 | 天翼云科技有限公司 | Method, device, equipment and medium for transmitting mass small files |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7966293B1 (en) * | 2004-03-09 | 2011-06-21 | Netapp, Inc. | System and method for indexing a backup using persistent consistency point images |
CN103853612A (en) * | 2012-12-04 | 2014-06-11 | 中山大学深圳研究院 | Method for reading data based on digital family content under distributed storage |
CN111399760A (en) * | 2019-11-19 | 2020-07-10 | 杭州海康威视系统技术有限公司 | NAS cluster metadata processing method and device, NAS gateway and medium |
CN111427841A (en) * | 2020-02-26 | 2020-07-17 | 平安科技(深圳)有限公司 | Data management method and device, computer equipment and storage medium |
CN112416889A (en) * | 2020-10-27 | 2021-02-26 | 中科曙光南京研究院有限公司 | Distributed storage system |
-
2021
- 2021-04-15 CN CN202110404012.0A patent/CN113076298A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7966293B1 (en) * | 2004-03-09 | 2011-06-21 | Netapp, Inc. | System and method for indexing a backup using persistent consistency point images |
CN103853612A (en) * | 2012-12-04 | 2014-06-11 | 中山大学深圳研究院 | Method for reading data based on digital family content under distributed storage |
CN111399760A (en) * | 2019-11-19 | 2020-07-10 | 杭州海康威视系统技术有限公司 | NAS cluster metadata processing method and device, NAS gateway and medium |
CN111427841A (en) * | 2020-02-26 | 2020-07-17 | 平安科技(深圳)有限公司 | Data management method and device, computer equipment and storage medium |
CN112416889A (en) * | 2020-10-27 | 2021-02-26 | 中科曙光南京研究院有限公司 | Distributed storage system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115499426A (en) * | 2022-07-29 | 2022-12-20 | 天翼云科技有限公司 | Method, device, equipment and medium for transmitting mass small files |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111723160B (en) | Multi-source heterogeneous incremental data synchronization method and system | |
JP7271670B2 (en) | Data replication method, device, computer equipment and computer program | |
WO2019154394A1 (en) | Distributed database cluster system, data synchronization method and storage medium | |
US11468015B2 (en) | Storage and synchronization of metadata in a distributed storage system | |
CN101809558B (en) | System and method for remote asynchronous data replication | |
US7653668B1 (en) | Fault tolerant multi-stage data replication with relaxed coherency guarantees | |
US6823474B2 (en) | Method and system for providing cluster replicated checkpoint services | |
US9547706B2 (en) | Using colocation hints to facilitate accessing a distributed data storage system | |
US20190163765A1 (en) | Continuous data management system and operating method thereof | |
US20070143286A1 (en) | File management method in file system and metadata server therefor | |
CN103138912B (en) | Method of data synchronization and system | |
CN111078667B (en) | Data migration method and related device | |
US20120278429A1 (en) | Cluster system, synchronization controlling method, server, and synchronization controlling program | |
CN101808127A (en) | Data backup method, system and server | |
CN107818111B (en) | Method for caching file data, server and terminal | |
CN103902405A (en) | Quasi-continuity data replication method and device | |
CN106873902B (en) | File storage system, data scheduling method and data node | |
CN113010496A (en) | Data migration method, device, equipment and storage medium | |
CN113076298A (en) | Distributed small file storage system | |
CN115563221A (en) | Data synchronization method, storage system, device and storage medium | |
CN111143366B (en) | High-efficiency storage method for massive large object data | |
WO2021208401A1 (en) | Continuous data protection system and method for modern applications | |
CN111522688B (en) | Data backup method and device for distributed system | |
CN108874592B (en) | Data cold standby method and system for Log-structured storage engine | |
CN112667698A (en) | MongoDB data synchronization method based on converged media platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |