CN113076298A

CN113076298A - Distributed small file storage system

Info

Publication number: CN113076298A
Application number: CN202110404012.0A
Authority: CN
Inventors: 许士松; 朱坤奎
Original assignee: Shanghai Zhuo Steel Chain Technology Co ltd
Current assignee: Shanghai Zhuo Steel Chain Technology Co ltd
Priority date: 2021-04-15
Filing date: 2021-04-15
Publication date: 2021-07-06

Abstract

The invention discloses a distributed small file storage system, which comprises a Master node and a plurality of DataNode nodes, wherein the Master node and the DataNode nodes are deployed by adopting a Master-Slave architecture; the MasterNode node is used for operating and managing a file directory tree and managing a DataNode node; the DataNode node is used for storing file data recorded in the file directory tree; the file directory tree is stored in a redis database cluster, and when the MasterNode node operates the file directory tree, the file directory tree is obtained from the redis database cluster. The invention solves the problem that a large amount of small files cannot be efficiently stored.

Description

Distributed small file storage system

Technical Field

The invention belongs to the technical field of data storage, and particularly relates to a distributed small file storage system.

Background

The invention is mainly based on two backgrounds, firstly, the enterprise digital transformation is accelerated, the requirement of mass data storage exists, secondly, the rapid development of a mass distributed file system, especially the development of the prior big data technology, the distributed file storage is widely applied in the enterprise, and the technical development is relatively mature.

At present, distributed file storage is widely applied to enterprises, plays an important role in data storage, data backup, data mining, machine learning and the like, and functions of a distributed file storage system are developed more and more along with further improvement of the technology. The distributed file storage system has the basic functions of file storage, providing various interfaces for users to store files on a server and providing storage and backup functions, and the server can conveniently store various files.

Secondly, the wide application of the distributed file storage system is also an important background of the invention, the distributed file storage system is a system based on file reading and writing and file management, and can store files in a server, namely, write the files into a disk of the server, and also can download and view the files from the server, namely, read the files from the disk of the server, and simultaneously, manage file directories of the whole file system.

The distributed file storage systems which are widely applied at present mainly comprise two file storage systems, namely FastDFS developed by C language and Haoop developed by Java language.

The FastDFS system lacks a backup notification mechanism, and once a copy is successfully written to a storage, when synchronizing to other storage backups, a failure of the source storage may result in loss of user data, which is unacceptable for the file storage system. Second, FastDFS lacks an automatic recovery mechanism and data recovery is inefficient.

Hadoop is a product of large data storage, and although Hadoop has the characteristics of high reliability, high expansibility and high fault tolerance, Hadoop architecture causes that Hadoop is not suitable for low-delay data access, secondly, Hadoop adopts a memory management file directory tree, a memory bottleneck exists, and massive small files occupy a large amount of memory space, so massive small files cannot be efficiently stored.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a distributed small file storage system to solve the problem that a large amount of small files cannot be efficiently stored.

In order to solve the technical problems, the invention adopts the technical scheme that: the distributed small file storage system comprises a Master node and a plurality of DataNode nodes which are deployed by adopting a Master-Slave architecture;

the MasterNode node is used for operating and managing a file directory tree and managing a DataNode node; the DataNode node is used for storing file data recorded in the file directory tree; the file directory tree is stored in a redis database cluster, and when the MasterNode node operates the file directory tree, the file directory tree is obtained from the redis database cluster.

The distributed small file storage system also comprises a SecondaryMasterNode node, wherein the SecondaryMasterNode node maintains the same file directory tree as the MasterNode node; the operation of the MasterNode node on the file directory tree can synchronize the SecondardryMasterNode node; after the SecondaryMasterNode synchronously operates the file directory tree, the file directory tree is synchronized to the redis database cluster.

In the distributed small file storage system, the MasterNode node generates editslog files for operating the file directory tree; every T time, the SecondaryMasterNode performs backup operation on the file directory tree, the fsimage file obtained by backup is synchronized to the MasterNode, and the MasterNode clears the editslog file in the time before the fsimage file is generated.

In the distributed small file storage system, the operation of the MasterNode node on the file directory tree includes data adding operation, data deleting operation, data querying operation and/or data modifying operation on the file directory tree.

In the distributed small file storage system, a plurality of DataNode nodes are communicated with each other through a gRPC protocol.

In the distributed small file storage system, each DataNode node sends a heartbeat packet to the MasterNode through the gRPC protocol to report the self state.

The distributed small file storage system also comprises a client side, wherein the client side is used for a user to access the MasterNode node to operate the file directory tree, and upload files to the DataNode node and/or read files from the DataNode node.

In the distributed small file storage system, the client uploads the file to the DataNode node, and the method includes the following steps:

step 1, a client sends a file uploading request to a MasterNode node;

step 2, the MasterNode node inquires a file directory tree, judges whether the ID of the uploaded file is recorded in the file directory tree or not, if so, returns whether the file is written in a covering mode to the client side, and if so, enters the next step; if not, entering the next step;

step 3, the MasterNode inquires the list information of the DataNode and returns the position of the DataNode node which is closest to the network distance and can upload files; the network distance is the communication distance between the client and the DataNode;

and 4, the client establishes a pipeline request with the returned DataNode node, and after the pipeline request is established, the client uploads the file to the DataNode node through the SocketStream by streaming data.

Step 5, when the DataNode node receives the data, writing the data into a file in an IO stream mode, and synchronizing the data to a backup DataNode node in a SocketStream stream mode;

and 6, after the data writing is finished, the DataNode node returns the file writing success to the MasterNode node and the client.

In the distributed small file storage system, the client reads files from the DataNode node, and the method comprises the following steps:

step a, a client sends a file name of a request reading file to a MasterNode node;

b, searching the position of the file by the MasterNode node according to the file name;

step c, the MasterNode node returns the position of the file to the client according to the network distance; the network distance is the communication distance between the client and the DataNode;

and d, the client reads the file from the DataNode according to the returned file position.

Compared with the prior art, the invention has the following advantages: the file directory tree is stored in the redis database cluster, so that the problem of memory bottleneck is solved.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a block diagram of the system of the present invention.

Fig. 2 is a schematic diagram of a file uploading process according to the present invention.

FIG. 3 is a diagram illustrating a process of reading a file according to the present invention.

Detailed Description

As shown in fig. 1, the distributed small file storage system includes a MasterNode node and a plurality of DataNode nodes deployed by a Master-Slave architecture;

It should be noted that the problem of insufficient memory is greatly improved by constructing a redis database cluster as a metadata center to store the file directory tree, and the bottleneck problem of the Hadoop memory is further solved, so that a large amount of small files can be stored.

The distributed small file storage system also comprises a SecondaryMasterNode node which is an assistant node of the MasterNode node, and because the task of the MasterNode node is heavy, a node is needed to help the MasterNode to complete the backup operation of the file directory; the SecondaryMasterNode node maintains the same file directory tree as the MasterNode node; the operation of the MasterNode node on the file directory tree can synchronize the SecondardryMasterNode node; after the SecondaryMasterNode synchronously operates the file directory tree, the file directory tree is synchronized to the redis database cluster.

The distributed small file storage system also comprises a client side, wherein the client side is used for enabling a user to access the MasterNode node to operate the file directory tree, and uploading files to the DataNode node and/or reading files from the DataNode node.

It should be noted that the DFSClient at the client is a module that provides an interface for the user in the system of the present invention, and the user can use the DFSClient module to perform operations such as creation, viewing, deletion, uploading, downloading, and deletion of a directory. Of course, the module may be relied upon by other systems, such that other systems may access the cluster through a particular API to perform operations on the system.

In this embodiment, the operation of the MasterNode node on the file directory tree generates an editslog file; every T time interval, T can be set according to actual requirements; the SecondaryMasterNode node performs backup operation on the file directory tree, the fsimage file obtained by backup is synchronized to the MasterNode node, and the MasterNode node clears the editslog file in the time before the fsimage file is generated.

It should be noted that the MasterNode node is a main core module in the system of the present invention, and the module is mainly used for providing services to the outside, managing a file directory tree, managing an operation log, managing a DataNode node, and the like. All requests from the client end are sent to the MasterNode node first, and after the MasterNode node receives the requests, different responses are carried out according to different request types. In addition, the MasterNode has the most important function of managing operation logs, and all operations of a client on a file system are recorded in the editslog file, so that when the MasterNode fails, the MasterNode can be restarted and played back once from the edisslog file according to the operation logs, a complete file directory tree is obtained, and data cannot be lost. In addition, because edisllog is continuously written, the size of the file is continuously increased, and if no measures are taken, the MasterNode can read the editllog file too large, so that the performance is underground, and the editllog file can be stored by adopting a segmented storage mechanism, so that the system only needs to read a small section of file, and the efficiency is greatly improved. Secondly, because it takes a lot of time to read the editslog data for playback, as the operation log increases, the problem of too long recovery time occurs when the MasterNode is down to recover, based on this problem, the SecondaryMasterNode also provides the function of fsimage directory, and the SecondaryMasterNode writes the file directory tree backup in the redis database into the fsimage file at intervals, synchronizes the fsimage file to the MasterNode, and clears the editslog before this time point, so that when the MasterNode is down to recover, only the file needs to be read from the fsimage, and a part of the operations is played back from the editslog file to obtain the complete file directory. Greatly shortens the time for recovering the MasterNode downtime.

In addition, the operation of the MasterNode node on the file directory tree includes data adding operation, data deleting operation, data querying operation and/or data modifying operation on the file directory tree.

As shown in fig. 1, a plurality of the DataNode nodes are in communication with each other via the gRPC protocol. Each DataNode node sends a heartbeat packet to the MasterNode through the gRPC protocol to report the self state. Communication is maintained through a gRPC protocol, and the high-efficiency low-delay use of the whole cluster is ensured.

As shown in fig. 2, the uploading of the file to the DataNode node by the client includes the following steps:

step 1, a client sends a file uploading request to a MasterNode node;

As shown in fig. 3, the client reads the file from the DataNode node, and includes the following steps:

It should be noted that the distributed storage of the massive small file system is met, and the distributed storage can be responded quickly in time. The system has a universal API interface, and can access the operation cluster only by simply introducing DFSClient and simply configuring. The system is designed aiming at the small files, so that the bottleneck problem, the high availability and high expansion problem and the like in small file storage are fully considered, the problem of backup of the file directory of the MasterNode node is creatively solved by introducing the SecondaryMasterNode node, and the high availability and high fault tolerance of the whole cluster are ensured. The distributed file storage system has low requirement on computer hardware, can provide high-efficiency and reliable file storage service by only forming a cluster by a plurality of cheap computer servers, and can be easily expanded. Theoretically, the data can be stored infinitely as long as the DataNode nodes are continuously added.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and all simple modifications, changes and equivalent structural changes made to the above embodiment according to the technical spirit of the present invention still fall within the protection scope of the technical solution of the present invention.

Claims

1. Distributed small file storage system, its characterized in that: the system comprises a Master node and a plurality of DataNode nodes which are deployed by adopting a Master-Slave architecture;

2. The distributed doclet storage system of claim 1, wherein: the system also comprises a SecondaryMasterNode node, wherein the SecondaryMasterNode node maintains the same file directory tree as the MasterNode node; the operation of the MasterNode node on the file directory tree can synchronize the SecondardryMasterNode node; after the SecondaryMasterNode synchronously operates the file directory tree, the file directory tree is synchronized to the redis database cluster.

3. The distributed doclet storage system of claim 2, wherein: the MasterNode node generates editslog files by operating the file directory tree; every T time, the SecondaryMasterNode performs backup operation on the file directory tree, the fsimage file obtained by backup is synchronized to the MasterNode, and the MasterNode clears the editslog file in the time before the fsimage file is generated.

4. The distributed doclet storage system of claim 3, wherein: the operation of the MasterNode node on the file directory tree comprises data adding operation, data deleting operation, data inquiring operation and/or data modifying operation on the file directory tree.

5. The distributed doclet storage system of claim 1, 2 or 3, wherein: a plurality of the DataNode nodes are communicated with each other through a gRPC protocol.

6. The distributed doclet storage system of claim 1, 2 or 3, wherein: each DataNode node sends a heartbeat packet to the MasterNode through the gRPC protocol to report the self state.

7. The distributed doclet storage system of claim 1, 2 or 3, wherein: the client is used for a user to access the MasterNode node to operate the file directory tree, and upload files to the DataNode node and/or read files from the DataNode node.

8. The distributed doclet storage system of claim 7, wherein: the method for uploading the file to the DataNode node by the client comprises the following steps:

step 1, a client sends a file uploading request to a MasterNode node;

9. The distributed doclet storage system of claim 8, wherein: further comprising:

10. The distributed doclet storage system of claim 7, wherein: the method for reading the file from the DataNode node by the client comprises the following steps: