CN103207866A

CN103207866A - File storing method and system based on partitioning strategies

Info

Publication number: CN103207866A
Application number: CN201210012391XA
Authority: CN
Inventors: 王劲林; 黄垂碧; 王玲芳; 陈君
Original assignee: Institute of Acoustics CAS
Current assignee: Zhengzhou Xinrand Network Technology Co ltd
Priority date: 2012-01-16
Filing date: 2012-01-16
Publication date: 2013-07-17
Anticipated expiration: 2032-01-16
Also published as: CN103207866B

Abstract

The invention provides a file storing method and a file storing system based on partitioning strategies. Data blocks in the method are stored in a Past sub system, and the method concretely comprises the following steps that 101, user files to be stored are divided into a plurality of data blocks, and meanwhile, metadata of maintenance files, file partitioning and file access relevant information is created; and 102, all data blocks and the metadata are respectively used as files to be stored in the Past sub system optimized by a client, wherein the optimized Past sub system is a Past sub system added with deletion and updating operation, the data blocks have the same size, and the metadata structure concretely comprises file names, all file partitioning names, file size, last file modification time and last file access time. The method and the system provided by the invention have the advantages that high expandability, high availability and high safety of a Past file storing system are inherited, and in addition, the unnecessary bandwidth waste and the response time prolonging during the modification and reading operation on large files are avoided.

Description

File storage method and system based on blocking strategy

Technical Field

The invention relates to the technical field of distributed storage, in particular to distributed storage based on a structured P2P technology and application thereof, and specifically relates to a file storage method and system based on a partitioning strategy.

Background

With the global coverage and popularization of the 3G technology, the appearance of operating systems such as IOS and Android enables the mobile intelligent terminal to be rapidly expanded in the market, and the corresponding application and research based on the P2P technology attracts attention again.

Among the many applications based on P2P technology today, social networking based on P2P technology is undoubtedly a focus of research. Most of the social networking applications based on P2P are built on Past. Past is an open source application for implementing redundant storage of files based on the structured P2P technology-Past. The file stored in Past is completely copied by K and distributed to K nodes nearest to the file ID for storage. When a read request for a file is received, Past sends the request to a root node corresponding to the file ID by using a password routing protocol, and in the routing process, once the request reaches one of the K nodes, the node directly returns the file.

However, in Past, the basic unit of storage is a file, and modification, update, and deletion operations cannot be performed on the file. This means that if the user only needs to modify a part of the content of a large file, he has to read the whole file first and modify the file accordingly, and finally store the file in the Past by using another version, or if the user only needs to read a part of the content of a large file, he also has to read the file from the system, which definitely wastes valuable bandwidth resources and increases the response time of the system. In addition, without deletion operation, the user cannot immediately release the space and file name occupied by useless files, so that the development of upper-layer application is bound undoubtedly, and the difficulty of application development is increased.

Disclosure of Invention

The invention aims to provide a file storage method and system based on a blocking strategy, aiming at overcoming the problems of bandwidth resource waste and system response time prolonging in the process of storing large files, which are caused by the fact that the existing Past system takes a file as a storage unit and does not provide updating and deleting operations.

In order to achieve the above object, the present invention provides a file storage method based on a blocking policy, where a data block of the method is stored in a Past subsystem, and the method specifically includes:

step 101) dividing a user file to be stored into a plurality of data blocks, and simultaneously creating metadata for maintaining the file, the file blocks and file access related information;

step 102) storing all data blocks and metadata as files in a client-side optimized Past subsystem;

the optimized Past subsystem is the Past subsystem added with deleting and updating operations.

In the above technical solution, the data blocks have equal sizes; the metadata structure specifically includes: file name, file individual block name, file size, time of last modification of the file, and time of last access of the file.

In the above technical solution, after the step 102), the method further includes: a method of reading and updating a file data block;

the method for reading and updating the file data block comprises the following steps:

step 201) reading metadata of a file;

step 202) analyzing the metadata to obtain the block name of the corresponding data block;

step 203) reading/updating the corresponding data block according to the block name of the acquired data block.

The invention also provides a file storage system based on the block strategy, which stores the data block in the Past subsystem and comprises the following steps: in the Past subsystem, the file storage system comprises: the file blocking unit is used for blocking the file to be stored; a metadata generation unit for establishing metadata for retrieval based on the blocks acquired by the file blocking unit; the auxiliary packaging unit is used for packaging all the blocks and the metadata into files in the Past subsystem; the optimized Past subsystem is used for storing the file encapsulated by the auxiliary encapsulation unit and is an improvement based on the Past subsystem; the optimized Past subsystem is a Past subsystem added with file deleting and updating functions.

In the above technical solution, the system further comprises: the auxiliary subsystem is used for reading and updating the data blocks of the file and is used for reading or updating the data blocks stored by the optimized Past subsystem; wherein the auxiliary subsystem for reading and updating the file data block comprises: the reading module is used for reading the metadata of the file; the analysis module is used for analyzing the metadata to obtain the block name of the corresponding data block; and the operation execution module is used for reading/updating the corresponding data block according to the block name of the acquired data block. The method and the system not only inherit the high expandability, usability and safety of the Past file storage system, but also avoid unnecessary bandwidth waste and prolonging of response time when large files are modified and read.

The technical scheme adopted by the invention for solving the technical problems is as follows:

1. deletion and update operations are added to the Past system, support is provided for subsequent systems to update and delete file blocks, and the two operations enable the system to better serve upper-layer applications.

2. The file is stored in equal-sized partitions into multiple data blocks, while metadata is created that maintains the file, file partitions, and file access related information. The file blocking and the introduction of metadata enable the system to provide data block level access to large files for applications with data blocks as storage units.

3. In the Past system with the deletion and update operations added, both data blocks and metadata are stored as files. Namely, the modified Past system is packaged to be used as the bottom data block storage system of the invention.

4. When reading the file data block, reading the metadata of the file from the modified post-file system at the bottom layer, analyzing the metadata to obtain the block name of the corresponding data block, and finally reading the corresponding data block from the post system by using the block name.

5. When a file data block is updated, reading metadata of a file from the modified post-file Past system at the bottom layer, analyzing the metadata to obtain a block name of a corresponding data block, and finally updating the corresponding data block in the Past system by using the block name.

After the technical scheme is adopted, the invention has the following advantages:

1. the method and the system of the invention can shorten the response time of operation and avoid a great deal of useless bandwidth consumption when updating and reading partial content of a large file on the basis of inheriting the advantages of high reliability, availability and safety of Past distributed storage.

2. The data block is used as a storage unit, a user can define the size of the unit according to application requirements, namely the size granularity of the data block, and the data block can be updated and deleted.

Drawings

FIG. 1 is a flow chart of the processing of inserting messages by nodes of a post system after modification;

FIG. 2 is a flow chart illustrating the execution of the client update operation of the present invention;

FIG. 3 is a flow chart illustrating the execution of a delete operation of a client in accordance with the present invention;

fig. 4 is a timing diagram of processing a request by an embodiment of the system of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described with reference to the accompanying drawings and specific embodiments.

Fig. 1 is a flow chart of processing an insertion message by a modified Past system node in the present invention, and the specific flow includes:

101) and after receiving the insertion message, the node of the Past system extracts the file to be inserted from the insertion message.

102) The node queries whether the local file system already stores the same file as the ID of the file to be inserted. If so, go to 103; otherwise, turning to 104;

103) comparing the version of the inserted file with the version of the file with the same ID stored locally, judging whether the version of the inserted file is the latest or not, and if the version of the inserted file is the latest, turning to 104; otherwise, go to 105.

104) And the Past system node stores the inserted file into a local file system.

105) The process flow for the insert message is ended.

The invention can further add deletion and update operations by utilizing the modification of the flow of processing the insert message of the Past system node as shown in figure 1. Fig. 2 is a flowchart of the update operation executed by the client of the Past system in the present invention, and the specific process includes:

201) and receiving a file updating request of a user by the client, wherein the request comprises a file ID and binary file contents.

202) And reading the file from the Past system by using the ID of the updated file, and judging whether the file is stored in the Past. If it has been stored, turn to 203, otherwise turn to 204.

203) And setting the updated file version as the version of the read file plus 1.

204) And storing the update file into the Past system.

205) The client update operation flow ends.

Fig. 3 is a flowchart of the delete operation executed by the client of the Past system in the present invention, and the specific process includes:

301) and receiving a file deletion request of a user by the client, wherein the request comprises the file ID.

302) And reading the file from the Past system by using the ID of the file to be deleted, and judging whether the file is stored in the Past. If it has been stored, turn to 303, otherwise turn to 304.

303) An empty file is created with a version number of the read file plus 1.

304) The file created in 303 is stored in the Past system.

305) The client delete operation flow ends.

The data structure of the metadata in the invention comprises a file name, each block name of the file, the size of the file, the last modification time of the file and the last access time of the file. Through the recording of these information, preparation is made for subsequent file blocking and related operations of blocking, which is specifically described as follows:

fig. 4 is a timing diagram of processing a file operation request according to an embodiment of the system of the present invention, and the specific operation steps include:

processing a file writing request:

501) the user sends a file writing request to the client of the system, wherein the request comprises the ID of the file and the binary content corresponding to the file.

502) And the client partitions the file according to the set partition size unit, and if the last partition is smaller than the set partition size unit, the actual size is selected.

503) And storing each block storage request into the modified Past system.

504) The Past system returns the storage results of the blocks.

505) Metadata for the file is created according to the data structure of fig. 4 and requested to be stored in Past.

506) The Past system returns the stored results of the metadata.

507) And the client returns the result of the write-back file to the user according to the storage results of the blocks and the metadata.

Processing a read or update file data block request:

508) the user sends a file data block reading or updating request to the client of the system, the request comprises the ID of the file, the offset of the data block and the size of the data block, and if the request is the updating request, the request comprises the binary content corresponding to the file.

509) And the client requests to read corresponding file metadata from the Past according to the file ID.

510) The Past system returns metadata.

511) And the client analyzes the file metadata to obtain the data block name of the corresponding data block.

512) And reading or updating the corresponding data block from the Past system according to the data block name.

513) The Past system returns the read data block or the updated result.

514) The client returns the result of the request processing.

In summary, the present invention provides a data block storage method based on Past, and the method and system include: a Past system added with delete and update operations; storing the file in blocks, establishing metadata, and retrieving the data through the metadata; data blocks and metadata are stored as files in the Past system with deletion and update operations added. On the basis of inheriting the high expandability, usability and safety of the Past file storage system, the method improves the efficiency of the Past in reading and modifying the large file. The invention can be applied to the improvement of the efficiency of reading and modifying the large file by any application established above the Past, and avoids the additional useless data transmission and bandwidth consumption.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A file storage method based on a block strategy is disclosed, wherein a data block of the method is stored in a Past subsystem, and the method specifically comprises the following steps:

2. The file storage method based on the block partitioning policy as claimed in claim 1, wherein said data blocks are equal in size.

3. The file storage method based on the blocking policy of claim 1, wherein the metadata structure specifically comprises: file name, file individual block name, file size, time of last modification of the file, and time of last access of the file.

4. The file storage method based on block policy according to claim 1, wherein said method step 102) is followed by further comprising: a method of reading and updating a file data block;

wherein,

step 201) reading metadata of a file;

5. A file storage system based on a blocking policy, the storage system storing data blocks in a Past subsystem, comprising: in the Past subsystem, the file storage system comprises:

the file blocking unit is used for blocking the file to be stored;

a metadata generation unit for establishing metadata for retrieval based on the blocks acquired by the file blocking unit;

the auxiliary packaging unit is used for packaging all the blocks and the metadata into files in the Past subsystem; and

the optimized Past subsystem is used for storing the file encapsulated by the auxiliary encapsulation unit and is an improvement based on the Past subsystem;

the optimized Past subsystem is a Past subsystem added with file deleting and updating functions.

6. The file storage system according to claim 5, wherein said data blocks are of equal size.

7. The file storage system based on the blocking policy of claim 5, wherein the metadata structure specifically comprises: file name, file individual block name, file size, time of last modification of the file, and time of last access of the file.

8. The file storage system based on a blocking policy of claim 5, wherein said system further comprises: the auxiliary subsystem is used for reading and updating the data blocks of the file and is used for reading or updating the data blocks stored by the optimized Past subsystem;

wherein,

the auxiliary subsystem for reading and updating the file data block comprises:

the reading module is used for reading the metadata of the file;

the analysis module is used for analyzing the metadata to obtain the block name of the corresponding data block; and

and the operation execution module is used for reading/updating the corresponding data block according to the block name of the acquired data block.