WO2018233331A1 - File storage method and system and computer storage medium - Google Patents

File storage method and system and computer storage medium Download PDF

Info

Publication number
WO2018233331A1
WO2018233331A1 PCT/CN2018/079683 CN2018079683W WO2018233331A1 WO 2018233331 A1 WO2018233331 A1 WO 2018233331A1 CN 2018079683 W CN2018079683 W CN 2018079683W WO 2018233331 A1 WO2018233331 A1 WO 2018233331A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
data block
file system
object storage
storage
Prior art date
Application number
PCT/CN2018/079683
Other languages
French (fr)
Chinese (zh)
Inventor
江汛洋
葛利亚
王静
李道兵
许式伟
Original Assignee
上海七牛信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海七牛信息技术有限公司 filed Critical 上海七牛信息技术有限公司
Publication of WO2018233331A1 publication Critical patent/WO2018233331A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data

Definitions

  • the present invention relates to the field of storage technologies, and more particularly to a file storage method, system, and computer storage medium.
  • Object storage has the characteristics of low cost, easy to distribute concurrent access, and support mass storage, but does not support random writing. At present, many software still have the need for out-of-order reading and writing of the storage system, and the file system can support out-of-order reading and writing.
  • the technical problem to be solved by the present invention is to provide a file storage method, system and computer storage medium capable of supporting out-of-order read and write and object storage advantages.
  • a file storage method comprising:
  • the file is divided into blocks to form a plurality of data blocks, and the data blocks are written out of order;
  • Synchronizing the data block to the object storage queue If the data of the data block changes, the task is added to the object storage queue, and the tasks in the object storage queue are cyclically executed according to the first preset period;
  • the object storage layer splicing a plurality of data blocks into files according to an operation instruction in a preset order
  • the loop recycling task includes: reclaiming a data block in the file system that has been synchronized to the object storage and satisfying a preset condition according to a preset policy, and deleting the data block and marking the data block
  • the data block address is stored as an object.
  • the cyclic retransmission task includes: acquiring a data block in the file system that is not synchronized to the object storage in a second preset period, and generating a synchronization task according to the data block, and adding the synchronization task to the object storage queue.
  • the performing an out-of-order write operation on the data block includes:
  • the data block of the file is in the object store, the data block is read from the object store and stored in the file system, and then overwritten.
  • step of assembling the plurality of data blocks into the file according to the operation instruction according to the operation instruction includes:
  • the file data block completes the overwrite write at the file system layer, the file data block is re-uploaded into the object store;
  • the data block is read from the object store by offset, and the data block is downloaded to the file system and re-uploaded to form a data block.
  • the file system layer when a file out-of-order read operation is performed, if the file is in the file system, the data block formation file is directly read from the file system.
  • a file storage system comprising:
  • Processing module used to form a plurality of data blocks in a file system layer, and perform an out-of-order write operation on the data blocks;
  • a synchronization module configured to synchronize the data block to the object storage queue, if the data of the data block changes, add a task in the object storage queue, and execute the task in the object storage queue cyclically according to the first preset period;
  • a splicing module configured to splicing a plurality of data blocks into a file in a preset order according to an operation instruction in an object storage layer
  • a recycler configured to: perform a loopback task in a file system layer, the loopback task includes: reclaiming, in a preset policy, a data block in the file system that has been synchronized to the object store and meets a preset condition, and the data is The block is deleted and the data block address is marked as an object store.
  • the processing module is further configured to set a cyclic retransmission task in the file system layer; the cyclic retransmission task includes: acquiring, in the second preset period, a data block in the file system that is not synchronized to the object storage, and A synchronization task is generated according to the data block, and the synchronization task is added to the object storage queue.
  • processing module is further configured to directly write if the data block of the file is in the file system layer; the processing module is further configured to: if the data block of the file is in the object storage, then the data block Read from the object store and store it in the file system, then overwrite the write.
  • the splicing module is further configured to splicing a file if the file data block is in the object storage and has not expired;
  • the splicing module further re-uploads the file data block into the object storage if the file data block completes the overwrite writing at the file system layer;
  • the splicing module also uses the offset if the file data block is in the object store but has expired, and reads the data block from the object storage by offset, and downloads the data block to the file system, and then re-uploads to form a data block.
  • processing module is further configured to perform a file out-of-order read operation in the file system layer, and if the file is in the file system, directly read the data block forming file from the file system.
  • a computer storage medium storing a program, the program performing the steps of any of the above.
  • the file is divided into blocks to form a plurality of data blocks, and the data blocks are written out of order; then the data blocks are synchronized to the object storage queue, and if the data of the data block changes, the object is
  • the storage queue adds a task, and cyclically executes the tasks in the object storage queue according to the first preset period; in the object storage layer, according to the operation instruction, the plurality of data blocks are spliced into files according to a preset order, and a loop recycling task is set in the file system layer.
  • the loop recycling task includes: reclaiming, in a preset policy, a data block in the file system that has been synchronized to the object storage and satisfying the preset condition, and deleting the data block and marking the data block address as an object storage.
  • the tiered storage method of the file system and the object storage can achieve the advantages of being able to support out-of-order reading and writing, and possessing object storage, that is, low-cost, easy to distribute concurrent access, and support mass storage.
  • the recycle task can maintain a large number of files at the file system level and transfer them to the object store.
  • FIG. 1 is a flowchart of a file storage method according to an embodiment of the present invention.
  • FIG. 2 is a block diagram of a file storage system according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a file system layer writing process according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a file storage system according to an embodiment of the present invention.
  • a file storage method includes steps S110-S140. among them:
  • S110 At the file system layer, the file is divided into blocks to form a plurality of data blocks, and the data blocks are written out of order.
  • the file is diced and sliced to form a plurality of data blocks, and each data block size may be set by a system or by a user.
  • the file system level it is supported to perform an out-of-order write operation on a file, and a single write operation of the file is split into a write operation to multiple data blocks.
  • S120 Synchronize the data block to the object storage queue. If the data of the data block changes, add a task in the object storage queue, and execute the task in the object storage queue cyclically according to the first preset period.
  • the object storage layer multiple files can be stitched into one large file in order. Since the files in the object storage are stored in order, it is convenient to support reading some data at an offset.
  • the object storage layer supports chunk uploading and chunking into a single file, but does not support out-of-order write operations.
  • Object layer storage data is mainly used for archiving and distribution. In the file system layer, a data block is synchronized to the object storage queue Q.
  • the task is added to the synchronization queue and the repeated tasks are merged, and the queue task is executed according to the first preset period.
  • the time of the first preset period can be automatically set according to the system, or can be set by the user.
  • the operation instruction may be an instruction to splicing a plurality of data blocks into a file.
  • the data block is spliced into a file task in the object store.
  • S140 Set a loop recycling task in the file system layer; the loop recycling task includes: reclaiming, in a preset policy, a data block in the file system that is synchronized to the object storage and satisfying a preset condition, and deleting the data block and The data block address is marked as an object store.
  • the file system layer runs a loop recycle task, which recycles the data that has been synchronized to the object store by cycle and policy, and then pulls from the object store when reading data from the file system layer.
  • a loop recycling task is started in the file system layer, and a file in the file system that has been synchronized to the object storage and meets the user setting conditions is obtained according to a preset policy, and the file is deleted and the marked file address is stored in the object.
  • the default policy can be a user-specified policy, such as date modified and frequency of use.
  • the hierarchical storage method of the file system and the object storage is combined to achieve the advantages of being able to support out-of-order reading and writing, and possessing object storage, that is, low-cost, easy to distribute concurrent access, and support mass storage.
  • the out-of-order writes are written on the file system, and the storage structure that mainly falls on the object storage is read.
  • the recycle task can maintain a large number of files at the file system level and transfer them to the object store.
  • the file storage method further includes: setting a cyclic retransmission task in the file system layer; the cyclic retransmission task includes: acquiring, in the second preset period, a data block in the file system that is not synchronized to the object storage, and A synchronization task is generated according to the data block, and the synchronization task is added to the object storage queue.
  • a loop retransmission task is started in the file system layer, and the files in the file system that are not synchronized to the object storage are obtained in cycles, and the synchronization task is added to the synchronization queue Q according to the file generation.
  • the out-of-order write operation on the data block includes:
  • the data block of the file is in the object store, the data block is read from the object store and stored in the file system, and then overwritten.
  • the method of assembling a plurality of data blocks into a file according to an operation instruction according to a preset order includes:
  • the file data block completes the overwrite write at the file system layer, the file data block is re-uploaded into the object store;
  • the data block is read from the object store by offset, and the data block is downloaded to the file system and re-uploaded to form a data block.
  • the number of file data blocks is determined according to the size of the file at this time, and it is determined whether all the data blocks are in the object storage one by one, and is valid, and is read from the disk and restarted from the object storage as needed. Download, read and re-upload data blocks directly from memory. There are four cases required for splicing file data blocks:
  • Case 1 The file data block is already in the object store and has not expired. It can be reused for splicing files and reused, that is, read from disk.
  • Case 2 The file data block is overwritten in the file system layer and then re-uploaded into the object storage. It is re-downloaded from the object storage. You can also set the version variable. Specifically, each file data block has two version variables, one is The file content version, the file data block is zero when it is created, and each subsequent update is incremented. The other is the file data block upload version. After each data block upload, the upload version number is set to the file content version number. In this case, Check whether the file content version and the uploaded version are consistent. If they are inconsistent, they will be retransmitted.
  • Case 3 If the file data block is already in the object store but has expired, it is downloaded from the object store as an offset to the file system and then re-uploaded to form a block.
  • the file storage method further includes: when the file out-of-order read operation is performed in the file system layer, if the file is in the file system, the data block forming file is directly read from the file system.
  • the file system layer when the out-of-order read is performed, if the file is in the file system, it is directly read from the file system, and a certain amount of data blocks are continuously read according to the user configuration to reduce the network request amount.
  • the file A is divided into several data block files A block 1, file A block 2, and the like, and the file B is also divided into several data block files B block 1 and file B block 2.
  • the data block is then added to the synchronization queue Q through the file writing module, or into the synchronization queue Q through the cyclic retransmission module.
  • the data blocks in the synchronous queue Q are stored in the object storage and combined into a file A and a file B.
  • the file system can read data out of order from the object storage and store the file data block to the file system layer.
  • the recovery of file modules by policy can recover the data blocks in the file system.
  • the file system layer writing process includes randomly writing data to the file system layer.
  • the file write content is split into writes for multiple data blocks.
  • the write is overwritten, and if the data block is written as an additional write under the new data block, the write is additionally performed;
  • the data block data is read from the object storage layer and stored in the file storage layer, and then the data is written to the file system layer. On the disk. If in the system layer, the data is written to the file system layer disk.
  • the data is written to the file system layer disk.
  • a file storage system 200 includes a processing module 210, a synchronization module 220, a splicing module 230, and a recycler 240.
  • the processing module 210 is configured to block the file into a plurality of data blocks at the file system layer, and perform an out-of-order write operation on the data block.
  • the processing module 210 performs dicing and fragmenting the file to form a plurality of data blocks, and each data block size may be set by the system or may be specified by the user.
  • the file system level it is supported to perform an out-of-order write operation on a file, and a single write operation of the file is split into a write operation to multiple data blocks.
  • the synchronization module 220 is configured to synchronize the data block to the object storage queue. If the data of the data block changes, the task is added to the object storage queue, and the tasks in the object storage queue are cyclically executed in the first preset period.
  • the synchronization module 220 maintains a data block synchronization to the object storage queue Q in the file system layer. After writing or modifying the data, the task is added to the synchronization queue and the repeated tasks are merged, and the queue task is executed according to the first preset period.
  • the time of the first preset period can be automatically set according to the system, or can be set by the user.
  • the splicing module 230 is configured to splicing a plurality of data blocks into files in a preset order according to an operation instruction in the object storage layer.
  • the operation instruction may be an instruction to splicing a plurality of data blocks into a file.
  • the splicing module 230 will trigger the splicing of the data blocks into file tasks in the object storage.
  • a recycler 240 configured to perform a loopback task in a file system layer, where the loopback task includes: reclaiming, in a preset policy, a data block in the file system that has been synchronized to the object store and meets a preset condition, and The data block is deleted and the data block address is marked as an object store.
  • the file system layer runs a recycler 240.
  • the recycler 240 recycles the data that has been synchronized to the object store by cycle and policy, and then pulls from the object store when reading data from the file system layer. With the recycler, you can maintain a reasonable transfer of compressed files to the object store at the file system level.
  • a loop recycling task is started in the file system layer, and a file in the file system that has been synchronized to the object storage and meets the user setting conditions is obtained according to a preset policy, and the file is deleted and the marked file address is stored in the object.
  • the default policy can be a user-specified policy, such as date modified and frequency of use.
  • the processing module is further configured to: set a cyclic retransmission task in the file system layer; the cyclic retransmission task includes: acquiring a data block in the file system that is not synchronized to the object storage according to the second preset period, and according to The data block generates a synchronization task and adds the synchronization task to the object storage queue.
  • the processing module is further configured to directly write if the data block of the file is in the file system layer; if the data block of the file is in the object storage, read and store the data block from the object storage Go to the file system and then overwrite the write.
  • the splicing module is further configured to splicing the file if the file data block is in the object storage and has not expired; if the file data block completes the overwrite writing at the file system layer, re-uploading the file data block to the object storage If the file data block is in the object store but has expired, the data block is read from the object store by offset, and the data block is downloaded to the file system and re-uploaded to form a data block.
  • the number of file data blocks is determined according to the size of the file at this time, and it is determined whether all the data blocks are in the object storage one by one, and is valid, and is read from the disk and restarted from the object storage as needed. Download, read and re-upload data blocks directly from memory. There are four cases required for splicing file data blocks:
  • Case 1 The file data block is already in the object store and has not expired. It can be reused for splicing files and reused, that is, read from disk.
  • Case 2 The file data block is overwritten in the file system layer and then re-uploaded into the object storage. It is re-downloaded from the object storage. You can also set the version variable. Specifically, each file data block has two version variables, one is The file content version, the file data block is zero when it is created, and each subsequent update is incremented. The other is the file data block upload version. After each data block upload, the upload version number is set to the file content version number. In this case, Check whether the file content version and the uploaded version are consistent. If they are inconsistent, they will be retransmitted.
  • Case 3 If the file data block is already in the object store but has expired, it is downloaded from the object store as an offset to the file system and then re-uploaded to form a block.
  • the file size in the local file system is updated, and the corresponding data block of the truncate size boundary is updated, and the file content version is incremented.
  • the truncate file can be correctly reflected in the file splicing stage in the trigger object storage.
  • the processing module is further configured to perform a file out-of-order read operation in the file system layer, and if the file is in the file system, directly read the data block forming file from the file system.
  • the file system layer when the out-of-order read is performed, if the file is in the file system, it is directly read from the file system, and a certain amount of data blocks are continuously read according to the user configuration to reduce the network request amount.
  • Another preferred embodiment of the present invention is a computer storage medium, the computer storage medium storing a program, the program performing the steps of any of the above.

Abstract

A file storage method and system, and a computer storage medium. The method comprises: in a file system layer, blocking a file to form multiple data blocks and performing an out-of-order writing operation on the data blocks (S110); synchronizing the data blocks into the queue of an object storage; if data of the data blocks changes, adding a task into the queue of the object storage; periodically executing the tasks in the queue of the object storage according to a first preset period (S120); in an object storage layer, splicing, according to an operational instruction, the multiple data blocks in a preset sequence into a file (S130); and setting a cycle recycling task in the file system layer, the cycle recycling task comprising: according to a preset strategy, recycling data blocks in the file system synchronized into the object storage and satisfying a preset condition, deleting the data blocks and marking the address of the data blocks to be object storage (S140). The described hierarchical storage method merging the file system and the object storage not only can support out-of-order reading and writing, but also has the advantage of the object storage; i.e., the method has low cost, can easily distribute concurrent accesses and can support mass storage.

Description

一种文件存储方法、系统及计算机存储介质File storage method, system and computer storage medium 技术领域Technical field
本发明涉及存储技术领域,更具体的说,涉及一种文件存储方法、系统及计算机存储介质。The present invention relates to the field of storage technologies, and more particularly to a file storage method, system, and computer storage medium.
背景技术Background technique
对象存储拥有低成本、易分发并发访问、支持海量存储的特点,但不支持随机写,目前许多软件仍对存储系统有乱序读写的需求,文件系统能支持乱序读写。Object storage has the characteristics of low cost, easy to distribute concurrent access, and support mass storage, but does not support random writing. At present, many software still have the need for out-of-order reading and writing of the storage system, and the file system can support out-of-order reading and writing.
发明内容Summary of the invention
本发明所要解决的技术问题是提供一种能支持乱序读写又具有对象存储优点的文件存储方法、系统及计算机存储介质。The technical problem to be solved by the present invention is to provide a file storage method, system and computer storage medium capable of supporting out-of-order read and write and object storage advantages.
本发明的目的是通过以下技术方案来实现的:The object of the present invention is achieved by the following technical solutions:
一种文件存储方法,包括:A file storage method comprising:
在文件系统层,将文件进行分块形成多个数据块,对数据块进行乱序写操作;At the file system layer, the file is divided into blocks to form a plurality of data blocks, and the data blocks are written out of order;
将数据块同步到对象存储队列,如果数据块的数据发生变化,则在对象存储队列增加任务,按第一预设周期循环执行对象存储队列内的任务;Synchronizing the data block to the object storage queue. If the data of the data block changes, the task is added to the object storage queue, and the tasks in the object storage queue are cyclically executed according to the first preset period;
在对象存储层,根据操作指令将多个数据块按预设顺序拼接成文件;In the object storage layer, splicing a plurality of data blocks into files according to an operation instruction in a preset order;
在文件系统层中设置循环回收任务;所述循环回收任务包括:按预设策略回收文件系统中已同步到对象存储中且满足预设条件的数据块,并将所述数据块删除且标记所述数据块地址为对象存储。Setting a loop recycling task in the file system layer; the loop recycling task includes: reclaiming a data block in the file system that has been synchronized to the object storage and satisfying a preset condition according to a preset policy, and deleting the data block and marking the data block The data block address is stored as an object.
进一步的,还包括:Further, it also includes:
在文件系统层中设置循环重传任务;Set a loop retransmission task in the file system layer;
所述循环重传任务包括:按第二预设周期获取文件系统中未同步到对象存储中的数据块,并根据数据块生成同步任务,并将所述同步任务加入对象存储队列中。The cyclic retransmission task includes: acquiring a data block in the file system that is not synchronized to the object storage in a second preset period, and generating a synchronization task according to the data block, and adding the synchronization task to the object storage queue.
进一步的,所述对数据块进行乱序写操作包括:Further, the performing an out-of-order write operation on the data block includes:
如果文件的数据块在文件系统层中,则直接进行写入;If the data block of the file is in the file system layer, write directly;
如果文件的数据块在对象存储中,则将所述数据块从对象存储中读取 并存储到文件系统中,再进行覆盖写入。If the data block of the file is in the object store, the data block is read from the object store and stored in the file system, and then overwritten.
进一步的,所述根据操作指令将多个数据块按预设顺序拼成文件包括:Further, the step of assembling the plurality of data blocks into the file according to the operation instruction according to the operation instruction includes:
如果文件数据块在对象存储中且未过期,则用于拼接文件;Used to splicing files if the file data block is in the object store and has not expired;
如果文件数据块在文件系统层完成覆盖写入,则将文件数据块重新上传到对象存储中;If the file data block completes the overwrite write at the file system layer, the file data block is re-uploaded into the object store;
如果文件数据块在对象存储中但已过期,则从对象存储中按偏移读取数据块,并将所述数据块下载到文件系统,再重新上传形成数据块。If the file data block is in the object store but has expired, the data block is read from the object store by offset, and the data block is downloaded to the file system and re-uploaded to form a data block.
进一步的,还包括:Further, it also includes:
在文件系统层中,进行文件乱序读操作时,如果文件在文件系统中,则直接从文件系统读取数据块形成文件。In the file system layer, when a file out-of-order read operation is performed, if the file is in the file system, the data block formation file is directly read from the file system.
一种文件存储系统,包括:A file storage system comprising:
处理模块:用于在文件系统层,将文件进行分块形成多个数据块,对数据块进行乱序写操作;Processing module: used to form a plurality of data blocks in a file system layer, and perform an out-of-order write operation on the data blocks;
同步模块,用于将数据块同步到对象存储队列,如果数据块的数据发生变化,则在对象存储队列增加任务,按第一预设周期循环执行对象存储队列内的任务;a synchronization module, configured to synchronize the data block to the object storage queue, if the data of the data block changes, add a task in the object storage queue, and execute the task in the object storage queue cyclically according to the first preset period;
拼接模块,用于在对象存储层,根据操作指令将多个数据块按预设顺序拼接成文件;a splicing module, configured to splicing a plurality of data blocks into a file in a preset order according to an operation instruction in an object storage layer;
回收器,用于在文件系统层中执行循环回收任务,所述循环回收任务包括:按预设策略回收文件系统中已同步到对象存储中且满足预设条件的数据块,并将所述数据块删除且标记所述数据块地址为对象存储。进一步的,所述处理模块还用于在文件系统层中设置循环重传任务;所述循环重传任务包括:按第二预设周期获取文件系统中未同步到对象存储中的数据块,并根据数据块生成同步任务,并将所述同步任务加入对象存储队列中。a recycler, configured to: perform a loopback task in a file system layer, the loopback task includes: reclaiming, in a preset policy, a data block in the file system that has been synchronized to the object store and meets a preset condition, and the data is The block is deleted and the data block address is marked as an object store. Further, the processing module is further configured to set a cyclic retransmission task in the file system layer; the cyclic retransmission task includes: acquiring, in the second preset period, a data block in the file system that is not synchronized to the object storage, and A synchronization task is generated according to the data block, and the synchronization task is added to the object storage queue.
进一步的,所述处理模块还用于如果文件的数据块在文件系统层中,则直接进行写入;所述处理模块还用于如果文件的数据块在对象存储中,则将所述数据块从对象存储中读取并存储到文件系统中,再进行覆盖写入。Further, the processing module is further configured to directly write if the data block of the file is in the file system layer; the processing module is further configured to: if the data block of the file is in the object storage, then the data block Read from the object store and store it in the file system, then overwrite the write.
进一步的,所述拼接模块还用于如果文件数据块在对象存储中且未过期,则用于拼接文件;Further, the splicing module is further configured to splicing a file if the file data block is in the object storage and has not expired;
所述拼接模块还用如果文件数据块在文件系统层完成覆盖写入,则将文件数据块重新上传到对象存储中;The splicing module further re-uploads the file data block into the object storage if the file data block completes the overwrite writing at the file system layer;
所述拼接模块还用如果文件数据块在对象存储中但已过期,则从对象存储中按偏移读取数据块,并将所述数据块下载到文件系统,再重新上传 形成数据块。The splicing module also uses the offset if the file data block is in the object store but has expired, and reads the data block from the object storage by offset, and downloads the data block to the file system, and then re-uploads to form a data block.
进一步的,所述处理模块还用于在文件系统层中,进行文件乱序读操作时,如果文件在文件系统中,则直接从文件系统读取数据块形成文件。Further, the processing module is further configured to perform a file out-of-order read operation in the file system layer, and if the file is in the file system, directly read the data block forming file from the file system.
一种计算机存储介质,所述计算机存储介质可存储有程序,所述程序执行包括上述任一项所述的步骤。A computer storage medium storing a program, the program performing the steps of any of the above.
本发明由于在文件系统层,将文件进行分块形成多个数据块,并对数据块进行乱序写操作;然后将数据块同步到对象存储队列,如果数据块的数据发生变化,则在对象存储队列增加任务,按第一预设周期循环执行对象存储队列内的任务;在对象存储层,根据操作指令将多个数据块按预设顺序拼接成文件,在文件系统层中设置循环回收任务;循环回收任务包括:按预设策略回收文件系统中已同步到对象存储中且满足预设条件的数据块,并将数据块删除且标记所述数据块地址为对象存储。融合文件系统与对象存储的分级存储方法,达到既能支持乱序读写,又能拥有对象存储的优势,即低成本、易分发并发访问、支持海量存储。循环回收任务能在文件系统层维护大量文件合理的转移给对象存储。In the present invention, at the file system layer, the file is divided into blocks to form a plurality of data blocks, and the data blocks are written out of order; then the data blocks are synchronized to the object storage queue, and if the data of the data block changes, the object is The storage queue adds a task, and cyclically executes the tasks in the object storage queue according to the first preset period; in the object storage layer, according to the operation instruction, the plurality of data blocks are spliced into files according to a preset order, and a loop recycling task is set in the file system layer. The loop recycling task includes: reclaiming, in a preset policy, a data block in the file system that has been synchronized to the object storage and satisfying the preset condition, and deleting the data block and marking the data block address as an object storage. The tiered storage method of the file system and the object storage can achieve the advantages of being able to support out-of-order reading and writing, and possessing object storage, that is, low-cost, easy to distribute concurrent access, and support mass storage. The recycle task can maintain a large number of files at the file system level and transfer them to the object store.
附图说明DRAWINGS
图1是本发明实施例的一种文件存储方法的流程图;1 is a flowchart of a file storage method according to an embodiment of the present invention;
图2是本发明实施例的一种文件存储系统的框图;2 is a block diagram of a file storage system according to an embodiment of the present invention;
图3是本发明实施例的文件系统层写入过程的流程示意图;3 is a schematic flowchart of a file system layer writing process according to an embodiment of the present invention;
图4是本发明实施例的一种文件存储系统的示意图。4 is a schematic diagram of a file storage system according to an embodiment of the present invention.
具体实施方式Detailed ways
在更加详细地讨论示例性实施例之前应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各项操作描述成顺序的处理,但是其中的许多操作可以被并行地、并发地或者同时实施。此外,各项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。Before discussing the exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as a process or method depicted as a flowchart. Although the flowcharts describe various operations as a sequential process, many of the operations can be implemented in parallel, concurrently or concurrently. In addition, the order of operations can be rearranged. The process may be terminated when its operation is completed, but may also have additional steps not included in the figures.
还应当提到的是,在一些替换实现方式中,所提到的功能/动作可以按照不同于附图中标示的顺序发生。举例来说,取决于所涉及的功能/动作,相继示出的两幅图实际上可以基本上同时执行或者有时可以按照相反的顺序来执行。It should also be noted that in some alternative implementations, the functions/acts noted may occur in a different order than that illustrated in the drawings. For example, two figures shown in succession may in fact be executed substantially concurrently or sometimes in the reverse order, depending on the function/acts involved.
下面结合附图和较佳的实施例对本发明作进一步说明。The invention will now be further described with reference to the drawings and preferred embodiments.
如图1所示,一种文件存储方法,该文件存储方法包括步骤S110- S140。其中:As shown in FIG. 1, a file storage method includes steps S110-S140. among them:
S110:在文件系统层,将文件进行分块形成多个数据块,对数据块进行乱序写操作。S110: At the file system layer, the file is divided into blocks to form a plurality of data blocks, and the data blocks are written out of order.
具体的,在文件系统层,对文件进行切块分片形成多个数据块,每个数据块大小可以由系统设定,也可以由用户指定。在文件系统层,支持对文件进行乱序写操作,文件单次写操作将被拆成对多个数据块的写操作。Specifically, at the file system layer, the file is diced and sliced to form a plurality of data blocks, and each data block size may be set by a system or by a user. At the file system level, it is supported to perform an out-of-order write operation on a file, and a single write operation of the file is split into a write operation to multiple data blocks.
S120:将数据块同步到对象存储队列,如果数据块的数据发生变化,则在对象存储队列增加任务,按第一预设周期循环执行对象存储队列内的任务。在对象存储层,可以将多个文件按顺序拼接成一个大文件。对象存储中文件由于按顺序存储,所以可很方便支持按某偏移量读取部分数据。对象存储层支持分块上传与分块拼接成单个文件,但不支持乱序写操作。对象层存储数据主要用于归档、分发。在文件系统层中维护一个数据块同步到对象存储队列Q,写入或修改数据后,向同步队列加入任务并合并重复任务,按第一预设周期执行队列任务。第一预设周期的时间可以根据系统自动设定,也可以用户自己设定。S120: Synchronize the data block to the object storage queue. If the data of the data block changes, add a task in the object storage queue, and execute the task in the object storage queue cyclically according to the first preset period. In the object storage layer, multiple files can be stitched into one large file in order. Since the files in the object storage are stored in order, it is convenient to support reading some data at an offset. The object storage layer supports chunk uploading and chunking into a single file, but does not support out-of-order write operations. Object layer storage data is mainly used for archiving and distribution. In the file system layer, a data block is synchronized to the object storage queue Q. After the data is written or modified, the task is added to the synchronization queue and the repeated tasks are merged, and the queue task is executed according to the first preset period. The time of the first preset period can be automatically set according to the system, or can be set by the user.
S130:在对象存储层,根据操作指令将多个数据块按预设顺序拼接成文件。S130: In the object storage layer, multiple data blocks are spliced into files according to an operation instruction in a preset order.
该操作指令可以为将多个数据块拼接成文件的指令。当用户在文件系统层触发对文件的close/sync操作时将在对象存储中触发数据块拼接成文件任务。The operation instruction may be an instruction to splicing a plurality of data blocks into a file. When the user triggers a close/sync operation on the file at the file system level, the data block is spliced into a file task in the object store.
S140:在文件系统层中设置循环回收任务;所述循环回收任务包括:按预设策略回收文件系统中已同步到对象存储中且满足预设条件的数据块,并将所述数据块删除且标记所述数据块地址为对象存储。S140: Set a loop recycling task in the file system layer; the loop recycling task includes: reclaiming, in a preset policy, a data block in the file system that is synchronized to the object storage and satisfying a preset condition, and deleting the data block and The data block address is marked as an object store.
文件系统层运行一个循环回收任务,将已同步到对象存储的数据按周期、策略进行回收,当从文件系统层读取数据时再从对象存储拉取。有了循环回收任务,可以在文件系统层维护大量文件的压缩合理的转移给对象存储。在文件系统层中启动一个循环回收任务,按预设策略获取文件系统中已同步到对象存储中且满足用户设定条件的文件,并文件删除且标记文件地址为对象存储。预设策略可以是用户指定策略,如修改日期及使用频次等。The file system layer runs a loop recycle task, which recycles the data that has been synchronized to the object store by cycle and policy, and then pulls from the object store when reading data from the file system layer. With the recycle task, you can maintain a reasonable transfer of compressed files to the object store at the file system level. A loop recycling task is started in the file system layer, and a file in the file system that has been synchronized to the object storage and meets the user setting conditions is obtained according to a preset policy, and the file is deleted and the marked file address is stored in the object. The default policy can be a user-specified policy, such as date modified and frequency of use.
本实施例融合文件系统与对象存储的分级存储方法,达到既能支持乱序读写,又能拥有对象存储的优势,即低成本、易分发并发访问、支持海量存储。本实施例将乱序写落在文件系统、读取主要落在对象存储的存储结构。循环回收任务能在文件系统层维护大量文件合理的转移给对象存储。In this embodiment, the hierarchical storage method of the file system and the object storage is combined to achieve the advantages of being able to support out-of-order reading and writing, and possessing object storage, that is, low-cost, easy to distribute concurrent access, and support mass storage. In this embodiment, the out-of-order writes are written on the file system, and the storage structure that mainly falls on the object storage is read. The recycle task can maintain a large number of files at the file system level and transfer them to the object store.
可选地,该文件存储方法还包括:在文件系统层中设置循环重传任 务;该循环重传任务包括:按第二预设周期获取文件系统中未同步到对象存储中的数据块,并根据数据块生成同步任务,并将所述同步任务加入对象存储队列中。Optionally, the file storage method further includes: setting a cyclic retransmission task in the file system layer; the cyclic retransmission task includes: acquiring, in the second preset period, a data block in the file system that is not synchronized to the object storage, and A synchronization task is generated according to the data block, and the synchronization task is added to the object storage queue.
在文件系统层中启动一个循环重传任务,按周期获取文件系统中未同步到对象存储中的文件,并根据文件生成同步任务加入同步队列Q中。A loop retransmission task is started in the file system layer, and the files in the file system that are not synchronized to the object storage are obtained in cycles, and the synchronization task is added to the synchronization queue Q according to the file generation.
具体的,其中对数据块进行乱序写操作包括:Specifically, the out-of-order write operation on the data block includes:
如果文件的数据块在文件系统层中,则直接进行写入;If the data block of the file is in the file system layer, write directly;
如果文件的数据块在对象存储中,则将所述数据块从对象存储中读取并存储到文件系统中,再进行覆盖写入。If the data block of the file is in the object store, the data block is read from the object store and stored in the file system, and then overwritten.
在文件系统层中,进行乱序读写时,如果文件已在文件系统层则直接进行写入,如果在对象存储中,则将文件相应数据块从对象存储中读取并存储到文件系统中,再进行覆盖写入。In the file system layer, when out-of-order reading and writing, if the file is already written directly at the file system layer, if it is in the object storage, the corresponding data block of the file is read from the object storage and stored in the file system. , then overwrite write.
具体的,其中根据操作指令将多个数据块按预设顺序拼成文件包括:Specifically, the method of assembling a plurality of data blocks into a file according to an operation instruction according to a preset order includes:
如果文件数据块在对象存储中且未过期,则用于拼接文件;Used to splicing files if the file data block is in the object store and has not expired;
如果文件数据块在文件系统层完成覆盖写入,则将文件数据块重新上传到对象存储中;If the file data block completes the overwrite write at the file system layer, the file data block is re-uploaded into the object store;
如果文件数据块在对象存储中但已过期,则从对象存储中按偏移读取数据块,并将所述数据块下载到文件系统,再重新上传形成数据块。If the file data block is in the object store but has expired, the data block is read from the object store by offset, and the data block is downloaded to the file system and re-uploaded to form a data block.
最终用户触发拼接文件时,根据此时文件的大小确定文件数据块数量,并逐一判断所有数据块是否已在对象存储中,且有效,此时按需从磁盘中读取、从对象存储中重新下载、直接从内存读取并重新上传数据块。所需用于拼接文件数据块分四种情况:When the end user triggers the splicing file, the number of file data blocks is determined according to the size of the file at this time, and it is determined whether all the data blocks are in the object storage one by one, and is valid, and is read from the disk and restarted from the object storage as needed. Download, read and re-upload data blocks directly from memory. There are four cases required for splicing file data blocks:
情况一:文件数据块已在对象存储中且未过期可继续用于拼接文件则重用,即从磁盘中读取。Case 1: The file data block is already in the object store and has not expired. It can be reused for splicing files and reused, that is, read from disk.
情况二:文件数据块在文件系统层完成覆盖写则重新上传到对象存储中,从对象存储中重新下载,还可以设置版本变量,具体包括,每个文件数据块有两个版本变量,一个是文件内容版本,文件数据块刚创建时为零,后续每次更新递增,另一个是文件数据块上传版本,每次数据块上传后,上传该版本号设置为文件内容版本号,该情况下,检查文件内容版本和上传版本是否一致,不一致则重传。Case 2: The file data block is overwritten in the file system layer and then re-uploaded into the object storage. It is re-downloaded from the object storage. You can also set the version variable. Specifically, each file data block has two version variables, one is The file content version, the file data block is zero when it is created, and each subsequent update is incremented. The other is the file data block upload version. After each data block upload, the upload version number is set to the file content version number. In this case, Check whether the file content version and the uploaded version are consistent. If they are inconsistent, they will be retransmitted.
情况三:文件数据块已在对象存储中但已过期,则从对象存储按偏移读取下载到文件系统再重新上传形成块。Case 3: If the file data block is already in the object store but has expired, it is downloaded from the object store as an offset to the file system and then re-uploaded to form a block.
情况四,如果数据块未上传过,则触发上传。如果拼接文件过程中失败则终止拼接,等待由重传模块再次触发拼接。Case 4, if the data block has not been uploaded, the upload is triggered. If the splicing file fails during the process, the splicing is terminated, and the splicing is triggered again by the retransmission module.
对文件执行truncate操作时,更新本地文件系统中的文件大小,并更新truncate大小边界相应的数据块,并递增其文件内容版本,如此, 在触发对象存储中文件拼接阶段才能正确体现truncate文件。可选地,该文件存储方法还包括:在文件系统层中,进行文件乱序读操作时,如果文件在文件系统中,则直接从文件系统读取数据块形成文件。When the truncate operation is performed on the file, the file size in the local file system is updated, and the corresponding data block of the truncate size boundary is updated, and the file content version is incremented. Thus, the truncate file can be correctly reflected in the file splicing stage in the trigger object storage. Optionally, the file storage method further includes: when the file out-of-order read operation is performed in the file system layer, if the file is in the file system, the data block forming file is directly read from the file system.
在文件系统层中,进行乱序读时,如果文件在文件系统中,则直接从文件系统读取,根据用户配置连续读取一定量数据块减少网络请求量。In the file system layer, when the out-of-order read is performed, if the file is in the file system, it is directly read from the file system, and a certain amount of data blocks are continuously read according to the user configuration to reduce the network request amount.
如图2所示,将文件A分成几个数据块文件A块1、文件A块2等,文件B也分成几个数据块文件B块1、文件B块2。As shown in FIG. 2, the file A is divided into several data block files A block 1, file A block 2, and the like, and the file B is also divided into several data block files B block 1 and file B block 2.
然后将数据块分别通过文件写入模块增加进同步队列Q,或通过循环重传模块进入同步队列Q。The data block is then added to the synchronization queue Q through the file writing module, or into the synchronization queue Q through the cyclic retransmission module.
同步队列Q中的数据块存入对象存储,并组合成文件A、文件B。The data blocks in the synchronous queue Q are stored in the object storage and combined into a file A and a file B.
其中,文件系统可直接从对象存储乱序读取数据,并将文件数据块存储到文件系统层。Among them, the file system can read data out of order from the object storage and store the file data block to the file system layer.
其中,按策略回收文件模块可以将文件系统中的数据块进行回收。Among them, the recovery of file modules by policy can recover the data blocks in the file system.
如图3所示,文件系统层写入过程包括:向文件系统层随机写入数据。As shown in FIG. 3, the file system layer writing process includes randomly writing data to the file system layer.
文件写入内容被拆成面向多个数据块的写入。The file write content is split into writes for multiple data blocks.
如果面向数据块写为覆盖写数据,则覆盖写,如果面向数据块写为新数据块下的追加写,则追加写;If the data block is written to overwrite the write data, the write is overwritten, and if the data block is written as an additional write under the new data block, the write is additionally performed;
如果为覆盖写,则判断相应数据块是否在文件系统层中,如果不在系统层中,则从对象存储层中读取数据块数据并存储在文件存储层,然后则将数据写入文件系统层磁盘中。如果在系统层中,则将数据写入文件系统层磁盘中。If it is overwritten, it is judged whether the corresponding data block is in the file system layer. If it is not in the system layer, the data block data is read from the object storage layer and stored in the file storage layer, and then the data is written to the file system layer. On the disk. If in the system layer, the data is written to the file system layer disk.
如果为追加写,则将数据写入文件系统层磁盘中。If it is an additional write, the data is written to the file system layer disk.
本发明的另一优选实施例,如图4所示,一种文件存储系统200,包括处理模块210、同步模块220、拼接模块230和回收器240。In another preferred embodiment of the present invention, as shown in FIG. 4, a file storage system 200 includes a processing module 210, a synchronization module 220, a splicing module 230, and a recycler 240.
处理模块210用于在文件系统层,将文件进行分块形成多个数据块,对数据块进行乱序写操作。The processing module 210 is configured to block the file into a plurality of data blocks at the file system layer, and perform an out-of-order write operation on the data block.
具体的,在文件系统层,处理模块210对文件进行切块分片形成多个数据块,每个数据块大小可以由系统设定,也可以由用户指定。在文件系统层,支持对文件进行乱序写操作,文件单次写操作将被拆成对多个数据块的写操作。Specifically, at the file system layer, the processing module 210 performs dicing and fragmenting the file to form a plurality of data blocks, and each data block size may be set by the system or may be specified by the user. At the file system level, it is supported to perform an out-of-order write operation on a file, and a single write operation of the file is split into a write operation to multiple data blocks.
同步模块220用于将数据块同步到对象存储队列,如果数据块的数据发生变化,则在对象存储队列增加任务,按第一预设周期循环执行对象存储队列内的任务。The synchronization module 220 is configured to synchronize the data block to the object storage queue. If the data of the data block changes, the task is added to the object storage queue, and the tasks in the object storage queue are cyclically executed in the first preset period.
同步模块220在文件系统层中维护一个数据块同步到对象存储队列Q,写入或修改数据后,向同步队列加入任务并合并重复任务,按第一预 设周期执行队列任务。第一预设周期的时间可以根据系统自动设定,也可以用户自己设定。The synchronization module 220 maintains a data block synchronization to the object storage queue Q in the file system layer. After writing or modifying the data, the task is added to the synchronization queue and the repeated tasks are merged, and the queue task is executed according to the first preset period. The time of the first preset period can be automatically set according to the system, or can be set by the user.
拼接模块230用于在对象存储层,根据操作指令将多个数据块按预设顺序拼接成文件。The splicing module 230 is configured to splicing a plurality of data blocks into files in a preset order according to an operation instruction in the object storage layer.
该操作指令可以为将多个数据块拼接成文件的指令。当用户在文件系统层触发对文件的close/sync操作时,拼接模块230将在对象存储中触发数据块拼接成文件任务。The operation instruction may be an instruction to splicing a plurality of data blocks into a file. When the user triggers a close/sync operation on the file at the file system layer, the splicing module 230 will trigger the splicing of the data blocks into file tasks in the object storage.
回收器240,用于在文件系统层中执行循环回收任务,所述循环回收任务包括:按预设策略回收文件系统中已同步到对象存储中且满足预设条件的数据块,并将所述数据块删除且标记所述数据块地址为对象存储。a recycler 240, configured to perform a loopback task in a file system layer, where the loopback task includes: reclaiming, in a preset policy, a data block in the file system that has been synchronized to the object store and meets a preset condition, and The data block is deleted and the data block address is marked as an object store.
文件系统层运行一个回收器240,回收器240将已同步到对象存储的数据按周期、策略进行回收,当从文件系统层读取数据时再从对象存储拉取。有了回收器,可以在文件系统层维护大量文件的压缩合理的转移给对象存储。在文件系统层中启动一个循环回收任务,按预设策略获取文件系统中已同步到对象存储中且满足用户设定条件的文件,并文件删除且标记文件地址为对象存储。预设策略可以是用户指定策略,如修改日期及使用频次等。The file system layer runs a recycler 240. The recycler 240 recycles the data that has been synchronized to the object store by cycle and policy, and then pulls from the object store when reading data from the file system layer. With the recycler, you can maintain a reasonable transfer of compressed files to the object store at the file system level. A loop recycling task is started in the file system layer, and a file in the file system that has been synchronized to the object storage and meets the user setting conditions is obtained according to a preset policy, and the file is deleted and the marked file address is stored in the object. The default policy can be a user-specified policy, such as date modified and frequency of use.
可选地,处理模块还用于在文件系统层中设置循环重传任务;所述循环重传任务包括:按第二预设周期获取文件系统中未同步到对象存储中的数据块,并根据数据块生成同步任务,并将所述同步任务加入对象存储队列中。Optionally, the processing module is further configured to: set a cyclic retransmission task in the file system layer; the cyclic retransmission task includes: acquiring a data block in the file system that is not synchronized to the object storage according to the second preset period, and according to The data block generates a synchronization task and adds the synchronization task to the object storage queue.
可选地,处理模块还用于如果文件的数据块在文件系统层中,则直接进行写入;如果文件的数据块在对象存储中,则将所述数据块从对象存储中读取并存储到文件系统中,再进行覆盖写入。Optionally, the processing module is further configured to directly write if the data block of the file is in the file system layer; if the data block of the file is in the object storage, read and store the data block from the object storage Go to the file system and then overwrite the write.
在文件系统层中,进行乱序读写时,如果文件已在文件系统层则直接进行写入,如果在对象存储中,则将文件相应数据块从对象存储中读取并存储到文件系统中,再进行覆盖写入。In the file system layer, when out-of-order reading and writing, if the file is already written directly at the file system layer, if it is in the object storage, the corresponding data block of the file is read from the object storage and stored in the file system. , then overwrite write.
可选地,拼接模块还用于如果文件数据块在对象存储中且未过期,则用于拼接文件;如果文件数据块在文件系统层完成覆盖写入,则将文件数据块重新上传到对象存储中;如果文件数据块在对象存储中但已过期,则从对象存储中按偏移读取数据块,并将所述数据块下载到文件系统,再重新上传形成数据块。Optionally, the splicing module is further configured to splicing the file if the file data block is in the object storage and has not expired; if the file data block completes the overwrite writing at the file system layer, re-uploading the file data block to the object storage If the file data block is in the object store but has expired, the data block is read from the object store by offset, and the data block is downloaded to the file system and re-uploaded to form a data block.
最终用户触发拼接文件时,根据此时文件的大小确定文件数据块数量,并逐一判断所有数据块是否已在对象存储中,且有效,此时按需从磁盘中读取、从对象存储中重新下载、直接从内存读取并重新上传数据块。所需用于拼接文件数据块分四种情况:When the end user triggers the splicing file, the number of file data blocks is determined according to the size of the file at this time, and it is determined whether all the data blocks are in the object storage one by one, and is valid, and is read from the disk and restarted from the object storage as needed. Download, read and re-upload data blocks directly from memory. There are four cases required for splicing file data blocks:
情况一:文件数据块已在对象存储中且未过期可继续用于拼接文件则重用,即从磁盘中读取。Case 1: The file data block is already in the object store and has not expired. It can be reused for splicing files and reused, that is, read from disk.
情况二:文件数据块在文件系统层完成覆盖写则重新上传到对象存储中,从对象存储中重新下载,还可以设置版本变量,具体包括,每个文件数据块有两个版本变量,一个是文件内容版本,文件数据块刚创建时为零,后续每次更新递增,另一个是文件数据块上传版本,每次数据块上传后,上传该版本号设置为文件内容版本号,该情况下,检查文件内容版本和上传版本是否一致,不一致则重传。Case 2: The file data block is overwritten in the file system layer and then re-uploaded into the object storage. It is re-downloaded from the object storage. You can also set the version variable. Specifically, each file data block has two version variables, one is The file content version, the file data block is zero when it is created, and each subsequent update is incremented. The other is the file data block upload version. After each data block upload, the upload version number is set to the file content version number. In this case, Check whether the file content version and the uploaded version are consistent. If they are inconsistent, they will be retransmitted.
情况三:文件数据块已在对象存储中但已过期,则从对象存储按偏移读取下载到文件系统再重新上传形成块。Case 3: If the file data block is already in the object store but has expired, it is downloaded from the object store as an offset to the file system and then re-uploaded to form a block.
情况四,如果数据块未上传过,则触发上传。如果拼接文件过程中失败则终止拼接,等待由重传模块再次触发拼接。Case 4, if the data block has not been uploaded, the upload is triggered. If the splicing file fails during the process, the splicing is terminated, and the splicing is triggered again by the retransmission module.
对文件执行truncate操作时,更新本地文件系统中的文件大小,并更新truncate大小边界相应的数据块,并递增其文件内容版本,如此,在触发对象存储中文件拼接阶段才能正确体现truncate文件。When the truncate operation is performed on the file, the file size in the local file system is updated, and the corresponding data block of the truncate size boundary is updated, and the file content version is incremented. Thus, the truncate file can be correctly reflected in the file splicing stage in the trigger object storage.
可选地,处理模块还用于在文件系统层中,进行文件乱序读操作时,如果文件在文件系统中,则直接从文件系统读取数据块形成文件。Optionally, the processing module is further configured to perform a file out-of-order read operation in the file system layer, and if the file is in the file system, directly read the data block forming file from the file system.
在文件系统层中,进行乱序读时,如果文件在文件系统中,则直接从文件系统读取,根据用户配置连续读取一定量数据块减少网络请求量。In the file system layer, when the out-of-order read is performed, if the file is in the file system, it is directly read from the file system, and a certain amount of data blocks are continuously read according to the user configuration to reduce the network request amount.
本发明的另一优选实施例,一种计算机存储介质,所述计算机存储介质可存储有程序,所述程序执行包括上述任一项所述的步骤。Another preferred embodiment of the present invention is a computer storage medium, the computer storage medium storing a program, the program performing the steps of any of the above.
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。The above is a further detailed description of the present invention in connection with the specific preferred embodiments, and the specific embodiments of the present invention are not limited to the description. It will be apparent to those skilled in the art that the present invention may be made without departing from the spirit and scope of the invention.

Claims (11)

  1. 一种文件存储方法,包括:A file storage method comprising:
    在文件系统层,将文件进行分块形成多个数据块,对数据块进行乱序写操作;At the file system layer, the file is divided into blocks to form a plurality of data blocks, and the data blocks are written out of order;
    将数据块同步到对象存储队列,如果数据块的数据发生变化,则在对象存储队列增加任务,按第一预设周期循环执行对象存储队列内的任务;Synchronizing the data block to the object storage queue. If the data of the data block changes, the task is added to the object storage queue, and the tasks in the object storage queue are cyclically executed according to the first preset period;
    在对象存储层,根据操作指令将多个数据块按预设顺序拼接成文件;In the object storage layer, splicing a plurality of data blocks into files according to an operation instruction in a preset order;
    在文件系统层中设置循环回收任务;所述循环回收任务包括:按预设策略回收文件系统中已同步到对象存储中且满足预设条件的数据块,并将所述数据块删除且标记所述数据块地址为对象存储。Setting a loop recycling task in the file system layer; the loop recycling task includes: reclaiming a data block in the file system that has been synchronized to the object storage and satisfying a preset condition according to a preset policy, and deleting the data block and marking the data block The data block address is stored as an object.
  2. 如权利要求1所述的一种文件存储方法,其特征在于,还包括:A file storage method according to claim 1, further comprising:
    在文件系统层中设置循环重传任务;Set a loop retransmission task in the file system layer;
    所述循环重传任务包括:按第二预设周期获取文件系统中未同步到对象存储中的数据块,并根据数据块生成同步任务,并将所述同步任务加入对象存储队列中。The cyclic retransmission task includes: acquiring a data block in the file system that is not synchronized to the object storage in a second preset period, and generating a synchronization task according to the data block, and adding the synchronization task to the object storage queue.
  3. 如权利要求1所述的一种文件存储方法,其特征在于,所述对数据块进行乱序写操作包括:The file storage method according to claim 1, wherein the performing an out-of-order write operation on the data block comprises:
    如果文件的数据块在文件系统层中,则直接进行写入;If the data block of the file is in the file system layer, write directly;
    如果文件的数据块在对象存储中,则将所述数据块从对象存储中读取并存储到文件系统中,再进行覆盖写入。If the data block of the file is in the object store, the data block is read from the object store and stored in the file system, and then overwritten.
  4. 如权利要求3所述的一种文件存储方法,其特征在于,所述根据操作指令将多个数据块按预设顺序拼成文件包括:The file storage method according to claim 3, wherein the assembling the plurality of data blocks into the file in a preset order according to the operation instruction comprises:
    如果文件数据块在对象存储中且未过期,则用于拼接文件;Used to splicing files if the file data block is in the object store and has not expired;
    如果文件数据块在文件系统层完成覆盖写入,则将文件数据块重新上传到对象存储中;If the file data block completes the overwrite write at the file system layer, the file data block is re-uploaded into the object store;
    如果文件数据块在对象存储中但已过期,则从对象存储中按偏移读取数据块,并将所述数据块下载到文件系统,再重新上传形成数据块。If the file data block is in the object store but has expired, the data block is read from the object store by offset, and the data block is downloaded to the file system and re-uploaded to form a data block.
  5. 如权利要求1所述的一种文件存储方法,其特征在于,还包括:A file storage method according to claim 1, further comprising:
    在文件系统层中,进行文件乱序读操作时,如果文件在文件系统中,则直接从文件系统读取数据块形成文件。In the file system layer, when a file out-of-order read operation is performed, if the file is in the file system, the data block formation file is directly read from the file system.
  6. 一种文件存储系统,其特征在于,包括:A file storage system, comprising:
    处理模块:用于在文件系统层,将文件进行分块形成多个数据块,对数据块进行乱序写操作;Processing module: used to form a plurality of data blocks in a file system layer, and perform an out-of-order write operation on the data blocks;
    同步模块,用于将数据块同步到对象存储队列,如果数据块的数据发生变化,则在对象存储队列增加任务,按第一预设周期循环执行对象存储队列内的任务;a synchronization module, configured to synchronize the data block to the object storage queue, if the data of the data block changes, add a task in the object storage queue, and execute the task in the object storage queue cyclically according to the first preset period;
    拼接模块,用于在对象存储层,根据操作指令将多个数据块按预设顺序拼接成文件;a splicing module, configured to splicing a plurality of data blocks into a file in a preset order according to an operation instruction in an object storage layer;
    回收器,用于在文件系统层中执行循环回收任务,所述循环回收任务包括:按预设策略回收文件系统中已同步到对象存储中且满足预设条件的数据块,并将所述数据块删除且标记所述数据块地址为对象存储。a recycler, configured to: perform a loopback task in a file system layer, the loopback task includes: reclaiming, in a preset policy, a data block in the file system that has been synchronized to the object store and meets a preset condition, and the data is The block is deleted and the data block address is marked as an object store.
  7. 如权利要求6所述的一种文件存储系统,其特征在于,所述处理模块还用于在文件系统层中设置循环重传任务;所述循环重传任务包括:按第二预设周期获取文件系统中未同步到对象存储中的数据块,并根据数据块生成同步任务,并将所述同步任务加入对象存储队列中。A file storage system according to claim 6, wherein the processing module is further configured to set a round-robin retransmission task in the file system layer; and the cyclic retransmission task comprises: acquiring the second preset period The data block in the object storage is not synchronized in the file system, and a synchronization task is generated according to the data block, and the synchronization task is added to the object storage queue.
  8. 如权利要求6所述的一种文件存储系统,其特征在于,所述处理模块还用于如果文件的数据块在文件系统层中,则直接进行写入;所述处理模块还用于如果文件的数据块在对象存储中,则将所述数据块从对象存储中读取并存储到文件系统中,再进行覆盖写入。A file storage system according to claim 6, wherein said processing module is further configured to directly write if a data block of the file is in a file system layer; said processing module is further configured to: if the file The data block is stored in the object store, and the data block is read from the object store and stored in the file system, and then overwritten.
  9. 如权利要求8所述的一种文件存储系统,其特征在于,所述拼接模块还用于如果文件数据块在对象存储中且未过期,则用于拼接文件;A file storage system according to claim 8, wherein the splicing module is further configured to splicing a file if the file data block is in the object storage and has not expired;
    所述拼接模块还用如果文件数据块在文件系统层完成覆盖写入,则将文件数据块重新上传到对象存储中;The splicing module further re-uploads the file data block into the object storage if the file data block completes the overwrite writing at the file system layer;
    所述拼接模块还用如果文件数据块在对象存储中但已过期,则从对象存储中按偏移读取数据块,并将所述数据块下载到文件系统,再重新上传形成数据块。The splicing module also uses the offset if the file data block is in the object storage but has expired, and reads the data block from the object storage by the offset, and downloads the data block to the file system, and then re-uploads to form the data block.
  10. 如权利要求6所述的一种文件存储系统,其特征在于,所述处理模块还用于在文件系统层中,进行文件乱序读操作时,如果文件在文件系统中,则直接从文件系统读取数据块形成文件。The file storage system according to claim 6, wherein the processing module is further configured to perform a file out-of-order read operation in the file system layer, and if the file is in the file system, directly from the file system. Read the data block to form a file.
  11. 一种计算机存储介质,其特征在于,所述计算机存储介质可存储有程序,所述程序执行包括如权利要求1-5中任一项所述的步骤。A computer storage medium, characterized in that the computer storage medium can store a program, the program execution comprising the steps of any one of claims 1-5.
PCT/CN2018/079683 2017-06-22 2018-03-20 File storage method and system and computer storage medium WO2018233331A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710480364.8 2017-06-22
CN201710480364.8A CN107229427B (en) 2017-06-22 2017-06-22 A kind of file memory method, system and computer storage medium

Publications (1)

Publication Number Publication Date
WO2018233331A1 true WO2018233331A1 (en) 2018-12-27

Family

ID=59936588

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/079683 WO2018233331A1 (en) 2017-06-22 2018-03-20 File storage method and system and computer storage medium

Country Status (2)

Country Link
CN (1) CN107229427B (en)
WO (1) WO2018233331A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107229427B (en) * 2017-06-22 2019-10-18 上海七牛信息技术有限公司 A kind of file memory method, system and computer storage medium
CN116048424B (en) * 2023-03-07 2023-06-06 浪潮电子信息产业股份有限公司 IO data processing method, device, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090228511A1 (en) * 2008-03-05 2009-09-10 Nec Laboratories America, Inc. System and Method for Content Addressable Storage
CN103077245A (en) * 2013-01-18 2013-05-01 浪潮电子信息产业股份有限公司 Method for expanding parallel file system by free hard disk space of cluster computing node
CN106021256A (en) * 2015-03-31 2016-10-12 Emc 公司 De-duplicating distributed file system using cloud-based object store
CN107229427A (en) * 2017-06-22 2017-10-03 上海七牛信息技术有限公司 A kind of file memory method, system and computer-readable storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722506A (en) * 2009-12-29 2012-10-10 华为数字技术(成都)有限公司 Data storage method and equipment
US8954408B2 (en) * 2011-07-28 2015-02-10 International Business Machines Corporation Allowing writes to complete without obtaining a write lock to a file
US9959207B2 (en) * 2015-06-25 2018-05-01 Vmware, Inc. Log-structured B-tree for handling random writes
CN106021536A (en) * 2016-05-27 2016-10-12 成都索贝数码科技股份有限公司 Data insertion method and system based on storage of FICS objects
CN106406981A (en) * 2016-09-18 2017-02-15 深圳市深信服电子科技有限公司 Disk data reading/writing method and virtual machine monitor
CN106776967B (en) * 2016-12-05 2020-03-27 哈尔滨工业大学(威海) Method and device for storing massive small files in real time based on time sequence aggregation algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090228511A1 (en) * 2008-03-05 2009-09-10 Nec Laboratories America, Inc. System and Method for Content Addressable Storage
CN103077245A (en) * 2013-01-18 2013-05-01 浪潮电子信息产业股份有限公司 Method for expanding parallel file system by free hard disk space of cluster computing node
CN106021256A (en) * 2015-03-31 2016-10-12 Emc 公司 De-duplicating distributed file system using cloud-based object store
CN107229427A (en) * 2017-06-22 2017-10-03 上海七牛信息技术有限公司 A kind of file memory method, system and computer-readable storage medium

Also Published As

Publication number Publication date
CN107229427B (en) 2019-10-18
CN107229427A (en) 2017-10-03

Similar Documents

Publication Publication Date Title
US8738883B2 (en) Snapshot creation from block lists
JP6236533B2 (en) Method and apparatus for creating differential update package, system differential update method and apparatus
US9436556B2 (en) Customizable storage system for virtual databases
CN108628874B (en) Method and device for migrating data, electronic equipment and readable storage medium
CN107193615B (en) Project code information updating and deploying method and device
CN109634774B (en) Data backup and recovery method and device
WO2015107666A1 (en) Storage apparatus and cache control method for storage apparatus
US10474537B2 (en) Utilizing an incremental backup in a decremental backup system
US10810035B2 (en) Deploying a cloud instance of a user virtual machine
WO2015034827A1 (en) Replication of snapshots and clones
CN109144785B (en) Method and apparatus for backing up data
US8966200B1 (en) Pruning free blocks out of a decremental backup chain
JP2016045869A (en) Data recovery method, program, and data processing system
CN111433760A (en) Enhanced techniques for replicating cloud-stored files
WO2017113694A1 (en) File synchronizing method, device and system
WO2018233331A1 (en) File storage method and system and computer storage medium
CN110442648B (en) Data synchronization method and device
CN113254394B (en) Snapshot processing method, system, equipment and storage medium
CN112579550B (en) Metadata information synchronization method and system of distributed file system
CN106339176B (en) Intermediate file processing method, client, server and system
US9921918B1 (en) Cloud-based data backup and management
US10031961B1 (en) Systems and methods for data replication
US20230108138A1 (en) Techniques for preserving clone relationships between files
WO2016054886A1 (en) Software upgrade method and apparatus, electronic device and storage medium
CN113971041A (en) Version synchronization method and device of cross-version control system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18820563

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18820563

Country of ref document: EP

Kind code of ref document: A1