WO2021238246A1 - 一种聚合小文件的操作请求的处理方法及装置 - Google Patents

一种聚合小文件的操作请求的处理方法及装置 Download PDF

Info

Publication number
WO2021238246A1
WO2021238246A1 PCT/CN2021/073259 CN2021073259W WO2021238246A1 WO 2021238246 A1 WO2021238246 A1 WO 2021238246A1 CN 2021073259 W CN2021073259 W CN 2021073259W WO 2021238246 A1 WO2021238246 A1 WO 2021238246A1
Authority
WO
WIPO (PCT)
Prior art keywords
metadata
request
invalid space
record
invalid
Prior art date
Application number
PCT/CN2021/073259
Other languages
English (en)
French (fr)
Inventor
王帅阳
李文鹏
张端
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2021238246A1 publication Critical patent/WO2021238246A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations

Definitions

  • This application relates to the field of computer technology, and in particular to a method and device for processing operation requests for aggregating small files, a terminal device of a distributed file system, and a readable storage medium.
  • the purpose of this application is to provide a method and device for processing operation requests for aggregated small files, terminal equipment of a distributed file system, and readable storage media, so as to solve every time an operation request for aggregated small files is received in the current solution. , It is necessary to perform a disk placement operation in invalid space, which causes a problem that the performance of the distributed file system is severely affected.
  • the specific plan is as follows:
  • this application provides a method for processing operation requests for aggregating small files, which is applied to the client and includes:
  • the recording the storage space of the aggregated small files as invalid space and generating an invalid space record includes:
  • the executing the operation request and sending the metadata update request to the metadata server according to the execution result of the operation request includes:
  • Execute the operation request update the number of the metadata of the aggregated small file, and obtain the number update result; send a metadata update request to the metadata server, wherein the metadata update request includes the number update result;
  • adding the invalid space record to the target queue includes:
  • the invalid space record corresponding to the number update result is added to the target queue.
  • the method further includes:
  • adding the invalid space record to the target queue includes:
  • performing a disk placement operation on the invalid space record in the target queue includes:
  • a disk placement operation is periodically performed on the invalid space records in the target queue.
  • the method before the placing operation on the invalid space record in the target queue when the preset time point is reached, the method further includes:
  • the invalid space record in the target queue scan the aggregated large file according to the object granularity, and determine whether the storage space that is not recorded as invalid space is invalid space;
  • this application provides an apparatus for processing operation requests for aggregating small files, including:
  • Request receiving module used to receive an operation request for aggregated small files, where the operation request is a delete request or a modification request;
  • Record generation module used to record the storage space of the aggregated small files as invalid space and generate invalid space records
  • Request update module used to execute the operation request, and send a metadata update request to the metadata server according to the execution result of the operation request;
  • Record adding module after receiving the metadata update complete message fed back by the metadata server, add the invalid space record to the target queue;
  • Record placing module used to place the record of the invalid space in the target queue when the preset time point is reached.
  • this application provides a terminal device of a distributed file system, including:
  • Memory used to store computer programs
  • Processor used to execute the computer program to implement the steps of the method for processing an operation request for aggregating small files as described above.
  • the present application provides a readable storage medium having a computer program stored on the readable storage medium, and when the computer program is executed by a processor, it is used to implement the above-mentioned aggregation small file operation request. Processing method steps.
  • the method for processing operation requests of aggregate small files includes: receiving an operation request for aggregate small files, where the operation request is a deletion request or modification request; recording the storage space of aggregate small files as invalid space, And generate invalid space records; execute the operation request, send a metadata update request to the metadata server according to the execution result of the operation request; after receiving the metadata update complete message fed back by the metadata server, add the invalid space record to the target queue ; When the preset time point is reached, the invalid space record in the target queue is placed in a disk operation.
  • this method after receiving the delete request or modification request for the aggregated small file, this method first generates the corresponding invalid space record, and then uses the metadata server to complete the metadata update operation. After the metadata server completes the metadata update operation, it will be invalid The space records are added to the target queue, and the invalid space records in the target queue are placed in batches when the preset time point is reached.
  • this method avoids the need to place invalid space every time a delete request or modification request is executed, improves the performance of aggregated small file modification delete performance, and reduces the pressure on the metadata server; on the other hand, this method only works on metadata
  • the server adds the invalid space record to the target queue after completing the metadata update operation, so the accuracy of the invalid space to be placed in the disk can be guaranteed.
  • the present application also provides a processing device for aggregating operation requests of small files, a terminal device of a distributed file system, and a readable storage medium, the technical effect of which corresponds to the technical effect of the above method, and will not be repeated here.
  • FIG. 1 is an implementation flowchart of Embodiment 1 of a method for processing operation requests for aggregating small files provided by this application;
  • FIG. 2 is a flowchart of the process of placing an invalid space in the second embodiment of a method for processing operation requests for aggregating small files provided by this application;
  • FIG. 3 is a flowchart of the implementation of the scanning and merging process in Embodiment 2 of the method for processing operation requests for aggregating small files provided by this application;
  • Fig. 4 is a functional block diagram of an embodiment of an apparatus for processing operation requests for aggregating small files provided by this application.
  • this application provides a method and device for processing operation requests of aggregated small files, a terminal device of a distributed file system, and a readable storage medium.
  • the corresponding invalid space record is generated.
  • the metadata server completes the metadata update operation, the invalid space record is added to the target queue to wait for the disk placement. Perform batch placement operations on invalid space records in the target queue at the point in time.
  • this application has at least the following advantages: First, this application does not need to perform an invalid space placement operation every time it receives a delete request or a modification request for aggregated small files, which reduces the frequency of invalid space placement and improves Aggregate small file modification and deletion performance to reduce the calculation pressure of the metadata server; second, use the client to complete the invalid space placement operation, further reducing the calculation pressure of the metadata server; third, if and only if the metadata server is completed After the metadata update operation, the invalid space record is added to the target queue, which can ensure the accuracy of the invalid space to be placed.
  • the first embodiment includes:
  • S101 Receive an operation request for aggregating small files, where the operation request is a deletion request or a modification request;
  • This embodiment is applied to a client of a distributed file system based on object storage technology.
  • aggregate storage of small files is an important means to improve storage system utilization and improve file reading speed.
  • multiple small files are aggregated in units of objects, and the objects are stored on the disk as the data of the aggregated large file.
  • the operation request specifically refers to a deletion request or a modification request.
  • small aggregate files When small aggregate files are deleted, their original storage space will become invalid space.
  • a small aggregate file When a small aggregate file is modified, it is divided into two processing methods, one is to write to another location, and the other is to become a non-aggregated file. Both methods will make the original storage space of the aggregate small file become invalid space. .
  • the invalid space record is used to record the original storage space of the aggregate small file, and specifically may include the identification information of the aggregate large file where the aggregate small file is located, the offset and length of the aggregate small file in the aggregate large file, and so on.
  • the process of executing the operation request specifically includes: modifying or deleting the data of the aggregated small file, and at the same time updating the metadata of the aggregated small file. Then, according to the metadata update result on the client, a metadata update request is sent to the metadata server. After the metadata server completes the metadata update operation, it sends a message to the client. Then, the client knows that the metadata update operation is completed on the metadata server side, and then adds the previously generated invalid space record to the target queue.
  • the metadata update request sent by the client to the metadata server includes metadata identification information, so the metadata server knows which metadata to update; correspondingly, the metadata server sends the metadata to the client.
  • the data update complete message also includes the identification information of the metadata, so the client knows which metadata has completed the update process on the metadata server side, so that the corresponding invalid space record is added to the target queue.
  • the target queue is used to store the invalid space records to be placed.
  • the invalid space records in the target queue can be placed regularly, for example, the invalid space records in the target queue are placed periodically according to a certain frequency. operate.
  • the number of invalid space placements is reduced, the frequency of invalid space placements is reduced, the deletion and modification performance of aggregated small files is improved, the calculation pressure of the metadata server is reduced, and the performance of the distributed file system is ultimately improved.
  • This embodiment provides a method for processing operation requests of aggregate small files. After receiving a delete request or modification request for aggregate small files, first generate the corresponding invalid space record, and then use the metadata server to complete the metadata update operation After the metadata server completes the metadata update operation, the invalid space records are added to the target queue, and the invalid space records in the target queue are placed in batches when the preset time point is reached.
  • this method avoids the need to place invalid space every time a delete request or modification request is executed, improves the performance of aggregated small file modification delete performance, and reduces the pressure on the metadata server; on the other hand, this method only works on metadata
  • the server adds the invalid space record to the target queue after completing the metadata update operation, so the accuracy of the invalid space to be placed in the disk can be guaranteed.
  • the second embodiment of a method for processing operation requests for aggregating small files provided by the present application will be introduced in detail below.
  • the second embodiment is implemented based on the aforementioned first embodiment, and is expanded to a certain extent on the basis of the first embodiment.
  • the second embodiment restricts that the client will update the identification information of the metadata when executing the operation request, that is, the number of the metadata; and restricts the invalid space record to include the updated metadata number.
  • the client first determines which message corresponds to Metadata number, and then directly add all invalid space records less than or equal to the metadata number to the target queue, thereby avoiding unnecessary waiting process, reducing the number of metadata number comparisons, and further improving the performance of the distributed file system.
  • the second embodiment specifically includes:
  • S201 Receive an operation request for aggregated small files, where the operation request is a deletion request or a modification request;
  • the invalid space record includes the identification information of the large aggregate file corresponding to the small aggregate file, and also includes the offset and length of the aggregate small file in the aggregate large file. Therefore, the invalid space record can be specifically: (aggregate large file ino number, offset offset, length len). After the invalid space record is generated, it can be put into the first queue of the local cache.
  • the number update result can be recorded as tid, which specifically refers to the updated metadata id.
  • tid specifically refers to the updated metadata id.
  • the metadata server persists the corresponding metadata according to the tid, and then responds to the client.
  • the invalid cache record is extracted from the first queue, and the corresponding number update result is added to the invalid space record, and the added invalid space record is placed in the second queue.
  • This embodiment does not limit the sequence of S204 and S205.
  • the invalid space records whose serial number update result is less than or equal to the target serial number update result are selected from the second queue and added to the target queue.
  • S207 Periodically perform a disk placement operation on the invalid space records in the target queue.
  • this embodiment also provides a process of supplementing and combining invalid space through scanning.
  • the process specifically includes: loading the previously recorded invalid space corresponding to the aggregate large file to the disk, and scanning the aggregate large file regularly according to the object granularity; for each object, scanning starts from the end of the first invalid record, Obtain the metadata of the aggregated small file to which the space belongs according to the object characteristics, and judge whether the current storage space is invalid space according to the metadata; if the current storage space is invalid space, generate an invalid space record and continue scanning the next data area; if The current storage space is not invalid, skip the current object, and continue to scan and process the next object. After the aggregation of large files is scanned, the newly generated invalid space records are merged with the original invalid space records.
  • the method for processing operation requests of aggregated small files provided in this embodiment is designed to record invalid space, batch asynchronous disk placement, scanning and supplementary merging schemes for aggregated files, to improve the performance of modifying, writing, and deleting aggregated small files, and to reduce metadata. Data cluster pressure, while ensuring that invalid junk data can be completely recorded.
  • the following describes an apparatus for processing operation requests for aggregating small files provided in an embodiment of the present application.
  • the apparatus for processing operation requests for aggregating small files described below is the same as the processing apparatus for processing operation requests for aggregating small files described above.
  • the methods can correspond to each other and refer to each other.
  • the apparatus for processing an operation request for aggregating small files in this embodiment includes:
  • Request receiving module 401 used to receive an operation request for aggregated small files, where the operation request is a delete request or a modification request;
  • Record generation module 402 used to record the storage space of the aggregated small files as invalid space and generate invalid space records
  • Request update module 403 used to execute the operation request, and send a metadata update request to the metadata server according to the execution result of the operation request;
  • the record adding module 404 is configured to add the invalid space record to the target queue after receiving the metadata update complete message fed back by the metadata server;
  • the record placing module 405 is configured to perform a disk placing operation on the invalid space record in the target queue when the preset time point is reached.
  • the apparatus for processing operation requests for aggregate small files in this embodiment is used to implement the aforementioned processing method for operation requests for aggregate small files. Therefore, the specific implementation of the device can be seen in the foregoing processing method for operation requests for aggregate small files.
  • the embodiment part, for example, the request receiving module 401, the record generating module 402, the request updating module 403, the record adding module 404, and the record placing module 405 are respectively used to implement step S101 in the processing method of the operation request of the aggregate small file, S102, S103, S104, and S105 Therefore, the specific implementation can refer to the description of the respective parts of the embodiment, and the introduction is not repeated here.
  • the apparatus for processing operation requests for aggregated small files in this embodiment is used to implement the foregoing processing method for operation requests for aggregated small files, its function corresponds to that of the foregoing method, and will not be repeated here.
  • this application also provides a terminal device of a distributed file system, including:
  • Memory used to store computer programs
  • Processor used to execute the computer program to implement the steps of the method for processing an operation request for aggregating small files as described above.
  • this application provides a readable storage medium with a computer program stored on the readable storage medium, and when the computer program is executed by a processor, it is used to realize the processing of the operation request of aggregating small files as described above. Method steps.
  • the steps of the method or algorithm described in combination with the embodiments disclosed herein can be directly implemented by hardware, a software module executed by a processor, or a combination of the two.
  • the software module can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or all areas in the technical field. Any other known storage media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种聚合小文件的操作请求的处理方法、装置、分布式文件系统的终端设备和可读存储介质,所述方法在接收到针对聚合小文件的删除请求或修改请求之后,首先生成无效空间记录,然后利用元数据服务器完成元数据更新操作,在元数据服务器完成元数据更新操作后将无效空间记录加入目标队列,并在达到预设时间点时对目标队列中的无效空间记录进行批量落盘。该方法能够避免每次执行删除请求或修改请求时都需要进行无效空间的落盘操作,提升了聚合小文件修改、删除性能,降低元数据服务器压力,且能够保证待落盘的无效空间的准确性。

Description

一种聚合小文件的操作请求的处理方法及装置
本申请要求于2020年05月28日提交至中国专利局、申请号为202010469827.2、发明名称为“一种聚合小文件的操作请求的处理方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别涉及一种聚合小文件的操作请求的处理方法、装置、分布式文件系统的终端设备和可读存储介质。
背景技术
在基于对象存储技术的分布式文件系统的小文件聚合场景中,若聚合小文件在聚合大文件中的映射位置发生改变,则会生成一个无效空间记录。为了保证后续能够清理垃圾对象,每次生成的无效空间记录都要进行落盘处理。因此,分布式文件系统每接收一个对聚合小文件的修改请求或删除请求时,元数据服务器不仅需要更新元数据,还需要执行一次无效空间的落盘操作,这严重影响聚合小文件的操作效率,降低分布式文件系统的性能。
可见,如何避免每次接收到对聚合小文件的操作请求时,都需要执行一次无效空间的落盘操作,导致分布式文件系统性能受到影响,是亟待本领域技术人员解决的问题。
发明内容
本申请的目的是提供一种聚合小文件的操作请求的处理方法、装置、分布式文件系统的终端设备和可读存储介质,用以解决当前方案每次接收到对聚合小文件的操作请求时,都需要执行一次无效空间的落盘操作,导致分布式文件系统性能受到严重影响的问题。其具体方案如下:
第一方面,本申请提供了一种聚合小文件的操作请求的处理方法,应用于客户端,包括:
接收针对聚合小文件的操作请求,其中所述操作请求为删除请求或修改请求;
将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录;
执行所述操作请求,根据所述操作请求的执行结果向元数据服务器发送元数据更新请求;
在接收到所述元数据服务器反馈的元数据更新完成的消息后,将所述无效空间记录添加至目标队列;
在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作。
优选的,所述将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录,包括:
将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录,其中所述无效空间记录包括所述聚合小文件对应的聚合大文件的标识信息,还包括所述聚合小文件在聚合大文件中的偏移量和长度。
优选的,所述执行所述操作请求,根据所述操作请求的执行结果向元数据服务器发送元数据更新请求,包括:
执行所述操作请求,对所述聚合小文件的元数据的编号进行更新,得到编号更新结果;向元数据服务器发送元数据更新请求,其中所述元数据更新请求包括所述编号更新结果;
相应的,所述在接收到所述元数据服务器反馈的元数据更新完成的消息后,将所述无效空间记录添加至目标队列,包括:
在接收到所述元数据服务器反馈的元数据更新完成的消息后,若所述消息包括所述编号更新结果,则将与所述编号更新结果对应的无效空间记录添加至目标队列。
优选的,在所述执行所述操作请求,对所述聚合小文件的元数据的编号进行更新,得到编号更新结果之后,还包括:
在所述无效空间记录中添加所述编号更新结果。
优选的,所述在接收到所述元数据服务器反馈的元数据更新完成的消息后,若所述消息包括所述编号更新结果,则将所述无效空间记录添加至目标队列,包括:
在接收到所述元数据服务器反馈的元数据更新完成的消息后,确定所述消息所包括的目标编号更新结果,将编号更新结果小于等于所述目标编 号更新结果的无效空间记录添加至目标队列。
优选的,所述在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作,包括:
周期性地对所述目标队列中的无效空间记录进行落盘操作。
优选的,在所述在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作之前,还包括:
根据所述目标队列中的无效空间记录,按照对象粒度对聚合大文件进行扫描,判断未记为无效空间的存储空间是否为无效空间;
若是,则生成相应的无效空间记录,并添加至所述目标队列。
第二方面,本申请提供了一种聚合小文件的操作请求的处理装置,包括:
请求接收模块:用于接收针对聚合小文件的操作请求,其中所述操作请求为删除请求或修改请求;
记录生成模块:用于将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录;
请求更新模块:用于执行所述操作请求,根据所述操作请求的执行结果向元数据服务器发送元数据更新请求;
记录添加模块:用于在接收到所述元数据服务器反馈的元数据更新完成的消息后,将所述无效空间记录添加至目标队列;
记录落盘模块:用于在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作。
第三方面,本申请提供了一种分布式文件系统的终端设备,包括:
存储器:用于存储计算机程序;
处理器:用于执行所述计算机程序,以实现如上所述的聚合小文件的操作请求的处理方法的步骤。
第四方面,本申请提供了一种可读存储介质,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时用于实现如上所述的聚合小文件的操作请求的处理方法的步骤。
本申请所提供的一种聚合小文件的操作请求的处理方法,包括:接收 针对聚合小文件的操作请求,其中操作请求为删除请求或修改请求;将聚合小文件的存储空间记为无效空间,并生成无效空间记录;执行操作请求,根据操作请求的执行结果向元数据服务器发送元数据更新请求;在接收到元数据服务器反馈的元数据更新完成的消息后,将无效空间记录添加至目标队列;在达到预设时间点时,对目标队列中的无效空间记录进行落盘操作。
可见,该方法在接收到针对聚合小文件的删除请求或修改请求之后,首先生成对应的无效空间记录,然后利用元数据服务器完成元数据更新操作,在元数据服务器完成元数据更新操作后将无效空间记录加入目标队列,并在达到预设时间点时对目标队列中的无效空间记录进行批量落盘。一方面,该方法避免每次执行删除请求或修改请求都需要进行无效空间的落盘操作,提升了聚合小文件修改性能删除性能,降低元数据服务器压力;另一方面,该方法只有在元数据服务器完成元数据更新操作之后才将无效空间记录加入目标队列,因此能够保证待落盘的无效空间的准确性。
此外,本申请还提供了一种聚合小文件的操作请求的处理装置、分布式文件系统的终端设备和可读存储介质,其技术效果与上述方法的技术效果相对应,这里不再赘述。
附图说明
为了更清楚的说明本申请实施例或现有技术的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请所提供的一种聚合小文件的操作请求的处理方法实施例一的实现流程图;
图2为本申请所提供的一种聚合小文件的操作请求的处理方法实施例二的无效空间落盘过程的实现流程图;
图3为本申请所提供的一种聚合小文件的操作请求的处理方法实施例二的扫描合并过程的实现流程图;
图4为本申请所提供的一种聚合小文件的操作请求的处理装置实施例 的功能框图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面结合附图和具体实施方式对本申请作进一步的详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
传统方案中,客户端每次接收到针对聚合小文件的删除请求或修改请求之后,元数据服务器需要分别执行元数据更新操作和无效空间落盘操作,导致元数据服务器计算压力大,聚合小文件删除、修改效率低,严重影响分布式文件系统的性能。
针对该问题,本申请提供一种聚合小文件的操作请求的处理方法、装置、分布式文件系统的终端设备和可读存储介质。在接收到针对聚合小文件的删除请求或修改请求之后,生成对应的无效空间记录,在元数据服务器完成元数据更新操作之后,将无效空间记录添加至目标队列以等待落盘,在达到预设时间点时对目标队列中的无效空间记录进行批量落盘操作。
因此,本申请至少具备以下优点:第一,本申请不需要每次接收到针对聚合小文件的删除请求或修改请求后都执行一次无效空间的落盘操作,降低无效空间落盘频率,提升了聚合小文件修改、删除性能,降低元数据服务器的计算压力;第二,利用客户端完成无效空间的落盘操作,进一步降低元数据服务器的计算压力;第三,当且仅当元数据服务器完成元数据更新操作之后,才将无效空间记录添加至目标队列,能够保证待落盘的无效空间的准确性。
下面对本申请提供的一种聚合小文件的操作请求的处理方法实施例一进行介绍,参见图1,实施例一包括:
S101、接收针对聚合小文件的操作请求,其中所述操作请求为删除请求或修改请求;
S102、将所述聚合小文件的存储空间记为无效空间,并生成无效空间 记录;
S103、执行所述操作请求,根据所述操作请求的执行结果向元数据服务器发送元数据更新请求;
S104、在接收到所述元数据服务器反馈的元数据更新完成的消息后,将所述无效空间记录添加至目标队列;
S105、在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作。
本实施例应用于基于对象存储技术的分布式文件系统的客户端。在小文件应用场景中,对小文件进行聚合存储是提高存储系统利用率和提升文件读取速度的重要手段。在小文件聚合场景下,多个小文件以对象为单位进行聚合,对象作为聚合大文件的数据存入到磁盘中。
本实施例中,操作请求特指删除请求或修改请求。当聚合小文件被删除时,其原本所在的存储空间会变为无效空间。当聚合小文件被修改时,分为两种处理方式,一种是写其他位置,另一种是变为非聚合文件,这两种方式都会使聚合小文件原本所在的存储空间变为无效空间。
因此,在接收到针对聚合小文件的操作请求之后,本实施例会生成无效空间记录。无效空间记录用于记录该聚合小文件的原始的存储空间,具体可以包括该聚合小文件所在的聚合大文件的标识信息、该聚合小文件在聚合大文件中的偏移和长度等。
执行操作请求的过程具体包括:修改或删除聚合小文件的数据,同时更新聚合小文件的元数据。然后,根据客户端上元数据更新结果,向元数据服务器发送元数据更新请求。元数据服务器完成元数据更新操作之后,会向客户端发送一个消息。而后,客户端得知元数据服务器一侧完成元数据更新操作,进而将之前生成的无效空间记录添加至目标队列。
可以理解的是,客户端向元数据服务器发送的元数据更新请求中包括元数据的标识信息,因此元数据服务器知道对哪个元数据进行更新;相应的,元数据服务器向客户端发送的表示元数据更新完成的消息中也包括元数据的标识信息,因此客户端知道哪个元数据完成了在元数据服务器一侧的更新过程,从而将对应的无效空间记录添加至目标队列。
目标队列用于存储待进行落盘的无效空间记录,具体的,可以定时对目标队列中的无效空间记录进行落盘操作,例如按照一定频率周期性地对目标队列中的无效空间记录进行落盘操作。通过这种落盘方式,减少无效空间落盘次数,降低无效空间落盘频率,提升聚合小文件的删除、修改性能,降低元数据服务器的计算压力,最终实现提升分布式文件系统的性能的目的。
本实施例所提供一种聚合小文件的操作请求的处理方法,在接收到针对聚合小文件的删除请求或修改请求之后,首先生成对应的无效空间记录,然后利用元数据服务器完成元数据更新操作,在元数据服务器完成元数据更新操作后将无效空间记录加入目标队列,并在达到预设时间点时对目标队列中的无效空间记录进行批量落盘。一方面,该方法避免每次执行删除请求或修改请求都需要进行无效空间的落盘操作,提升了聚合小文件修改性能删除性能,降低元数据服务器压力;另一方面,该方法只有在元数据服务器完成元数据更新操作之后才将无效空间记录加入目标队列,因此能够保证待落盘的无效空间的准确性。
下面开始详细介绍本申请提供的一种聚合小文件的操作请求的处理方法实施例二,实施例二基于前述实施例一实现,并在实施例一的基础上进行了一定程度上的拓展。
具体的,实施例二限定了客户端在执行操作请求时会更新元数据的标识信息,即元数据的编号;并限定了无效空间记录中包括更新后的元数据编号。基于以上条件,考虑到元数据服务器按照元数据编号从小到大的顺序进行元数据更新操作,因此,当接收到元数据服务器反馈的元数据更新完成的消息之后,客户端首先确定该消息对应的元数据编号,然后直接将小于等于该元数据编号的全部无效空间记录添加至目标队列,从而避免不必要的等待过程,并减少元数据编号比对次数,进一步提升分布式文件系统的性能。
参见图2,实施例二具体包括:
S201、接收针对聚合小文件的操作请求,其中所述操作请求为删除请 求或修改请求;
S202、将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录;
其中所述无效空间记录包括所述聚合小文件对应的聚合大文件的标识信息,还包括所述聚合小文件在聚合大文件中的偏移量和长度。因此,无效空间记录具体可以为:(聚合大文件ino号,偏移offset,长度len)。生成无效空间记录之后,可以先将其放入本地缓存的第一队列中。
S203、执行所述操作请求,对所述聚合小文件的元数据的编号进行更新,得到编号更新结果;
编号更新结果可以记为tid,具体是指更新后的元数据id。客户端每次修改元数据都会递增元数据id,元数据服务器根据tid持久化对应的元数据,然后给客户端应答。
S204、在所述无效空间记录中添加所述编号更新结果;
具体的,从第一队列中提取中无效缓存记录,并在无效空间记录中添加对应的编号更新结果,将添加完成的无效空间记录放入第二队列中。
S205、向元数据服务器发送元数据更新请求,其中所述元数据更新请求包括所述编号更新结果;
本实施例不限定S204和S205的先后顺序。
S206、在接收到所述元数据服务器反馈的元数据更新完成的消息后,确定所述消息所包括的目标编号更新结果,将编号更新结果小于等于所述目标编号更新结果的无效空间记录添加至目标队列;
具体的,从第二队列中选取编号更新结果小于等于目标编号更新结果的无效空间记录,并添加至目标队列中。
S207、周期性地对所述目标队列中的无效空间记录进行落盘操作。
此外,本实施例还提供了通过扫描补充合并无效空间的过程。如图3所示,该过程具体包括:向磁盘加载先前记录的聚合大文件对应的无效空间,定时按照对象粒度扫描聚合大文件;对于每个对象,从第一个无效记录的末尾开始扫描,根据对象特性获取空间所属的聚合小文件的元数据,根据该元数据判断当前存储空间是否为无效空间;若当前存储空间为无效 空间,则生成无效空间记录,并继续扫描下一数据区域;若当前存储空间不为无效空间,跳过当前对象,继续扫描处理下一对象。聚合大文件扫描完成后,将新生成的无效空间记录与原本的无效空间记录进行合并。
可见,本实施例提供的一种聚合小文件的操作请求的处理方法,针对聚合文件,设计无效空间记录、批量异步落盘、扫描补充合并方案,提升聚合小文件修改写、删除性能,降低元数据集群压力,同时保证无效垃圾数据能够完全记录。
下面对本申请实施例提供的一种聚合小文件的操作请求的处理装置进行介绍,下文描述的一种聚合小文件的操作请求的处理装置与上文描述的一种聚合小文件的操作请求的处理方法可相互对应参照。
如图4所示,本实施例的聚合小文件的操作请求的处理装置,包括:
请求接收模块401:用于接收针对聚合小文件的操作请求,其中所述操作请求为删除请求或修改请求;
记录生成模块402:用于将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录;
请求更新模块403:用于执行所述操作请求,根据所述操作请求的执行结果向元数据服务器发送元数据更新请求;
记录添加模块404:用于在接收到所述元数据服务器反馈的元数据更新完成的消息后,将所述无效空间记录添加至目标队列;
记录落盘模块405:用于在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作。
本实施例的聚合小文件的操作请求的处理装置用于实现前述的聚合小文件的操作请求的处理方法,因此该装置中的具体实施方式可见前文中的聚合小文件的操作请求的处理方法的实施例部分,例如,请求接收模块401、记录生成模块402、请求更新模块403、记录添加模块404、记录落盘模块405,分别用于实现上述聚合小文件的操作请求的处理方法中步骤S101,S102,S103,S104,S105所以,其具体实施方式可以参照相应的各个部分实施例的描述,在此不再展开介绍。
另外,由于本实施例的聚合小文件的操作请求的处理装置用于实现前述的聚合小文件的操作请求的处理方法,因此其作用与上述方法的作用相对应,这里不再赘述。
此外,本申请还提供了一种分布式文件系统的终端设备,包括:
存储器:用于存储计算机程序;
处理器:用于执行所述计算机程序,以实现如上文所述的聚合小文件的操作请求的处理方法的步骤。
最后,本申请提供了一种可读存储介质,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时用于实现如上文所述的聚合小文件的操作请求的处理方法的步骤。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
以上对本申请所提供的方案进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (10)

  1. 一种聚合小文件的操作请求的处理方法,其特征在于,应用于客户端,包括:
    接收针对聚合小文件的操作请求,其中所述操作请求为删除请求或修改请求;
    将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录;
    执行所述操作请求,根据所述操作请求的执行结果向元数据服务器发送元数据更新请求;
    在接收到所述元数据服务器反馈的元数据更新完成的消息后,将所述无效空间记录添加至目标队列;
    在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作。
  2. 如权利要求1所述的方法,其特征在于,所述将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录,包括:
    将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录,其中所述无效空间记录包括所述聚合小文件对应的聚合大文件的标识信息,还包括所述聚合小文件在聚合大文件中的偏移量和长度。
  3. 如权利要求2所述的方法,其特征在于,所述执行所述操作请求,根据所述操作请求的执行结果向元数据服务器发送元数据更新请求,包括:
    执行所述操作请求,对所述聚合小文件的元数据的编号进行更新,得到编号更新结果;向元数据服务器发送元数据更新请求,其中所述元数据更新请求包括所述编号更新结果;
    相应的,所述在接收到所述元数据服务器反馈的元数据更新完成的消息后,将所述无效空间记录添加至目标队列,包括:
    在接收到所述元数据服务器反馈的元数据更新完成的消息后,若所述消息包括所述编号更新结果,则将与所述编号更新结果对应的无效空间记录添加至目标队列。
  4. 如权利要求3所述的方法,其特征在于,在所述执行所述操作请求,对所述聚合小文件的元数据的编号进行更新,得到编号更新结果之后,还 包括:
    在所述无效空间记录中添加所述编号更新结果。
  5. 如权利要求4所述的方法,其特征在于,所述在接收到所述元数据服务器反馈的元数据更新完成的消息后,若所述消息包括所述编号更新结果,则将所述无效空间记录添加至目标队列,包括:
    在接收到所述元数据服务器反馈的元数据更新完成的消息后,确定所述消息所包括的目标编号更新结果,将编号更新结果小于等于所述目标编号更新结果的无效空间记录添加至目标队列。
  6. 如权利要求1所述的方法,其特征在于,所述在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作,包括:
    周期性地对所述目标队列中的无效空间记录进行落盘操作。
  7. 如权利要求1-6任意一项所述的方法,其特征在于,在所述在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作之前,还包括:
    根据所述目标队列中的无效空间记录,按照对象粒度对聚合大文件进行扫描,判断未记为无效空间的存储空间是否为无效空间;
    若是,则生成相应的无效空间记录,并添加至所述目标队列。
  8. 一种聚合小文件的操作请求的处理装置,其特征在于,包括:
    请求接收模块:用于接收针对聚合小文件的操作请求,其中所述操作请求为删除请求或修改请求;
    记录生成模块:用于将所述聚合小文件的存储空间记为无效空间,并生成无效空间记录;
    请求更新模块:用于执行所述操作请求,根据所述操作请求的执行结果向元数据服务器发送元数据更新请求;
    记录添加模块:用于在接收到所述元数据服务器反馈的元数据更新完成的消息后,将所述无效空间记录添加至目标队列;
    记录落盘模块:用于在达到预设时间点时,对所述目标队列中的无效空间记录进行落盘操作。
  9. 一种分布式文件系统的终端设备,其特征在于,包括:
    存储器:用于存储计算机程序;
    处理器:用于执行所述计算机程序,以实现如权利要求1-7任意一项所述的聚合小文件的操作请求的处理方法的步骤。
  10. 一种可读存储介质,其特征在于,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时用于实现如权利要求1-7任意一项所述的聚合小文件的操作请求的处理方法的步骤。
PCT/CN2021/073259 2020-05-28 2021-01-22 一种聚合小文件的操作请求的处理方法及装置 WO2021238246A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010469827.2 2020-05-28
CN202010469827.2A CN111625515A (zh) 2020-05-28 2020-05-28 一种聚合小文件的操作请求的处理方法及装置

Publications (1)

Publication Number Publication Date
WO2021238246A1 true WO2021238246A1 (zh) 2021-12-02

Family

ID=72260087

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/073259 WO2021238246A1 (zh) 2020-05-28 2021-01-22 一种聚合小文件的操作请求的处理方法及装置

Country Status (2)

Country Link
CN (1) CN111625515A (zh)
WO (1) WO2021238246A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115576505A (zh) * 2022-12-13 2023-01-06 浪潮电子信息产业股份有限公司 一种数据存储方法、装置、设备及可读存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625515A (zh) * 2020-05-28 2020-09-04 苏州浪潮智能科技有限公司 一种聚合小文件的操作请求的处理方法及装置
CN112148800B (zh) * 2020-10-20 2021-04-27 北京天华星航科技有限公司 分布式数据存储系统
CN113704027B (zh) * 2021-10-29 2022-02-18 苏州浪潮智能科技有限公司 文件聚合兼容方法、装置、计算机设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095489A (zh) * 2015-08-18 2015-11-25 浪潮(北京)电子信息产业有限公司 一种分布式文件删除方法、装置和系统
CN107704203A (zh) * 2017-09-27 2018-02-16 郑州云海信息技术有限公司 聚合大文件的删除方法、装置、设备及计算机存储介质
US20190114082A1 (en) * 2017-10-17 2019-04-18 HoneycombData Inc. Coordination Of Compaction In A Distributed Storage System
CN111125034A (zh) * 2019-12-27 2020-05-08 深信服科技股份有限公司 一种聚合对象数据处理方法、系统及相关设备
CN111625515A (zh) * 2020-05-28 2020-09-04 苏州浪潮智能科技有限公司 一种聚合小文件的操作请求的处理方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095489A (zh) * 2015-08-18 2015-11-25 浪潮(北京)电子信息产业有限公司 一种分布式文件删除方法、装置和系统
CN107704203A (zh) * 2017-09-27 2018-02-16 郑州云海信息技术有限公司 聚合大文件的删除方法、装置、设备及计算机存储介质
US20190114082A1 (en) * 2017-10-17 2019-04-18 HoneycombData Inc. Coordination Of Compaction In A Distributed Storage System
CN111125034A (zh) * 2019-12-27 2020-05-08 深信服科技股份有限公司 一种聚合对象数据处理方法、系统及相关设备
CN111625515A (zh) * 2020-05-28 2020-09-04 苏州浪潮智能科技有限公司 一种聚合小文件的操作请求的处理方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115576505A (zh) * 2022-12-13 2023-01-06 浪潮电子信息产业股份有限公司 一种数据存储方法、装置、设备及可读存储介质

Also Published As

Publication number Publication date
CN111625515A (zh) 2020-09-04

Similar Documents

Publication Publication Date Title
WO2021238246A1 (zh) 一种聚合小文件的操作请求的处理方法及装置
US10564880B2 (en) Data deduplication method and apparatus
WO2021077745A1 (zh) 一种分布式存储系统的数据读写方法
WO2017041570A1 (zh) 向缓存写入数据的方法及装置
CN110008738B (zh) 用于区块链合约数据的缓存方法、装置、介质和计算设备
JP2014071905A (ja) コンピュータシステム及びコンピュータシステムのデータ管理方法
WO2014023000A1 (zh) 分布式数据处理方法及装置
WO2019001521A1 (zh) 数据存储方法、存储设备、客户端及系统
WO2022134128A1 (zh) 多版本数据存储方法、装置、计算机设备及存储介质
CN110727404A (zh) 一种基于存储端的数据重删方法、设备以及存储介质
CN109213450B (zh) 一种基于闪存阵列的关联元数据删除方法、装置及设备
WO2021027340A1 (zh) 一种键值kv的存储方法、装置及存储设备
WO2017157158A1 (zh) 写数据的方法及装置、计算机存储介质
US20240086332A1 (en) Data processing method and system, device, and medium
US20130311734A1 (en) Data copy management for faster reads
WO2013091167A1 (zh) 日志存储方法及系统
JP2017527877A (ja) フラッシュメモリから/フラッシュメモリへデータを読み取る/書き込むための方法および装置、ならびにユーザ機器
JP2005222534A (ja) フラッシュメモリのデータ管理装置及び方法
CN108280123B (zh) 一种HBase的列聚合方法
CN112764662B (zh) 用于存储管理的方法、设备和计算机程序产品
CN109324929B (zh) 一种快照创建方法、装置、设备及可读存储介质
US10073657B2 (en) Data processing apparatus, data processing method, and computer program product, and entry processing apparatus
CN114442961B (zh) 数据处理方法、装置、计算机设备及存储介质
WO2011120335A1 (zh) 数据操作的方法、装置及计算机
KR20120074817A (ko) 저장 장치의 중복 제거 성능 향상을 위한 맵핑 관리 시스템 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21812826

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21812826

Country of ref document: EP

Kind code of ref document: A1