WO2023087231A1 - Directory reading system - Google Patents

Directory reading system Download PDF

Info

Publication number
WO2023087231A1
WO2023087231A1 PCT/CN2021/131620 CN2021131620W WO2023087231A1 WO 2023087231 A1 WO2023087231 A1 WO 2023087231A1 CN 2021131620 W CN2021131620 W CN 2021131620W WO 2023087231 A1 WO2023087231 A1 WO 2023087231A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
directory
client
read
entry
Prior art date
Application number
PCT/CN2021/131620
Other languages
French (fr)
Chinese (zh)
Inventor
霍尔德阿比吉特
李楚
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2021/131620 priority Critical patent/WO2023087231A1/en
Priority to CN202180036862.8A priority patent/CN117280333A/en
Publication of WO2023087231A1 publication Critical patent/WO2023087231A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application provides a query system and apparatus. The query system comprises a client and a storage device. The client is used for sending, to the storage device, a read request for requesting to read directory entries in a target path. The storage device is used for obtaining the directory entries in the target path in response to the read request; and compressing some or all of the directory entries in the target path, and sending the compressed directory entries to the client by means of a read response. According to the mode, the storage device compresses the directory entries, and then sends the compressed directory entries, so that more entries can be borne in one read response, the number of roundtrips for the interaction of the directory entries between the storage device and the client can be reduced, the delay of obtaining the complete directory entries by the client is reduced, and the directory reading performance is improved.

Description

一种目录读取系统A directory reading system 技术领域technical field
本申请涉及计算机技术领域,尤其涉及一种目录读取系统。The present application relates to the field of computer technology, in particular to a directory reading system.
背景技术Background technique
分布式或共享存储系统(例如,网络附加存储(network-attached storage,NAS)系统或丛集)通常包含多个NAS装置,其将基于文件的数据存储服务提供到网络中的其它装置(客户端,例如应用程序服务器)。A distributed or shared storage system (e.g., a network-attached storage (NAS) system or cluster) typically consists of multiple NAS devices that provide file-based data storage services to other devices (clients, such as an application server).
客户端可以使用网络文件系统(Network File System,NFS)或通用Internet文件系统(Common Internet File System,CIFS)等数据共享或文件共享协议经由网络对NAS装置上的文件系统进行访问。如NAS客户端请求访问NAS装置中指定路径下的目录条目,NAS装置会将该指定路径的目录条目返回给NAS客户端。但是这可能需要NAS客户端和NAS装置之间通过多次往返来完成,这是由于当目录条目较大时,NAS装置每次只能返回部分条目至客户端。Clients can use data sharing or file sharing protocols such as Network File System (Network File System, NFS) or Common Internet File System (Common Internet File System, CIFS) to access the file system on the NAS device via the network. If the NAS client requests to access the directory entry under the specified path in the NAS device, the NAS device will return the directory entry of the specified path to the NAS client. However, this may require multiple round trips between the NAS client and the NAS device, because when the directory entries are large, the NAS device can only return some entries to the client each time.
综上,现有随着目录中条目数量的增加,读取的成本会越来越高,并且读目录的总体时延较高。To sum up, as the number of entries in the current directory increases, the cost of reading will become higher and higher, and the overall delay of reading the directory is relatively high.
发明内容Contents of the invention
本申请提供一种目录读取系统,用于减少NAS客户端获取目录条目的时延,提高读目录的性能。The present application provides a directory reading system, which is used to reduce the time delay for a NAS client to obtain directory entries and improve the performance of directory reading.
附图说明Description of drawings
图1为本申请实施例提供的一种系统架构示意图;FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application;
图2为本申请实施例提供的一种NAS装置上文件系统示意图;FIG. 2 is a schematic diagram of a file system on a NAS device provided by an embodiment of the present application;
图3为本申请实施例提供的一种目录读取系统所执行方法流程示意图;FIG. 3 is a schematic flowchart of a method executed by a directory reading system provided in an embodiment of the present application;
图4为本申请实施例提供的一种获取文件句柄的流程示意图;FIG. 4 is a schematic flow diagram of obtaining a file handle provided by an embodiment of the present application;
图5为本申请实施例提供的一种交互目录条目的流程示意图;FIG. 5 is a schematic flowchart of an interactive directory entry provided by an embodiment of the present application;
图6为本申请实施例提供的一种读响应的格式示意图;FIG. 6 is a schematic diagram of the format of a read response provided by the embodiment of the present application;
图7为本申请实施例提供的不同目录读取方法的比对示意图。Fig. 7 is a schematic diagram of comparison of different directory reading methods provided in the embodiment of the present application.
具体实施方式Detailed ways
图1为本申请实施例提供的一种系统架构示意图。该系统包括NAS客户端10和NAS集群。其中,NAS集群通常包括多个NAS装置20(图1中仅示出了1个NAS装置20,但本申请不限于1个NAS装置20)。FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application. The system includes a NAS client 10 and a NAS cluster. Wherein, a NAS cluster generally includes a plurality of NAS devices 20 (only one NAS device 20 is shown in FIG. 1 , but the present application is not limited to one NAS device 20 ).
其中,NAS客户端10为用户侧的一种计算设备,可以是物理机,也可以是虚拟机。 物理机包括但不限于桌面电脑、服务器、笔记本电脑以及移动设备。NAS客户端10可以通过网络150与NAS集群内的任一NAS装置20通信。其中,网络150通常表示任何电信或计算机网络,包含例如企业内部网、广域网(wide area network,WAN)、局域网(local area network,LAN)、个域网(personal area network,PAN)或因特网。Wherein, the NAS client 10 is a computing device on the user side, which may be a physical machine or a virtual machine. Physical machines include, but are not limited to, desktop computers, servers, laptops, and mobile devices. The NAS client 10 can communicate with any NAS device 20 in the NAS cluster through the network 150 . Wherein, the network 150 generally represents any telecommunications or computer network, including, for example, an enterprise intranet, a wide area network (wide area network, WAN), a local area network (local area network, LAN), a personal area network (personal area network, PAN) or the Internet.
NAS装置20,可以指连接在网络上的具备数据存储功能的设备,其上部署有文件系统(参见图2),用于提供文件级共享服务,换而言之,NAS装置20通过文件级的数据访问和共享为不同客户端10提供存储资源。图2为NAS装置20内部署的文件系统的示意图。首先,对文件系统进行介绍:The NAS device 20 may refer to a device connected to the network with a data storage function, on which a file system (see FIG. 2 ) is deployed to provide file-level sharing services. In other words, the NAS device 20 uses file-level Data access and sharing provides storage resources to different clients 10 . FIG. 2 is a schematic diagram of a file system deployed in the NAS device 20 . First, an introduction to the file system:
文件系统是一个结构化的数据文件存储和组织形式。我们知道,计算机中所有的数据都是0和1,存储在硬件介质上的一连串的01组合对我们来说完全无法去分辨以及管理。因此我们用“文件”这个概念对这些数据进行组织,用于同一用途的数据,按照不同应用程序要求的结构方式组成不同类型的文件。通常用不同的后缀来指代不同的类型,然后我们给每个文件起一个方便理解记忆的名字。而当文件很多的时候,我们按照某种划分方式给这些文件分组,每一组文件放在同一个目录(或者叫文件夹)里面。而且目录下面除了文件还可以有下一级目录(称之为子目录或者子文件夹),所有的文件、目录形成一个树状结构。这个树状结构有一个专用的名字:文件系统(File System)。文件系统有很多类型,常见的有Windows的FAT/FAT32/NTFS,Linux的EXT2/EXT3/EXT4/XFS/BtrFS等。为了方便查找,从根节点开始逐级目录往下,一直到文件本身,把这些目录、子目录、文件的名字用特殊的字符(例如Windows/DOS用“\”,类Unix系统用“/”)拼接起来,这样的一串字符称之为文件路径,例如Linux中的“/etc/systemd/system.conf”或者Windows中的“C:\Windows\System32\taskmgr.exe”。路径是访问某个具体的文件的唯一标识。例如,Windows下的D:\data\file.exe就是一个文件的路径,它表示D分区下的data目录下的file.exe文件。A file system is a structured form of data file storage and organization. We know that all the data in the computer are 0 and 1, and a series of 01 combinations stored on the hardware media are completely indistinguishable and manageable for us. Therefore, we use the concept of "file" to organize these data, and the data used for the same purpose can be composed of different types of files according to the structure required by different applications. Usually different suffixes are used to refer to different types, and then we give each file a name that is easy to understand and remember. And when there are many files, we group these files according to a certain division method, and each group of files is placed in the same directory (or folder). In addition, there may be a subdirectory (subdirectory or subfolder) under the directory except files, and all files and directories form a tree structure. This tree structure has a dedicated name: File System (File System). There are many types of file systems, the common ones are FAT/FAT32/NTFS of Windows, EXT2/EXT3/EXT4/XFS/BtrFS of Linux, etc. In order to facilitate the search, start from the root node and go down to the file itself, and use special characters for the names of these directories, subdirectories, and files (such as "\" for Windows/DOS, and "/" for Unix-like systems. ) together, such a string of characters is called a file path, such as "/etc/systemd/system.conf" in Linux or "C:\Windows\System32\taskmgr.exe" in Windows. A path is a unique identifier for accessing a specific file. For example, D:\data\file.exe under Windows is the path of a file, which represents the file.exe file under the data directory under the D partition.
文件系统是建立在块设备上面的,文件系统不但记录文件路径,还记录哪些块组成一个文件,哪些块记录的是目录/子目录信息。不同的文件系统有不同的组织结构。为了方便管理,硬盘这样的块设备通常可以划分为多个逻辑块设备,也就是硬盘分区(Partition)。反过来,单个介质的容量、性能有限,可以通过某些技术手段把多个物理块设备组合成一个逻辑块设备,例如各种级别的RAID,JBOD等。文件系统也可以建立在这些逻辑块设备之上。无论如何,应用服务器应用并不需要关心所要访问的文件位于底层的块设备的具体位置,只需要该文件的文件名/ID发送给文件系统,由文件系统根据所述文件名/ID查询出文件路径即可。The file system is built on the block device. The file system not only records the file path, but also records which blocks form a file, and which blocks record directory/subdirectory information. Different file systems have different organizational structures. In order to facilitate management, a block device such as a hard disk can usually be divided into multiple logical block devices, that is, a hard disk partition (Partition). Conversely, the capacity and performance of a single medium are limited, and multiple physical block devices can be combined into a logical block device through certain technical means, such as various levels of RAID, JBOD, etc. File systems can also be built on top of these logical block devices. In any case, the application server application does not need to care about the specific location of the underlying block device where the file to be accessed is located. It only needs to send the file name/ID of the file to the file system, and the file system will query the file according to the file name/ID. Just the path.
NAS客户端10可以使用多种文件访问协议(如NFS、CIFS或者SMB等,本实施例不对此进行任何限定)经由网络150访问由NAS装置20的文件系统。比如,客户端10发起对NAS装置20中文件系统的远程访问请求,如读目录请求,客户端10将该读目录请求发送至NAS装置20。NAS装置20响应于该读目录请求,获取客户端10请求的指定路径下的目录条目,并将这些目录条目返回给客户端10。The NAS client 10 can access the file system of the NAS device 20 via the network 150 by using various file access protocols (such as NFS, CIFS or SMB, etc., which are not limited in this embodiment). For example, the client 10 initiates a remote access request to the file system in the NAS device 20 , such as a directory read request, and the client 10 sends the directory read request to the NAS device 20 . In response to the directory read request, the NAS device 20 acquires the directory entries under the specified path requested by the client 10 and returns these directory entries to the client 10 .
在一种实现方式中,NAS客户端10可能会迭代地调用读目录请求,调用次数取决于指定路径下的目录条目的数量以及文件访问协议配置。例如,指定路径下的目录条目包括1000条,文件访问协议配置每次返回100条,则针对NAS客户端10的第一个读目录请求,NAS装置20首先向NAS客户端10返回该1000条目录条目中的第1-100条目录条目,接 着,NAS客户端10会发送第二个读目录请求,NAS装置20再向NAS客户端10返回第101-200条目录条目,依此类推,直至全部返回完成。In an implementation manner, the NAS client 10 may iteratively call the read directory request, and the number of calls depends on the number of directory entries under the specified path and the configuration of the file access protocol. For example, the directory entries under the specified path include 1000 entries, and the file access protocol configuration returns 100 entries at a time, then for the first read directory request of the NAS client 10, the NAS device 20 first returns the 1000 directory entries to the NAS client 10 The 1-100th directory entries in the entry, then, the NAS client 10 will send the second read directory request, and the NAS device 20 will return the 101-200th directory entries to the NAS client 10, and so on, until all Return done.
上述方式在访问远程文件系统时,可能会产生频繁的调用、往返,网络延迟高。When the above method accesses the remote file system, frequent calls and round trips may occur, and the network delay is high.
为此,本申请实施例提供了一种目录读取方法,通过实施该方法,可以减少网络中的数据传输量,减少读目录的时延,提高对远程文件系统的访问性能。Therefore, the embodiment of the present application provides a method for reading a directory. By implementing the method, the amount of data transmission in the network can be reduced, the time delay for reading the directory can be reduced, and the access performance to the remote file system can be improved.
下面结合图3,以本申请实施例提供的目录读取方法应用于图1所示的系统架构中为例进行说明。该方法中所提及的NAS客户端可以是图1中的NAS客户端10(或NAS客户端10的组件如处理器),NAS装置可以是NAS装置20(或NAS装置20的组件如处理器)。In the following, with reference to FIG. 3 , it will be described by taking the directory reading method provided by the embodiment of the present application applied to the system architecture shown in FIG. 1 as an example. The NAS client mentioned in this method may be the NAS client 10 (or a component of the NAS client 10 such as a processor) in FIG. ).
图3为本申请实施例提供的一种目录读取方法所对应的流程示意图。如图3所示,该方法包括如下步骤:FIG. 3 is a schematic flow chart corresponding to a directory reading method provided by an embodiment of the present application. As shown in Figure 3, the method includes the following steps:
步骤301,NAS客户端向NAS装置发送读请求。对应的,NAS装置接收NAS客户端发送的读请求。In step 301, the NAS client sends a read request to the NAS device. Correspondingly, the NAS device receives the read request sent by the NAS client.
该读请求用于请求读取NAS装置中指定目录的目录条目。The read request is used to request to read directory entries of a specified directory in the NAS device.
步骤302,NAS装置获取指定目录的目录条目。In step 302, the NAS device obtains the directory entry of the specified directory.
步骤303,NAS装置对该指定目录的目录条目中的部分或全部条目进行压缩。In step 303, the NAS device compresses part or all of the directory entries of the specified directory.
步骤304,NAS装置向NAS客户端发送读响应,该读响应包括NAS装置对该指定目录的目录条目中的部分或全部条目进行压缩后的数据。对应的,NAS客户端接收该读响应。In step 304, the NAS device sends a read response to the NAS client, and the read response includes data after the NAS device compresses some or all of the directory entries of the specified directory. Correspondingly, the NAS client receives the read response.
上述方式,NAS装置在获取到目录条目之后,对条目进行压缩,这样一个读响应中便可以承载更多的条目,从而减少NAS装置和NAS客户端之间的交互目录条目的往返次数,减少数据传输量,降低NAS客户端从NAS装置获取完整目录条目的延迟,提高目录条目读取性能。In the above method, after the NAS device obtains the directory entries, it compresses the entries, so that more entries can be carried in a read response, thereby reducing the number of round trips between the NAS device and the NAS client for interactive directory entries, reducing data The transmission volume reduces the delay for the NAS client to obtain complete directory entries from the NAS device, and improves the performance of directory entry reading.
如下结合图4和图5对本申请实施例的目录读取方法的完整流程进行介绍:The complete flow of the directory reading method in the embodiment of the present application is introduced as follows in conjunction with FIG. 4 and FIG. 5:
本领域技术人员可知,在NFS协议中,计算机设备是经由不透明文件句柄寻址文件或目录的。在文件系统中,每个文件或每个目录均具有各自的文件句柄,文件句柄对于文件或目录是唯一的识别依据。那么示例性地,NAS客户端可以通过文件句柄来指示请求读取的指定目录。Those skilled in the art know that in the NFS protocol, a computer device addresses a file or directory via an opaque file handle. In the file system, each file or each directory has its own file handle, and the file handle is the only identification basis for the file or directory. Exemplarily, the NAS client may use the file handle to indicate the specified directory to be read.
请参见图4,图4为获取文件句柄的流程示意图。Please refer to FIG. 4, which is a schematic flowchart of obtaining a file handle.
示例性地,该流程可以包括:Exemplarily, the process may include:
步骤1,NAS客户端上的某应用程序调用opendir()函数传递指定目录的路径,以请求打开指定目录。Libc库拦截该opendir()调用,并基于该指定目录的路径查询其对应的文件句柄:在一种可能的实现方式中,NAS客户端本地可能缓存有该指定目录对应的文件句柄,则libc库可以直接在本地获取该文件句柄。在另一种可能的实现方式中,参见图4的步骤2,NAS客户端可以在本地未命中其文件句柄的情况下,将该指定目录的路径发送至NAS装置,当然,NAS客户端也可以不在本地查询,直接将指定目录的路径发送给NAS装置,由NAS装置确定该指定目录的文件句柄,并将查询到的文件句柄返回给NAS装置10,之后,该文件句柄会返回至NAS客户端的libc库。 Step 1. An application program on the NAS client calls the opendir() function to pass the path of the specified directory, so as to request to open the specified directory. The libc library intercepts the opendir() call, and queries the corresponding file handle based on the path of the specified directory: in one possible implementation, the NAS client may cache the file handle corresponding to the specified directory locally, then the libc library The file handle can be obtained locally directly. In another possible implementation, referring to step 2 in Figure 4, the NAS client can send the path of the specified directory to the NAS device when the file handle is not found locally. Of course, the NAS client can also Instead of querying locally, the path of the specified directory is directly sent to the NAS device, and the NAS device determines the file handle of the specified directory, and returns the queried file handle to the NAS device 10, and then the file handle will be returned to the NAS client. libc library.
步骤3,libc库为该文件句柄分配一个文件描述符,并维护该文件句柄和文件描述符 的映射关系,其中,文件描述符用于给应用程序所用,Libc库将该文件描述符和为该应用程序分配的缓冲区(如下简称为应用缓冲区)的信息返回给该应用程序。该应用缓冲区的信息可以包括用于指示该应用缓冲区地址的指针和/或用于指示该应用缓冲区长度的信息。在本申请中,该应用缓冲区用于承载该应用程序所请求读取的指定目录的目录条目。该应用缓冲区长度可以是用户配置的,或者是由NAS客户端分配的,本申请实施例对此不做限定。 Step 3, the libc library allocates a file descriptor for the file handle, and maintains the mapping relationship between the file handle and the file descriptor, wherein the file descriptor is used for the application, and the Libc library uses the file descriptor and the file descriptor as the The information of the buffer allocated by the application program (hereinafter referred to as the application buffer for short) is returned to the application program. The information of the application buffer may include a pointer for indicating the address of the application buffer and/or information for indicating the length of the application buffer. In this application, the application buffer is used to carry the directory entries of the specified directory that the application program requests to read. The application buffer length may be configured by the user or allocated by the NAS client, which is not limited in this embodiment of the present application.
值得注意的是,libc库是本申请实施例提供的一种新的函数库,可以用于拦截前述的opendir()函数以及后续的readdir()函数,从而在不改变原有函数库的基础上完成本申请实施例的目录条目的查询,使得本申请提供的技术方案具有更好的移植性和兼容性。It is worth noting that the libc library is a new function library provided by the embodiment of this application, which can be used to intercept the aforementioned opendir() function and the subsequent readdir() function, so that the original function library can be Completing the query of the directory entries in the embodiment of the present application makes the technical solution provided by the present application have better portability and compatibility.
参见图5,图5为查询目录条目的流程示意图。图5是在图4的基础上执行的后续查询流程,示例性地,该流程可以包括:Referring to FIG. 5, FIG. 5 is a schematic flowchart of querying directory entries. FIG. 5 is a subsequent query process executed on the basis of FIG. 4. Exemplarily, the process may include:
步骤4,在NAS客户端上,应用程序获取到指定目录对应的文件描述符以及应用缓冲区信息之后,调用readdir()函数来请求读取该指定目录的目录条目,并通过调用readdir()函数来传递该文件描述符和应用缓冲区的信息。对应的,libc库获取该readdir()调用。Step 4: On the NAS client, after the application obtains the file descriptor corresponding to the specified directory and the application buffer information, it calls the readdir() function to request to read the directory entries of the specified directory, and calls the readdir() function to pass the file descriptor and application buffer information. Correspondingly, the libc library gets the readdir() call.
步骤5,NAS客户端基于该readdir()调用向NAS装置发送读请求,对应的,NAS装置接收该读请求。In step 5, the NAS client sends a read request to the NAS device based on the readdir() call, and correspondingly, the NAS device receives the read request.
示例性地,发送读请求的过程包括:libc库根据之前维护的文件描述符和文件句柄的映射关系确定该readdir()调用传递的文件描述符所对应的文件句柄,并将给文件句柄和readdir()调用一起传递给VFS,VFS将该readdir调用发送至对应的文件系统客户端,如FS客户端。之后,在一种实施方式中,FS客户端通过RPC客户端将该读请求发送至NAS装置20。示例性地,该读请求可以是根据该readdir()调用所传递的信息封装成的符合RPC协议帧格式的消息。示例性地,该读请求包括但不限于下列参数中的一项或多项:分配缓冲区长度、cookie。Exemplarily, the process of sending a read request includes: the libc library determines the file handle corresponding to the file descriptor passed by the readdir() call according to the previously maintained mapping relationship between the file descriptor and the file handle, and sends the file handle and readdir () calls are passed to the VFS together, and the VFS sends the readdir call to the corresponding file system client, such as the FS client. After that, in one embodiment, the FS client sends the read request to the NAS device 20 through the RPC client. Exemplarily, the read request may be a message conforming to the frame format of the RPC protocol encapsulated according to the information transmitted by the readdir() call. Exemplarily, the read request includes, but is not limited to, one or more of the following parameters: allocated buffer length, cookie.
1)第一存储区域长度:用于表示NAS客户端为本次读取调用分配的存储区域(即应用缓冲区)的大小。NAS装置根据该存储区域的大小返回相应的数据量,也可以理解为,一个读响应中数据部分所占存储空间的上限,例如,NAS客户端分配的存储区域的大小为64KB,则NAS装置返回不大于64kB的数据量给NAS客户端。本申请中,该读请求中用于指示第一存储区域长度的字段上的信息可以称为第一指示信息。为便于说明,如下将第一存储区域称为分配缓冲区。1) The length of the first storage area: used to indicate the size of the storage area (that is, the application buffer) allocated by the NAS client for this read call. The NAS device returns the corresponding amount of data according to the size of the storage area, which can also be understood as the upper limit of the storage space occupied by the data part in a read response. For example, if the size of the storage area allocated by the NAS client is 64KB, the NAS device returns The amount of data no larger than 64kB is sent to the NAS client. In this application, the information in the field used to indicate the length of the first storage area in the read request may be referred to as first indication information. For ease of description, the first storage area is referred to as an allocation buffer as follows.
2)cookie:既是NAS装置的输入参数也是NAS装置的输出参数。Cookie对于NAS客户端而言是透明的数据,不需要NAS客户端对其进行解析,NAS客户端也不关心其中的内容,它的作用在于,在NAS装置无法通过一个读响应返回指定目录的完整目录条目时,可以先对已返回的目录条目的位置做一个标记,通常cookie包含已返回的目录条目的下一个条目的起始地址(如该条目的偏移量),NAS客户端接收到cookie后会原封不动再返回给NAS装置,由NAS装置来解析,用于指示NAS装置下一次从哪里开始读取。应理解的是,针对一个指定目录的首个读请求可以不包括cookie或者cookie为空。2) cookie: it is both an input parameter and an output parameter of the NAS device. Cookie is transparent data for NAS client, it does not need to be parsed by NAS client, and NAS client does not care about its content. For directory entries, you can first mark the location of the returned directory entry. Usually, the cookie contains the starting address of the next entry of the returned directory entry (such as the offset of the entry). The NAS client receives the cookie Afterwards, it will be returned intact to the NAS device for parsing by the NAS device, which is used to instruct the NAS device where to start reading next time. It should be understood that the first read request for a specified directory may not include a cookie or the cookie may be empty.
除此之外,该读请求还可以携带该指定目录的文件句柄,或者NAS客户端可以单独将指定目录的文件句柄发送给NAS装置,即读请求中不包括文件句柄。需要说明的是,上述方式仅为一种基于NFS协议下的可选实施方式,本申请实施例并不限定于NAS客户端通过文件句柄来指示指定目录的方式。例如,NAS装置10可以发送携带指定目录的路径 的读请求至NAS装置,或NAS客户端将读请求和指定目录的路径一起发送给NAS装置等等,任何可以指示指定目录的方式均适用于本申请实施例。In addition, the read request may also carry the file handle of the specified directory, or the NAS client may separately send the file handle of the specified directory to the NAS device, that is, the read request does not include the file handle. It should be noted that the foregoing manner is only an optional implementation manner based on the NFS protocol, and this embodiment of the present application is not limited to the manner in which the NAS client indicates a specified directory through a file handle. For example, the NAS device 10 can send a read request carrying the path of the specified directory to the NAS device, or the NAS client can send the read request and the path of the specified directory to the NAS device, etc. Any method that can indicate the specified directory is applicable to this document. Application example.
步骤6,NAS装置响应于该读请求,根据NAS客户端发送的文件句柄查询指定目录,并获取该指定目录的目录条目。Step 6: In response to the read request, the NAS device queries the specified directory according to the file handle sent by the NAS client, and obtains the directory entry of the specified directory.
步骤7,NAS装置基于生成读响应。In step 7, the NAS device generates a read response based on.
NAS装置根据获取的目录条目生成读响应,并通过读响应将部分或全部目录条目返回给NAS客户端。The NAS device generates a read response according to the obtained directory entry, and returns part or all of the directory entries to the NAS client through the read response.
参见图6所示,示例性地,该读响应包括头部和数据部分,如下对各部分进行介绍:Referring to FIG. 6, for example, the read response includes a header and a data part, and each part is introduced as follows:
一、头部。One, the head.
头部包括但不限于下列中的一项或多项:The header includes, but is not limited to, one or more of the following:
1)cookie:如前所述,既是NAS装置的输入参数,也是输出参数,作为输入参数时,NAS装置根据该cookie确定待读取的目录条目的起始地址。作为输出参数时,NAS装置通过该cooker记录下一次待读取的目录条目的起始地址,具体参见上文的介绍,此处不再重复说明。1) cookie: As mentioned above, it is both an input parameter and an output parameter of the NAS device. When used as an input parameter, the NAS device determines the starting address of the directory entry to be read according to the cookie. As an output parameter, the NAS device uses the cooker to record the starting address of the directory entry to be read next time. For details, refer to the introduction above, and will not repeat the description here.
2)文件结尾标志(end-of-file):是NAS装置的输出参数,用于指示指定目录的目录条目是否完全返回。当NAS装置将指定目录的条目全部返回至NAS客户端时,会设置此字段,该文件结尾标志的字段可以包括1个比特,该1个比特的不同值表示返回的目录条目是否到达目录条目的结尾,例如,该值为0时,表示未到达,该值为1时,表示已到达目录条目的结尾。可以理解,通常NAS装置会在返回目录条目的最后一个条目的读响应中设置该字段。NAS客户端接收到指示到达目录条目结尾的读响应后,不会再向NAS装置10发起同一目录的读请求。2) end-of-file flag (end-of-file): an output parameter of the NAS device, used to indicate whether the directory entries of the specified directory are completely returned. When the NAS device returns all the entries of the specified directory to the NAS client, this field will be set. The field of the end-of-file flag can include 1 bit, and the different values of the 1 bit indicate whether the returned directory entry has reached the directory entry. End, for example, when the value is 0, it means that it has not been reached, and when the value is 1, it means that the end of the directory entry has been reached. It can be understood that usually the NAS device will set this field in the read response of the last entry of the returned directory entry. After the NAS client receives the read response indicating that the end of the directory entry has been reached, it will not initiate a read request for the same directory to the NAS device 10 .
3)有效数据长度(valid data size):是NAS装置的输出参数,指读响应的数据部分所承载的有效数据的长度。3) Valid data size (valid data size): It is an output parameter of the NAS device, and refers to the length of the valid data carried in the data part of the read response.
4)第二存储区域长度:是NAS装置的输出参数,为便于说明,如下将该第二存储区域称为影子缓冲区(shadow buffer size)。影子缓冲区是NAS客户端上用于承载解压缩后的目录条目的存储区域。影子缓冲区的大小是基于NAS装置对目录条目进行压缩的压缩率确定的,例如,读响应中有效数据均为压缩后的目录条目,大小为64KB,压缩率为2,则影子缓冲区的大小为128KB。也就是说,影子缓冲区的大小等于读响应中有效数据压缩前的大小,在本申请中,将读响应中用于指示影子缓冲区长度的字段上的信息称为第二指示信息。4) The length of the second storage area: it is an output parameter of the NAS device. For the convenience of explanation, the second storage area is called shadow buffer size (shadow buffer size) as follows. The shadow buffer is the storage area on the NAS client used to host the decompressed directory entries. The size of the shadow buffer is determined based on the compression ratio of the directory entries compressed by the NAS device. For example, if the valid data in the read response are all compressed directory entries, the size is 64KB, and the compression ratio is 2, then the size of the shadow buffer is It is 128KB. That is to say, the size of the shadow buffer is equal to the size of the valid data in the read response before compression. In this application, the information on the field indicating the length of the shadow buffer in the read response is referred to as second indication information.
二、数据部分:2. Data part:
数据部分,NAS装置输出的有效数据部分,用于承载NAS客户端请求读取的指定目录的目录条目,数据部分所占的存储空间最大值即为分配缓冲区的大小。The data part, the effective data part output by the NAS device, is used to carry the directory entries of the specified directory requested by the NAS client to read, and the maximum storage space occupied by the data part is the size of the allocated buffer.
需要说明的是,上述列举的读请求和读响应仅为示意,本申请实施例对读请求以及读响应所包括的参数或格式并不限定,如在实际应用中,读响应还可以包括条目数(数据部分所承载的条目的数量),或者在指定目录的目录条目的数量较少不需要压缩时,读响应里可以不携带第二指示信息。等等,另外,上述各参数的名称仅为便于说明的示例,在不同的场景中,可能具有不同的名称,本申请实施例对此也不做限定。It should be noted that the read requests and read responses listed above are only illustrative, and the embodiment of the present application does not limit the parameters or formats included in the read requests and read responses. For example, in practical applications, the read response may also include the number of entries (the number of entries carried by the data part), or when the number of directory entries in the specified directory is small and does not need to be compressed, the second indication information may not be carried in the read response. etc. In addition, the names of the above parameters are only examples for convenience of description, and may have different names in different scenarios, which is not limited in this embodiment of the present application.
在本申请中,NAS装置根据分配缓冲区的大小和指定目录的目录条目的大小会采取不同的压缩策略,压缩策略是指将目录条目填充至读响应的方式。压缩策略包括完全压缩、 部分压缩和不压缩。In this application, the NAS device will adopt different compression strategies according to the size of the allocation buffer and the size of the directory entries of the specified directory. The compression strategy refers to the way of filling the directory entries into the read response. Compression strategies include full compression, partial compression, and no compression.
示例性地,NAS装置根据读请求的第一指示信息确定分配缓冲区的大小,并获取指定目录的完整目录条目的大小,然后将两者进行比较,如果完整目录条目的大小不大于分配缓冲区的大小,则可以采取不压缩的方式生成读响应。如果完整目录条目的大小大于分配缓冲区的大小,则可以采用完全压缩或部分压缩的方式生成读响应。Exemplarily, the NAS device determines the size of the allocation buffer according to the first indication information of the read request, and obtains the size of the full directory entry of the specified directory, and then compares the two, if the size of the full directory entry is not larger than the size of the allocation buffer size, the read response can be generated in an uncompressed manner. If the size of the full directory entry is larger than the size of the allocated buffer, the read response can be generated fully or partially compressed.
如下列举几个具体的场景对上述压缩策略进行详细介绍,下文中为便于说明,均以分配的分配缓冲区大小为64KB为例予以说明:A few specific scenarios are listed below to introduce the above compression strategy in detail. For the convenience of explanation, the allocated allocation buffer size is 64KB as an example to illustrate:
场景一、部分压缩/完全压缩 Scenario 1. Partial compression/full compression
若指定目录的完整目录条目的大小大于分配缓冲区的大小,则NAS装置可以通过一次或多次压缩来填充读响应的数据部分,为便于理解,如下将该数据部分称为分配缓冲区:If the size of the full directory entry of the specified directory is greater than the size of the allocation buffer, the NAS device can fill the data portion of the read response through one or more compressions. For ease of understanding, the data portion is referred to as the allocation buffer as follows:
(1)第一次压缩:(1) The first compression:
NAS装置从完整目录条目的头部开始顺序取部分目录条目填满读响应的分配缓冲区,假设分配缓冲区的大小为64KB,那么首次填充的这部分目录条目的大小也为64KB,(如完整目录条目的第0-64KB所对应的条目)。填充完成后,NAS装置对该64KB的目录条目进行压缩,假设压缩率为2,那么压缩后这部分目录条目所占用的存储空间为32KB。The NAS device starts from the head of the complete directory entry and fills up the allocation buffer of the read response with partial directory entries. Assuming that the size of the allocation buffer is 64KB, then the size of this part of the directory entry filled for the first time is also 64KB, (such as the complete The entry corresponding to the 0-64KB of the directory entry). After the filling is completed, the NAS device compresses the 64KB directory entries. Assuming that the compression rate is 2, the storage space occupied by these directory entries after compression is 32KB.
其中,压缩率CR=压缩前数据大小So/压缩后数据大小Sc;如上所述,压缩前条目大小为64KB,压缩率为2,压缩后条目大小为32KB。应理解,压缩率与压缩算法以及被压缩的数据相关。本申请实施例对压缩算法不做限定,如可以是香农-范诺算法、哈夫曼编码、算数编码、LZ77/LZ78编码等等,任何可以对数据进行压缩的已有算法以及未来可能应用的压缩算法均适用于本申请实施例。Wherein, compression ratio CR=data size So before compression/data size Sc after compression; as mentioned above, the size of the entry before compression is 64KB, the compression ratio is 2, and the size of the entry after compression is 32KB. It should be understood that the compression rate is related to the compression algorithm and the data to be compressed. The embodiment of this application does not limit the compression algorithm, such as Shannon-Fano algorithm, Huffman coding, arithmetic coding, LZ77/LZ78 coding, etc., any existing algorithm that can compress data and possible applications in the future All compression algorithms are applicable to this embodiment of the application.
(2)经过第一次压缩后,得到一个压缩后的数据块(chunk)(如下简称压缩块),第一个压缩块可以称为压缩块1,如前所述,该,压缩块1的大小为32KB,由于分配缓冲区的大小为64KB,因此,分配缓冲区中剩余的可用空间接近于32KB,这里的接近是指略小于32KB,由于NAS装置还可以为该压缩块1生成头部信息,这部分头部信息会占用分配缓冲区的一部分空间,因此,分配缓冲区中剩余的存储空间实际上是小于32KB的,应理解,NAS装置可以为每个压缩块均生成一个头部信息,并非仅为压缩块1生成头部信息。为便于说明,这里将每个压缩块的头部忽略,假设剩余的存储空间就是32KB。(2) After the first compression, a compressed data block (chunk) (hereinafter referred to as the compressed block) is obtained. The first compressed block can be called the compressed block 1. As mentioned above, the compressed block 1 The size is 32KB. Since the size of the allocation buffer is 64KB, the remaining available space in the allocation buffer is close to 32KB. The closeness here refers to slightly less than 32KB, because the NAS device can also generate header information for the compressed block 1 , this part of the header information will occupy part of the space in the allocation buffer, therefore, the remaining storage space in the allocation buffer is actually less than 32KB. It should be understood that the NAS device can generate a header information for each compressed block, The header information is not generated only for compressed block 1. For ease of illustration, the header of each compressed block is ignored here, assuming that the remaining storage space is 32KB.
示例性地,如果剩余的未填充的目录条目,即完整目录条目的第64KB至目录结尾所对应的目录条目可以完全放置在分配缓冲区的剩余存储空间中,那么,这部分剩余的目录条目便不需要被压缩,可以直接放置在读响应的分配缓冲区中。Exemplarily, if the remaining unfilled directory entries, that is, the directory entries corresponding to the 64 KB of the complete directory entry to the end of the directory can be completely placed in the remaining storage space of the allocation buffer, then this part of the remaining directory entries will be Does not need to be compressed and can be placed directly in the allocated buffer of the read response.
再示例性地,如果剩余的未填充的目录条目大于分配缓冲区的剩余存储空间,那么,NAS装置继续从上次选取的目录条目的位置顺序选择与剩余存储空间等大的目录条目,再次填满分配缓冲区,如剩余存储空间为32KB,那么再顺序取32KB(如完整目录条目的第64KB-96KB所对应的条目)填充至剩余存储空间中,然后对新填充的这32KB目录条目进行压缩,再次得到一个新的压缩块,称为压缩块2。假设该压缩块2的大小为20KB,则本轮压缩后分配缓冲区剩余的存储空间为12KB(64KB-32KB-20KB),NAS装置继续判断剩余的目录条目是否大于12KB,同理,如果不大于,则直接将剩余的目录条目填充至分配缓冲区,即不需要压缩。如果大于,则继续顺序取12KB条目进行填充,再对该12KB进行压缩,依此类推,直到满足结束填充条件。In another example, if the remaining unfilled directory entries are larger than the remaining storage space of the allocation buffer, then the NAS device continues to sequentially select a directory entry that is as large as the remaining storage space from the position of the directory entry selected last time, and fills in the directory entry again. Full allocation buffer, if the remaining storage space is 32KB, then sequentially take 32KB (such as the entry corresponding to the 64KB-96KB of the complete directory entry) to fill the remaining storage space, and then compress the newly filled 32KB directory entry , again to get a new compressed block, called compressed block 2. Assuming that the size of the compressed block 2 is 20KB, the remaining storage space of the allocation buffer after this round of compression is 12KB (64KB-32KB-20KB), and the NAS device continues to judge whether the remaining directory entries are greater than 12KB. Similarly, if not greater than , the remaining directory entries are filled directly into the allocation buffer, ie no compression is required. If it is larger, continue to sequentially fetch 12KB entries for filling, then compress the 12KB, and so on, until the end filling condition is satisfied.
其中,结束填充条件包括但不限于:(1)分配缓冲区没有剩余存储空间,即已填充至 分配缓冲区的末尾。(2)分配缓冲区中剩余存储空间满足设定的阈值(如称为第一阈值),该第一阈值如为2KB,即分配缓冲区为64KB时,有效数据长度达到62KB时,则可以不再进一步填充压缩,结束填充。值得注意的是,这里的有效数据长度达到62KB是指压缩后的数据达到62KB,在压缩前是可以将分配缓冲区填满的,参见上文的示例。(3)没有剩余的目录条目,即指定目录的完整目录条目已全部填充完成。Wherein, the end filling condition includes but not limited to: (1) There is no remaining storage space in the allocation buffer, that is, it has been filled to the end of the allocation buffer. (2) The remaining storage space in the allocation buffer satisfies the set threshold (such as the first threshold). If the first threshold is 2KB, that is, when the allocation buffer is 64KB and the effective data length reaches 62KB, then no Then further filling compression, the end of the filling. It is worth noting that the effective data length of 62KB here means that the compressed data reaches 62KB, and the allocation buffer can be filled before compression, see the example above. (3) There are no remaining directory entries, that is, the complete directory entries of the specified directory have all been filled.
至此,读响应的数据部分(分配缓冲区)已填充完成,NAS装置还需要根据有效数据长度生成读响应头部中的相关参数,如有效数据长度、影子缓冲区长度;其中,有效数据长度即该读响应中数据部分中的数据所占的存储空间,可以理解的是,在分配缓冲区被填满的情况下,有效数据长度即为分配缓冲区的长度,如64KB。So far, the data part of the read response (allocation buffer) has been filled, and the NAS device also needs to generate relevant parameters in the read response header according to the effective data length, such as the effective data length and the shadow buffer length; where the effective data length is For the storage space occupied by the data in the data part of the read response, it can be understood that when the allocation buffer is full, the effective data length is the length of the allocation buffer, such as 64KB.
如前所述,影子缓冲区为NAS客户端中用于承载解压缩后的目录条目,那么影子缓冲区的长度实际上就是NAS装置填充至分配缓冲区的所有目录条目未压缩前的实际长度,因此,影子缓冲区长度与压缩率和有效数据长度有关,值得注意的是,分配缓冲区内可能存在未压缩数据,影子缓冲区长度满足于压缩后数据的长度*压缩率+未压缩数据的长度;压缩后数据的长度+未压缩数据的长度=有效数据长度。例如有效数据长度为64KB,其中,压缩后数据的长度以及未压缩数据的长度均为32KB,压缩率为2,则影子缓冲区长度=32KB*2+32KB=96KB。又例如,如果有效数据长度为64KB,且有效数据均为压缩后的数据,那么影子缓冲区长度为64KB*2=128KB。As mentioned above, the shadow buffer is used to carry the decompressed directory entries in the NAS client, so the length of the shadow buffer is actually the actual length of all directory entries filled into the allocation buffer by the NAS device before compression. Therefore, the length of the shadow buffer is related to the compression rate and the effective data length. It is worth noting that there may be uncompressed data in the allocated buffer, and the length of the shadow buffer is satisfied by the length of the compressed data * the compression rate + the length of the uncompressed data ; Length of compressed data + length of uncompressed data = effective data length. For example, the effective data length is 64KB, wherein the compressed data length and the uncompressed data length are both 32KB, and the compression rate is 2, then the shadow buffer length=32KB*2+32KB=96KB. For another example, if the effective data length is 64KB, and the effective data are all compressed data, then the shadow buffer length is 64KB*2=128KB.
如此,读响应生成完成。In this way, the read response generation is completed.
场景二、不压缩。Scenario two, no compression.
可以理解,若指定目录的完整目录条目的大小小于或等于分配缓冲区的大小,比如完整目录条目的数量较少时,NAS装置可以直接将完整目录条目放置于读响应的分配缓冲区中,即可通过一个读响应完全返回用户请求读取的目录条目,则NAS装置不需要对目录条目进行压缩,。It can be understood that if the size of the full directory entry of the specified directory is less than or equal to the size of the allocation buffer, for example, when the number of full directory entries is small, the NAS device can directly place the full directory entry in the allocation buffer of the read response, that is, The directory entry requested by the user can be completely returned through a read response, so the NAS device does not need to compress the directory entry.
综上,执行部分压缩策略,分配缓冲区中即存在压缩数据又存在未压缩数据。执行完全压缩策略,分配缓冲区中目录条目均为压缩数据。执行未压缩策略,则分配缓冲区中均为实际的未经压缩的目录条目。In summary, a partial compression strategy is implemented, and there are both compressed and uncompressed data in the allocation buffer. The full compression strategy is implemented, and the directory entries in the allocation buffer are all compressed data. If the uncompressed policy is enforced, the allocation buffer is filled with actual uncompressed directory entries.
执行全部压缩或部分压缩策略,读响应的数据部分可能会有多个数据块,包括压缩块和未压缩块,本申请实施例可以为每个数据块生成一个压缩头部,该压缩头部包括用于指示该数据块内的数据是否被压缩的指示信息(如称为第三指示信息),该第三指示可以用于NAS客户端确定是否需要对该数据块内的数据进行解压缩。可选的,压缩头部还可以包括其他信息,如压缩算法、校验信息(如校验和)、块长度信息等。Execute full compression or partial compression strategy, the data part of the read response may have multiple data blocks, including compressed blocks and uncompressed blocks, the embodiment of this application can generate a compression header for each data block, the compression header includes The indication information used to indicate whether the data in the data block is compressed (for example, referred to as third indication information), the third indication may be used by the NAS client to determine whether the data in the data block needs to be decompressed. Optionally, the compression header may also include other information, such as compression algorithm, verification information (such as checksum), block length information, and the like.
步骤8,NAS装置向NAS客户端发送读响应,对应的,NAS客户端接收NAS装置发送的读响应。In step 8, the NAS device sends a read response to the NAS client, and correspondingly, the NAS client receives the read response sent by the NAS device.
步骤9,NAS客户端的数据抽象层检测读响应中是否有压缩块,如果存在,则分配一个用于承载解压缩数据的影子缓冲区,该过程可以包括:数据抽象层根据读响应头部的对应字段确定影子缓冲区长度,并为该读响应分配一个等长度的影子缓冲区,随后将用于指示该影子缓存区地址的指针写入该读响应头部的相应字段。这个字段可以是NAS装置预留的,或者是NAS客户端自己生成的。Step 9. The data abstraction layer of the NAS client detects whether there is a compressed block in the read response, and if it exists, allocates a shadow buffer for carrying decompressed data. This process may include: the data abstraction layer The field determines the length of the shadow buffer, and allocates a shadow buffer of equal length for the read response, and then writes the pointer used to indicate the address of the shadow buffer into the corresponding field of the read response header. This field may be reserved by the NAS device, or generated by the NAS client itself.
随后数据抽象层将承载了该读响应发送至解压缩模块对读响应中的数据进行解压缩。或者,也可以直接由数据抽象层进行解压缩。这里的数据抽象层以及解压缩模块可以是一 个软件模块也可以是硬件模块,或者软件模块和硬件模块的组合,本申请对此不做限定。如果由数据抽象层直接解压缩,也可以不在读响应中添加用于指示影子缓冲区指针的字段,由数据抽象层维护影子缓冲区的指针与该读响应的对应关系即可。Then the data abstraction layer will carry the read response and send it to the decompression module to decompress the data in the read response. Alternatively, decompression can be performed directly by the data abstraction layer. The data abstraction layer and the decompression module here can be a software module or a hardware module, or a combination of software modules and hardware modules, which is not limited in this application. If the data abstraction layer directly decompresses, the field for indicating the shadow buffer pointer may not be added in the read response, and the data abstraction layer maintains the corresponding relationship between the pointer of the shadow buffer and the read response.
NAS客户端可以根据每个数据块的压缩头部中的信息确定该数据块是否需要解压缩,如第三指示信息指示该数据块中的数据为压缩后的数据时,NAS客户端确定该数据块中的数据需要解压缩。可选的,当压缩头部包括压缩信息如压缩算法、校验和时,NAS客户端可以使用相应的解压缩算法对数据块中的数据进行解压缩,或使用校验和对数据块中的数据进行校验,在确保数据的完整和准确的情况下,再对数据块进行解压缩。The NAS client can determine whether the data block needs to be decompressed according to the information in the compression header of each data block. For example, when the third indication information indicates that the data in the data block is compressed data, the NAS client determines that the data The data in the block needs to be decompressed. Optionally, when the compression header includes compression information such as compression algorithm and checksum, the NAS client can use the corresponding decompression algorithm to decompress the data in the data block, or use the checksum to decompress the data in the data block. The data is verified, and the data block is decompressed under the condition of ensuring the integrity and accuracy of the data.
解压缩策略包括串行解压缩、并行解压缩。串行解压缩过程包括:按照压缩块的排列位置,每次顺序取一个压缩块,对其中的压缩后的数据进行解压缩,得到多条目录条目,并顺序填充至影子缓冲区中,当上一个压缩块解压缩完成后,再从读响应中顺序取下一个压缩块进行解压缩并将解压缩后的目录条目填充至影子缓冲区,依此类推,直至将所有的数据块的目录条目都填充至影子缓冲区为止。应注意,有些数据块是未压缩的,则不需要解压缩。Decompression strategies include serial decompression and parallel decompression. The serial decompression process includes: according to the arrangement position of the compressed blocks, take a compressed block in order each time, decompress the compressed data in it, obtain multiple directory entries, and fill them in the shadow buffer sequentially. After a compressed block is decompressed, the next compressed block is sequentially decompressed from the read response, and the decompressed directory entries are filled into the shadow buffer, and so on, until all the directory entries of all data blocks are Fill up to the shadow buffer. It should be noted that some data blocks are uncompressed and do not need to be decompressed.
并行解压缩,同时获取多个压缩块进行解压缩,并按数据块的排列顺序将解压缩后的目录条目顺序填充至影子缓冲区,保证影子缓冲区内的目录条目是按实际条目顺序排列的。Parallel decompression, obtain multiple compressed blocks for decompression at the same time, and fill the decompressed directory entries into the shadow buffer in order according to the order of the data blocks, ensuring that the directory entries in the shadow buffer are arranged in the order of the actual entries .
如果第三指示信息指示该数据块中的数据为未压缩,则NAS客户端直接将该数据块内的数据放入影子缓冲区即可。If the third indication information indicates that the data in the data block is uncompressed, then the NAS client can directly put the data in the data block into the shadow buffer.
继续参见图6,影子缓冲区内存储有顺序排列的解压缩后的目录条目,示例性地,每个目录条目包括但不限于如下信息中的一项或多项:条目类型(D_type)、条目的inode(D_INO)、条目长度(D_RECLEN)、条目偏移量(D_OFFSET)、条目名称(D_NAME)等等。需要说明的是,上述目录条目所包括的信息和信息的排序仅为举例,本申请实施例对此不做限定。Continuing to refer to FIG. 6, there are sequentially decompressed directory entries stored in the shadow buffer. Exemplarily, each directory entry includes but is not limited to one or more of the following information: entry type (D_type), entry The inode (D_INO), entry length (D_RECLEN), entry offset (D_OFFSET), entry name (D_NAME), etc. It should be noted that the information included in the above directory entries and the ordering of the information are only examples, which are not limited in this embodiment of the present application.
步骤10,向应用返回目录条目。Step 10, return the directory entry to the application.
将影子缓冲区的目录条目顺序写入应用缓冲区。Write the directory entries of the shadow buffer sequentially to the application buffer.
参见图7,图7为示出了无压缩和有压缩两种读目录的交互过程示意图。未压缩的方式可能需要NAS客户端和NAS服务端之间通过多次往返来交互完整目录条目,耗时较长,并且占用大量网络带宽。有压缩的方式为本申请实施例的方式,即NAS装置将目录条目进行压缩后发送至NAS装置10,本领域技术人员通过实验确定,这可以在相当大的工作负载(每个目录约100万个目录条目)下提高约2倍的性能增益。如LZ4等一些标准压缩技术可以提供约为2的压缩率,压缩和解压缩64KB缓冲区所需的时间不到100μs,这小于readdir()请求的总响应时间的10%开销。本申请实施例的读目录方式可以显著减少往返次数,缩短延时,提高读目录性能。Referring to FIG. 7 , FIG. 7 is a schematic diagram showing the interactive process of reading directories without compression and with compression. The uncompressed method may require multiple round trips between the NAS client and the NAS server to exchange complete directory entries, which takes a long time and consumes a lot of network bandwidth. The compression method is the method of the embodiment of the present application, that is, the NAS device compresses the directory entries and sends them to the NAS device 10. Those skilled in the art have determined through experiments that this can be achieved under a considerable workload (about 1 million files per directory). directory entries) about a 2x performance gain. Some standard compression techniques such as LZ4 can provide a compression ratio of about 2, and the time required to compress and decompress a 64KB buffer is less than 100μs, which is less than the 10% overhead of the total response time of a readdir() request. The directory reading method in the embodiment of the present application can significantly reduce the number of round trips, shorten the delay, and improve the directory reading performance.

Claims (7)

  1. 一种查询系统,其特征在于,包括客户端和存储设备:A query system, characterized in that it includes a client and a storage device:
    所述客户端用于向所述存储设备发送读请求,所述读请求用于请求读取目标路径的目录条目;The client is configured to send a read request to the storage device, and the read request is used to request to read the directory entry of the target path;
    所述存储设备用于响应所述读请求,获取所述目标路径的目录条目;The storage device is configured to respond to the read request and obtain a directory entry of the target path;
    所述存储设备还用于向所述客户端发送读响应,所述读响应包括数据部分,所述数据部分包括所述存储设备对所述目录条目的部分或全部条目进行压缩后的数据;The storage device is further configured to send a read response to the client, the read response includes a data part, and the data part includes data after the storage device compresses some or all of the directory entries;
    所述客户端用于接收所述存储设备发送的所述读响应。The client is configured to receive the read response sent by the storage device.
  2. 如权利要求1所述的系统,其特征在于,所述客户端还用于对所述读响应中的所述压缩后的数据进行解压缩,得到所述目录条目的部分或全部条目。The system according to claim 1, wherein the client is further configured to decompress the compressed data in the read response to obtain part or all of the directory entries.
  3. 如权利要求1或2所述的系统,其特征在于,所述读请求包括第一指示信息,或所述读请求包括所述第一指示信息和cookie;The system according to claim 1 or 2, wherein the read request includes first indication information, or the read request includes the first indication information and a cookie;
    所述第一指示信息用于指示所述读响应中所述数据部分所占存储区域的最大值;所述cookie用于指示所述指定目录的目录条目中待读取的条目的起始地址。The first indication information is used to indicate the maximum value of the storage area occupied by the data part in the read response; the cookie is used to indicate the starting address of the entry to be read in the directory entries of the specified directory.
  4. 如权利要求3所述的系统,其特征在于,所述读响应中的所述数据部分至少包括第一数据块和第二数据块;The system according to claim 3, wherein the data portion in the read response includes at least a first data block and a second data block;
    所述第一数据块中的数据为所述存储装置对第一数据进行压缩得到的,所述第一数据包括所述目录条目的第一部分条目;所述第一部分条目的大小不大于所述数据部分所占存储区域的最大值;The data in the first data block is obtained by compressing the first data by the storage device, and the first data includes a first partial entry of the directory entry; the size of the first partial entry is not larger than the data The maximum value of the storage area occupied by the part;
    所述第二数据块中的数据为所述存储装置对第二数据进行压缩得到的,所述第二数据包括所述目录条目的第二部分条目,所述第二部分条目在所述第一部分条目之后;或所述第二数据为所述目录条目中除所述第一数据之外的剩余条目。The data in the second data block is obtained by compressing the second data by the storage device, the second data includes a second part of the directory entry, and the second part of the entry is in the first part after the entry; or the second data is a remaining entry in the directory entry except the first data.
  5. 如权利要求4所述的系统,其特征在于,所述第一数据块包括块头部;所述头部包括所述第一数据块中的数据的压缩信息,所述压缩信息包括下列参数中的一项或多项:The system of claim 4, wherein the first data block includes a block header; the header includes compression information for data in the first data block, and the compression information includes among the following parameters One or more of:
    用于指示所述第一数据块中的数据是否被压缩的第三指示信息、压缩算法、校验和。The third indication information, compression algorithm, and checksum used to indicate whether the data in the first data block is compressed.
  6. 如权利要求1-5任一项所述的系统,其特征在于,所述读响应还包括头部,所述头部包括下列中的一项或多项:The system according to any one of claims 1-5, wherein the read response further includes a header, and the header includes one or more of the following:
    第二指示信息、文件结尾标志、cookie、有效数据长度;Second indication information, end-of-file mark, cookie, effective data length;
    其中,所述第二指示信息用于指示存储区域的大小,所述存储区域是所述存储装置根据所述读响应的数据部分在压缩之前的总长度确定的;所述存储区域用于承载所述存储客户端对所述读响应的数据部分中的数据进行解压缩后的部分或全部目录条目;Wherein, the second indication information is used to indicate the size of the storage area, and the storage area is determined by the storage device according to the total length of the data part of the read response before compression; the storage area is used to carry the storing some or all of the directory entries after the client decompresses the data in the data part of the read response;
    所述文件结尾标志用于指示所述指定目录的目录条目是否已完全返回;The end-of-file flag is used to indicate whether the directory entry of the specified directory has been completely returned;
    所述有效数据长度用于指示所述所述读响应中所述数据部分的长度。The valid data length is used to indicate the length of the data part in the read response.
  7. 如权利要求6所述的系统,其特征在于,所述客户端还用于根据所述第二指示信息确定所述存储区域,并将基于所述读响应得到解压缩后的部分或全部目录条目存储于所述存储区域。The system according to claim 6, wherein the client is further configured to determine the storage area according to the second indication information, and obtain part or all of the directory entries decompressed based on the read response stored in the storage area.
PCT/CN2021/131620 2021-11-19 2021-11-19 Directory reading system WO2023087231A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/131620 WO2023087231A1 (en) 2021-11-19 2021-11-19 Directory reading system
CN202180036862.8A CN117280333A (en) 2021-11-19 2021-11-19 Catalog reading system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/131620 WO2023087231A1 (en) 2021-11-19 2021-11-19 Directory reading system

Publications (1)

Publication Number Publication Date
WO2023087231A1 true WO2023087231A1 (en) 2023-05-25

Family

ID=86396140

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131620 WO2023087231A1 (en) 2021-11-19 2021-11-19 Directory reading system

Country Status (2)

Country Link
CN (1) CN117280333A (en)
WO (1) WO2023087231A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1570886A (en) * 2003-07-21 2005-01-26 万国电脑股份有限公司 Memory unit capable of improving transmission speed
CN1574795A (en) * 2002-11-20 2005-02-02 微软公司 System and method for using packed compressed buffers for improved client server communication
CN101196929A (en) * 2007-12-29 2008-06-11 中国科学院计算技术研究所 Metadata management method for splitting name space
US20150339314A1 (en) * 2014-05-25 2015-11-26 Brian James Collins Compaction mechanism for file system
CN111209259A (en) * 2018-11-22 2020-05-29 杭州海康威视系统技术有限公司 NAS distributed file system and data processing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1574795A (en) * 2002-11-20 2005-02-02 微软公司 System and method for using packed compressed buffers for improved client server communication
CN1570886A (en) * 2003-07-21 2005-01-26 万国电脑股份有限公司 Memory unit capable of improving transmission speed
CN101196929A (en) * 2007-12-29 2008-06-11 中国科学院计算技术研究所 Metadata management method for splitting name space
US20150339314A1 (en) * 2014-05-25 2015-11-26 Brian James Collins Compaction mechanism for file system
CN111209259A (en) * 2018-11-22 2020-05-29 杭州海康威视系统技术有限公司 NAS distributed file system and data processing method

Also Published As

Publication number Publication date
CN117280333A (en) 2023-12-22

Similar Documents

Publication Publication Date Title
EP1543424B1 (en) Storage virtualization by layering virtual disk objects on a file system
US7890504B2 (en) Using the LUN type for storage allocation
US8086652B1 (en) Storage system-based hole punching for reclaiming unused space from a data container
US7603532B2 (en) System and method for reclaiming unused space from a thinly provisioned data container
US8260831B2 (en) System and method for implementing a flexible storage manager with threshold control
US7930473B2 (en) System and method for supporting file and block access to storage object on a storage appliance
US7181439B1 (en) System and method for transparently accessing a virtual disk using a file-based protocol
US8694469B2 (en) Cloud synthetic backups
US8386446B1 (en) Proxying search requests for files to a search engine
US20110022566A1 (en) File system
US7577692B1 (en) System and method for reserving space to guarantee file writability in a file system supporting persistent consistency point images
US8918378B1 (en) Cloning using an extent-based architecture
US7856423B1 (en) System and method for generating a crash consistent persistent consistency point image set
US8866649B2 (en) Method and system for using non-variable compression group size in partial cloning
WO2023087231A1 (en) Directory reading system
US7584279B1 (en) System and method for mapping block-based file operations to file level protocols
US7506111B1 (en) System and method for determining a number of overwitten blocks between data containers
CN115840662A (en) Data backup system and device
WO2024022330A1 (en) Metadata management method based on file system, and related device thereof
US20240119029A1 (en) Data processing method and related apparatus
EP3532939A1 (en) Garbage collection system and process
US20240061807A1 (en) Rebalancing engine for use in rebalancing files in a distributed storage systems
CN115495412A (en) Query system and device
CN116661675A (en) Workload feature extraction method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21964387

Country of ref document: EP

Kind code of ref document: A1