CN118625993A - Data storage method and device - Google Patents
Data storage method and device Download PDFInfo
- Publication number
- CN118625993A CN118625993A CN202310245340.XA CN202310245340A CN118625993A CN 118625993 A CN118625993 A CN 118625993A CN 202310245340 A CN202310245340 A CN 202310245340A CN 118625993 A CN118625993 A CN 118625993A
- Authority
- CN
- China
- Prior art keywords
- file
- inode
- pba
- data
- store
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/06—Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
- G06F12/0615—Address space extension
- G06F12/063—Address space extension for I/O modules, e.g. memory mapped I/O
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/065—Replication mechanisms
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域Technical Field
本申请涉及数据存储技术领域,尤其涉及数据存储方法及装置。The present application relates to the field of data storage technology, and in particular to a data storage method and device.
背景技术Background Art
当前,手机中的应用可以采用云存储的方式对数据进行存储。在采用云存储方式时可能采用差分技术。差分技术指的是将文件切分成多个块(chunk),然后计算chunk对应的哈希(hash)值以确定被修改的chunk,进而只将被修改的chunk上传到云端存储的方式。Currently, mobile phone applications can use cloud storage to store data. Differential technology may be used when using cloud storage. Differential technology refers to dividing a file into multiple chunks, then calculating the hash value corresponding to the chunk to determine the modified chunk, and then uploading only the modified chunk to the cloud storage.
差分技术可以减少网络输入/输出(input/output,I/O)的数目。但是,该方法会占据大量的中央处理单元(central processing unit,CPU)资源。Differentiation technology can reduce the number of network input/output (I/O), but this method will occupy a large amount of central processing unit (CPU) resources.
发明内容Summary of the invention
本申请提供一种数据存储方法及装置,能够在减少网络I/O数目的情况下,降低占用的CPU资源的开销。The present application provides a data storage method and device, which can reduce the overhead of occupied CPU resources while reducing the number of network I/Os.
为达到上述目的,本申请采用如下技术方案:In order to achieve the above purpose, this application adopts the following technical solutions:
第一方面,提供一种数据存储方法,应用于第一设备,方法包括:响应于对第一文件的数据存储请求,获取第一文件对应的索引节点inode,inode包括第一inode和第二inode,第一inode用于存储第一物理区块地址PBA,第一PBA为第一文件的未更新数据对应的PBA,第二inode用于存储第二PBA,第二PBA为第一文件的更新数据对应的PBA;根据第二PBA确定更新数据;向第二设备发送更新数据。In a first aspect, a data storage method is provided, which is applied to a first device, and the method includes: in response to a data storage request for a first file, obtaining an index node inode corresponding to the first file, the inode including a first inode and a second inode, the first inode is used to store a first physical block address PBA, the first PBA is a PBA corresponding to unupdated data of the first file, and the second inode is used to store a second PBA, the second PBA is a PBA corresponding to updated data of the first file; determining updated data according to the second PBA; and sending the updated data to the second device.
基于上述技术方案,当接收到对文件的数据存储请求之后,获取该文件对应的inode,该inode为双层inode结构,其中一个inode用于存储该文件未更新数据对应的PBA,另一个inode用于存储该文件更新数据对应的PBA。这样,直接根据用于存储该文件更新数据对应的PBA即可确定文件的更新数据,也就是说通过简单的比较判断即可确定文件的更新数据,无需通过计算哈希值这样密集计算的方式,降低了占用的CPU资源的开销。进而仅将该更新数据发送到云端存储,可以减少网络I/O的数目,节省网络带宽。Based on the above technical solution, after receiving a data storage request for a file, the inode corresponding to the file is obtained. The inode is a double-layer inode structure, in which one inode is used to store the PBA corresponding to the unupdated data of the file, and the other inode is used to store the PBA corresponding to the updated data of the file. In this way, the updated data of the file can be determined directly based on the PBA corresponding to the updated data of the file, that is, the updated data of the file can be determined by a simple comparison and judgment, without the need for intensive calculation such as calculating the hash value, thereby reducing the CPU resource overhead. Then, only the updated data is sent to the cloud storage, which can reduce the number of network I/Os and save network bandwidth.
一种可能的设计中,第二inode还用于存储第三PBA,第三PBA用于指示第一文件的数据对应的PBA存储于第一inode中。基于该设计,在文件采用双层inode的情况下,第二inode中用于存储第二PBA以及第三PBA,这样,根据第二PBA即可确定数据的真实PBA位于第二inode中,根据第三PBA即可确定数据的真实PBA位于第一inode中,进而实现对数据的寻址。In a possible design, the second inode is also used to store the third PBA, and the third PBA is used to indicate that the PBA corresponding to the data of the first file is stored in the first inode. Based on this design, when the file uses a double-layer inode, the second inode is used to store the second PBA and the third PBA. In this way, the real PBA of the data can be determined to be in the second inode according to the second PBA, and the real PBA of the data can be determined to be in the first inode according to the third PBA, thereby achieving addressing of the data.
一种可能的设计中,在响应于对第一文件的数据存储请求,获取第一文件对应的inode之前,方法还包括:创建不包括数据的第二文件,第二文件对应第三inode,第三inode中包括第四PBA;根据第三inode、第四PBA、第三PBA生成第二inode;根据第二inode以及第一文件对应的第一inode生成第一文件对应的inode。基于该设计,在一些场景中,由于应用处于用户态,而inode处于内核态,因此应用无法直接创建inode。因此,可以通过创建空文件的方式创建inode,进而可以将文件的单层inode结构转换为双层inode结构。In a possible design, before obtaining the inode corresponding to the first file in response to a data storage request for the first file, the method also includes: creating a second file that does not include data, the second file corresponds to a third inode, and the third inode includes a fourth PBA; generating a second inode according to the third inode, the fourth PBA, and the third PBA; generating an inode corresponding to the first file according to the second inode and the first inode corresponding to the first file. Based on this design, in some scenarios, since the application is in user mode and the inode is in kernel mode, the application cannot directly create an inode. Therefore, an inode can be created by creating an empty file, and the single-layer inode structure of the file can be converted into a double-layer inode structure.
一种可能的设计中,在向第二设备发送更新数据之前,方法还包括:创建不包括数据的第三文件,第三文件对应第四inode,第四inode中包括第四PBA;将第四PBA设置为第三PBA;根据第三文件,第四inode以及第一inode生成第一文件的快照。基于该设计,第三文件与第一文件共享第一inode,而第一inode用于存储第一文件的未更新数据对应的PBA。因此,第一inode对于上层应用来讲是只读的,这样,第三文件即变成了第一文件的快照。该创建文件快照的方案,除了创建第三文件的开销,并未对第一文件的数据进行读写,实现了零拷贝的创建文件的快照,减少了产生的写I/O的数目,可以提高用户体验。In a possible design, before sending the updated data to the second device, the method also includes: creating a third file that does not include data, the third file corresponds to a fourth inode, and the fourth inode includes a fourth PBA; setting the fourth PBA to the third PBA; generating a snapshot of the first file based on the third file, the fourth inode and the first inode. Based on this design, the third file shares the first inode with the first file, and the first inode is used to store the PBA corresponding to the unupdated data of the first file. Therefore, the first inode is read-only for the upper-layer application, so that the third file becomes a snapshot of the first file. This solution for creating a file snapshot, in addition to the overhead of creating the third file, does not read and write the data of the first file, and realizes zero-copy creation of a file snapshot, reduces the number of write I/Os generated, and can improve the user experience.
一种可能的设计中,第一文件为第一状态,第一状态用于指示第一文件被修改;在向第二设备发送更新数据之前,方法还包括:将第二PBA保存到第一inode中;将第一文件的状态设置为第二状态,第二状态用于指示第一文件未被修改。基于该设计,一个文件可能存在多次被修改的情况,如果此次更新的数据的PBA以及上次更新的数据的PBA均保存在第二inode时,那么无法区分此次更新后的数据与上次更新后的数据存在的差异。因此,通过文件的状态转换,即可确定文件当前是否被修改过,进而仅通过遍历第二inode中的PBA即可确定文件被修改的位置。不仅无需通过计算哈希值这样密集计算的方式,降低占用的CPU资源的开销,还可以区分此次更新后的数据与上次更新后的数据存在的差异。In a possible design, the first file is in a first state, and the first state is used to indicate that the first file has been modified; before sending the updated data to the second device, the method also includes: saving the second PBA in the first inode; setting the state of the first file to the second state, and the second state is used to indicate that the first file has not been modified. Based on this design, a file may be modified multiple times. If the PBA of the data updated this time and the PBA of the data updated last time are both saved in the second inode, it is impossible to distinguish the difference between the data updated this time and the data updated last time. Therefore, by converting the state of the file, it is possible to determine whether the file has been modified at present, and then the location where the file has been modified can be determined by only traversing the PBA in the second inode. Not only does it not need to reduce the CPU resource overhead by calculating the hash value in an intensive way, but it is also possible to distinguish the difference between the data updated this time and the data updated last time.
一种可能的设计中,根据第二PBA确定更新数据,包括:根据第一文件的快照对应的第一inode中保存的第二PBA,确定更新数据。基于该设计,可以将文件的快照中的更新数据上传到云端,可以在避免上传过程中文件被更改而出现的一致性问题的情况下,减少网络I/O的数目,节省网络带宽。In a possible design, determining the updated data according to the second PBA includes: determining the updated data according to the second PBA stored in the first inode corresponding to the snapshot of the first file. Based on this design, the updated data in the snapshot of the file can be uploaded to the cloud, which can reduce the number of network I/Os and save network bandwidth while avoiding consistency problems caused by file changes during the upload process.
第二方面,提供一种数据存储装置,装置应用于第一设备,装置包括处理单元以及通信单元;处理单元,用于:响应于对第一文件的数据存储请求,获取第一文件对应的inode,inode包括第一inode和第二inode,第一inode用于存储第一PBA,第一PBA为第一文件的未更新数据对应的PBA,第二inode用于存储第二PBA,第二PBA为第一文件的更新数据对应的PBA;根据第二PBA确定更新数据;通信单元,用于:向第二设备发送更新数据。In a second aspect, a data storage device is provided, which is applied to a first device, and the device includes a processing unit and a communication unit; the processing unit is used to: in response to a data storage request for a first file, obtain an inode corresponding to the first file, the inode includes a first inode and a second inode, the first inode is used to store a first PBA, the first PBA is a PBA corresponding to the unupdated data of the first file, and the second inode is used to store a second PBA, the second PBA is a PBA corresponding to the updated data of the first file; determine the updated data according to the second PBA; the communication unit is used to: send the updated data to the second device.
一种可能的设计中,第二inode还用于存储第三PBA,第三PBA用于指示第一文件的数据对应的PBA存储于第一inode中。In a possible design, the second inode is also used to store a third PBA, and the third PBA is used to indicate that the PBA corresponding to the data of the first file is stored in the first inode.
一种可能的设计中,处理单元,还用于:创建不包括数据的第二文件,第二文件对应第三inode,第三inode中包括第四PBA;根据第三inode、第四PBA、第三PBA生成第二inode;根据第二inode以及第一文件对应的第一inode生成第一文件对应的inode。In one possible design, the processing unit is also used to: create a second file that does not include data, the second file corresponds to a third inode, and the third inode includes a fourth PBA; generate a second inode based on the third inode, the fourth PBA, and the third PBA; generate an inode corresponding to the first file based on the second inode and the first inode corresponding to the first file.
一种可能的设计中,处理单元,还用于:创建不包括数据的第三文件,第三文件对应第四inode,第四inode中包括第四PBA;将第四PBA设置为第三PBA;根据第三文件,第四inode以及第一inode生成第一文件的快照。In one possible design, the processing unit is further used to: create a third file that does not include data, the third file corresponds to a fourth inode, and the fourth inode includes a fourth PBA; set the fourth PBA to the third PBA; and generate a snapshot of the first file based on the third file, the fourth inode and the first inode.
一种可能的设计中,第一文件为第一状态,第一状态用于指示第一文件被修改;处理单元,还用于:将第二PBA保存到第一inode中;将第一文件的状态设置为第二状态,第二状态用于指示第一文件未被修改。In one possible design, the first file is in a first state, and the first state is used to indicate that the first file has been modified; the processing unit is also used to: save the second PBA to the first inode; set the state of the first file to a second state, and the second state is used to indicate that the first file has not been modified.
一种可能的设计中,处理单元,具体用于根据第一文件的快照对应的第一inode中保存的第二PBA,确定更新数据。In one possible design, the processing unit is specifically configured to determine the update data according to the second PBA stored in the first inode corresponding to the snapshot of the first file.
第三方面,提供一种设备,包括处理器、存储器和通信接口,所述存储器、所述通信接口与所述处理器耦合,所述通信接口用于与其他装置通信,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,所述处理器从所述存储器中读取所述计算机指令,以使得所述设备执行如上述第一方面及其中任一设计所述的方法。According to a third aspect, a device is provided, comprising a processor, a memory and a communication interface, wherein the memory and the communication interface are coupled to the processor, the communication interface is used to communicate with other devices, the memory is used to store computer program code, and the computer program code comprises computer instructions, and the processor reads the computer instructions from the memory so that the device executes the method described in the first aspect and any one of the designs thereof.
可选的,该存储器可以与处理器耦合,或者,也可以独立于该处理器。Optionally, the memory may be coupled to the processor, or may be independent of the processor.
示例性的,该通信接口可以为收发器、输入/输出接口、接口电路、输出电路、输入电路、管脚或相关电路等。Exemplarily, the communication interface may be a transceiver, an input/output interface, an interface circuit, an output circuit, an input circuit, a pin or a related circuit, etc.
一种可能的设计中,设备还包括显示屏,显示屏可用于设备执行显示操作。In one possible design, the device also includes a display screen, which can be used for the device to perform display operations.
第四方面,提供一种计算机可读存储介质,计算机可读存储介质包括计算机程序,当计算机程序在设备上运行时,使得设备执行如上述第一方面及其中任一设计所述的方法。In a fourth aspect, a computer-readable storage medium is provided, the computer-readable storage medium comprising a computer program, and when the computer program runs on a device, the device executes the method described in the first aspect and any one of the designs thereof.
第五方面,提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行如上述第一方面及其中任一设计所述的方法。According to a fifth aspect, a computer program product is provided. When the computer program product runs on a computer, the computer executes the method described in the first aspect and any one of the designs thereof.
第六方面,提供一种芯片系统,包括至少一个处理器和至少一个接口电路,至少一个接口电路用于执行收发功能,并将指令发送给至少一个处理器,当至少一个处理器执行指令时,至少一个处理器执行如上述第一方面及其中任一设计所述的方法。In a sixth aspect, a chip system is provided, comprising at least one processor and at least one interface circuit, wherein the at least one interface circuit is used to perform transceiver functions and send instructions to at least one processor, and when the at least one processor executes the instructions, the at least one processor executes the method described in the first aspect and any one of the designs thereof.
第七方面,提供一种通信系统,包括第一设备以及第二设备,第一设备以及第二设备通过交互实现本申请实施例提供的方法。In a seventh aspect, a communication system is provided, including a first device and a second device, wherein the first device and the second device implement the method provided in an embodiment of the present application through interaction.
需要说明的是,上述第二方面至第七方面中任一设计所带来的技术效果可以参见第一方面中对应设计所带来的技术效果,此处不再赘述。It should be noted that the technical effects brought about by any design in the above-mentioned second to seventh aspects can refer to the technical effects brought about by the corresponding design in the first aspect, and will not be repeated here.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本申请实施例提供的一种根据LBA与PBA之间的映射关系确定文件在磁盘上的物理块的示意图;FIG1 is a schematic diagram of determining a physical block of a file on a disk according to a mapping relationship between LBA and PBA provided by an embodiment of the present application;
图2为本申请实施例提供的一种文件系统的结构示意图;FIG2 is a schematic diagram of the structure of a file system provided in an embodiment of the present application;
图3为本申请实施例提供的一种通信系统的架构示意图;FIG3 is a schematic diagram of the architecture of a communication system provided in an embodiment of the present application;
图4为本申请实施例提供的一种集中式存储系统的架构示意图;FIG4 is a schematic diagram of the architecture of a centralized storage system provided in an embodiment of the present application;
图5为本申请实施例提供的一种主机的硬件结构示意图;FIG5 is a schematic diagram of a hardware structure of a host provided in an embodiment of the present application;
图6为本申请实施例提供的一种主机的软件结构框图;FIG6 is a software structure block diagram of a host provided in an embodiment of the present application;
图7为本申请实施例提供的一种双层inode结构的示意图;FIG7 is a schematic diagram of a double-layer inode structure provided in an embodiment of the present application;
图8为本申请实施例提供的一种双层inode结构中保存的PBA的示意图;FIG8 is a schematic diagram of a PBA stored in a double-layer inode structure provided in an embodiment of the present application;
图9为本申请实施例提供的一种双层inode结构中保存的LBA与PBA的映射关系的示意图;FIG9 is a schematic diagram of a mapping relationship between LBA and PBA stored in a double-layer inode structure provided in an embodiment of the present application;
图10为本申请实施例提供的一种将单层inode结构转换成双层inode结构的过程示意图;FIG10 is a schematic diagram of a process of converting a single-layer inode structure into a double-layer inode structure provided in an embodiment of the present application;
图11为本申请实施例提供的一种创建文件快照的过程示意图;FIG11 is a schematic diagram of a process of creating a file snapshot provided in an embodiment of the present application;
图12为本申请实施例提供的一种文件状态转换的过程示意图;FIG12 is a schematic diagram of a process of file state conversion provided by an embodiment of the present application;
图13为本申请实施例提供的一种数据存储方法的流程示意图;FIG13 is a schematic diagram of a flow chart of a data storage method provided in an embodiment of the present application;
图14为本申请实施例提供的一种数据存储装置的结构示意图;FIG14 is a schematic diagram of the structure of a data storage device provided in an embodiment of the present application;
图15为本申请实施例提供的一种芯片系统的结构示意图。FIG15 is a schematic diagram of the structure of a chip system provided in an embodiment of the present application.
具体实施方式DETAILED DESCRIPTION
在本申请的描述中,除非另有说明,“/”表示前后关联的对象是一种“或”的关系,例如,A/B可以表示A或B;本申请中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。In the description of this application, unless otherwise specified, "/" indicates that the objects associated with each other are in an "or" relationship, for example, A/B can represent A or B; "and/or" in this application is merely a description of the association relationship between associated objects, indicating that three relationships may exist, for example, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural.
在本申请的描述中,除非另有说明,“多个”是指两个或多于两个。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a和b,a和c,b和c,a和b和c,其中a,b,c可以是单个,也可以是多个。In the description of this application, unless otherwise specified, "plurality" means two or more than two. "At least one of the following" or similar expressions refers to any combination of these items, including any combination of single items or plural items. For example, at least one of a, b, or c can mean: a, b, c, a and b, a and c, b and c, a and b and c, where a, b, and c can be single or multiple.
另外,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。In addition, in order to clearly describe the technical solutions of the embodiments of the present application, in the embodiments of the present application, words such as "first" and "second" are used to distinguish the same or similar items with substantially the same functions and effects. Those skilled in the art can understand that words such as "first" and "second" do not limit the quantity and execution order, and words such as "first" and "second" do not necessarily limit them to be different.
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念,便于理解。In the embodiments of the present application, words such as "exemplary" or "for example" are used to indicate examples, illustrations or descriptions. Any embodiment or design described as "exemplary" or "for example" in the embodiments of the present application should not be interpreted as being more preferred or more advantageous than other embodiments or designs. Specifically, the use of words such as "exemplary" or "for example" is intended to present related concepts in a concrete way for easy understanding.
本申请中的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。在本申请的各种实施例中,各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。The features, structures or characteristics in this application may be combined in one or more embodiments in any suitable manner. In various embodiments of this application, the size of the sequence number of each process does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.
本申请实施例中的一些可选的特征,在某些场景下,可以不依赖于其他特征,而独立实施,解决相应的技术问题,达到相应的效果,也可以在某些场景下,依据需求与其他特征进行结合。Some optional features in the embodiments of the present application may be implemented independently in some scenarios without relying on other features to solve corresponding technical problems and achieve corresponding effects. They may also be combined with other features as needed in some scenarios.
本申请中,除特殊说明外,各个实施例之间相同或相似的部分可以互相参考。在本申请中各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。本申请实施方式并不构成对本申请保护范围的限定。In this application, unless otherwise specified, the same or similar parts between the various embodiments can refer to each other. In each embodiment of this application, if there is no special description and logical conflict, the terms and/or descriptions between different embodiments are consistent and can be referenced to each other, and the technical features in different embodiments can be combined to form new embodiments according to their inherent logical relationships. The implementation methods of this application do not constitute a limitation on the scope of protection of this application.
此外,本申请实施例描述的网络架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。In addition, the network architecture and business scenarios described in the embodiments of the present application are intended to more clearly illustrate the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided in the embodiments of the present application. Ordinary technicians in this field can know that with the evolution of network architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
为便于理解,下面先对本申请实施例可能涉及的技术术语和相关概念进行介绍。To facilitate understanding, the technical terms and related concepts that may be involved in the embodiments of the present application are first introduced below.
1、I/O操作1. I/O operations
在计算机中,所有的元素都是文件,而文件是一种流的外在表现形式。在信息交换过程中,计算机都是对这些流进行操作,简称为I/O操作。示例性的,从流中读出数据,向流中写入数据等均为I/O操作。In a computer, all elements are files, and files are the external manifestation of a stream. In the process of information exchange, computers operate on these streams, referred to as I/O operations. For example, reading data from a stream and writing data to a stream are all I/O operations.
2、云存储2. Cloud Storage
云存储是一种网上在线存储的模式,其将数据存放在通常由第三方托管的多台虚拟服务器,而不是将数据存放到计算机的硬盘或其他本地存储设备上。云存储指的是通过集群应用、网络技术或分布式文件系统等功能,将网络中各种类型的存储设备通过应用软件集合起来协同工作,共同对外提供数据存储和业务访问功能的一个系统。Cloud storage is an online storage model that stores data on multiple virtual servers usually hosted by a third party, rather than storing data on a computer's hard drive or other local storage device. Cloud storage refers to a system that uses cluster applications, network technology, or distributed file systems to bring together various types of storage devices in the network through application software to work together and provide external data storage and business access functions.
3、快照3. Snapshot
快照是一种基于时间点的数据拷贝技术,能够记录某一个时刻的数据并将其保存。快照可以用来恢复数据,比如:在发生某些故障需要恢复数据时,可以通过快照来将数据恢复到快照记录时间点的状态。Snapshot is a point-in-time data copy technology that can record and save data at a certain moment. Snapshots can be used to restore data. For example, when data needs to be restored in the event of a certain failure, snapshots can be used to restore data to the state at the time point recorded in the snapshot.
快照技术分为两类:物理拷贝和逻辑拷贝。物理拷贝是对原始数据的完全拷贝。逻辑拷贝是只针对发生过改变的数据进行拷贝。Snapshot technology is divided into two categories: physical copy and logical copy. Physical copy is a complete copy of the original data. Logical copy is a copy of only the data that has changed.
4、零拷贝4. Zero Copy
零拷贝指的是计算机执行I/O操作时,CPU不需要将数据从一个存储区域复制到另一个存储区域,进而减少上下文切换以及CPU的拷贝时间。Zero copy means that when a computer performs I/O operations, the CPU does not need to copy data from one storage area to another storage area, thereby reducing context switching and CPU copy time.
5、索引节点(index node,inode)5. Index node (inode)
文件存储在硬盘上,硬盘的最小存储单位叫做扇区(sector),每个扇区存储512字节。操作系统读取硬盘时,不会一个个扇区的读取,这样效率太低,而是一次性连续读取多个扇区,即一次性读取一个块(block)(或称物理区块,或称物理块等)。块是文件存取的最小单位。块的大小最常见的是4(Kilobyte,KB),即连续8个sector组成一个block。文件的数据都存储在块中,而用于存储文件的元信息(如:文件的创建者、创建日期、文件大小等)的区域叫做inode。Files are stored on the hard disk. The smallest storage unit of the hard disk is called a sector, and each sector stores 512 bytes. When the operating system reads the hard disk, it does not read sectors one by one, which is too inefficient. Instead, it reads multiple sectors continuously at one time, that is, it reads a block (or physical block, or physical block, etc.) at one time. Block is the smallest unit of file access. The most common block size is 4 (Kilobyte, KB), that is, 8 consecutive sectors make up a block. The data of the file is stored in the block, and the area used to store the file's meta information (such as: the creator of the file, the creation date, the file size, etc.) is called the inode.
inode除了可以保存文件的元信息之外,还可以保存文件所存储块的索引信息,该索引信息可以称为物理区块地址(physical block address,PBA)。其中,PBA根据存储设备的硬件特点,来指定地址规则,不同的物理存储设备,PBA的编码是不同的,唯一的PBA是硬盘出厂时最原始的寻址机制,是固定的。In addition to storing the file's metadata, inode can also store the index information of the file's storage block, which can be called the physical block address (PBA). PBA specifies the address rules based on the hardware characteristics of the storage device. Different physical storage devices have different PBA codes. The only PBA is the original addressing mechanism when the hard disk leaves the factory and is fixed.
例如:以块的大小为4KB为例,一个文件的大小为8KB,该文件存储在磁盘上的两个块中。则inode可以通过保存的这两个块的PBA对这两个块进行索引。如:若该文件的第一个数据块位于磁盘上第100个块的位置,第二个数据块位于磁盘上第200个块的位置,则该文件对应的磁盘上的两个块的PBA可以为PBA=100,PBA=200。For example, if the block size is 4KB, and the file size is 8KB, the file is stored in two blocks on the disk. Then the inode can index the two blocks by the PBAs of the two blocks. For example, if the first data block of the file is located at the 100th block on the disk, and the second data block is located at the 200th block on the disk, then the PBAs of the two blocks on the disk corresponding to the file can be PBA=100, PBA=200.
而文件的第一个数据块,第二个数据块等可以称为逻辑块(或称逻辑区块,或称数据区块等),每个逻辑块存在对应的逻辑区块地址(logical block address,LBA)。其中,LBA可以指某个数据块的地址。如,以从1开始编号来定位文件的数据块,则上述文件的第一个数据块对应的LBA=1,第二个数据块对应的LBA=2等。The first data block, the second data block, etc. of a file can be called a logical block (or a logical block, or a data block, etc.), and each logical block has a corresponding logical block address (logical block address, LBA). Among them, LBA can refer to the address of a data block. For example, if the data blocks of a file are located by numbering from 1, then the LBA corresponding to the first data block of the above file is 1, and the LBA corresponding to the second data block is 2, etc.
在一些实施例中,可以构建LBA与PBA之间的映射关系。结合上文所述的示例,可以构建两个映射关系,如LBA=1与PBA=100之间存在映射关系。LBA=2与PBA=200之间存在映射关系等。该映射关系可用于根据文件的数据块去寻找磁盘中对应的物理块。这种LBA与PBA之间的映射关系可以称为块地图(block map)(或称块索引)等。In some embodiments, a mapping relationship between LBA and PBA can be constructed. In combination with the examples described above, two mapping relationships can be constructed, such as a mapping relationship between LBA=1 and PBA=100. A mapping relationship between LBA=2 and PBA=200, etc. This mapping relationship can be used to find the corresponding physical block in the disk according to the data block of the file. This mapping relationship between LBA and PBA can be called a block map (or block index), etc.
在一些实施例中,LBA与PBA的映射关系可以保存在inode中。In some embodiments, the mapping relationship between LBA and PBA can be stored in the inode.
示例性的,图1示出了本申请实施例提供的一种根据LBA与PBA之间的映射关系确定文件在磁盘上的物理块的示意图。Exemplarily, FIG1 shows a schematic diagram of determining the physical blocks of a file on a disk according to a mapping relationship between LBA and PBA, provided by an embodiment of the present application.
如图1所示,文件1中包括3个数据块,数据块1、数据块2和数据块3。这3个数据块对应的LBA分别为LBA=1,LBA=2以及LBA=3。在inode中查找LBA与PBA之间的映射关系,可以确定LBA=1对应的PBA=100,LBA=2对应的PBA=101,LBA=3对应的PBA=201。进而根据PBA可以查找到文件的数据块对应的磁盘上的物理块(如图1中黑色的方块)。As shown in Figure 1, file 1 includes three data blocks, data block 1, data block 2, and data block 3. The LBAs corresponding to these three data blocks are LBA=1, LBA=2, and LBA=3. By searching the mapping relationship between LBA and PBA in the inode, it can be determined that LBA=1 corresponds to PBA=100, LBA=2 corresponds to PBA=101, and LBA=3 corresponds to PBA=201. Then, according to the PBA, the physical blocks on the disk corresponding to the data blocks of the file can be found (such as the black blocks in Figure 1).
6、文件系统6. File System
文件系统是一个结构化的数据文件存储和组织形式。示例性的,图2示出了本申请实施例提供的一种文件系统的结构示意图。A file system is a structured data file storage and organization form. For example, FIG2 shows a schematic diagram of the structure of a file system provided in an embodiment of the present application.
计算机中所有的数据都是0和1,存储在硬件介质上的一连串的01组合对计算机来说完全无法去分辨以及管理。因此计算机用“文件”这个概念对这些数据进行组织,用于同一用途的数据,按照不同应用程序要求的结构方式组成不同类型的文件。通常用不同的后缀来指代不同的类型,然后计算机给每个文件起一个方便理解记忆的名字。All data in a computer are 0s and 1s. A series of 01 combinations stored on hardware media is completely indistinguishable and unmanageable for a computer. Therefore, computers use the concept of "files" to organize these data. Data for the same purpose is organized into different types of files according to the structure required by different applications. Different suffixes are usually used to refer to different types, and then the computer gives each file a name that is easy to understand and remember.
而当文件很多的时候,计算机按照某种划分方式给这些文件(如图2所示的白色方块)分组,每一组文件放在同一个目录(或者叫文件夹,如图2所示出的黑色方块)里面。而且目录下面除了文件还可以有下一级目录(称之为子目录或者子文件夹),所有的文件、目录形成一个树状结构。这个树状结构有一个专用的名字:文件系统(File System)。文件系统有很多类型,常见的有Windows的FAT/FAT32/NTFS,Linux的EXT2/EXT3/EXT4/XFS/BtrFS等。When there are many files, the computer groups these files (white blocks as shown in Figure 2) in a certain way, and each group of files is placed in the same directory (or folder, as shown in the black block in Figure 2). In addition to files, there can also be a lower-level directory (called a subdirectory or subfolder) under the directory, and all files and directories form a tree structure. This tree structure has a special name: file system. There are many types of file systems, the most common ones are FAT/FAT32/NTFS in Windows, EXT2/EXT3/EXT4/XFS/BtrFS in Linux, etc.
为方便查找,从根节点开始逐级目录往下,一直到文件本身,把这些目录、子目录、文件的名字用特殊的字符(例如Windows/DOS用“\”,类Unix系统用“/”)拼接起来,这样的一串字符称之为文件路径,例如Linux中的“/etc/systemd/system.conf”或者Windows中的“C:\Windows\System32\taskmgr.exe”。路径是访问某个具体的文件的唯一标识。例如,Windows下的D:\data\file.exe就是一个文件的路径,它表示D分区下的data目录下的file.exe文件。To facilitate searching, we start from the root node and go down the directories one level at a time until we reach the file itself. We concatenate the names of these directories, subdirectories, and files with special characters (e.g., "\" for Windows/DOS and "/" for Unix-like systems). This string of characters is called a file path, such as "/etc/systemd/system.conf" in Linux or "C:\Windows\System32\taskmgr.exe" in Windows. A path is a unique identifier for accessing a specific file. For example, D:\data\file.exe in Windows is a file path, which indicates the file.exe file in the data directory under the D partition.
在一些场景中,手机中的应用在采用云存储的方式对数据进行存储时,对于应用中的每个文件,先通过物理拷贝的方式创建该文件的快照,然后再上传快照中的数据,以避免上传过程中文件被更改而出现的一致性问题。可以理解,一致性问题可以指文件在上传到云端的过程中被修改,进而导致上传到云端的数据被恢复之后,与原来的数据不一致,这就称为一致性问题。因此,通过对文件创建快照,然后将快照中的数据上传到云端,这样原文件被修改时,不会影响到快照中的数据上传,从而可以保证数据的一致性。In some scenarios, when the mobile phone application uses cloud storage to store data, for each file in the application, a snapshot of the file is first created by physical copying, and then the data in the snapshot is uploaded to avoid consistency problems caused by file changes during the upload process. It can be understood that the consistency problem can refer to the modification of the file during the upload process to the cloud, which causes the data uploaded to the cloud to be inconsistent with the original data after being restored. This is called a consistency problem. Therefore, by creating a snapshot of the file and then uploading the data in the snapshot to the cloud, when the original file is modified, it will not affect the upload of the data in the snapshot, thereby ensuring data consistency.
但是该方案中,由于需要通过物理拷贝的方式创建文件的快照,因此会产生大量的写I/O,这会阻塞其他应用的写入,影响到用户体验。However, in this solution, since a snapshot of the file needs to be created by physical copying, a large amount of write I/O will be generated, which will block the writing of other applications and affect the user experience.
基于此,相关方案中,新一代文件系统(XFS)和B树文件系统(B-tree filesystem,BtrFS)提供了reflink特性,以实现零拷贝的创建文件的快照。其中,reflink特性指的是在不产生大量写I/O的情况下,复制一个文件。但是,由于XFS和BtrFS这两个文件系统是针对服务器的负载而设计的,XFS和BtrFS的随机写性能较差,不适用于在手机中安装使用。Based on this, in related solutions, the new generation file system (XFS) and B-tree file system (BtrFS) provide reflink features to achieve zero-copy snapshot creation of files. The reflink feature refers to copying a file without generating a large amount of write I/O. However, since XFS and BtrFS are designed for server loads, their random write performance is poor and they are not suitable for installation and use in mobile phones.
在另一些场景中,手机中的应用在采用云存储的方式对数据进行存储时,可能采用差分技术。差分技术指的是将文件切分为多个chunk,每个chunk存在对应的哈希值。当一个chunk被修改时,该chunk对应的哈希值也会发生变化。因此,通过计算chunk对应的哈希值以确定被修改的chunk,进而只将被修改的chunk上传到云端存储。差分技术可以减少网络I/O的数目,节省网络带宽。但是会占据大量的CPU资源。且对于手机这种硬件条件弱且功耗要求高的设备,这种方式不仅会跟其他应用抢占CPU资源,还会影响到手机的续航时间。In other scenarios, when applications in mobile phones use cloud storage to store data, they may use differential technology. Differential technology refers to dividing a file into multiple chunks, each of which has a corresponding hash value. When a chunk is modified, the hash value corresponding to the chunk will also change. Therefore, by calculating the hash value corresponding to the chunk to determine the modified chunk, only the modified chunk is uploaded to the cloud storage. Differential technology can reduce the number of network I/Os and save network bandwidth. However, it will occupy a large amount of CPU resources. And for mobile phones, which have weak hardware conditions and high power consumption requirements, this method will not only compete with other applications for CPU resources, but also affect the battery life of the mobile phone.
基于此,相关方案中,远程同步(remote synchronization,Rsync)和Inotify的组合可以实现实时监控每一个文件操作的信息(如新写入的文件的地址偏移、文件大小等),从而确定文件被修改的位置,仅将被修改位置的数据上传到云端。其中,Inotify可用于监控哪一个文件被修改,Rsync可用于监控被修改的文件中的具体哪一个位置被修改。但是,Inotify是在服务器中使用的,由于手机中存在严格的文件权限管理机制,一个应用不能监控其他应用对文件操作的信息,即无法跨应用监控文件操作的信息。因此,手机中不可以使用Inotify,以实现对文件操作信息的监控。Based on this, in the relevant scheme, the combination of remote synchronization (Rsync) and Inotify can realize real-time monitoring of the information of each file operation (such as the address offset of the newly written file, the file size, etc.), so as to determine the location where the file is modified, and only upload the data of the modified location to the cloud. Among them, Inotify can be used to monitor which file is modified, and Rsync can be used to monitor which specific location in the modified file is modified. However, Inotify is used in the server. Due to the strict file permission management mechanism in the mobile phone, one application cannot monitor the information of file operations of other applications, that is, it is impossible to monitor the information of file operations across applications. Therefore, Inotify cannot be used in mobile phones to monitor file operation information.
基于上述技术问题,本申请实施例提供一种数据存储方法,能够实现零拷贝的创建文件的快照,减少产生的写I/O的数目,提高用户体验。还可以通过无需计算哈希值的方式确定文件被修改的位置,进而只将被修改的位置的数据上传到云端,在减少网络I/O的数目,节省网络带宽的情况下,降低占用的CPU资源的开销。Based on the above technical problems, the embodiment of the present application provides a data storage method that can achieve zero-copy snapshot creation of files, reduce the number of write I/Os generated, and improve user experience. It is also possible to determine the location where the file is modified without calculating the hash value, and then only upload the data at the modified location to the cloud, while reducing the number of network I/Os and saving network bandwidth, and reducing the overhead of CPU resources.
示例性的,图3示出了本申请实施例提供的一种数据存储方法应用的通信系统的架构示意图。如图3所示,该通信系统10包括一个或多个主机11(图3中仅示出了一个)以及存储设备12。For example, Fig. 3 shows a schematic diagram of the architecture of a communication system for a data storage method provided in an embodiment of the present application. As shown in Fig. 3, the communication system 10 includes one or more hosts 11 (only one is shown in Fig. 3) and a storage device 12.
其中,在图3所示的应用场景中,用户可以通过应用程序来存取数据。运行这些应用程序的计算机可以被称为“主机”。主机可以是物理机,也可以是虚拟机(virtualmachine,VM)。物理主机包括但不限于桌面电脑、服务器以及移动设备。示例性的,移动设备可以包括但不限于手机、平板电脑、笔记本电脑、以及个人数字助理(personal digitalassistant,PDA)、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、人工智能(artificial intelligence,AI)设备、可穿戴式设备等。Among them, in the application scenario shown in Figure 3, users can access data through applications. The computer running these applications can be called a "host". The host can be a physical machine or a virtual machine (VM). The physical host includes but is not limited to desktop computers, servers, and mobile devices. Exemplary, mobile devices can include but are not limited to mobile phones, tablet computers, laptops, personal digital assistants (PDAs), augmented reality (AR) devices, virtual reality (VR) devices, artificial intelligence (AI) devices, wearable devices, etc.
可选的,主机11安装的操作系统包括但不限于 或者其它操作系统。本申请对主机11的具体类型、安装的操作系统均不作限制。Optionally, the operating system installed on the host 11 includes but is not limited to Or other operating systems. This application does not limit the specific type of the host 11 or the installed operating system.
在一种可选的实现方式中,该主机可以运行在存储设备12上。In an optional implementation, the host may run on the storage device 12 .
值得注意的是,在一些情形中,上述的主机(host)也可以称为客户端(client)。It is worth noting that, in some cases, the above-mentioned host may also be referred to as a client.
存储设备12可以是服务器、云端设备等各种具有存储能力的存储设备,服务器可以是一台服务器,也可以是由多台服务器组成的服务器集群。本申请对存储设备12的具体类型不做任何限定。The storage device 12 may be any storage device with storage capacity, such as a server or a cloud device. The server may be a single server or a server cluster composed of multiple servers. The present application does not impose any limitation on the specific type of the storage device 12.
可选的,存储设备12与主机11可以是同种类型的设备,也可以是不同类型的设备。Optionally, the storage device 12 and the host 11 may be devices of the same type or devices of different types.
示例性的,图3以主机11为手机,存储设备12为服务器示出。Exemplarily, FIG3 shows that the host 11 is a mobile phone and the storage device 12 is a server.
在一种可能的示例中,主机通过网络访问存储设备12以存取数据,例如,该网络可以包括交换机(图中未示出)。In a possible example, the host accesses the storage device 12 through a network to access data. For example, the network may include a switch (not shown in the figure).
在另一种可能的示例中,主机也可以通过有线连接与存储设备12通信以存取数据,例如,通用串行总线(universal serial bus,USB)或快捷外围组件互连(peripheralcomponent interconnect express,PCIe)高速总线等。In another possible example, the host may also communicate with the storage device 12 to access data through a wired connection, such as a universal serial bus (USB) or a peripheral component interconnect express (PCIe) high-speed bus.
可以理解,本申请实施例对主机11和存储设备12之间采用的通信方式不作任何限制。It can be understood that the embodiment of the present application does not impose any limitation on the communication method adopted between the host 11 and the storage device 12.
可选的,图3可以为便于理解而示例的简化示意图,在实际应用中,上述系统中还可以包括其他设备,图中未予以画出。Optionally, FIG3 may be a simplified schematic diagram for ease of understanding. In actual applications, the above system may further include other devices, which are not shown in the figure.
本申请实施例提供的数据存储方法可以由主机11执行。可选的,主机11所采用的存储系统可以是集中式存储系统或分布式存储系统。其中,集中式存储系统的特点是有一个统一的入口,所有从外部设备来的数据都要经过这个入口,这个入口就是集中式存储系统的引擎。引擎是集中式存储系统中最为核心的部件,许多存储系统的高级功能都在其中实现。The data storage method provided in the embodiment of the present application can be executed by the host 11. Optionally, the storage system used by the host 11 can be a centralized storage system or a distributed storage system. Among them, the centralized storage system is characterized by having a unified entrance, and all data from external devices must pass through this entrance. This entrance is the engine of the centralized storage system. The engine is the most core component in the centralized storage system, and many advanced functions of the storage system are implemented in it.
以主机11采用集中式存储系统为例,图4示出了本申请实施例提供的一种集中式存储系统的架构示意图。Taking the host 11 adopting a centralized storage system as an example, FIG4 shows a schematic diagram of the architecture of a centralized storage system provided in an embodiment of the present application.
如图4所示,引擎121中可以有一个或多个控制器,图4以引擎121包含一个控制器为例予以说明。在一种可能的示例中,若引擎121具有多个控制器,任意两个控制器之间可以具有镜像通道,实现任意两个控制器互为备份的功能,从而避免硬件故障导致整个主机11不可用。As shown in Fig. 4, the engine 121 may have one or more controllers, and Fig. 4 takes the engine 121 including one controller as an example for explanation. In a possible example, if the engine 121 has multiple controllers, any two controllers may have a mirror channel to achieve the function of any two controllers backing up each other, thereby avoiding the entire host 11 being unavailable due to hardware failure.
引擎121还包含前端接口1211和后端接口1214,其中前端接口1211用于与其他装置通信。而后端接口1214用于与硬盘通信,以扩充主机11的容量。通过后端接口1214,引擎121可以连接更多的硬盘,从而形成一个非常大的存储资源池。The engine 121 further includes a front-end interface 1211 and a back-end interface 1214, wherein the front-end interface 1211 is used to communicate with other devices, and the back-end interface 1214 is used to communicate with the hard disk to expand the capacity of the host 11. Through the back-end interface 1214, the engine 121 can connect more hard disks to form a very large storage resource pool.
在硬件上,如图4所示,控制器至少包括处理器1212、内存1213。处理器1212是一个中央处理器(central processing unit,CPU),用于处理来自主机11外部(服务器或者其他存储系统)的数据访问请求,也用于处理主机11内部生成的请求。本申请的实施例中的处理器还可以是神经处理器(Neural processing unit,NPU)或图形处理器(Graphicprocessing unit,GPU),或其它通用处理器、数字信号处理器(Digital SignalProcessor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。In terms of hardware, as shown in FIG4 , the controller includes at least a processor 1212 and a memory 1213. The processor 1212 is a central processing unit (CPU) for processing data access requests from outside the host 11 (server or other storage system), and is also used to process requests generated inside the host 11. The processor in the embodiment of the present application may also be a neural processing unit (NPU) or a graphics processing unit (GPU), or other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. The general-purpose processor may be a microprocessor or any conventional processor.
示例性的,处理器1212通过前端接口1211接收其他装置发送的写数据请求时,会将这些写数据请求中的数据暂时保存在内存1213中。当内存1213中的数据总量达到一定阈值时,处理器1212通过后端端口将内存1213中存储的数据发送给机械硬盘(hard diskdrive,HDD)1221、机械硬盘1222、固态硬盘(solid state drive,SSD)1223或其他硬盘1224中至少一个硬盘进行持久化存储。在一种示例中,该其他硬盘1224可以是HDD,SSD,叠瓦式磁记录(shingled magneting recording,SMR)磁盘,或支持分区命名空间(zonednamespace,ZNS)的SSD,或其他存储器,如持久化存储器(persistent memory,PMEM)等。Exemplarily, when the processor 1212 receives write data requests sent by other devices through the front-end interface 1211, the data in these write data requests will be temporarily stored in the memory 1213. When the total amount of data in the memory 1213 reaches a certain threshold, the processor 1212 sends the data stored in the memory 1213 to at least one of the mechanical hard disk drive (HDD) 1221, the mechanical hard disk 1222, the solid state drive (SSD) 1223 or other hard disks 1224 through the back-end port for persistent storage. In one example, the other hard disk 1224 can be a HDD, an SSD, a shingled magnetic recording (SMR) disk, or an SSD that supports a zoned namespace (ZNS), or other storage, such as a persistent memory (PMEM), etc.
内存1213是指与处理器直接交换数据的内部存储器,它可以随时读写数据,而且速度很快,作为操作系统或其他正在运行中的程序的临时数据存储器。内存包括至少两种存储器,例如内存既可以是随机存取存储器,也可以是只读存储器(Read Only Memory,ROM)。举例来说,随机存取存储器是DRAM,或者SCM。DRAM是一种半导体存储器,与大部分随机存取存储器(Random Access Memory,RAM)一样,属于一种易失性存储器(volatilememory)设备。然而,DRAM和SCM在本实施例中只是示例性的说明,内存还可以包括其他随机存取存储器,例如静态随机存取存储器(Static Random Access Memory,SRAM)等。而对于只读存储器,举例来说,可以是可编程只读存储器(Programmable Read Only Memory,PROM)、可抹除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)等。另外,内存1213还可以是双列直插式存储器模块或双线存储器模块(Dual In-line MemoryModule,简称DIMM),即由动态随机存取存储器(DRAM)组成的模块,还可以是SSD。实际应用中,控制器中可配置多个内存1213,以及不同类型的内存1213。本实施例不对内存1213的数量和类型进行限定。此外,可对内存1213进行配置使其具有保电功能。保电功能是指系统发生掉电又重新上电时,内存1213中存储的数据也不会丢失。具有保电功能的内存被称为非易失性存储器。Memory 1213 refers to an internal memory that directly exchanges data with the processor. It can read and write data at any time and at a very fast speed. It serves as a temporary data storage for the operating system or other running programs. Memory includes at least two types of memory. For example, memory can be either a random access memory or a read-only memory (ROM). For example, the random access memory is DRAM or SCM. DRAM is a semiconductor memory, and like most random access memories (RAM), it is a volatile memory device. However, DRAM and SCM are only exemplary descriptions in this embodiment, and memory can also include other random access memories, such as static random access memories (SRAM), etc. As for read-only memories, for example, they can be programmable read-only memories (PROM), erasable programmable read-only memories (EPROM), etc. In addition, the memory 1213 can also be a dual in-line memory module or a dual-line memory module (Dual In-line Memory Module, abbreviated as DIMM), that is, a module composed of a dynamic random access memory (DRAM), or an SSD. In practical applications, multiple memories 1213 and different types of memories 1213 can be configured in the controller. This embodiment does not limit the number and type of memory 1213. In addition, the memory 1213 can be configured to have a power preservation function. The power preservation function means that when the system loses power and then powers on again, the data stored in the memory 1213 will not be lost. A memory with a power preservation function is called a non-volatile memory.
内存1213中存储有软件程序,处理器1212运行内存1213中的软件程序可实现对硬盘的管理。The memory 1213 stores software programs, and the processor 1212 runs the software programs in the memory 1213 to manage the hard disk.
示例性的,上述的内存1213也可以是其他存储器,该存储器可以用于存储一组计算机指令;当处理器1212执行该组计算机指令时,可以实现本发明实施例提供的方法。例如,该其他存储器可以是但不限于易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是ROM、PROM、EPROM、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是RAM,其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如SRAM、DRAM、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(doubledata date SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。Exemplarily, the above-mentioned memory 1213 may also be other memories, which can be used to store a set of computer instructions; when the processor 1212 executes the set of computer instructions, the method provided by the embodiment of the present invention can be implemented. For example, the other memory may be, but is not limited to, a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories. Among them, the non-volatile memory may be a ROM, a PROM, an EPROM, an electrically erasable programmable read-only memory (Electrically EPROM, EEPROM) or a flash memory. The volatile memory may be a RAM, which is used as an external cache. By way of exemplary but not limiting description, many forms of RAM are available, such as SRAM, DRAM, synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous connection dynamic random access memory (SLDRAM) and direct memory bus random access memory (DR RAM).
另外,图4中只示出了一个引擎121,然而在实际应用中,存储系统中可包含两个或两个以上引擎121,多个引擎121之间做冗余或者负载均衡。In addition, FIG. 4 shows only one engine 121 , but in actual applications, the storage system may include two or more engines 121 , and redundancy or load balancing is performed between the multiple engines 121 .
另外,图4所示的主机11还可以采用分布式存储系统,该分布式存储系统包括计算节点集群和存储节点集群,计算节点集群包括一个或多个计算节点,各个计算节点之间可以相互通信。计算节点可以是服务器、台式计算机或者存储阵列的控制器等。在硬件上,计算节点可以包括处理器、内存和网卡等。其中,处理器是一个CPU,用于处理来自计算节点外部的数据访问请求,或者计算节点内部生成的请求。示例性的,处理器接收用户发送的写数据请求时,会将这些写数据请求中的数据暂时保存在内存中。当内存中的数据总量达到一定阈值时,处理器将内存中存储的数据发送给存储节点进行持久化存储。除此之外,处理器还用于数据进行计算或处理,例如元数据管理、重复数据删除、数据压缩、虚拟化存储空间以及地址转换等。In addition, the host 11 shown in FIG4 can also adopt a distributed storage system, which includes a computing node cluster and a storage node cluster, and the computing node cluster includes one or more computing nodes, and each computing node can communicate with each other. The computing node can be a server, a desktop computer, or a controller of a storage array. In hardware, the computing node can include a processor, a memory, and a network card. Among them, the processor is a CPU for processing data access requests from outside the computing node, or requests generated inside the computing node. Exemplarily, when the processor receives a write data request sent by a user, the data in these write data requests will be temporarily stored in the memory. When the total amount of data in the memory reaches a certain threshold, the processor sends the data stored in the memory to the storage node for persistent storage. In addition, the processor is also used for data calculation or processing, such as metadata management, deduplication, data compression, virtualized storage space, and address conversion.
示例性的,图5示出了本申请实施例提供的一种主机11的硬件结构示意图。Exemplarily, FIG5 shows a schematic diagram of the hardware structure of a host 11 provided in an embodiment of the present application.
该主机11包括至少一个处理器201,通信线路202,存储器203以及至少一个通信接口204。The host 11 includes at least one processor 201 , a communication line 202 , a memory 203 and at least one communication interface 204 .
处理器201可以是一个通用CPU,微处理器,ASIC,或一个或多个用于控制本申请方案程序执行的集成电路。The processor 201 may be a general-purpose CPU, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the program of the present application.
通信线路202可包括一通路,在上述组件之间传送信息。The communication link 202 may include a pathway to transmit information between the above-mentioned components.
通信接口204,用于与其他设备通信。在本申请实施例中,通信接口204可以是模块、电路、总线、接口、收发器或者其它能实现通信功能的装置。可选的,当通信接口是收发器时,该收发器可以为独立设置的发送器,该发送器可用于向其他设备发送信息,该收发器也可以为独立设置的接收器,用于从其他设备接收信息。该收发器也可以是将发送、接收信息功能集成在一起的部件,本申请实施例对收发器的具体实现不做限制。The communication interface 204 is used to communicate with other devices. In the embodiment of the present application, the communication interface 204 can be a module, a circuit, a bus, an interface, a transceiver or other device capable of implementing a communication function. Optionally, when the communication interface is a transceiver, the transceiver can be an independently arranged transmitter, which can be used to send information to other devices, and the transceiver can also be an independently arranged receiver for receiving information from other devices. The transceiver can also be a component that integrates the functions of sending and receiving information, and the embodiment of the present application does not limit the specific implementation of the transceiver.
存储器203可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electricallyerasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储器、光碟存储器(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质。存储器203可以是独立存在,通过通信线路202与处理器201相连接。存储器203也可以和处理器201集成在一起。The memory 203 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, a random access memory (RAM) or other types of dynamic storage devices that can store information and instructions, or an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disc storage, an optical disc storage (including a compressed optical disc, a laser disc, an optical disc, a digital versatile disc, a Blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of an instruction or data structure and can be accessed by a computer. The memory 203 may exist independently and be connected to the processor 201 via the communication line 202. The memory 203 may also be integrated with the processor 201.
其中,存储器203用于存储用于实现本申请方案的计算机执行指令。处理器201用于执行存储器203中存储的计算机执行指令,从而实现本申请实施例提供的方法。The memory 203 is used to store computer-executable instructions for implementing the solution of the present application. The processor 201 is used to execute the computer-executable instructions stored in the memory 203, thereby implementing the method provided in the embodiment of the present application.
可选的,本申请实施例中的计算机执行指令也可以称之为应用程序代码、指令、计算机程序或者其它名称,本申请实施例对此不作具体限定。Optionally, the computer-executable instructions in the embodiments of the present application may also be referred to as application code, instructions, computer program or other names, which are not specifically limited in the embodiments of the present application.
在具体实现中,作为一种实施例,处理器201可以包括一个或多个CPU,例如图5中的CPU0和CPU1。In a specific implementation, as an embodiment, the processor 201 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 5 .
在具体实现中,作为一种实施例,主机11可以包括多个处理器,例如图5中的处理器201和处理器205。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。In a specific implementation, as an embodiment, the host 11 may include multiple processors, such as the processor 201 and the processor 205 in FIG5 . Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. The processor here may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
上述的主机11可以是一个通用设备或者是一个专用设备,本申请实施例不限定主机11的类型。The host 11 mentioned above may be a general device or a dedicated device, and the embodiment of the present application does not limit the type of the host 11.
在本申请另一些实施例中,主机11可以包括比图5所示的更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者替换某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。In other embodiments of the present application, the host 11 may include more or fewer components than those shown in FIG. 5 , or combine certain components, or split certain components, or replace certain components, or arrange the components differently. The components shown in the figure may be implemented in hardware, software, or a combination of software and hardware.
在一些实施例中,主机11的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本发明实施例以分层架构为例,示例性说明主机11的软件结构。In some embodiments, the software system of the host 11 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present invention takes the layered architecture as an example to exemplify the software structure of the host 11.
图6示出了本申请实施例提供的一种主机11的软件结构框图。FIG. 6 shows a software structure block diagram of a host 11 provided in an embodiment of the present application.
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,主机11包括两层,从上至下分别为应用层以及内核层。The layered architecture divides the software into several layers, each with a clear role and division of labor. The layers communicate with each other through software interfaces. In some embodiments, the host 11 includes two layers, namely, the application layer and the kernel layer from top to bottom.
应用层可以包括一系列应用程序,如第一应用,以及文件上传模块、接口模块等。其中,应用程序可用于实现数据的存取。文件上传模块可用于将主机11中的数据上传至其他设备存储,如云端等。接口模块可用于启动文件系统中包括的文件差异检测模块,确定文件被修改的位置,以达到将该位置的数据上传至其他设备存储的目的。The application layer may include a series of application programs, such as the first application, as well as a file upload module, an interface module, etc. Among them, the application program can be used to implement data access. The file upload module can be used to upload the data in the host 11 to other device storage, such as the cloud. The interface module can be used to start the file difference detection module included in the file system to determine the location where the file is modified, so as to achieve the purpose of uploading the data at the location to other device storage.
内核层包括文件系统(file system)以及虚拟文件系统(virtual file system,VFS)层。The kernel layer includes the file system and the virtual file system (VFS) layer.
示例性的,该文件系统可以是闪存友好文件系统(flash friendly file system,F2FS)。其中F2FS具备日志结构(log-structured)特性,日志结构(log-structured)特性指的是文件系统在写入新数据时,不会将新数据写入到保存旧数据的存储位置,而是将新数据写入到新的存储位置,然后将旧数据的存储位置释放给新数据使用。该特性对于随机写的场景,能够显著改善性能。当然,该文件系统也可以是其他文件系统,比如EXT4等,本申请对此不作具体限制。Exemplarily, the file system can be a flash friendly file system (F2FS). F2FS has a log-structured feature, which means that when the file system writes new data, it will not write the new data to the storage location where the old data is stored, but will write the new data to a new storage location, and then release the storage location of the old data for the new data to use. This feature can significantly improve the performance for random write scenarios. Of course, the file system can also be other file systems, such as EXT4, etc., and this application does not make specific restrictions on this.
VFS层作为抽象层,向上为应用层提供统一的文件访问接口,向下可兼容各种不同的文件系统。VFS层中包括inode转换模块以及快照创建模块。其中,inode转换模块可用于将单层inode结构转换成双层inode结构。在一些实施例中,转换后的双层inode结构中可以包括上层inode(U-inode)和下层inode(L-inode)。关于U-inode和L-inode的介绍可参考后文所述。As an abstract layer, the VFS layer provides a unified file access interface for the application layer upwards and is compatible with various file systems downwards. The VFS layer includes an inode conversion module and a snapshot creation module. Among them, the inode conversion module can be used to convert a single-layer inode structure into a double-layer inode structure. In some embodiments, the converted double-layer inode structure can include an upper inode (U-inode) and a lower inode (L-inode). For the introduction of U-inode and L-inode, please refer to the following description.
如图7所示,文件1具备单层inode结构,如仅包括inode1。通过inode转换模块可以将文件1的单层inode结构转换成双层inode结构,如包括U-inode以及L-inode。As shown in Fig. 7, file 1 has a single-layer inode structure, such as only including inode 1. The single-layer inode structure of file 1 can be converted into a double-layer inode structure, such as including U-inode and L-inode, by the inode conversion module.
快照创建模块可用于实现零拷贝的创建文件的快照,关于具体的创建过程请参考后文所述。The snapshot creation module can be used to create a snapshot of a file with zero copy. For the specific creation process, please refer to the following description.
文件系统中包括差异检测模块,差异检测模块可用于寻找U-inode和/或L-inode中保存的LBA,根据该保存的LBA的差异确定文件被修改的位置,进而查找到被修改的数据。The file system includes a difference detection module, which can be used to find the LBA stored in the U-inode and/or L-inode, determine the location where the file is modified based on the difference of the stored LBA, and then find the modified data.
可以理解,图6所示的层次的划分以及各层次中所包含的模块仅为示例性说明,并不构成本申请的限定。在实际应用中,也可以有其他的划分方式,各模块也可位于不同的层次,比如:差异检测模块也可位于VFS层等。It is understood that the hierarchical division shown in Figure 6 and the modules included in each level are only exemplary and do not constitute a limitation of the present application. In practical applications, there may be other division methods, and each module may also be located at a different level, such as: the difference detection module may also be located at the VFS layer, etc.
以下实施例所涉及的技术方案均可以在具有如图4、图5、图6所示结构的设备、以及图1所示架构的系统中实现。The technical solutions involved in the following embodiments can all be implemented in devices having the structures shown in Figures 4, 5, and 6, and in a system having the architecture shown in Figure 1.
可以理解,本申请以下实施例中,以主机11为主机,存储设备12为存储设备进行描述。It can be understood that in the following embodiments of the present application, the host 11 is described as the host and the storage device 12 is described as the storage device.
本申请实施例提供一种数据存储方法,主机接收到对文件的数据存储请求之后,可以根据该数据存储请求将文件的数据发送到存储设备。相应的,存储设备接收到来自主机的数据之后,可以对该数据进行存储。The embodiment of the present application provides a data storage method, after a host receives a data storage request for a file, the host can send the file data to a storage device according to the data storage request. Correspondingly, after the storage device receives the data from the host, the storage device can store the data.
在一些实施例中,主机中的文件可能采用单层inode结构,也就是说,一个文件具备一个inode。主机接收到对文件的数据存储请求之后,响应于该数据存储请求,可以先将文件的单层inode结构转换成双层inode结构,也就是说,一个文件具备两个inode,如U-inode和L-inode。当然,在主机中的文件已具备双层inode结构的情况下,也可以不执行该实施例所述的方案。In some embodiments, the file in the host may adopt a single-layer inode structure, that is, one file has one inode. After the host receives a data storage request for the file, in response to the data storage request, the single-layer inode structure of the file may be converted into a double-layer inode structure, that is, one file has two inodes, such as a U-inode and an L-inode. Of course, if the file in the host already has a double-layer inode structure, the solution described in this embodiment may not be executed.
可选的,本申请实施例中,对文件的数据存储请求可以指,将该文件的数据存储到存储设备的请求。示例性的,该对文件的数据存储请求可以是由该文件对应的应用触发的,或者是由用户触发的等。Optionally, in the embodiment of the present application, the data storage request for a file may refer to a request to store the data of the file in a storage device. Exemplarily, the data storage request for a file may be triggered by an application corresponding to the file, or by a user, etc.
在一些实施例中,L-inode可用于保存文件被转换成双层inode结构前,该文件已存在的数据(或数据块),即原始数据对应的LBA、PBA、LBA与PBA之间的映射关系等中的一种或多种。U-inode可用于保存文件被转换成双层inode结构后,该文件的更新数据对应的LBA、PBA,LBA与PBA之间的映射关系等。可选的,该更新数据可以包括但不限于对原有数据更改后的数据、新写入的数据等。In some embodiments, L-inode can be used to save the existing data (or data blocks) of the file before the file is converted into a two-layer inode structure, that is, one or more of the LBA, PBA, and mapping relationship between LBA and PBA corresponding to the original data. U-inode can be used to save the LBA, PBA, and mapping relationship between LBA and PBA corresponding to the updated data of the file after the file is converted into a two-layer inode structure. Optionally, the updated data may include but is not limited to data after the original data is changed, newly written data, etc.
在一些实施例中,当文件被转换成双层inode结构之后,在该文件的数据被修改之前,U-inode中保存的PBA的取值均为R。其中,R可用于表征该PBA的真实取值位于L-inode中。当然,R也可以采用其他的标识,本申请对此不作限制。In some embodiments, after a file is converted into a double-layer inode structure, before the data of the file is modified, the value of the PBA stored in the U-inode is R. R can be used to indicate that the real value of the PBA is located in the L-inode. Of course, R can also use other identifiers, and this application does not limit this.
示例性的,如图8中(1)所示,文件1的双层inode结构中,U-inode中的PBA的取值均为PBA=R,L-inode中PBA的取值分别为PBA=101、PBA=102、PBA=103、PBA=104。当应用对文件1进行访问时,可以先访问U-inode,而由于U-inode中保存的PBA的取值均为“R”,因此应用根据该指示可以确定真实的PBA存储于L-inode中,需要访问L-inode,最终,应用可以访问PBA为101至104中的数据。For example, as shown in (1) of FIG8 , in the double-layer inode structure of file 1, the value of PBA in U-inode is PBA=R, and the value of PBA in L-inode is PBA=101, PBA=102, PBA=103, and PBA=104. When an application accesses file 1, it can first access U-inode. Since the value of PBA stored in U-inode is "R", the application can determine that the real PBA is stored in L-inode according to the indication and needs to access L-inode. Finally, the application can access the data with PBA 101 to 104.
在一些场景中,由于文件1被更新了数据,比如:对位于PBA为103和PBA为104中的数据进行了修改(或称更新),则主机会将更新后的数据的PBA保存在U-inode中。如图8中(2)所示,U-inode中保存的PBA=301和PBA=302为更新后的数据的PBA。后续,应用再次访问文件1的数据时,可以先访问U-inode,由于U-inode中保存有非R的PBA,因此,对于这些非R的PBA,应用可以直接访问这些PBA中的数据,无需访问L-inode中的数据。也就是说,应用可以访问PBA为101、102、301、302中的数据。In some scenarios, since the data of file 1 is updated, for example, the data located in PBA 103 and PBA 104 is modified (or updated), the host will save the PBA of the updated data in the U-inode. As shown in (2) in Figure 8, PBA=301 and PBA=302 saved in the U-inode are the PBAs of the updated data. Subsequently, when the application accesses the data of file 1 again, it can first access the U-inode. Since non-R PBAs are saved in the U-inode, for these non-R PBAs, the application can directly access the data in these PBAs without accessing the data in the L-inode. In other words, the application can access the data in PBAs 101, 102, 301, and 302.
可以理解,图8中仅示出了inode中保存的PBA,inode中也可以保存有LBA,以及LBA与PBA的映射关系等。示例性的,结合图8,图9示出了本申请实施例提供的一种U-inode和L-inode中保存的LBA与PBA的映射关系的示例。It is understandable that FIG8 only shows the PBA stored in the inode, and the inode may also store the LBA, as well as the mapping relationship between the LBA and the PBA, etc. Exemplarily, in combination with FIG8 , FIG9 shows an example of the mapping relationship between the LBA and the PBA stored in a U-inode and a L-inode provided in an embodiment of the present application.
同样的,结合图9中(1)所示,应用对文件1LBA为1的数据块进行访问时,也可以先访问U-inode,而由于U-inode中保存的LBA=1对应的PBA的取值为R,因此应用根据该指示可以确定需要访问L-inode,L-inode中保存的LBA=1对应的PBA的取值为101,因此应用最终会访问PBA为101中的数据。Similarly, in combination with (1) in FIG9 , when the application accesses the data block with LBA 1 in file 1, it may first access the U-inode. Since the value of the PBA corresponding to the LBA=1 stored in the U-inode is R, the application can determine that it needs to access the L-inode based on the indication. The value of the PBA corresponding to the LBA=1 stored in the L-inode is 101, so the application will eventually access the data in the PBA 101.
如图9中(2)所示,若文件1被更新数据,比如对LBA为3,以及LBA=4的数据块进行了更改,则主机将更新的数据块的PBA保存U-inode中,具体为PBA=301和PBA=302。后续,应用对文件1LBA为3的数据块进行访问时,也可以先访问U-inode,而由于U-inode中LBA=3对应的PBA的取值为301,非R,因此,应用1可以确定直接访问PBA为301中的数据。As shown in (2) of FIG9 , if file 1 is updated, for example, data blocks with LBA 3 and LBA 4 are modified, the host saves the PBA of the updated data blocks in the U-inode, specifically PBA 301 and PBA 302. Subsequently, when the application accesses the data block with LBA 3 of file 1, it can also access the U-inode first. Since the PBA value corresponding to LBA 3 in the U-inode is 301, not R, application 1 can determine to directly access the data in PBA 301.
可选的,对于U-inode和L-inode的访问顺序可以是,先访问U-inode,再访问L-inode。这样,当U-inode中存在非R的PBA时,应用直接访问U-inode,即可确定根据U-inode中该非R的PBA去查找数据,对于这些PBA,无需进一步访问L-inode。可以节省数据访问开销,加快数据访问的效率,节省功耗。Optionally, the order of accessing the U-inode and the L-inode may be to first access the U-inode and then access the L-inode. In this way, when there is a non-R PBA in the U-inode, the application directly accesses the U-inode to determine to find data according to the non-R PBA in the U-inode. For these PBAs, there is no need to further access the L-inode. This can save data access overhead, speed up data access efficiency, and save power consumption.
当然,也可以是先访问L-inode,再访问U-inode。这样,应用可能需要访问结束L-inode和U-inode之后,才可以确定根据哪个inode中保存的PBA去访问数据。Of course, it is also possible to access the L-inode first and then the U-inode. In this way, the application may need to access both the L-inode and the U-inode before determining which inode to use to access the data based on the PBA stored therein.
该实施例中,作为一种可能的实现,图10示出了本申请实施例提供的一种将文件的单层inode结构转换成双层inode结构的过程示意图。In this embodiment, as a possible implementation, FIG. 10 shows a schematic diagram of a process of converting a single-layer inode structure of a file into a double-layer inode structure provided by an embodiment of the present application.
如图10所示,文件1具备单层inode结构,其仅存在一个对应的inode,如inode1。该inode1中保存了文件1数据对应的PBA,如PBA=101、PBA=102、PBA=103等。主机首先创建一个空文件(或称中间文件等),如文件2,该空文件中不包含数据,因此,该空文件对应的inode(如inode2)中也不包含任何数据,即,inode2中的PBA的取值均为空(NULL)。然后主机将inode2中的PBA的取值均初始化为R,初始化完成之后,将inode2替换掉文件1中的inode1,最后inode2可以通过指针的方式链接到inode1。也就是说,inode2中可以保存inode1的指针(或索引,或地址等),以便于查找到inode1。这样,inode2即为文件1的U-inode,inode1即为文件1的L-inode,即完成了文件1的双层inode结构的转换。该过程中,由于未产生任何物理拷贝操作,因此主机系统的开销较低。As shown in FIG10 , file 1 has a single-layer inode structure, and there is only one corresponding inode, such as inode1. The inode1 stores the PBA corresponding to the data of file 1, such as PBA=101, PBA=102, PBA=103, etc. The host first creates an empty file (or intermediate file, etc.), such as file 2. The empty file does not contain any data, so the inode corresponding to the empty file (such as inode2) does not contain any data, that is, the value of PBA in inode2 is empty (NULL). Then the host initializes the value of PBA in inode2 to R. After the initialization is completed, inode2 replaces inode1 in file 1. Finally, inode2 can be linked to inode1 by pointer. In other words, inode2 can store the pointer (or index, or address, etc.) of inode1 to facilitate the search for inode1. In this way, inode2 is the U-inode of file 1, and inode1 is the L-inode of file 1, which completes the conversion of the double-layer inode structure of file 1. In this process, since no physical copy operation is generated, the overhead of the host system is low.
可选的,图10所述的方案是以通过创建文件2的方式实现inode2的创建为例的。当然,在其他的实施例中,也可以直接创建inode2,以实现文件1的双层inode结构的转换,本申请对此不作限制。Optionally, the solution described in Figure 10 takes the creation of inode 2 as an example by creating file 2. Of course, in other embodiments, inode 2 may also be created directly to achieve the conversion of the double-layer inode structure of file 1, and this application does not limit this.
在一些实施例中,主机接收到对文件的数据存储请求之后,响应于该数据存储请求,在将文件的数据发送到存储设备之前,还可以创建该文件的快照,进而将快照中的数据发送到存储设备,以避免上传过程中文件被修改而出现的一致性问题。In some embodiments, after the host receives a data storage request for a file, it can also create a snapshot of the file in response to the data storage request and before sending the file data to the storage device, and then send the data in the snapshot to the storage device to avoid consistency issues caused by file modifications during the upload process.
示例性的,图11示出了本申请实施例提供的一种创建文件快照的过程示意图。Exemplarily, FIG11 shows a schematic diagram of a process for creating a file snapshot provided in an embodiment of the present application.
如图11所示,文件1为待上传数据到存储设备的文件,文件1具备双层inode结构,inode1为文件1的L-inode,inode2为文件1的U-inode。主机首先创建一个具有单层inode结构的文件3,文件3为空文件,该空文件中不包含数据,因此,文件3的inode(如inode3)中也不包含任何数据,即,inode3中的PBA的取值均为空(NULL)。然后主机将inode3中的PBA的取值均初始化为R,初始化完成之后,将inode3通过指针的方式链接到inode1。关于该指针的实现可参考上文所述。这样,inode3即为文件3的U-inode,inode1即为文件3的L-inode,即建立了文件3的双层inode结构。As shown in FIG11 , file 1 is a file to be uploaded to the storage device. File 1 has a double-layer inode structure, inode 1 is the L-inode of file 1, and inode 2 is the U-inode of file 1. The host first creates a file 3 with a single-layer inode structure. File 3 is an empty file. The empty file does not contain any data. Therefore, the inode of file 3 (such as inode 3) does not contain any data, that is, the value of PBA in inode 3 is empty (NULL). Then the host initializes the value of PBA in inode 3 to R. After the initialization is completed, inode 3 is linked to inode 1 by means of a pointer. For the implementation of the pointer, please refer to the above description. In this way, inode 3 is the U-inode of file 3, and inode 1 is the L-inode of file 3, that is, a double-layer inode structure of file 3 is established.
而由于文件1的更新数据的PBA均保存在U-inode,即inode2中,inode1作为L-inode仅用于保存文件1的原始数据的PBA,因此inode1对于上层应用来讲是只读的,也就是说,inode1中的PBA不变。这样,文件3即变成了文件1的快照。该创建文件快照的方案,除了创建文件3的开销,并未对文件1的数据进行读写,实现了零拷贝的创建文件的快照,减少了产生的写I/O的数目,可以提高用户体验。Since the PBA of the updated data of file 1 is stored in the U-inode, that is, inode2, inode1 is used as an L-inode only to store the PBA of the original data of file 1. Therefore, inode1 is read-only for upper-layer applications, that is, the PBA in inode1 remains unchanged. In this way, file 3 becomes a snapshot of file 1. In addition to the overhead of creating file 3, this solution for creating a file snapshot does not read or write the data of file 1, and realizes zero-copy creation of file snapshots, reduces the number of write I/Os generated, and can improve user experience.
在一些实施例中,主机接收到对文件的数据存储请求之后,响应于该数据存储请求,在将文件的数据发送到存储设备之前,还可以先确定该文件被修改的数据,进而仅将被修改的数据发送到存储设备,以减少网络I/O的数目,节省网络带宽。In some embodiments, after the host receives a data storage request for a file, in response to the data storage request, before sending the file data to the storage device, it can also first determine the modified data of the file, and then only send the modified data to the storage device, so as to reduce the number of network I/Os and save network bandwidth.
基于图8所述的双层inode结构的方案,对于一个文件,由于该文件对应的U-inode用于保存文件的更新数据的PBA,U-inode用于保存文件的原始数据的PBA。因此,主机根据U-inode中的PBA的取值可以确定文件被修改的数据,如U-inode中取值为非R的PBA即为,文件被修改的数据对应的PBA。Based on the dual-layer inode structure solution described in FIG8 , for a file, since the U-inode corresponding to the file is used to store the PBA of the updated data of the file, and the U-inode is used to store the PBA of the original data of the file, the host can determine the modified data of the file according to the value of the PBA in the U-inode, such as the PBA in the U-inode whose value is not R, that is, the PBA corresponding to the modified data of the file.
基于该方案,通过遍历文件对应的U-inode中的PBA即可确定文件被修改的位置,无需通过计算哈希值这样密集计算的方式,进而只将被修改的数据上传到云端,在减少网络I/O的数目,节省网络带宽的情况下,降低了占用的CPU资源的开销。Based on this solution, the location where the file is modified can be determined by traversing the PBA in the U-inode corresponding to the file, without the need for intensive calculations such as calculating hash values. Only the modified data is uploaded to the cloud, reducing the number of network I/Os and saving network bandwidth, while reducing the overhead of occupied CPU resources.
在一些场景中,一个文件可能存在多次被修改的情况,如果此次更新的数据的PBA以及上次更新的数据的PBA均保存在U-inode时,那么主机无法区分此次更新后的数据与上次更新后的数据存在的差异。In some scenarios, a file may be modified multiple times. If the PBA of the updated data and the PBA of the last updated data are both stored in the U-inode, the host cannot distinguish the difference between the updated data and the last updated data.
基于此,本申请实施例提供了一种状态机方案。对于每一个文件,存在两种状态,分别为S0和S1。其中,S0用于表示该文件未被修改过,该文件对应的U-inode中的PBA的取值均为R。S1用于表示该文件被修改过,该文件对应的U-inode中包含取值为非R的PBA。Based on this, the embodiment of the present application provides a state machine solution. For each file, there are two states, S0 and S1. Among them, S0 is used to indicate that the file has not been modified, and the value of PBA in the U-inode corresponding to the file is R. S1 is used to indicate that the file has been modified, and the U-inode corresponding to the file contains a PBA with a value other than R.
示例性的,图12中(a)表示文件的状态为S0,该文件的U-inode中保存的PBA的取值均为R,L-inode中包括的PBA(如PBA=101至PBA=106等)为该文件原始数据对应的PBA。后续,该文件被修改,如假设该文件的PBA为103至106中的数据被修改,修改后的数据对应的PBA为301至304,这些PBA的取值被保存到U-inode中,则该文件的状态由图12中(a)所示的S0状态转换到图12中(b)所示的S1状态。主机根据S1状态的文件对应的U-inode中的PBA的取值即可确定文件被修改的位置,如U-inode中取值为非R的PBA中的数据即为文件被修改的数据,进而主机可以向存储设备仅发送这些被修改的数据。Exemplarily, (a) in FIG. 12 indicates that the state of the file is S0, and the value of the PBA stored in the U-inode of the file is R, and the PBA included in the L-inode (such as PBA=101 to PBA=106, etc.) is the PBA corresponding to the original data of the file. Subsequently, the file is modified. For example, assuming that the data in the PBAs 103 to 106 of the file are modified, the PBAs corresponding to the modified data are 301 to 304, and the values of these PBAs are saved in the U-inode, then the state of the file is converted from the S0 state shown in (a) in FIG. 12 to the S1 state shown in (b) in FIG. 12. The host can determine the location where the file is modified based on the value of the PBA in the U-inode corresponding to the file in the S1 state. For example, the data in the PBA in the U-inode whose value is not R is the data of the file being modified, and then the host can send only these modified data to the storage device.
进一步的,主机将U-inode中此次修改数据对应的PBA,即非R的PBA同步到L-inode中,文件的状态由图12中(b)所示的S1状态再次转换到图12中(c)所示的S0状态。后续,处于图12中(c)所示的S0状态的文件对应的U-inode可以再次记录下一次修改数据对应的PBA。Further, the host synchronizes the PBA corresponding to the modified data in the U-inode, that is, the non-R PBA, to the L-inode, and the file status is converted from the S1 state shown in Figure 12 (b) to the S0 state shown in Figure 12 (c). Subsequently, the U-inode corresponding to the file in the S0 state shown in Figure 12 (c) can record the PBA corresponding to the next modified data again.
可选的,可以先将处于S1状态的文件对应的U-inode中保存的非R的PBA对应的数据上传到云端,再将该U-inode中保存的非R的PBA同步到L-inode中。也可以先将S1状态的文件对应的U-inode中保存的非R的PBA同步到L-inode中,再将该非R的PBA对应的数据上传到云端。Optionally, the data corresponding to the non-R PBA stored in the U-inode corresponding to the file in the S1 state can be uploaded to the cloud first, and then the non-R PBA stored in the U-inode is synchronized to the L-inode. Alternatively, the non-R PBA stored in the U-inode corresponding to the file in the S1 state can be synchronized to the L-inode first, and then the data corresponding to the non-R PBA is uploaded to the cloud.
可以理解,本申请各实施例所述的技术方案可以单独使用,在不冲突的情况下,还可以任意结合使用。It can be understood that the technical solutions described in the various embodiments of the present application can be used separately, and can also be used in any combination without conflict.
还可以理解,本申请实施例,第一设备可以执行本申请实施例中的部分或全部步骤,这些步骤或操作仅是示例,本申请实施例还可以执行其它操作或者各种操作的变形。此外,各个步骤可以按照本申请实施例呈现的不同的顺序来执行,并且有可能并非要执行本申请实施例中的全部操作。It can also be understood that in the embodiment of the present application, the first device can perform some or all of the steps in the embodiment of the present application, and these steps or operations are only examples. The embodiment of the present application can also perform other operations or variations of various operations. In addition, the various steps can be performed in different orders presented in the embodiment of the present application, and it is possible that not all operations in the embodiment of the present application need to be performed.
示例性的,图13示出了本申请实施例提供的一种数据存储方法的流程示意图,该方法可以由第一设备执行。示例性的,第一设备可以是上述主机或者主机中的处理器等。该方法包括以下步骤:Exemplarily, FIG13 shows a flow chart of a data storage method provided in an embodiment of the present application, which can be executed by a first device. Exemplarily, the first device can be the above-mentioned host or a processor in the host. The method includes the following steps:
S1301、响应于对第一文件的数据存储请求,获取第一文件对应的inode。S1301. In response to a data storage request for a first file, obtain an inode corresponding to the first file.
示例性的,第一文件可以是主机中存储的任意文件。Exemplarily, the first file may be any file stored in the host.
其中,第一文件对应的inode为双层inode结构。该双层inode结构,即第一文件对应的inode包括第一inode,即L-inode,和第二inode,即U-inode。第一inode用于存储第一PBA,第一PBA为第一文件的未更新数据对应的PBA,第二inode用于存储第二PBA,第二PBA为第一文件的更新数据对应的PBA。The inode corresponding to the first file is a double-layer inode structure. The double-layer inode structure, i.e., the inode corresponding to the first file, includes a first inode, i.e., an L-inode, and a second inode, i.e., an U-inode. The first inode is used to store a first PBA, which is a PBA corresponding to the unupdated data of the first file, and the second inode is used to store a second PBA, which is a PBA corresponding to the updated data of the first file.
在一些实施例中,第二inode还用于存储第三PBA,即取值为R的PBA,该第三PBA可用于指示第一文件的数据对应的PBA存储于第一inode中,也就是说,第一文件的数据对应的真实PBA位于第一inode中。In some embodiments, the second inode is also used to store a third PBA, that is, a PBA with a value of R. The third PBA can be used to indicate that the PBA corresponding to the data of the first file is stored in the first inode, that is, the real PBA corresponding to the data of the first file is located in the first inode.
在本申请的一些实施例中,更新数据和未更新数据指的当前数据相对于上一次数据有没有更新而言的。比如:文件发生了第一次修改,则更新数据可以指文件第一次修改后的数据,未更新数据可以指文件第一次修改前的数据。若文件发生了第二次修改,则更新数据可以指文件第二次修改后的数据,未更新数据可以指文件第二次修改前的数据,但是文件第二次修改前的数据包含了文件第一次修改后的数据。In some embodiments of the present application, updated data and unupdated data refer to whether the current data has been updated relative to the last data. For example, if a file is modified for the first time, the updated data may refer to the data after the first modification of the file, and the unupdated data may refer to the data before the first modification of the file. If a file is modified for the second time, the updated data may refer to the data after the second modification of the file, and the unupdated data may refer to the data before the second modification of the file, but the data before the second modification of the file includes the data after the first modification of the file.
S1302、根据第二PBA确定更新数据。S1302. Determine update data according to the second PBA.
S1303、向第二设备发送更新数据。相应的,第二设备接收来自第一设备的更新数据。可选的,第二设备可以对该更新数据进行存储。S1303: Send update data to the second device. Correspondingly, the second device receives the update data from the first device. Optionally, the second device may store the update data.
基于上述方案,当接收对文件的数据存储请求之后,获取该文件对应的inode,该inode为双层inode结构,其中一个inode用于存储该文件未更新数据对应的PBA,另一个inode用于存储该文件更新数据对应的PBA。这样,直接根据用于存储该文件更新数据对应的PBA即可确定文件的更新数据,无需通过计算哈希值这样密集计算的方式,降低了占用的CPU资源的开销。进而仅将该更新数据发送到云端存储,可以减少网络I/O的数目,节省网络带宽。Based on the above scheme, after receiving a data storage request for a file, the inode corresponding to the file is obtained. The inode is a double-layer inode structure, in which one inode is used to store the PBA corresponding to the unupdated data of the file, and the other inode is used to store the PBA corresponding to the updated data of the file. In this way, the updated data of the file can be determined directly based on the PBA corresponding to the updated data of the file, without the need for intensive calculation such as calculating the hash value, thereby reducing the overhead of occupied CPU resources. Then, only the updated data is sent to the cloud storage, which can reduce the number of network I/Os and save network bandwidth.
在一些实施例中,第一设备中的文件,如第一文件可能采用单层inode结构,因此,在步骤S1301之前,图13所述的方法还可以包括步骤S1301a至步骤S1301c(图中未示出)。In some embodiments, the file in the first device, such as the first file, may adopt a single-layer inode structure. Therefore, before step S1301, the method described in Figure 13 may also include steps S1301a to S1301c (not shown in the figure).
S1301a、创建不包括数据的第二文件。S1301a: Create a second file that does not include data.
其中,第二文件,即空文件,对应第三inode。第三inode中包括第四PBA,即取值为NULL的PBA。The second file, that is, the empty file, corresponds to the third inode. The third inode includes the fourth PBA, that is, the PBA whose value is NULL.
S1301b、根据第三inode、第四PBA、第三PBA生成第二inode。S1301b. Generate a second inode according to the third inode, the fourth PBA, and the third PBA.
示例性的,可以将第四PBA均设置为第三PBA,这样第三inode即变成了第二inode。Exemplarily, the fourth PBA may be set to the third PBA, so that the third inode becomes the second inode.
S1301c、根据第二inode以及第一文件对应的第一inode生成第一文件对应的inode。S1301c. Generate an inode corresponding to the first file according to the second inode and the first inode corresponding to the first file.
可以理解,第一文件对应的第一inode可以指第一文件采用单层inode时,第一文件对应的inode。It can be understood that the first inode corresponding to the first file may refer to the inode corresponding to the first file when the first file adopts a single-layer inode.
示例性的,可以将第二inode通过指针的方式等链接到第一inode,这样即生成了第一文件对应的inode。Exemplarily, the second inode may be linked to the first inode by means of a pointer, etc., so that an inode corresponding to the first file is generated.
基于该方案,在一些场景中,由于应用处于用户态,而inode处于内核态,因此应用无法直接创建inode。因此,可以通过创建空文件的方式创建inode,进而可以将文件的单层inode结构转换为双层inode结构。Based on this solution, in some scenarios, since the application is in user mode and the inode is in kernel mode, the application cannot directly create an inode. Therefore, an inode can be created by creating an empty file, and the single-layer inode structure of the file can be converted to a double-layer inode structure.
在一些实施例中,在图13所示的步骤S1303之前,图13所示的方法还包括步骤S1303a至步骤S1303c(图中未示出)。In some embodiments, before step S1303 shown in FIG. 13 , the method shown in FIG. 13 further includes steps S1303 a to S1303 c (not shown in the figure).
S1303a、创建不包括数据的第三文件。S1303a: Create a third file that does not include data.
其中,第三文件,即空文件,对应第四inode。第四inode中包括第四PBA,即取值未为空的PBA。The third file, that is, the empty file, corresponds to the fourth inode. The fourth inode includes the fourth PBA, that is, the PBA whose value is not empty.
S1303b、将第四PBA设置为第三PBA。S1303b. Set the fourth PBA as the third PBA.
即将PBA的取值由空设置为R。That is, the value of PBA is set from empty to R.
S1303c、根据第三文件,第四inode以及第一inode生成第一文件的快照。S1303c. Generate a snapshot of the first file according to the third file, the fourth inode, and the first inode.
示例性的,可以将第四inode通过指针的方式链接到第一inode,这样,第三文件即成为了第一文件的快照。Exemplarily, the fourth inode may be linked to the first inode by means of a pointer, so that the third file becomes a snapshot of the first file.
基于该方案,第三文件与第一文件共享第一inode,而第一inode用于存储第一文件的未更新数据对应的PBA。因此,第一inode对于上层应用来讲是只读的,这样,第三文件即变成了第一文件的快照。该创建文件快照的方案,除了创建第三文件的开销,并未对第一文件的数据进行读写,实现了零拷贝的创建文件的快照,减少了产生的写I/O的数目,可以提高用户体验。Based on this solution, the third file shares the first inode with the first file, and the first inode is used to store the PBA corresponding to the unupdated data of the first file. Therefore, the first inode is read-only for the upper-layer application, so that the third file becomes a snapshot of the first file. In addition to the overhead of creating the third file, this solution for creating a file snapshot does not read or write the data of the first file, and realizes zero-copy creation of a file snapshot, reduces the number of write I/Os generated, and can improve the user experience.
在一些实施例中,第一文件为第一状态,即上述S1状态,第一状态用于指示第一文件被修改。在图13所示的步骤S1303之前,图13所示的方法还包括步骤S1303d至步骤S1303e(图中未示出)。In some embodiments, the first file is in the first state, ie, the above-mentioned S1 state, and the first state is used to indicate that the first file is modified. Before step S1303 shown in FIG13 , the method shown in FIG13 further includes steps S1303d to S1303e (not shown in the figure).
S1303d、将第二PBA保存到第一inode中。S1303d. Save the second PBA into the first inode.
在一些实施例中,在第一文件处于第一状态时,第一文件对应的第二inode中可能仅存储有第二PBA,也可能同时存储有第二PBA和第三PBA。这样,将第二inode中的第二PBA存储到第一inode之后,第二inode中可能仅存储有第三PBA,或者仅存储有第四PBA。In some embodiments, when the first file is in the first state, the second inode corresponding to the first file may store only the second PBA, or may store both the second PBA and the third PBA. Thus, after the second PBA in the second inode is stored in the first inode, the second inode may store only the third PBA, or only the fourth PBA.
S1303e、将第一文件的状态设置为第二状态。S1303e. Set the state of the first file to the second state.
其中,第二状态,即上述S0状态,用于指示第一文件未被修改。The second state, namely the above-mentioned S0 state, is used to indicate that the first file has not been modified.
基于该方案,一个文件可能存在多次被修改的情况,如果此次更新的数据的PBA以及上次更新的数据的PBA均保存在第二inode时,那么无法区分此次更新后的数据与上次更新后的数据存在的差异。因此,通过文件的状态转换,即可确定文件当前是否被修改过,进而仅通过遍历第二inode中的PBA即可确定文件被修改的位置。不仅无需通过计算哈希值这样密集计算的方式,还可以区分此次更新后的数据与上次更新后的数据存在的差异。Based on this scheme, a file may be modified multiple times. If the PBA of the data updated this time and the PBA of the data updated last time are both saved in the second inode, it is impossible to distinguish the difference between the data updated this time and the data updated last time. Therefore, by changing the state of the file, it is possible to determine whether the file has been modified, and then the location where the file has been modified can be determined by only traversing the PBA in the second inode. Not only does it not require intensive calculations such as calculating hash values, it is also possible to distinguish the difference between the data updated this time and the data updated last time.
该实施例中,作为一个具体的实现,图13所示的方法步骤S1302可以具体实现为:根据第一文件的快照对应的第一inode中保存的第二PBA,确定更新数据。其中,由于第一文件与第一文件的快照共享第一inode,所以第一文件的快照对应的第一inode也即第一文件对应的第一inode。基于该方案,可以通过第一文件的快照,访问第一文件的第一inode中的第二PBA,而第二PBA是第二inode中保存的更新数据对应的PBA。这样,可以将文件的快照中的更新数据上传到云端,可以在避免上传过程中文件被更改而出现的一致性问题的情况下,减少网络I/O的数目,节省网络带宽。In this embodiment, as a specific implementation, the method step S1302 shown in FIG. 13 can be specifically implemented as follows: determining the updated data according to the second PBA stored in the first inode corresponding to the snapshot of the first file. Among them, since the first file and the snapshot of the first file share the first inode, the first inode corresponding to the snapshot of the first file is also the first inode corresponding to the first file. Based on this scheme, the second PBA in the first inode of the first file can be accessed through the snapshot of the first file, and the second PBA is the PBA corresponding to the updated data stored in the second inode. In this way, the updated data in the snapshot of the file can be uploaded to the cloud, and the number of network I/Os can be reduced and network bandwidth can be saved while avoiding consistency problems caused by file changes during the upload process.
可以理解,文件的快照可以在文件被修改之前创建,也可以在文件被修改之后创建,本申请对创建快照的时机不作限定。It is understandable that a snapshot of a file may be created before or after the file is modified, and the present application does not limit the timing of creating a snapshot.
下面以第一文件为文件1,未接收到对文件1的数据存储请求时,文件1采用单层inode结构为例,对该实现进行具体介绍。The following takes the first file as file 1, and when no data storage request for file 1 is received, file 1 adopts a single-layer inode structure as an example to specifically introduce the implementation.
当第一设备接收到对文件1的数据存储请求时,响应于该数据存储请求,将文件1的单层inode结构转换成双层inode结构。随后,创建文件1的快照,文件1的快照与文件1共享一个L-inode,文件1的快照与文件1的U-inode中的PBA的取值均为R,此时文件1的数据与文件1的快照中的数据是相同的。后续,文件1被修改,文件1的更新数据的PBA被存储到文件1的U-inode中。此时,第一设备根据文件1的快照即可确定文件1,进而根据文件1的U-inode中保存的PBA即可确定文件1的更新数据。随后,将该更新数据对应的PBA记录到内存,然后将文件1的U-inode中更新数据的PBA同步到L-inode中,同步之后,文件1的U-inode中的PBA的取值为R或者NULL。最后,根据内存记录,通过文件1的快照访问L-inode中保存的新数据对应的PBA即可确定更新数据,进而可以仅将该更新数据上传到云端。When the first device receives a data storage request for file 1, in response to the data storage request, the single-layer inode structure of file 1 is converted into a double-layer inode structure. Subsequently, a snapshot of file 1 is created, and the snapshot of file 1 shares an L-inode with file 1. The values of PBA in the snapshot of file 1 and the U-inode of file 1 are both R. At this time, the data of file 1 is the same as the data in the snapshot of file 1. Subsequently, file 1 is modified, and the PBA of the updated data of file 1 is stored in the U-inode of file 1. At this time, the first device can determine file 1 according to the snapshot of file 1, and then determine the updated data of file 1 according to the PBA stored in the U-inode of file 1. Subsequently, the PBA corresponding to the updated data is recorded in the memory, and then the PBA of the updated data in the U-inode of file 1 is synchronized to the L-inode. After synchronization, the value of PBA in the U-inode of file 1 is R or NULL. Finally, according to the memory record, the PBA corresponding to the new data stored in the L-inode can be accessed through the snapshot of file 1 to determine the updated data, and then only the updated data can be uploaded to the cloud.
可以理解,该示例是以先创建文件1的快照,再将U-inode中更新数据的PBA同步到L-inode为例的。也可以先将U-inode中更新数据的PBA同步到L-inode,再创建文件1的快照,本申请实施例对此不作限制。It can be understood that this example is based on first creating a snapshot of file 1, and then synchronizing the PBA of the updated data in U-inode to L-inode. It is also possible to first synchronize the PBA of the updated data in U-inode to L-inode, and then create a snapshot of file 1, and the embodiment of the present application is not limited to this.
还可以理解,本申请实施例中,是以双层inode结构区分更新数据和未更新数据的,该双层inode结构也可以是多于双层的inode结构,比如:三层inode结构、四层inode结构等。也可以在不同层次的inode中区分更新数据和未更新数据,以确定文件被修改的位置。It can also be understood that in the embodiment of the present application, a double-layer inode structure is used to distinguish updated data from non-updated data, and the double-layer inode structure can also be an inode structure with more than two layers, such as a three-layer inode structure, a four-layer inode structure, etc. Updated data and non-updated data can also be distinguished in inodes of different levels to determine the location where the file is modified.
以三层inode结构为例,第一层inode可用于存储未更新数据对应的PBA,第二层inode可用于存储第一次更新后的数据对应的PBA,第三层inode可用于存储第二次更新后的数据对应的PBA。可选的,后续第三层inode中存储的第二次更新后的数据对应的PBA也可同步到第二层inode,和/或第一层inode中。可选的,第二层inode中存储的第一次更新后的数据对应的PBA也可同步到第一层inode中。进一步,第二层inode可用于存储第三次更新后的数据对应的PBA。第三层inode可用于存储第四次更新后的数据对应的PBA,以此类推。Taking the three-layer inode structure as an example, the first-layer inode can be used to store the PBA corresponding to the unupdated data, the second-layer inode can be used to store the PBA corresponding to the data updated for the first time, and the third-layer inode can be used to store the PBA corresponding to the data updated for the second time. Optionally, the PBA corresponding to the second updated data stored in the subsequent third-layer inode can also be synchronized to the second-layer inode, and/or the first-layer inode. Optionally, the PBA corresponding to the first updated data stored in the second-layer inode can also be synchronized to the first-layer inode. Further, the second-layer inode can be used to store the PBA corresponding to the data updated for the third time. The third-layer inode can be used to store the PBA corresponding to the data updated for the fourth time, and so on.
可以理解,本申请实施例提供的适用于双层inode结构的方案,在多于双层的inode结构中同样适用。It can be understood that the solution provided in the embodiment of the present application and applicable to a two-layer inode structure is also applicable to an inode structure with more than two layers.
上述主要是从方法的角度对本申请实施例提供的方案进行了介绍。可以理解的是,数据存储装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。结合本申请中所公开的实施例描述的各示例的单元及算法步骤,本申请实施例能够以硬件或硬件和软件的结合形式来实现。某个功能究竟以硬件还是软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同的方法来实现所描述的功能,但是这种实现不应认为超过本申请实施例的技术方案的范围。The above mainly introduces the solution provided by the embodiment of the present application from the perspective of the method. It is understandable that, in order to realize the above functions, the data storage device includes a hardware structure and/or software module corresponding to the execution of each function. In combination with the units and algorithm steps of each example described in the embodiment disclosed in this application, the embodiment of the present application can be implemented in the form of hardware or a combination of hardware and software. Whether a function is executed in a hardware or software-driven hardware manner depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered to exceed the scope of the technical solution of the embodiment of the present application.
本申请是实施例可以根据上述方法示例对数据存储装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。The present application is an embodiment that can divide the data storage device into functional modules according to the above method example. For example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing unit. The above integrated unit can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of units in the embodiment of the present application is schematic and is only a logical function division. There may be other division methods in actual implementation.
如图14所示,为本申请实施例提供的一种数据存储装置的结构示意图,该数据存储装置1400可应用于第一设备,以用于实现以上各个方法实施例中记载的方法。示例性的,该数据存储装置1400具体可以包括:处理单元1401和通信单元1402。As shown in Figure 14, it is a schematic diagram of the structure of a data storage device provided in an embodiment of the present application, and the data storage device 1400 can be applied to a first device to implement the methods described in the above method embodiments. Exemplarily, the data storage device 1400 may specifically include: a processing unit 1401 and a communication unit 1402.
处理单元1401用于支持数据存储装置1400执行图13中的步骤S1301至S1302。和/或,处理单元1401还用于支持数据存储装置1400执行本申请实施例中第一设备执行的其他步骤。The processing unit 1401 is used to support the data storage device 1400 to execute steps S1301 to S1302 in Figure 13. And/or, the processing unit 1401 is also used to support the data storage device 1400 to execute other steps executed by the first device in the embodiment of the present application.
通信单元1402用于支持数据存储装置1400执行图13中的步骤S1303。和/或,通信单元1402还用于支持数据存储装置1400执行本申请实施例中第一设备执行的其他步骤。The communication unit 1402 is used to support the data storage device 1400 to execute step S1303 in Figure 13. And/or, the communication unit 1402 is also used to support the data storage device 1400 to execute other steps executed by the first device in the embodiment of the present application.
可选的,图14所示的数据存储装置1400还可以包括显示单元(图14中未示出)。显示单元可用于执行显示操作等。Optionally, the data storage device 1400 shown in FIG14 may further include a display unit (not shown in FIG14 ). The display unit may be used to perform display operations and the like.
可选的,图14所示的数据存储装置1400还可以包括存储单元1403,该存储单元1403存储有程序或指令。当处理单元1401执行该程序或指令时,使得图14所示的数据存储装置1400可以执行图13等中所示的方法。Optionally, the data storage device 1400 shown in Fig. 14 may further include a storage unit 1403, which stores a program or instruction. When the processing unit 1401 executes the program or instruction, the data storage device 1400 shown in Fig. 14 may execute the method shown in Fig. 13 and the like.
图14所示的数据存储装置1400的技术效果可以参考图13等所示的方法的技术效果,此处不再赘述。图14所示的数据存储装置1400中涉及的处理单元1401可以由处理器或处理器相关电路组件实现,可以为处理器或处理模块。通信单元1402可以由收发器或收发器相关电路组件实现,可以为收发器或收发模块。The technical effects of the data storage device 1400 shown in FIG. 14 can refer to the technical effects of the method shown in FIG. 13 and the like, and will not be repeated here. The processing unit 1401 involved in the data storage device 1400 shown in FIG. 14 can be implemented by a processor or a processor-related circuit component, and can be a processor or a processing module. The communication unit 1402 can be implemented by a transceiver or a transceiver-related circuit component, and can be a transceiver or a transceiver module.
本申请实施例还提供一种芯片系统,如图15所示,该芯片系统包括至少一个处理器1501和至少一个接口电路1502。处理器1501和接口电路1502可通过线路互联。例如,接口电路1502可用于从其它装置接收信号。又例如,接口电路1502可用于向其它装置(例如处理器1501)发送信号。示例性的,接口电路1502可读取存储器中存储的指令,并将该指令发送给处理器1501。当所述指令被处理器1501执行时,可使得第一设备执行上述实施例中的第一设备执行的各个步骤。当然,该芯片系统还可以包含其他分立器件,本申请实施例对此不作具体限定。The embodiment of the present application also provides a chip system, as shown in Figure 15, the chip system includes at least one processor 1501 and at least one interface circuit 1502. The processor 1501 and the interface circuit 1502 can be interconnected through lines. For example, the interface circuit 1502 can be used to receive signals from other devices. For another example, the interface circuit 1502 can be used to send signals to other devices (such as processor 1501). Exemplarily, the interface circuit 1502 can read instructions stored in the memory and send the instructions to the processor 1501. When the instruction is executed by the processor 1501, the first device can execute the various steps performed by the first device in the above embodiment. Of course, the chip system can also include other discrete devices, which is not specifically limited in the embodiment of the present application.
可选地,该芯片系统中的处理器可以为一个或多个。该处理器可以通过硬件实现也可以通过软件实现。当通过硬件实现时,该处理器可以是逻辑电路、集成电路等。当通过软件实现时,该处理器可以是一个通用处理器,通过读取存储器中存储的软件代码来实现。Optionally, the processor in the chip system may be one or more. The processor may be implemented by hardware or by software. When implemented by hardware, the processor may be a logic circuit, an integrated circuit, etc. When implemented by software, the processor may be a general-purpose processor implemented by reading software code stored in a memory.
可选地,该芯片系统中的存储器也可以为一个或多个。该存储器可以与处理器集成在一起,也可以和处理器分离设置,本申请并不限定。示例性的,存储器可以是非瞬时性处理器,例如只读存储器ROM,其可以与处理器集成在同一块芯片上,也可以分别设置在不同的芯片上,本申请对存储器的类型,以及存储器与处理器的设置方式不作具体限定。Optionally, the memory in the chip system may be one or more. The memory may be integrated with the processor or may be separately provided with the processor, which is not limited in the present application. Exemplarily, the memory may be a non-transient processor, such as a read-only memory ROM, which may be integrated with the processor on the same chip or may be provided on different chips. The present application does not specifically limit the type of memory and the arrangement of the memory and the processor.
示例性的,该芯片系统可以是现场可编程门阵列(field programmable gatearray,FPGA),可以是专用集成芯片(application specific integrated circuit,ASIC),还可以是系统芯片(system on chip,SoC),还可以是中央处理器(central processorunit,CPU),还可以是网络处理器(network processor,NP),还可以是数字信号处理电路(digital signal processor,DSP),还可以是微控制器(micro controller unit,MCU),还可以是可编程控制器(programmable logic device,PLD)或其他集成芯片。Exemplarily, the chip system can be a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), a system on chip (SoC), a central processor unit (CPU), a network processor (NP), a digital signal processor (DSP), a microcontroller unit (MCU), a programmable logic device (PLD) or other integrated chips.
应理解,上述方法实施例中的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。It should be understood that each step in the above method embodiment can be completed by an integrated logic circuit of hardware in a processor or by instructions in the form of software. The method steps disclosed in the embodiments of the present application can be directly embodied as being executed by a hardware processor, or by a combination of hardware and software modules in a processor.
本申请实施例提供一种计算机可读存储介质,计算机可读存储介质包括计算机程序,当计算机程序在设备上运行时,使得设备执行上述方法实施例所述的方法。An embodiment of the present application provides a computer-readable storage medium, which includes a computer program. When the computer program runs on a device, the device executes the method described in the above method embodiment.
本申请实施例提供一种计算机程序产品,该计算机程序产品包括:计算机程序或指令,当计算机程序或指令在设备或计算机上运行时,使得设备或计算机执行上述方法实施例所述的方法。An embodiment of the present application provides a computer program product, which includes: a computer program or instructions, when the computer program or instructions are executed on a device or a computer, the device or computer executes the method described in the above method embodiment.
本申请实施例提供一种通信系统,该通信系统包括第一设备以及第二设备,第一设备以及第二设备通过交互实现上述实施例所述的方法。An embodiment of the present application provides a communication system, which includes a first device and a second device. The first device and the second device implement the method described in the above embodiment through interaction.
另外,本申请实施例还提供一种装置,这个装置具体可以是芯片,组件或模块,该装置可包括相连的处理器和存储器;其中,存储器用于存储执行指令,当装置运行时,处理器可执行存储器存储的执行指令,以使装置执行上述各方法实施例中的方法。In addition, an embodiment of the present application also provides a device, which can specifically be a chip, component or module, and the device may include a connected processor and memory; wherein the memory is used to store execution instructions, and when the device is running, the processor can execute the execution instructions stored in the memory so that the device executes the methods in the above-mentioned method embodiments.
其中,本实施例提供的设备、通信系统、计算机可读存储介质、计算机程序产品或芯片均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。Among them, the equipment, communication system, computer-readable storage medium, computer program product or chip provided in this embodiment are all used to execute the corresponding methods provided above. Therefore, the beneficial effects that can be achieved can refer to the beneficial effects in the corresponding methods provided above and will not be repeated here.
通过以上实施方式的描述,所属领域的技术人员可以了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。Through the description of the above implementation methods, technical personnel in the relevant field can understand that for the convenience and simplicity of description, only the division of the above-mentioned functional modules is used as an example. In actual applications, the above-mentioned functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。各实施例在不冲突的情况下可以相互结合或相互参考。以上所描述的装置实施例仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in the present application, it should be understood that the disclosed devices and methods can be implemented in other ways. The various embodiments can be combined with each other or referenced to each other without conflict. The device embodiments described above are merely schematic. For example, the division of modules or units is only a logical function division. There may be other division methods in actual implementation. For example, multiple units or components can be combined or integrated into another device, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may be one physical unit or multiple physical units, that is, they may be located in one place or distributed in multiple different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the present embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium. Based on this understanding, the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium, including several instructions to enable a device (which can be a single-chip microcomputer, chip, etc.) or a processor (processor) to perform all or part of the steps of the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read only memory (ROM), random access memory (RAM), disk or optical disk and other media that can store program code.
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above contents are only specific implementation methods of the present application, but the protection scope of the present application is not limited thereto. Any technician familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the present application, which should be included in the protection scope of the present application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.
Claims (15)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310245340.XA CN118625993A (en) | 2023-03-07 | 2023-03-07 | Data storage method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310245340.XA CN118625993A (en) | 2023-03-07 | 2023-03-07 | Data storage method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118625993A true CN118625993A (en) | 2024-09-10 |
Family
ID=92600765
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310245340.XA Pending CN118625993A (en) | 2023-03-07 | 2023-03-07 | Data storage method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118625993A (en) |
-
2023
- 2023-03-07 CN CN202310245340.XA patent/CN118625993A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11301374B2 (en) | Method and system for distributed garbage collection of deduplicated datasets | |
US11636031B2 (en) | Optimized inline deduplication | |
US12235799B2 (en) | Optimizing a transfer of a file system | |
US20210303522A1 (en) | Copying a File System | |
US10346297B1 (en) | Method and system for cloud based distributed garbage collection of a deduplicated datasets | |
US10078583B1 (en) | Method and system for reducing memory used in embedded DDRs by using spare drives for OOC GC | |
US10031703B1 (en) | Extent-based tiering for virtual storage using full LUNs | |
US11061770B1 (en) | Reconstruction of logical pages in a storage system | |
US10055420B1 (en) | Method to optimize random IOS of a storage device for multiple versions of backups using incremental metadata | |
US10606803B2 (en) | Data cloning in memory-based file systems | |
EP2691886B1 (en) | Time-based data partitioning | |
US10515009B1 (en) | Method and system for reducing memory requirements during distributed garbage collection of deduplicated datasets | |
US9904480B1 (en) | Multiplexing streams without changing the number of streams of a deduplicating storage system | |
US10437682B1 (en) | Efficient resource utilization for cross-site deduplication | |
US10990518B1 (en) | Method and system for I/O parallel distributed garbage collection of a deduplicated datasets | |
US20210303511A1 (en) | Cloning a Managed Directory of a File System | |
US11099940B1 (en) | Reconstruction of links to orphaned logical pages in a storage system | |
US11269547B2 (en) | Reusing overwritten portion of write buffer of a storage system | |
US20120023146A1 (en) | Storage system and method of controlling same | |
US20210406241A1 (en) | Reconstruction of links between logical pages in a storage system | |
US10678431B1 (en) | System and method for intelligent data movements between non-deduplicated and deduplicated tiers in a primary storage array | |
US9996426B1 (en) | Sparse segment trees for high metadata churn workloads | |
US20240283463A1 (en) | Data compression method and apparatus | |
CN115525602A (en) | Data processing method and related device | |
US10705733B1 (en) | System and method of improving deduplicated storage tier management for primary storage arrays by including workload aggregation statistics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |