WO2020125362A1 - File system and data layout method - Google Patents
File system and data layout method Download PDFInfo
- Publication number
- WO2020125362A1 WO2020125362A1 PCT/CN2019/121301 CN2019121301W WO2020125362A1 WO 2020125362 A1 WO2020125362 A1 WO 2020125362A1 CN 2019121301 W CN2019121301 W CN 2019121301W WO 2020125362 A1 WO2020125362 A1 WO 2020125362A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- module
- file system
- cost
- file
- area
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
Definitions
- the invention belongs to the technical field of data layout, and particularly relates to a file system and a data layout method.
- GPFS is the abbreviation of General Parallel File System.
- GPFS from IBM is a scalable, high-performance, general-purpose parallel file system based on shared disks. GPFS can provide parallel, high-speed, safe, and reliable data access for all nodes in the storage system.
- PanFS is a parallel file system developed by Panasas.
- PanFS is a general-purpose parallel file system. At present, its main application field is similar to luster.
- PanFS is a scalable line that can provide strong consistency through distributed locks.
- the performance gap between a solid-state drive-based server and a hard disk drive-based server will significantly reduce the performance of the parallel file system, because solid-state drive-based servers are always better than Hard disk drive servers have higher performance, which requires less I/O time to complete the same amount of data access.
- the existing layout scheme is applied, the scheme will give solid-state drive-based servers and hard disk drive-based servers Allocating the same stripe may result in severe load imbalance between heterogeneous servers.
- complex I/O workloads may also jeopardize the efficiency of I/O systems.
- the present invention provides a file system, the file system includes An I/O tracer, a cost calculation module, and an area division module that are electrically connected to each other, and the I/O tracer is used to provide the area division module with the I/O information collected by itself when the file system is running
- the I/O tracer is also used to provide the cost calculation module with the configuration file of the file system collected by itself; the cost calculation module is used to calculate or estimate the file request in the file system Access cost to output a cost model to the area dividing module; the area dividing module is used to generate a distribution area with a minimum total cost according to the cost model, and divide the file into different areas, the area dividing module It is also used to obtain the stripe size corresponding to the area.
- the file system further includes a daemon process module, the daemon process module is used to execute the daemon process in the background; and the FUSE module is used as an agent of the daemon process.
- the file system further includes an update data layout module, the update data layout module and the daemon module, the I/O tracer, the area division module, and the hybrid storage system, respectively Connected, the update data layout module is used to dynamically detect and update area changes.
- the calculation formula of the replication time is: T c (r, h, s) ⁇ 3 (mh+ns) t c , where t c represents the unit data replication time from kernel space to user space, and h represents HServer Band size, s indicates the strip size on SServer, m indicates the number of HServers, and n indicates the number of SServers;
- the area division model is used to obtain the minimum cost of dividing 1 event into k areas starting from event i
- the invention also provides a data layout method, which includes:
- Step S1 Collect the I/O information of data access at runtime and the file system configuration file used for cost modeling into the tracking file, orient the file system configuration file to establish the cost model, and use the I/O information to Area division
- Step S2 calculate or estimate the access cost of the file request to form a cost model
- Step S3 generate a distribution area with a minimum total cost according to the cost model, and divide the file into different areas;
- FIG. 1 is a schematic diagram of a data layout scheme using fixed-size strips in the prior art
- Figure 2 is a schematic diagram of the data layout scheme based on area division
- Figure 3 is a schematic diagram of a file system based on regional data layout
- FIG. 5 is an application example diagram of a file system in an embodiment of the present invention.
- the regional scheme in RLFS is a more fine-grained and more adaptive data layout scheme than the traditional data layout, and corresponds to different stripe sizes in all storage servers. Therefore, the regional scheme in RLFS can be seen as a variant of the 1-DH layout scheme. RLFS can aggregate the bandwidth of all storage servers to maximize I/O performance. RLFS matches the hybrid storage system 100 very well.
- RLFS aims to support area-based data layout by using file strips of different sizes.
- RLFS uses a partitioned processing method to achieve the optimal data layout.
- a cost model is generated in RLFS. According to the cost model, RLFS divides a large file into a set of regions, and each region stores its own strip size separately. When the total cost of all I/O requests of the application is minimized, the optimal regions and their stripe sizes are obtained.
- the storage system involved in this embodiment is a hybrid storage system 100.
- the hybrid storage system 100 includes a solid-state drive-based server 102 and a hard disk drive-based server 101, a solid-state drive-based server referred to as SServer, and a hard disk drive-based server. HServer.
- An embodiment of the present invention provides a file system, which is called a region-level file system, that is, Region Level File System, or RLFS for short.
- the file system can support regional data layout and solve the data distribution problem in the existing parallel file system.
- RLFS relies on a defined cost model and a heterogeneous sensing scheme based on each region to determine the optimal file stripe size for each server, and further uses the changed access mode to adjust the regional scheme at runtime.
- RLFS is storage system and application-aware. RLFS essentially represents a change from the traditional one-dimensional fixed stripe size layout to a two-dimensionally changing stripe size layout. RLFS can adapt well to server performance and application behavior. Variety. In addition, RLFS also updates the generated data layout scheme based on the detected change in access mode to solve the static data layout problem, making it more suitable for file access at runtime.
- an embodiment of the present invention provides a file system, which is called a regional file system, that is, Region Level File System, abbreviated as RLFS.
- the file system can support regional data layout and solve the data distribution problem in the existing parallel file system.
- the kernel part of the RLFS package and the user-level daemon module 20 includes the FUSE module 10.
- the file system RLFS provided by the embodiments of the present invention is preferably designed based on the FUSE framework.
- FUSE refers to the user space file system, which is an abbreviation of Filesystem in Userspace.
- the kernel part is preferably a Linux kernel module, and the kernel part further includes a VFS module 11, which is a virtual file system, which is an abbreviation of Virtual File System.
- the VFS module 11 is used to register RLFS.
- a block device is created in the kernel part. The block device acts as an interface between the daemon process module 20 and the kernel part.
- the FUSE module 10 acts as an agent of the daemon process module 20 for various file systems issued by the application. request.
- the application program from client 1 can access RLFS by mounting RLFS into its name space, and thereafter, all file system calls directed to the mount point are forwarded to FUSE module 10 through VFS module 11. Then, the FUSE module 10 relays the call instruction in the request queue to the daemon module 20 through the block device, wherein, by contacting the metadata server 200 and or other storage server, an appropriate service processing program is called to adapt to the file system call.
- the response propagates through the kernel part along the reverse path and eventually propagates back to the application.
- the application is usually in a waiting state after making a request, waiting for a response.
- the RLFS daemon and storage server should complete all PFS semantics.
- the read handler should first identify which storage servers have the requested data segment, and which server stores the corresponding data segment, and then issue sub-requests to these servers for parallel access.
- the kernel part also includes a file log module 12 for recording operation logs for the metadata server 200.
- RLFS In addition to the general semantics of PFS, RLFS also needs to implement region-based data layout functions. To achieve this goal, RLFS is equipped with an I/O tracer 3 with three user-level components, a cost calculation module 4 and an area division module 5. RLFS completes a three-phase data layout cycle through three user-level components. The data layout cycle starts from the tracking phase. During the tracking phase, the I/O tracer 3 collects the runtime statistics of data access and the summary of the file system used for cost modeling (for example, FUSE queue information) during application execution. Into the trace file.
- cost modeling for example, FUSE queue information
- RLFS can greatly improve the I/O performance of the application in subsequent operations.
- RLFS also includes an updated data layout module 8, which is connected to the daemon module 20, the I/O tracer 3, the area division module 5, and the hybrid storage system 100, respectively.
- the updated data layout module is used In order to dynamically update the data layout, the update data layout module is used to dynamically detect and update area changes. . Further, the specific functions of the I/O tracer 3, the cost calculation module 4 and the area division module 5 are separately explained:
- I/O tracer 3 is used in RLFS to collect both runtime I/O information and file system configuration files.
- IOSIG [42] the file system provided by the embodiment of the present invention is designed based on the FUSE framework, similar to the existing IOSIG [42]
- the I/O data collection tools in the technology cannot be directly applied to RLFS. This is determined by the inherent characteristics of the FUSE framework structure. Therefore, in the I/O tracer 3 involved in this embodiment, which follows the N-1 log mode, all RLFS daemons are used to write a single file shared file. Therefore, the designed I/O tracer 3 can help to collect all information of I/O operations, including file access type, operation time, and other process-related data.
- the cost calculation module 4 can generate a cost model, and the cost model aims to find the minimum total cost.
- the file system proposed in the embodiment of the present invention needs to rely on the cost calculation module 4.
- the cost is defined as the I/O completion time of each file request.
- the cost calculation module 4 is used to calculate the cost of file request access in the file system.
- the file system is compatible with the hybrid storage system.
- the cost calculation module 4 should include the system cost of the file system and the network and storage costs.
- the cost calculation module 4 includes a system cost calculation module 41 and a network and storage cost calculation module 42.
- the system cost of the file system mainly refers to the time overhead in the FUSE data path. Since the main goal of RLFS is to optimize the read request through the optimal position of the data file on the hybrid storage system 100, only the system cost related to the read request is defined in this embodiment, and the cost of the write request can also be followed by Export with the same parameters.
- the service time is divided into three sub-parts, one is the waiting time in the FUSE module 10, and the other is the two between the FUSE module 10 and the daemon module 20.
- the time of context switching the third is the time of the three copy operations collected in the first copy.
- the time to wait for a read request in the FUSE module 10 queue is closely related to the application running between the client 1 and RLFS.
- the time to wait for the read request in the FUSE module 10 queue depends not only on the I/O request made by the application Mode, which is also related to other factors caused by the file system, such as page caching or interruption. Therefore, it is difficult to estimate it accurately.
- Tq 0.
- reproduction time T c The first copy of the collected time copy operation of three, referred to as reproduction time T c, r is proportional to the size of the data reproduction time T c with the requested file, which is calculated as:
- the file request data size is r, and the calculation formula for the file request data size r is:
- s m and s n represent the maximum sub-request size on HServer and the maximum sub-request size on SServer, and s m ⁇ h and s n ⁇ s, h represents the stripe size on HServer, s represents the stripe size on SServer, So, further, the replication time T c can be expressed as:
- While the network by the network computing and storage costs and storage costs calculation module 42 comprises: a network connection time T e, T a memory access times and network transmission time T x.
- PFS requests are divided into a set of subtasks, and each subtask is forwarded to a separate storage server for parallel execution. Therefore, the cost of request subcomponents in the network and storage server is determined by the maximum cost of all subrequests.
- the network transmission time T x can be determined according to the data size (s m and s n ) and the data transmission network time t. The specific formula is:
- s m s n represent the largest sub-request HServer the maximum size of the child and SServer request size.
- the storage access time T a is determined by the sub-request.
- the specific formula is:
- s m and s n represent the maximum sub-request size on the HServer and the maximum sub-request size on the SServer, respectively.
- t h and t s represent the unit data transmission time on the HServer and the unit data transmission time on the SServer.
- the network and storage cost T 2 calculated by the network and storage cost calculation module 42 can be expressed by the formula:
- h indicates the strip size on the HServer
- s indicates the strip size on the SServer.
- the write request involves more operations than read.
- two contexts are performed between the FUSE module 10 and the daemon module 20
- the switching time T s needs to include the time for write amplification, garbage collection and wear leveling.
- the area division module 5 can divide the file into different areas, trying to minimize the total cost of a given access set featuring parallel applications.
- the existing area division device has HARL, and HARL divides the area division and stripe size determination into two different stages to deal with.
- the layout strategy of RLFS is integrated, and the layout strategy of RLFS is a unified
- the method considers the problem of area division and stripe size determination, so RLFS can determine area division and stripe size at a time.
- RLFS does not scan trace files in a heuristic way to find logical regions like HARL, but puts logical regions and physical blocks together with the goal of minimizing the total cost. This consideration is easy to understand because the smallest unit of file access is a block, such as 64MB or 128MB, and the logical area can naturally span a sequence of adjacent physical blocks.
- the first algorithm can be executed in the area division module 5.
- the first algorithm is an offline form of the most relatively fast algorithm.
- the first algorithm can be repeated periodically to adapt to the dynamic characteristics of the access. "Relatively fast" means that the algorithm is pseudo-polynomial time.
- the essence of the algorithm is to first represent the shared file as a sequence of blocks, then partition the file in blocks according to the given access request, and finally use the dynamic programming module to partition from these partitions. Find the optimal area division.
- the data between HServer and SServer in each area is striped, and logical I/O requests can be processed by a single multiple physical requests related to the requested data.
- the total access cost is minimized according to the defined cost model compared to traditional strategies.
- An area is defined with a size of It can be expressed as:
- ⁇ 1 is the expansion factor of SServer relative to HServer
- B represents the block size in the configuration.
- FIG. 5 is an application example diagram of a file system in an embodiment of the present invention.
- the file client 1 issues a request on behalf of the application program from the computing server 301
- the hybrid storage system 100 is responsible for storing and managing the stripped area
- the metadata server 200 contains the files stored in the RLFS. Description.
- the client 1 first contacts the MDS to obtain file metadata, and then uses it to perform data access with the hybrid storage system 100 through the RLFS daemon.
- a file server in the parallel file system is used to test the context switching time, unit data copy time and unit data transfer time of HServer and SServer with read/write mode. These parameters can vary with different I/ O mode.
- a pair of nodes, a client node and a file server are used to estimate the network transmission time. The network transmission time test can be repeated thousands of times, and then the average of them is calculated as the parameter value of the generated cost model.
- RLFS To perform the optimal data layout for a specific file, RLFS first uses its area division module 5 to calculate the optimal area division of the file, and then uses the cost model and I/O tracking data to determine the stripe size of each area.
- the optimal area information is calculated for writing files on each server at the same time, and the area dividing module 5 creates an RST for subsequent reading of the files in the MDS.
- MDS holds the RLFS namespace, RST, and other information about each file.
- the size of the MDS is highly controlled, and the size of the MDS is small.
- the hybrid storage system 100 maintains a flat namespace, where each file can be identified by "filename_region#_stripe#" in the local disk.
- filename can contain path information specified by the application.
- the background I/O daemon is used to receive incoming requests from client 1, which is characterized by “filename”, "region#” and “stripe#”, and serves the request by sending back the requested stripe file. Band files are combined with other band files to meet the needs of the application.
- a file system (RLFS) proposed by the present invention supports region-level data layout by dividing a file into a set of optimal regions, so that the file system can determine the optimal region and its stripe size. Therefore, by This file system can optimize the data layout of the hybrid storage system 100.
- using the FUSE module 10 not only greatly simplifies the development work, but also allows access to RLFS through the standard file system interface, allowing applications to access RLFS in a transparent manner, and variable-size RLFS can ease the The load is unbalanced, which can flexibly adapt to workload changes and server heterogeneity, thereby significantly speeding up I/O system performance.
- RLFS uses the optimal data layout of ⁇ 32KB, 160KB ⁇ and ⁇ 36KB, 148KB ⁇ , respectively, which improves I/O performance by 73.4% and 176.7 compared to the default layout with 64KB fixed-size stripes. %. Compared with other layouts with different but fixed-size stripes, RLFS improves read performance to 138.6% and write performance to 177.6%. Compared with the randomly selected stripe strategy, RLFS improves read performance to 154.5% and write performance to 215.4%.
- the invention also provides a data layout method, which includes:
- Step S1 Collect the I/O information of data access at runtime and the file system configuration file used for cost modeling into the tracking file, orient the file system configuration file to establish the cost model, and use the I/O information to Area division
- Step S2 calculate or estimate the access cost of the file request to form a cost model
- Step S3 generate a distribution area with a minimum total cost according to the cost model, and divide the file into different areas;
- Step S4 Obtain the stripe size corresponding to the area.
- This method supports regional data layout by dividing the file into a set of optimal regions to determine the optimal region and its stripe size. This method optimizes the data of the hybrid storage system layout.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (9)
- 一种文件系统,其特征在于,所述文件系统包括相互电性连接的I/O示踪器、成本计算模块和区域划分模块,所述I/O示踪器用于向所述区域划分模块提供自身收集到所述文件系统运行时的I/O信息;所述I/O示踪器还用于向所述成本计算模块提供自身收集到的所述文件系统的配置文件;所述成本计算模块用于计算或预估所述文件系统中文件请求的访问成本,以向所述区域划分模块输出成本模型;所述区域划分模块用于根据所述成本模型生成总成本最小化的分布区域,并将文件划分到的不同区域中,所述区域划分模块还用于获得所述区域对应的条带大小。A file system, characterized in that the file system includes an I/O tracer, a cost calculation module and an area division module electrically connected to each other, and the I/O tracer is used to provide the area division module I/O information collected by the file system during operation; the I/O tracer is also used to provide the cost calculation module with the configuration file of the file system collected by itself; the cost calculation module Used to calculate or estimate the access cost of file requests in the file system to output a cost model to the area division module; the area division module is used to generate a distribution area with a minimum total cost according to the cost model, and In different areas into which the file is divided, the area dividing module is also used to obtain the stripe size corresponding to the area.
- 如权利要求1所述的文件系统,其特征在于,所述文件系统还包括内核部分,所述内核部分用于执行元数据服务器、混合型存储系统和客户端三方之间的信息或数据的交互;所述内核部分包括FUSE模块。The file system according to claim 1, wherein the file system further comprises a kernel part, and the kernel part is used to perform information or data interaction between a metadata server, a hybrid storage system, and a client ; The core part includes a FUSE module.
- 如权利要求2所述的文件系统,其特征在于,所述文件系统还包括守护进程模块,所述守护进程模块用于在后台执行守护进程;所述FUSE模块用于作为所述守护进程的代理。The file system according to claim 2, wherein the file system further comprises a daemon process module, the daemon process module is used to execute the daemon process in the background; the FUSE module is used as an agent of the daemon process .
- 如权利要求3所述的文件系统,其特征在于,所述文件系统还包括更新数据布局模块,所述更新数据布局模块分别与所述守护进程模块、所述I/O示踪器、所述区域划分模块和所述混合型存储系统连接,所述更新数据布局模块用于动态检测和更新区域变化。The file system according to claim 3, wherein the file system further comprises an update data layout module, the update data layout module and the daemon module, the I/O tracer, the An area division module is connected to the hybrid storage system, and the update data layout module is used to dynamically detect and update area changes.
- 如权利要求4所述的文件系统,其特征在于,所述成本计算模块用于计算请求的总成本,总成本计算公式为:T=T s+T c+T 2,公式中,T s表示所述FUSE模块和所述守护进程模块之间进行两个上下文切换的时间,T c表示复制时 间,T 2表示网络和存储成本。 The file system according to claim 4, wherein the cost calculation module is used to calculate the total cost of the request, the total cost calculation formula is: T = T s + T c + T 2 , in the formula, T s represents The time for the two context switches between the FUSE module and the daemon module, T c represents the replication time, and T 2 represents the network and storage costs.
- 如权利要求5所述的文件系统,其特征在于,所述混合型存储系统包括包括基于固态驱动器的服务器SServer和基于硬盘驱动器的服务器HServer;The file system according to claim 5, wherein the hybrid storage system includes a server SServer based on a solid state drive and a server HServer based on a hard disk drive;所述复制时间的计算公式为:T c(r,h,s)≈3(mh+ns)t c,公式中t c表示从内核空间到用户空间的单元数据复制时间,h表示HServer上条带尺寸,s表示SServer上条带尺寸,m表示HServer的数量,n表示SServer的数量; The calculation formula of the replication time is: T c (r, h, s) ≈ 3 (mh+ns) t c , where t c represents the unit data replication time from kernel space to user space, and h represents HServer Band size, s indicates the strip size on SServer, m indicates the number of HServers, and n indicates the number of SServers;所述网络和存储成本的计算公式为:T 2≈T e+max{h(t h+t),s(t s+t)},公式中,t表示数据传输网络时间,t h和t s分别表示HServer上单元数据传输时间和SServer上单元数据传输时间,T e表示网络连接时间。 The calculation formula of the network and storage cost is: T 2 ≈T e +max{h(t h +t),s(t s +t)}, where t represents the data transmission network time, t h and t s represents the unit data transmission time on HServer and SServer, respectively, and T e represents the network connection time.
- 如权利要求3或5或6所述的文件系统,其特征在于,所述区域划分模用于获取从事件i开始将l个事件划分为k个区域的最小成本 所述最小成本 的计算公式为: The file system according to claim 3, 5 or 6, wherein the area division module is used to obtain a minimum cost for dividing 1 event into k areas starting from event i The minimum cost Is calculated as:
- 如权利要求7所述的文件系统,其特征在于,基于固态驱动器的服务器和基于硬盘驱动器的服务器能够将 条带化,并分别得到h i和s i,s i的计算公式为s i=αh i,h i的计算公式为: The file system according to claim 7, wherein the server based on the solid state drive and the server based on the hard disk drive can Striping, respectively and h i and s i, s i is calculated as s i = αh i, h i is calculated as:公式中,α≥1且是SServer相对于HServer的扩展因子,B表示配置中的块大小。In the formula, α≥1 is the expansion factor of SServer relative to HServer, and B represents the block size in the configuration.
- 一种数据布局方法,其特征在于,其包括:A data layout method is characterized in that it includes:步骤S1,将运行时的数据访问的I/O信息以及用于成本建模的文件系统配置文件收集到跟踪文件中,将文件系统配置文件定向用于建立成本模型,将I/O信息用于区域划分;Step S1: Collect the I/O information of data access at runtime and the file system configuration file used for cost modeling into the tracking file, orient the file system configuration file to establish the cost model, and use the I/O information to Area division步骤S2,计算或预估文件请求的访问成本,形成成本模型;Step S2, calculate or estimate the access cost of the file request to form a cost model;步骤S3,根据所述成本模型以生成总成本最小化的分布区域,并将文件划分到的不同区域中;Step S3, generate a distribution area with a minimum total cost according to the cost model, and divide the file into different areas;步骤S4,获取所述区域对应的条带大小。Step S4: Obtain the stripe size corresponding to the area.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811547400.9 | 2018-12-18 | ||
CN201811547400.9A CN109840247B (en) | 2018-12-18 | 2018-12-18 | File system and data layout method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020125362A1 true WO2020125362A1 (en) | 2020-06-25 |
Family
ID=66883264
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/121301 WO2020125362A1 (en) | 2018-12-18 | 2019-11-27 | File system and data layout method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109840247B (en) |
WO (1) | WO2020125362A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109840247B (en) * | 2018-12-18 | 2020-12-18 | 深圳先进技术研究院 | File system and data layout method |
CN110825698B (en) * | 2019-11-07 | 2021-02-09 | 重庆紫光华山智安科技有限公司 | Metadata management method and related device |
CN114578299A (en) * | 2021-06-10 | 2022-06-03 | 中国人民解放军63698部队 | Method and system for generating radio frequency signal by wireless remote control beacon device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1692356A (en) * | 2002-11-14 | 2005-11-02 | 易斯龙系统公司 | Systems and methods for restriping files in a distributed file system |
US20090248756A1 (en) * | 2008-03-27 | 2009-10-01 | Akidau Tyler A | Systems and methods for a read only mode for a portion of a storage system |
CN102566942A (en) * | 2011-12-28 | 2012-07-11 | 华为技术有限公司 | File striping writing method, device and system |
CN103778222A (en) * | 2014-01-22 | 2014-05-07 | 浪潮(北京)电子信息产业有限公司 | File storage method and system for distributed file system |
WO2015153671A1 (en) * | 2014-03-31 | 2015-10-08 | Amazon Technologies, Inc. | File storage using variable stripe sizes |
CN109840247A (en) * | 2018-12-18 | 2019-06-04 | 深圳先进技术研究院 | File system and data layout method |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005302152A (en) * | 2004-04-12 | 2005-10-27 | Sony Corp | Composite type storage device, data writing method, and program |
CN105872031B (en) * | 2016-03-26 | 2019-06-14 | 天津书生云科技有限公司 | Storage system |
US9916311B1 (en) * | 2013-12-30 | 2018-03-13 | Emc Corporation | Storage of bursty data using multiple storage tiers with heterogeneous device storage |
CN104020961B (en) * | 2014-05-15 | 2017-07-25 | 深信服科技股份有限公司 | Distributed data storage method, apparatus and system |
JP6346880B2 (en) * | 2014-10-17 | 2018-06-20 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | recoding media |
CN105760164B (en) * | 2016-02-15 | 2020-01-10 | 苏州浪潮智能科技有限公司 | Method for realizing ACL authority in user space file system |
CN106326344B (en) * | 2016-08-05 | 2018-09-18 | 中国水产科学研究院东海水产研究所 | A kind of method of the management of distributing big data and retrieval |
CN106528761B (en) * | 2016-11-04 | 2019-06-18 | 郑州云海信息技术有限公司 | A kind of file caching method and device |
CN107479827A (en) * | 2017-07-24 | 2017-12-15 | 上海德拓信息技术股份有限公司 | A kind of mixing storage system implementation method based on IO and separated from meta-data |
CN107734026B (en) * | 2017-10-11 | 2020-10-16 | 苏州浪潮智能科技有限公司 | Method, device and equipment for designing network additional storage cluster |
-
2018
- 2018-12-18 CN CN201811547400.9A patent/CN109840247B/en active Active
-
2019
- 2019-11-27 WO PCT/CN2019/121301 patent/WO2020125362A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1692356A (en) * | 2002-11-14 | 2005-11-02 | 易斯龙系统公司 | Systems and methods for restriping files in a distributed file system |
US20090248756A1 (en) * | 2008-03-27 | 2009-10-01 | Akidau Tyler A | Systems and methods for a read only mode for a portion of a storage system |
CN102566942A (en) * | 2011-12-28 | 2012-07-11 | 华为技术有限公司 | File striping writing method, device and system |
CN103778222A (en) * | 2014-01-22 | 2014-05-07 | 浪潮(北京)电子信息产业有限公司 | File storage method and system for distributed file system |
WO2015153671A1 (en) * | 2014-03-31 | 2015-10-08 | Amazon Technologies, Inc. | File storage using variable stripe sizes |
CN109840247A (en) * | 2018-12-18 | 2019-06-04 | 深圳先进技术研究院 | File system and data layout method |
Also Published As
Publication number | Publication date |
---|---|
CN109840247B (en) | 2020-12-18 |
CN109840247A (en) | 2019-06-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kang et al. | Towards building a high-performance, scale-in key-value storage system | |
Wang et al. | An efficient design and implementation of LSM-tree based key-value store on open-channel SSD | |
US9235531B2 (en) | Multi-level buffer pool extensions | |
US6219693B1 (en) | File array storage architecture having file system distributed across a data processing platform | |
WO2020125362A1 (en) | File system and data layout method | |
US9135262B2 (en) | Systems and methods for parallel batch processing of write transactions | |
US9557933B1 (en) | Selective migration of physical data | |
WO2021218038A1 (en) | Storage system, memory management method, and management node | |
US20080294698A1 (en) | Foresight data transfer type hierachical storage system | |
CN103873559A (en) | Database all-in-one machine capable of realizing high-speed storage | |
CN111708719B (en) | Computer storage acceleration method, electronic equipment and storage medium | |
JP2005056077A (en) | Database control method | |
Shen et al. | Magnet: push-based shuffle service for large-scale data processing | |
Riedel et al. | Data mining on an OLTP system (nearly) for free | |
Li et al. | Elastic and stable compaction for LSM-tree: A FaaS-based approach on TerarkDB | |
Li et al. | Leveraging NVMe SSDs for building a fast, cost-effective, LSM-tree-based KV store | |
WO2024131379A1 (en) | Data storage method, apparatus and system | |
Su et al. | Revitalizing the Forgotten {On-Chip}{DMA} to Expedite Data Movement in {NVM-based} Storage Systems | |
An et al. | Avoiding read stalls on flash storage | |
Banakar et al. | WiscSort: External Sorting For Byte-Addressable Storage | |
CN115793957A (en) | Method and device for writing data and computer storage medium | |
Son et al. | Design and evaluation of a user-level file system for fast storage devices | |
Xie et al. | PetPS: Supporting huge embedding models with persistent memory | |
US20150177984A1 (en) | Management system and management method | |
Menon et al. | Logstore: A workload-aware, adaptable key-value store on hybrid storage systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19899730 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19899730 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 10/11/2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19899730 Country of ref document: EP Kind code of ref document: A1 |