WO2018233332A1 - Distributed storage internal storage management method and system, and computer storage medium - Google Patents

Distributed storage internal storage management method and system, and computer storage medium Download PDF

Info

Publication number
WO2018233332A1
WO2018233332A1 PCT/CN2018/079685 CN2018079685W WO2018233332A1 WO 2018233332 A1 WO2018233332 A1 WO 2018233332A1 CN 2018079685 W CN2018079685 W CN 2018079685W WO 2018233332 A1 WO2018233332 A1 WO 2018233332A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
node
data block
file
version
Prior art date
Application number
PCT/CN2018/079685
Other languages
French (fr)
Chinese (zh)
Inventor
江汛洋
梁松涛
李道兵
许式伟
Original Assignee
上海七牛信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海七牛信息技术有限公司 filed Critical 上海七牛信息技术有限公司
Publication of WO2018233332A1 publication Critical patent/WO2018233332A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • the present invention relates to the field of storage technologies, and in particular, to a distributed storage memory management method, system, and computer storage medium.
  • the traditional network storage system uses a centralized storage server to store all data.
  • the storage server becomes a bottleneck of system performance, and is also the focus of reliability and security, and cannot meet the needs of large-scale storage applications.
  • the distributed network storage system adopts a scalable system structure, uses multiple storage servers to share the storage load, and uses the location server to locate the storage information, which not only improves the reliability, availability and access efficiency of the system, but also is easy to expand.
  • the existing memory management method cannot perform high-quality reading and writing operations on files by multiple machines.
  • the technical problem to be solved by the present invention is to provide an efficient distributed storage memory management method, system and computer storage medium.
  • a distributed storage memory management method includes:
  • the file corresponds to a data block type, and the file is divided into one or more data blocks of the same size in the memory, and an access context of the data block is generated, and the access context includes the data block guide and the file to which it belongs. coding;
  • the first node When a file is written at the second node, the lease of the file has been held by the first node, the first node receives the file forwarded from the second node and writes the file to the local storage of the first node And update the version.
  • the data is requested from the second node, and after the data is requested, the data is cached to the first node and the version is updated.
  • caching the data to the first node and updating the version further includes:
  • the latest version of the data is requested, and the data is cached to the first node after requesting the latest version of the data.
  • the method further includes: setting a working data block chain and an idle data block chain in the memory management module;
  • a data block is taken from the free blockchain chain and placed in the working data block chain.
  • the data blocks in the free blockchain are released at preset intervals and at a preset ratio.
  • the operating on the data block includes:
  • the method further includes: the memory is read-only into the file part of the data block for operation.
  • a distributed storage memory management system comprising:
  • a memory management module for dividing a memory management module into a plurality of data block types of different sizes
  • a writing module configured to write a file at the first node, and set a lease to the file, so that the file is bound to the first node;
  • a memory processing module configured to, in memory, correspond to a data block type, and divide the file into one or more data blocks of the same size in memory, and generate a data block access context, where the access context includes a data block Guidance and file encoding;
  • a storage module configured to store the data block and the access context in the first node and update the version
  • the writing module is further configured to: when a file is written in the second node, the lease of the file is already held by the first node, the first node receives the file forwarded from the second node, and the file is Write to the local storage of the first node and update the version.
  • system further includes:
  • a reading module configured to read data from one or more data blocks respectively at the first node; directly read the data if the data is in the first node; and request data from the second node if the data is in the second node, requesting data The data is then cached to the first node and the version is updated.
  • the reading module is further configured to: after requesting the data, determine the data local storage version and the global version; if the data version is the latest version, directly load; if the data version is lower than the global version, request the latest version of the data, request The latest version of the data is cached to the first node.
  • the memory management module includes a working data block chain and a free data block chain; and is further configured to operate the data block, and insert the data block into the working data block chain or the idle data block chain according to the state;
  • the memory management module is further configured to insert a data block in the working data block chain and insert the data block into the free data block chain; and take a data block from the free data block chain into the working data block chain.
  • the memory management module is further configured to release the data blocks in the idle data block chain by a preset period and at a preset ratio.
  • the memory processing module is further configured to: write, read, release, brush into a disk, synchronize to a network, delete, update a file size, and disable a file in a memory. .
  • a computer storage medium storing a program, the program performing the steps of any of the above.
  • a single file is stored in a memory and divided into one or more data blocks of the same size, a file is written in the first node, and a lease is set on the file, so that the file is bound to the first node.
  • the memory management module is divided into a plurality of data block types, a single file corresponds to a certain size data block type, and an access context of the data block is generated, and the access context includes a data block guide and a file encoding; the data block and the access context are stored.
  • the first node receives the file forwarded from the second node, and writes the file to the first
  • the node is stored locally and updated. Such data is written based on the operation of the data block, and because the file is written, it will be bound to the first node to set up a lease. If other nodes write the file, it will be merged into the first node, so that the file can support random write. There will be no problems. Can easily cope with concurrent write operations.
  • FIG. 1 is a flowchart of a distributed storage memory management method according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a method for requesting and releasing data blocks in an internal memory according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a method for writing a file according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a processing method when data is written to node A but the lease is held by node B according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a method for reading data from a node according to an embodiment of the present invention.
  • FIG. 6 is a block diagram of a distributed storage memory management system in accordance with an embodiment of the present invention.
  • a distributed storage memory management method includes steps S110-S150. among them:
  • S120 Write a file at the first node, and set a lease on the file, so that the file is bound to the first node;
  • the file corresponds to a data block type, and the file is divided into one or more data blocks of the same size in the memory, and an access context of the data block is generated, where the access context includes a data block guide and File of the file;
  • S140 Store the data block and the access context in the first node and update the version
  • the writing of the data is based on the operation of the data block, and because the file is written and bound to the first node for binding, if the other node writes the file, it will be merged into the first A node that enables files to support random writes without problems. Can easily cope with concurrent write operations. Consistency is achieved by using the lease concept to write the same file to a machine. Use the version number concept to implement cross-network read caching.
  • the method further includes:
  • the data is requested from the second node, and after the data is requested, the data is cached to the first node and the version is updated.
  • caching the data to the first node and updating the version further includes:
  • the latest version of the data is requested, and the data is cached to the first node after requesting the latest version of the data.
  • the reading of data is also based on the operation of the data block in the distributed storage, if the read node of the file holds the lease, that is, the data is stored in the node, the data is directly read, if the data is in other nodes, then Request data reading from other nodes to make the data uniform and support out-of-order reading.
  • the version judgment mechanism is also set when reading data, so that the read data is the latest. Enables the file to support random reads based on the ability to support random writes.
  • the method further includes:
  • a data block is taken from the free blockchain chain and placed in the working data block chain.
  • the data blocks are respectively operated, which is more convenient and flexible.
  • the data blocks are added to the working data block chain and the idle data block chain, and can be performed between the working data block chain and the idle data block chain. Conversion makes it easy to manage and control multiple data blocks and files, making overall memory management more efficient.
  • the method further includes: setting different number of blocks corresponding to the plurality of data blocks, thereby facilitating efficient and rational allocation of resources of the memory.
  • the method further includes: releasing the data blocks in the idle data block chain by a preset period and at a preset ratio.
  • the memory management module releases the data blocks in the idle data block chain according to the preset period and according to the preset ratio, so as to avoid occupying a large amount of memory all the time.
  • the preset period can be automatically set according to the system, or can be set by the user, and the preset ratio can be automatically set according to the system. It can also be set by the user.
  • the operation of the data block includes: at least one operation of writing, reading, releasing, brushing into a disk, synchronizing into a network, deleting, updating a file size, and version failure of a file in a memory.
  • a single block of data in memory can be manipulated in the following order: write, read, release, flush to disk, synchronize to the network, delete, update file size, version invalidation, and so on.
  • the method further includes: the memory is read-only into the file part of the data block for operation, and the efficiency is improved.
  • node roles are divided into node and mds (meta data service), node is responsible for specific reading and writing, and mds node is responsible for metadata management and coordination of node nodes.
  • mds metal data service
  • the method for requesting and releasing data blocks in the memory includes: a read module in the memory, a write module, buffer recovery, data synchronization to other nodes, read-ahead from other nodes, etc., and data block access is generated.
  • Context the context includes the data block index (guide) and the file ID shown; then enters the memory data block request; then combines the LRU algorithm (Least Recently Used) with the access context to trigger the data block release of the working data block chain
  • LRU algorithm Least Recently Used
  • the data block after the data release is put into the idle data block chain; the idle data block chain applies for the data block to the memory data block; the work data block chain also brushes the data into the physical storage as required, and the memory data block applies to the working data block.
  • Chain and free blockchains trigger block release on a periodic basis.
  • the writing of the file specifically includes:
  • the node node requests the mds node to write a file to the lease.
  • the node node divides the file into data block A, data block B, and may also include more data blocks. This embodiment only exemplifies two data blocks, but does not limit the number of data blocks. Multiple blocks are generated in memory and data blocks are written to different blocks. If there is no data block content in the local storage, the data blocks are loaded from other nodes according to the reading process.
  • Block A and Block B are asynchronously flushed to disk and updated by policy.
  • the node A node requests the mds node to write a file to the lease, but the lease has been held by the node B node.
  • the node A node divides the file into data block A and data block B, and may also include more data blocks. This embodiment only exemplifies two data blocks, but does not limit the number of data blocks.
  • Data block A and data block B forward the data block write content to the node B node.
  • the node B node then writes the data locally and updates the file version.
  • the method for reading data from a node includes:
  • the data is read from the node A node, and the data is read from the data block A and the data block B respectively.
  • more data blocks can be included. In this embodiment, only two data blocks are exemplified, but the number of data blocks is not limited. .
  • the file local storage version and the global version are determined. If the file version is the latest version, the file is directly loaded into the local storage. If the file local storage version is lower than the global version, the latest version is requested from the node B node. Then node B requests the file to cache the data locally and records the version number.
  • node B If the data block is not stored locally, node B requests the file to cache the data locally and records the version number.
  • Another preferred embodiment of the present invention is a distributed storage memory management system including a memory management module 210, a write module 220, a memory processing module 230, and a storage module 240.
  • the memory management module 210 is configured to divide the memory management module into a plurality of data block types of different sizes.
  • the writing module 220 is configured to write a file at the first node, and set a lease for the file, so that the file is bound to the first node.
  • the memory processing module 230 is configured to: in the memory, associate the file with a data block type, and divide the file into one or more data blocks of the same size in the memory, and generate a data block access context, where the access context includes data. Block navigation and associated file encoding.
  • the storage module 240 is configured to store the data block and the access context in the first node and update the version.
  • the writing module is further configured to: when a file is written in the second node, the lease of the file is already held by the first node, the first node receives the file forwarded from the second node, and the file is Write to the local storage of the first node and update the version.
  • the writing of the data is based on the operation of the data block, and because the file is written and bound to the first node for binding, if the other node writes the file, it will be merged into the first A node that enables files to support random writes without problems. It is easy to deal with concurrent write operations. Consistency is achieved by using the lease concept to write the same file to a machine. Use the version number concept to implement cross-network read caching.
  • the system further includes a reading module, configured to read data from the one or more data blocks respectively at the first node; if the data is in the first node, directly read; if the data is in the second node, Request data from the second node, request the data, cache the data to the first node, and update the version.
  • a reading module configured to read data from the one or more data blocks respectively at the first node; if the data is in the first node, directly read; if the data is in the second node, Request data from the second node, request the data, cache the data to the first node, and update the version.
  • the reading module is further configured to: after requesting the data, determine the data local storage version and the global version; if the data version is the latest version, directly load; if the data version is lower than the global version, request the latest version of the data, request The latest version of the data is cached to the first node.
  • the reading of data is also based on the operation of the data block in the distributed storage, if the read node of the file holds the lease, that is, the data is stored in the node, the data is directly read, if the data is in other nodes, then Request data reading from other nodes to make the data uniform and support out-of-order reading.
  • the version judgment mechanism is also set when reading data, so that the read data is the latest. Enables the file to support random reads based on the ability to support random writes.
  • the memory processing module 230 includes a working data block chain and a free data block chain; and is further configured to operate on the data block, and insert the data block into the working data block chain or the idle data block chain according to the state; the memory management module further uses After releasing the data block in the working data block chain, insert the free data block chain; take a data block from the free data block chain and put it into the working data block chain.
  • the data blocks are respectively operated, which is more convenient and flexible.
  • the data blocks are added to the working data block chain and the idle data block chain, and can be performed between the working data block chain and the idle data block chain. Conversion makes it easy to manage and control multiple data blocks and files, making overall memory management more efficient.
  • the memory processing module is further configured to set a different number of blocks corresponding to the plurality of data blocks, thereby facilitating efficient and rational allocation of resources.
  • the memory processing module is further configured to release the data blocks in the idle data block chain by a preset period and at a preset ratio.
  • the memory processing module releases the data blocks in the idle data block chain according to a preset period and according to a preset ratio, so as to avoid occupying a large amount of memory all the time.
  • the preset period may be automatically set according to the system, or may be set by the user, and the preset ratio may be automatically according to the system. Settings can also be set by the user.
  • the memory processing module is further configured to: write, read, release, flush to a disk, synchronize to the network, delete, update a file size, and disable a single data block in a file. operating.
  • a single block of data in memory can be manipulated in the following order: write, read, release, flush to disk, synchronize to the network, delete, update file size, version invalidation, and so on.
  • Another preferred embodiment of the present invention is a computer storage medium having a program stored thereon, the program execution comprising the steps of any of the above embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A distributed storage internal storage management method and system, and a computer storage medium, the method comprising: dividing an internal storage management module into a plurality of different size data block types (S110); writing a file to a first node, and setting a lease for the file, such that the file is bound to the first node (S120); in the internal storage, corresponding the file to a data block type, dividing the file in the internal storage into one or a plurality of data blocks of the same size, and simultaneously generating data block access context, the data block access context comprising data block navigation and the associated file coding (S130); storing the data blocks and the access context in the first node and updating the version (S140); when a file is written to the second node and the lease of the file is already held by the first node, then the first node receives the file forwarded by the second node, writes the file to the local storage of the first node, and updates the version (S150). The data writing is a data block-based operation, such that the file can support random writing in the distributed storage.

Description

一种分布式存储内存管理方法、系统及计算机存储介质Distributed storage memory management method, system and computer storage medium 技术领域Technical field
本发明涉及存储技术领域,更具体的说,涉及一种分布式存储内存管理方法、系统及计算机存储介质。The present invention relates to the field of storage technologies, and in particular, to a distributed storage memory management method, system, and computer storage medium.
背景技术Background technique
分布式存储系统,是将数据分散存储在多台独立的设备上。传统的网络存储系统采用集中的存储服务器存放所有数据,存储服务器成为系统性能的瓶颈,也是可靠性和安全性的焦点,不能满足大规模存储应用的需要。分布式网络存储系统采用可扩展的系统结构,利用多台存储服务器分担存储负荷,利用位置服务器定位存储信息,它不但提高了系统的可靠性、可用性和存取效率,还易于扩展。Distributed storage systems distribute data across multiple independent devices. The traditional network storage system uses a centralized storage server to store all data. The storage server becomes a bottleneck of system performance, and is also the focus of reliability and security, and cannot meet the needs of large-scale storage applications. The distributed network storage system adopts a scalable system structure, uses multiple storage servers to share the storage load, and uses the location server to locate the storage information, which not only improves the reliability, availability and access efficiency of the system, but also is easy to expand.
分布式存储系统中的,现有的内存管理方法对多台机器对文件进行读写操作无法高效的完成。In the distributed storage system, the existing memory management method cannot perform high-quality reading and writing operations on files by multiple machines.
发明内容Summary of the invention
本发明所要解决的技术问题是提供一种高效的分布式存储内存管理方法、系统及计算机存储介质。The technical problem to be solved by the present invention is to provide an efficient distributed storage memory management method, system and computer storage medium.
本发明的目的是通过以下技术方案来实现的:The object of the present invention is achieved by the following technical solutions:
一种分布式存储内存管理方法,包括:A distributed storage memory management method includes:
将内存管理模块分成多种大小不同的数据块类型;Divide the memory management module into a plurality of data block types of different sizes;
在第一节点写入文件,并对所述文件设置租约,使所述文件与第一节点绑定;Writing a file at the first node, and setting a lease to the file, so that the file is bound to the first node;
在内存中,将文件对应一种数据块类型,并将文件在内存中分成一个或多个大小相同的数据块,同时生成数据块的访问上下文,所述访问上下文包括数据块导引和所属文件编码;In the memory, the file corresponds to a data block type, and the file is divided into one or more data blocks of the same size in the memory, and an access context of the data block is generated, and the access context includes the data block guide and the file to which it belongs. coding;
将所述数据块和访问上下文存储在第一节点并更新版本;Storing the data block and the access context in the first node and updating the version;
当在第二节点写入文件,所述文件的租约已被第一节点持有,则所述第一节点接收从第二节点转发的文件,并将所述文件写入第一节点的本地存储并更新版本。When a file is written at the second node, the lease of the file has been held by the first node, the first node receives the file forwarded from the second node and writes the file to the local storage of the first node And update the version.
进一步的,还包括:Further, it also includes:
在第一节点分别从一个或多个数据块中读取数据;Reading data from one or more data blocks at the first node;
如果数据在第一节点则直接读取;If the data is in the first node, read it directly;
如果数据在第二节点则向第二节点请求数据,请求数据后将数据缓存到第一节点并更新版本。If the data is in the second node, the data is requested from the second node, and after the data is requested, the data is cached to the first node and the version is updated.
进一步的,所述请求数据后将数据缓存到第一节点并更新版本还包括:Further, after the requesting the data, caching the data to the first node and updating the version further includes:
请求数据后判断数据本地存储版本与全局版本;After requesting the data, determine the data local storage version and the global version;
如果数据版本为最新的版本则直接加载;Load directly if the data version is the latest version;
如果数据版本低于全局版本,则请求最新版本的数据,请求最新版本的数据后将数据缓存到第一节点。If the data version is lower than the global version, the latest version of the data is requested, and the data is cached to the first node after requesting the latest version of the data.
进一步的,还包括:内存管理模块中设置工作数据块链和空闲数据块链;Further, the method further includes: setting a working data block chain and an idle data block chain in the memory management module;
对数据块进行操作,并将数据块根据状态插入工作数据块链或空闲数据块链;Operate the data block and insert the data block into the working data block chain or the idle data block chain according to the state;
将工作数据块链中的数据块释放后插入空闲数据块链;Inserting the data block in the working data block chain into the free data block chain;
从空闲数据块链中取出一个数据块放入工作数据块链。A data block is taken from the free blockchain chain and placed in the working data block chain.
进一步的,还包括:Further, it also includes:
按预设周期且按预设比例释放空闲数据块链中的数据块。The data blocks in the free blockchain are released at preset intervals and at a preset ratio.
进一步的,所述对数据块进行操作包括:Further, the operating on the data block includes:
将文件在内存中的单个数据块进行写、读、释放、刷入磁盘、同步到网络中、删除、更新文件大小、版本失效中的至少一种操作。At least one of writing, reading, releasing, swiping into a disk, synchronizing to a network, deleting, updating a file size, or failing a version of a file in memory.
进一步的,还包括:内存只读入文件部分数据块进行操作。Further, the method further includes: the memory is read-only into the file part of the data block for operation.
一种分布式存储内存管理系统,包括:A distributed storage memory management system comprising:
内存管理模块,用于将内存管理模块分成多种大小不同的数据块类型;a memory management module for dividing a memory management module into a plurality of data block types of different sizes;
写入模块,用于在第一节点写入文件,并对所述文件设置租约,使所述文件与第一节点绑定;a writing module, configured to write a file at the first node, and set a lease to the file, so that the file is bound to the first node;
内存处理模块,用于在内存中,将文件对应一种数据块类型,并将文件在内存中分成一个或多个大小相同的数据块,同时生成数据块访问上下文,所述访问上下文包括数据块导引和所属文件编码;a memory processing module, configured to, in memory, correspond to a data block type, and divide the file into one or more data blocks of the same size in memory, and generate a data block access context, where the access context includes a data block Guidance and file encoding;
存储模块,用于将所述数据块和访问上下文存储在第一节点并更新版本;a storage module, configured to store the data block and the access context in the first node and update the version;
所述写入模块还用于当在第二节点写入文件,所述文件的租约已被第一节点持有,则所述第一节点接收从第二节点转发的文件,并将所述文件写入第一节点的本地存储并更新版本。The writing module is further configured to: when a file is written in the second node, the lease of the file is already held by the first node, the first node receives the file forwarded from the second node, and the file is Write to the local storage of the first node and update the version.
进一步的,所述系统还包括:Further, the system further includes:
读取模块,用于在第一节点分别从一个或多个数据块中读取数据;如果数据在第一节点则直接读取;如果数据在第二节点则向第二节点请求数据,请求数据后将数据缓存到第一节点并更新版本。a reading module, configured to read data from one or more data blocks respectively at the first node; directly read the data if the data is in the first node; and request data from the second node if the data is in the second node, requesting data The data is then cached to the first node and the version is updated.
进一步的,所述读取模块还用于请求数据后判断数据本地存储版本与全局版本;如果数据版本为最新的版本则直接加载;如果数据版本低于全局版本,则请求最新版本的数据,请求最新版本的数据后将数据缓存到第一节点。Further, the reading module is further configured to: after requesting the data, determine the data local storage version and the global version; if the data version is the latest version, directly load; if the data version is lower than the global version, request the latest version of the data, request The latest version of the data is cached to the first node.
进一步的,所述内存管理模块包括工作数据块链和空闲数据块链;还用于对数据块进行操作,并将数据块根据状态插入工作数据块链或空闲数据块链;Further, the memory management module includes a working data block chain and a free data block chain; and is further configured to operate the data block, and insert the data block into the working data block chain or the idle data block chain according to the state;
所述内存管理模块还用于将工作数据块链中的数据块释放后插入空闲数据块链;从空闲数据块链中取出一个数据块放入工作数据块链。The memory management module is further configured to insert a data block in the working data block chain and insert the data block into the free data block chain; and take a data block from the free data block chain into the working data block chain.
进一步的,所述内存管理模块还用于按预设周期且按预设比例释放空闲数据块链中的数据块。Further, the memory management module is further configured to release the data blocks in the idle data block chain by a preset period and at a preset ratio.
进一步的,所述内存处理模块还用于将文件在内存中的单个数据块进行写、读、释放、刷入磁盘、同步到网络中、删除、更新文件大小、版本失效中的至少一种操作。Further, the memory processing module is further configured to: write, read, release, brush into a disk, synchronize to a network, delete, update a file size, and disable a file in a memory. .
一种计算机存储介质,所述计算机存储介质可存储有程序,所述程序执行包括上述任一项所述的步骤。A computer storage medium storing a program, the program performing the steps of any of the above.
本发明由于在分布式存储中,单一文件存储在内存中被分割成一个或 多个相同大小的数据块,在第一节点写入文件,并对文件设置租约,使文件与第一节点绑定;内存管理模块分多种大小的数据块类型,单一文件对应某种大小数据块类型,同时生成数据块的访问上下文,访问上下文包括数据块导引和所属文件编码;将数据块和访问上下文存储在第一节点并更新版本;当在第二节点写入文件,文件的租约已被第一节点持有,则第一节点接收从第二节点转发的文件,并将所述文件写入第一节点的本地存储并更新版本。如此数据的写入是基于数据块的操作,而且因为文件写入后会与第一节点进行设置租约进行绑定,如果其他节点写入文件则都会并入第一节点,使文件能够支持随机写而不会出现问题。能够轻松应对并发乱序写操作。In the present invention, in a distributed storage, a single file is stored in a memory and divided into one or more data blocks of the same size, a file is written in the first node, and a lease is set on the file, so that the file is bound to the first node. The memory management module is divided into a plurality of data block types, a single file corresponds to a certain size data block type, and an access context of the data block is generated, and the access context includes a data block guide and a file encoding; the data block and the access context are stored. At the first node and updating the version; when the file is written at the second node, the lease of the file has been held by the first node, the first node receives the file forwarded from the second node, and writes the file to the first The node is stored locally and updated. Such data is written based on the operation of the data block, and because the file is written, it will be bound to the first node to set up a lease. If other nodes write the file, it will be merged into the first node, so that the file can support random write. There will be no problems. Can easily cope with concurrent write operations.
附图说明DRAWINGS
图1是本发明实施例的一种分布式存储内存管理方法的流程图;1 is a flowchart of a distributed storage memory management method according to an embodiment of the present invention;
图2是本发明实施例的一种内存中数据块申请与释放的方法的示意图;2 is a schematic diagram of a method for requesting and releasing data blocks in an internal memory according to an embodiment of the present invention;
图3是本发明实施例的一种写入文件方法的示意图;3 is a schematic diagram of a method for writing a file according to an embodiment of the present invention;
图4是本发明实施例的一种向node A写入数据,但租约被node B持有时的处理方法的示意图;4 is a schematic diagram of a processing method when data is written to node A but the lease is held by node B according to an embodiment of the present invention;
图5是本发明实施例的一种从节点中读取数据的方法的示意图;FIG. 5 is a schematic diagram of a method for reading data from a node according to an embodiment of the present invention; FIG.
图6是本发明实施例的一种分布式存储内存管理系统的框图。6 is a block diagram of a distributed storage memory management system in accordance with an embodiment of the present invention.
具体实施方式Detailed ways
在更加详细地讨论示例性实施例之前应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各项操作描述成顺序的处理,但是其中的许多操作可以被并行地、并发地或者同时实施。此外,各项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。Before discussing the exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as a process or method depicted as a flowchart. Although the flowcharts describe various operations as a sequential process, many of the operations can be implemented in parallel, concurrently or concurrently. In addition, the order of operations can be rearranged. The process may be terminated when its operation is completed, but may also have additional steps not included in the figures.
还应当提到的是,在一些替换实现方式中,所提到的功能/动作可以按照不同于附图中标示的顺序发生。举例来说,取决于所涉及的功能/动作,相继示出的两幅图实际上可以基本上同时执行或者有时可以按照相反的顺 序来执行。It should also be noted that in some alternative implementations, the functions/acts noted may occur in a different order than that illustrated in the drawings. For example, two figures shown in succession may in fact be executed substantially concurrently or sometimes in the reverse order, depending on the function/acts involved.
下面结合附图和较佳的实施例对本发明作进一步说明。The invention will now be further described with reference to the drawings and preferred embodiments.
如图1所示,一种分布式存储内存管理方法,包括步骤S110-S150。其中:As shown in FIG. 1, a distributed storage memory management method includes steps S110-S150. among them:
S110:将内存管理模块分成多种大小不同的数据块类型;S110: Divide the memory management module into a plurality of data block types of different sizes;
S120:在第一节点写入文件,并对所述文件设置租约,使所述文件与第一节点绑定;S120: Write a file at the first node, and set a lease on the file, so that the file is bound to the first node;
S130:在内存中,将文件对应一种数据块类型,并将文件在内存中分成一个或多个大小相同的数据块,同时生成数据块的访问上下文,所述访问上下文包括数据块导引和所属文件编码;S130: In the memory, the file corresponds to a data block type, and the file is divided into one or more data blocks of the same size in the memory, and an access context of the data block is generated, where the access context includes a data block guide and File of the file;
S140:将所述数据块和访问上下文存储在第一节点并更新版本;S140: Store the data block and the access context in the first node and update the version;
S150:当在第二节点写入文件,所述文件的租约已被第一节点持有,则所述第一节点接收从第二节点转发的文件,并将所述文件写入第一节点的本地存储并更新版本。S150: When a file is written in the second node, the lease of the file is already held by the first node, the first node receives the file forwarded from the second node, and writes the file to the first node. Store and update the version locally.
本实施例由于在分布式存储中,数据的写入是基于数据块的操作,而且因为文件写入后会与第一节点进行设置租约进行绑定,如果其他节点写入文件则都会并入第一节点,使文件能够支持随机写而不会出现问题。能够轻松应对并发乱序写操作。使用租约概念归并对同一文件的写到一台机器上,从而实现一致性。使用版本号概念来实现跨网络读取缓存。In this embodiment, in the distributed storage, the writing of the data is based on the operation of the data block, and because the file is written and bound to the first node for binding, if the other node writes the file, it will be merged into the first A node that enables files to support random writes without problems. Can easily cope with concurrent write operations. Consistency is achieved by using the lease concept to write the same file to a machine. Use the version number concept to implement cross-network read caching.
可选地,该方法还包括:Optionally, the method further includes:
在第一节点分别从一个或多个数据块中读取数据;Reading data from one or more data blocks at the first node;
如果数据在第一节点则直接读取;If the data is in the first node, read it directly;
如果数据在第二节点则向第二节点请求数据,请求数据后将数据缓存到第一节点并更新版本。If the data is in the second node, the data is requested from the second node, and after the data is requested, the data is cached to the first node and the version is updated.
进一步的,所述请求数据后将数据缓存到第一节点并更新版本还包括:Further, after the requesting the data, caching the data to the first node and updating the version further includes:
请求数据后判断数据本地存储版本与全局版本;After requesting the data, determine the data local storage version and the global version;
如果数据版本为最新的版本则直接加载;Load directly if the data version is the latest version;
如果数据版本低于全局版本,则请求最新版本的数据,请求最新版本的数据后将数据缓存到第一节点。If the data version is lower than the global version, the latest version of the data is requested, and the data is cached to the first node after requesting the latest version of the data.
本实施例由于在分布式存储中,数据的读取同样是基于数据块的操作,文件的读取节点如果持有租约即数据存储在该节点,则直接读取,如果数据在其他节点,则向其他节点申请数据读取,使数据统一,支持乱序读。读取数据时还设置版本判断机制,使读取的数据都是最新的。使文件在能够支持随机写的基础上支持随机读。In this embodiment, since the reading of data is also based on the operation of the data block in the distributed storage, if the read node of the file holds the lease, that is, the data is stored in the node, the data is directly read, if the data is in other nodes, then Request data reading from other nodes to make the data uniform and support out-of-order reading. The version judgment mechanism is also set when reading data, so that the read data is the latest. Enables the file to support random reads based on the ability to support random writes.
可选的,该方法还包括:Optionally, the method further includes:
内存管理模块中设置工作数据块链和空闲数据块链;Setting a working data block chain and an idle data block chain in the memory management module;
对数据块进行操作,并将数据块根据状态插入工作数据块链或空闲数据块链;Operate the data block and insert the data block into the working data block chain or the idle data block chain according to the state;
将工作数据块链中的数据块释放后插入空闲数据块链;Inserting the data block in the working data block chain into the free data block chain;
从空闲数据块链中取出一个数据块放入工作数据块链。A data block is taken from the free blockchain chain and placed in the working data block chain.
本实施例由于文件分成多个数据块,分别对数据块进行操作,更加方便灵活,数据块加入工作数据块链和空闲数据块链,并且可以在工作数据块链和空闲数据块链之间进行转换,方便管理控制多个数据块和文件,从而使整个内存管理更加高效。In this embodiment, since the file is divided into a plurality of data blocks, the data blocks are respectively operated, which is more convenient and flexible. The data blocks are added to the working data block chain and the idle data block chain, and can be performed between the working data block chain and the idle data block chain. Conversion makes it easy to manage and control multiple data blocks and files, making overall memory management more efficient.
可选地,该方法还包括:对应多种数据块设置不同的块数量,利于内存的有效合理分配资源。Optionally, the method further includes: setting different number of blocks corresponding to the plurality of data blocks, thereby facilitating efficient and rational allocation of resources of the memory.
可选地,该方法还包括:按预设周期且按预设比例释放空闲数据块链中的数据块。Optionally, the method further includes: releasing the data blocks in the idle data block chain by a preset period and at a preset ratio.
内存管理模块按预设周期且按预设比例释放空闲数据块链中的数据块,避免一直占用大量内存预设周期可以根据系统自动设置,也可以用户设置,预设比例可以根据系统自动设置,也可以用户设置。The memory management module releases the data blocks in the idle data block chain according to the preset period and according to the preset ratio, so as to avoid occupying a large amount of memory all the time. The preset period can be automatically set according to the system, or can be set by the user, and the preset ratio can be automatically set according to the system. It can also be set by the user.
其中,所述对数据块进行操作包括:将文件在内存中的单个数据块进行写、读、释放、刷入磁盘、同步到网络中、删除、更新文件大小、版本失效中的至少一种操作。文件在内存中的单个数据块可被按照不同顺序做如下操作:写、读、释放、刷入磁盘、同步到网络中、删除、更新文件大小、版本失效等。The operation of the data block includes: at least one operation of writing, reading, releasing, brushing into a disk, synchronizing into a network, deleting, updating a file size, and version failure of a file in a memory. . A single block of data in memory can be manipulated in the following order: write, read, release, flush to disk, synchronize to the network, delete, update file size, version invalidation, and so on.
可选地,该方法还包括:内存只读入文件部分数据块进行操作,提高效率。Optionally, the method further includes: the memory is read-only into the file part of the data block for operation, and the efficiency is improved.
分布式存储中节点角色分node、mds(meta data service),node负责具体读写,mds节点负责元数据管理以及协调node节点。In the distributed storage, the node roles are divided into node and mds (meta data service), node is responsible for specific reading and writing, and mds node is responsible for metadata management and coordination of node nodes.
如图2所示,内存中数据块申请与释放的方法,具体包括:内存中的读模块、写模块,缓冲回收、数据同步到其他节点、从其他节点预读等操作,都会生成数据块访问上下文,该上下文包括数据块index(导引)和所示文件ID;然后进入内存数据块申请;接着结合LRU算法(Least Recently Used近期最少使用算法)与访问上下文触发工作数据块链的数据块释放,数据释放后的数据块放入空闲数据块链;空闲数据块链向内存数据块申请申请数据块;工作数据块链还将数据按需求刷入物理存储,内存数据块申请还对工作数据块链和空闲数据块链按周期触发数据块释放。As shown in FIG. 2, the method for requesting and releasing data blocks in the memory includes: a read module in the memory, a write module, buffer recovery, data synchronization to other nodes, read-ahead from other nodes, etc., and data block access is generated. Context, the context includes the data block index (guide) and the file ID shown; then enters the memory data block request; then combines the LRU algorithm (Least Recently Used) with the access context to trigger the data block release of the working data block chain The data block after the data release is put into the idle data block chain; the idle data block chain applies for the data block to the memory data block; the work data block chain also brushes the data into the physical storage as required, and the memory data block applies to the working data block. Chain and free blockchains trigger block release on a periodic basis.
在整个分布式系统中,数据流转的时候会基于数据块,可以避免各种模块之间的冲突。In the entire distributed system, data flow is based on data blocks, and conflicts between various modules can be avoided.
如图3所示,写入文件具体包括:As shown in FIG. 3, the writing of the file specifically includes:
向node节点写入文件。Write a file to the node node.
node节点向mds节点申请文件写入租约。The node node requests the mds node to write a file to the lease.
node节点将文件分成数据块A、数据块B,还可以包括更多的数据块,本实施例仅举例两个数据块,但并不对数据块的数量进行限制。内存中生成多个块并向不同块写入数据块,如果本地存储不存在数据块内容则按读取流程从其他节点加载数据块。The node node divides the file into data block A, data block B, and may also include more data blocks. This embodiment only exemplifies two data blocks, but does not limit the number of data blocks. Multiple blocks are generated in memory and data blocks are written to different blocks. If there is no data block content in the local storage, the data blocks are loaded from other nodes according to the reading process.
数据块A和数据块B都按策略异步刷入磁盘并更新版本。Both Block A and Block B are asynchronously flushed to disk and updated by policy.
如图4所示,向node A写入数据,但租约被node B持有时的处理方法:As shown in Figure 4, the data is written to node A, but the lease is handled by node B:
向node A节点写入数据。Write data to the node A node.
node A节点向mds节点申请文件写入租约,但租约已被node B节点持有。The node A node requests the mds node to write a file to the lease, but the lease has been held by the node B node.
node A节点将文件分成数据块A和数据块B,还可以包括更多的数据块,本实施例仅举例两个数据块,但并不对数据块的数量进行限制。The node A node divides the file into data block A and data block B, and may also include more data blocks. This embodiment only exemplifies two data blocks, but does not limit the number of data blocks.
数据块A和数据块B将数据块写入内容转发到node B节点。Data block A and data block B forward the data block write content to the node B node.
node B节点然后将数据写入本地并更新文件版本。The node B node then writes the data locally and updates the file version.
如图5所示,从节点中读取数据的方法,具体包括:As shown in FIG. 5, the method for reading data from a node includes:
在node A节点读取数据,分别从数据块A和数据块B读取数据,当然还可以包括更多的数据块,本实施例仅举例两个数据块,但并不对数据块的数量进行限制。The data is read from the node A node, and the data is read from the data block A and the data block B respectively. Of course, more data blocks can be included. In this embodiment, only two data blocks are exemplified, but the number of data blocks is not limited. .
如果数据在节点B则向B节点请求数据。If the data is at Node B, then request data from Node B.
如果本地没有存储该数据块,则判断文件本地存储版本与全局版本,如果文件版本为最新版本则直接加载到本地存储,如果文件本地存储版本低于全局版本,则向node B节点请求最新版本,然后node B请求文件后将数据缓存到本地,并记录版本号。If the data block is not stored locally, the file local storage version and the global version are determined. If the file version is the latest version, the file is directly loaded into the local storage. If the file local storage version is lower than the global version, the latest version is requested from the node B node. Then node B requests the file to cache the data locally and records the version number.
如果本地没有存储该数据块,则node B请求文件后将数据缓存到本地,并记录版本号。If the data block is not stored locally, node B requests the file to cache the data locally and records the version number.
本发明的另一优选实施例,一种分布式存储内存管理系统,包括内存管理模块210、写入模块220、内存处理模块230和存储模块240。Another preferred embodiment of the present invention is a distributed storage memory management system including a memory management module 210, a write module 220, a memory processing module 230, and a storage module 240.
其中,内存管理模块210,用于将内存管理模块分成多种大小不同的数据块类型。The memory management module 210 is configured to divide the memory management module into a plurality of data block types of different sizes.
写入模块220,用于在第一节点写入文件,并对所述文件设置租约,使所述文件与第一节点绑定。The writing module 220 is configured to write a file at the first node, and set a lease for the file, so that the file is bound to the first node.
内存处理模块230,用于在内存中,将文件对应一种数据块类型,并将文件在内存中分成一个或多个大小相同的数据块,同时生成数据块访问上下文,所述访问上下文包括数据块导引和所属文件编码。The memory processing module 230 is configured to: in the memory, associate the file with a data block type, and divide the file into one or more data blocks of the same size in the memory, and generate a data block access context, where the access context includes data. Block navigation and associated file encoding.
存储模块240,用于将所述数据块和访问上下文存储在第一节点并更新版本。The storage module 240 is configured to store the data block and the access context in the first node and update the version.
所述写入模块还用于当在第二节点写入文件,所述文件的租约已被第一节点持有,则所述第一节点接收从第二节点转发的文件,并将所述文件写入第一节点的本地存储并更新版本。The writing module is further configured to: when a file is written in the second node, the lease of the file is already held by the first node, the first node receives the file forwarded from the second node, and the file is Write to the local storage of the first node and update the version.
本实施例由于在分布式存储中,数据的写入是基于数据块的操作,而且因为文件写入后会与第一节点进行设置租约进行绑定,如果其他节点写入文件则都会并入第一节点,使文件能够支持随机写而不会出现问题。能 够轻松应对并发乱序写操作。使用租约概念归并对同一文件的写到一台机器上,从而实现一致性。使用版本号概念来实现跨网络读取缓存。In this embodiment, in the distributed storage, the writing of the data is based on the operation of the data block, and because the file is written and bound to the first node for binding, if the other node writes the file, it will be merged into the first A node that enables files to support random writes without problems. It is easy to deal with concurrent write operations. Consistency is achieved by using the lease concept to write the same file to a machine. Use the version number concept to implement cross-network read caching.
可选的,所述系统还包括读取模块,用于在第一节点分别从一个或多个数据块中读取数据;如果数据在第一节点则直接读取;如果数据在第二节点则向第二节点请求数据,请求数据后将数据缓存到第一节点并更新版本。Optionally, the system further includes a reading module, configured to read data from the one or more data blocks respectively at the first node; if the data is in the first node, directly read; if the data is in the second node, Request data from the second node, request the data, cache the data to the first node, and update the version.
进一步的,所述读取模块还用于请求数据后判断数据本地存储版本与全局版本;如果数据版本为最新的版本则直接加载;如果数据版本低于全局版本,则请求最新版本的数据,请求最新版本的数据后将数据缓存到第一节点。Further, the reading module is further configured to: after requesting the data, determine the data local storage version and the global version; if the data version is the latest version, directly load; if the data version is lower than the global version, request the latest version of the data, request The latest version of the data is cached to the first node.
本实施例由于在分布式存储中,数据的读取同样是基于数据块的操作,文件的读取节点如果持有租约即数据存储在该节点,则直接读取,如果数据在其他节点,则向其他节点申请数据读取,使数据统一,支持乱序读。读取数据时还设置版本判断机制,使读取的数据都是最新的。使文件在能够支持随机写的基础上支持随机读。所述内存处理模块230包括工作数据块链和空闲数据块链;还用于对数据块进行操作,并将数据块根据状态插入工作数据块链或空闲数据块链;所述内存管理模块还用于将工作数据块链中的数据块释放后插入空闲数据块链;从空闲数据块链中取出一个数据块放入工作数据块链。In this embodiment, since the reading of data is also based on the operation of the data block in the distributed storage, if the read node of the file holds the lease, that is, the data is stored in the node, the data is directly read, if the data is in other nodes, then Request data reading from other nodes to make the data uniform and support out-of-order reading. The version judgment mechanism is also set when reading data, so that the read data is the latest. Enables the file to support random reads based on the ability to support random writes. The memory processing module 230 includes a working data block chain and a free data block chain; and is further configured to operate on the data block, and insert the data block into the working data block chain or the idle data block chain according to the state; the memory management module further uses After releasing the data block in the working data block chain, insert the free data block chain; take a data block from the free data block chain and put it into the working data block chain.
本实施例由于文件分成多个数据块,分别对数据块进行操作,更加方便灵活,数据块加入工作数据块链和空闲数据块链,并且可以在工作数据块链和空闲数据块链之间进行转换,方便管理控制多个数据块和文件,从而使整个内存管理更加高效。In this embodiment, since the file is divided into a plurality of data blocks, the data blocks are respectively operated, which is more convenient and flexible. The data blocks are added to the working data block chain and the idle data block chain, and can be performed between the working data block chain and the idle data block chain. Conversion makes it easy to manage and control multiple data blocks and files, making overall memory management more efficient.
可选地,所述内存处理模块还用于对应多种数据块设置不同的块数量,利于内存的有效合理分配资源。Optionally, the memory processing module is further configured to set a different number of blocks corresponding to the plurality of data blocks, thereby facilitating efficient and rational allocation of resources.
可选地,所述内存处理模块还用于按预设周期且按预设比例释放空闲数据块链中的数据块。所述内存处理模块按预设周期且按预设比例释放空闲数据块链中的数据块,避免一直占用大量内存预设周期可以根据系统自动设置,也可以用户设置,预设比例可以根据系统自动设置,也可以用户 设置。Optionally, the memory processing module is further configured to release the data blocks in the idle data block chain by a preset period and at a preset ratio. The memory processing module releases the data blocks in the idle data block chain according to a preset period and according to a preset ratio, so as to avoid occupying a large amount of memory all the time. The preset period may be automatically set according to the system, or may be set by the user, and the preset ratio may be automatically according to the system. Settings can also be set by the user.
可选地,所述内存处理模块还用于将文件在内存中的单个数据块进行写、读、释放、刷入磁盘、同步到网络中、删除、更新文件大小、版本失效中的至少一种操作。文件在内存中的单个数据块可被按照不同顺序做如下操作:写、读、释放、刷入磁盘、同步到网络中、删除、更新文件大小、版本失效等。Optionally, the memory processing module is further configured to: write, read, release, flush to a disk, synchronize to the network, delete, update a file size, and disable a single data block in a file. operating. A single block of data in memory can be manipulated in the following order: write, read, release, flush to disk, synchronize to the network, delete, update file size, version invalidation, and so on.
本发明的另一优选实施例,一种计算机存储介质,所述计算机存储介质可存储有程序,所述程序执行包括上述任一实施例所述的步骤。Another preferred embodiment of the present invention is a computer storage medium having a program stored thereon, the program execution comprising the steps of any of the above embodiments.
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。The above is a further detailed description of the present invention in connection with the specific preferred embodiments, and the specific embodiments of the present invention are not limited to the description. It will be apparent to those skilled in the art that the present invention may be made without departing from the spirit and scope of the invention.

Claims (14)

  1. 一种分布式存储内存管理方法,其特征在于,包括:A distributed storage memory management method, comprising:
    将内存管理模块分成多种大小不同的数据块类型;Divide the memory management module into a plurality of data block types of different sizes;
    在第一节点写入文件,并对所述文件设置租约,使所述文件与第一节点绑定;Writing a file at the first node, and setting a lease to the file, so that the file is bound to the first node;
    在内存中,将文件对应一种数据块类型,并将文件在内存中分成一个或多个大小相同的数据块,同时生成数据块的访问上下文,所述访问上下文包括数据块导引和所属文件编码;In the memory, the file corresponds to a data block type, and the file is divided into one or more data blocks of the same size in the memory, and an access context of the data block is generated, and the access context includes the data block guide and the file to which it belongs. coding;
    将所述数据块和访问上下文存储在第一节点并更新版本;Storing the data block and the access context in the first node and updating the version;
    当在第二节点写入文件,所述文件的租约已被第一节点持有,则所述第一节点接收从第二节点转发的文件,并将所述文件写入第一节点的本地存储并更新版本。When a file is written at the second node, the lease of the file has been held by the first node, the first node receives the file forwarded from the second node and writes the file to the local storage of the first node And update the version.
  2. 如权利要求1所述的一种分布式存储内存管理方法,其特征在于,还包括:A distributed storage memory management method according to claim 1, further comprising:
    在第一节点分别从一个或多个数据块中读取数据;Reading data from one or more data blocks at the first node;
    如果数据在第一节点则直接读取;If the data is in the first node, read it directly;
    如果数据在第二节点则向第二节点请求数据,请求数据后将数据缓存到第一节点并更新版本。If the data is in the second node, the data is requested from the second node, and after the data is requested, the data is cached to the first node and the version is updated.
  3. 如权利要求2所述的一种分布式存储内存管理方法,其特征在于,所述请求数据后将数据缓存到第一节点并更新版本还包括:The distributed storage memory management method according to claim 2, wherein the requesting the data to cache the data to the first node and updating the version further comprises:
    请求数据后判断数据本地存储版本与全局版本;After requesting the data, determine the data local storage version and the global version;
    如果数据版本为最新的版本则直接加载;Load directly if the data version is the latest version;
    如果数据版本低于全局版本,则请求最新版本的数据,请求最新版本的数据后将数据缓存到第一节点。If the data version is lower than the global version, the latest version of the data is requested, and the data is cached to the first node after requesting the latest version of the data.
  4. 如权利要求1所述的一种分布式存储内存管理方法,其特征在于,还包括:内存管理模块中设置工作数据块链和空闲数据块链;The distributed storage memory management method according to claim 1, further comprising: setting a working data block chain and an idle data block chain in the memory management module;
    对数据块进行操作,并将数据块根据状态插入工作数据块链或空闲数据块链;Operate the data block and insert the data block into the working data block chain or the idle data block chain according to the state;
    将工作数据块链中的数据块释放后插入空闲数据块链;Inserting the data block in the working data block chain into the free data block chain;
    从空闲数据块链中取出一个数据块放入工作数据块链。A data block is taken from the free blockchain chain and placed in the working data block chain.
  5. 如权利要求4所述的一种分布式存储内存管理方法,其特征在于,还包括:按预设周期且按预设比例释放空闲数据块链中的数据块。The distributed storage memory management method according to claim 4, further comprising: releasing the data blocks in the idle data block chain by a preset period and at a preset ratio.
  6. 如权利要求4所述的一种分布式存储内存管理方法,其特征在于,所述对数据块进行操作包括:The distributed storage memory management method according to claim 4, wherein the operating the data block comprises:
    将文件在内存中的单个数据块进行写、读、释放、刷入磁盘、同步到网络中、删除、更新文件大小、版本失效中的至少一种操作。At least one of writing, reading, releasing, swiping into a disk, synchronizing to a network, deleting, updating a file size, or failing a version of a file in memory.
  7. 如权利要求1所述的一种分布式存储内存管理方法,其特征在于,还包括:内存只读入文件部分数据块进行操作。A distributed storage memory management method according to claim 1, further comprising: a memory read-only file partial data block for operation.
  8. 一种分布式存储内存管理系统,其特征在于,包括:A distributed storage memory management system, comprising:
    内存管理模块,用于将内存管理模块分成多种大小不同的数据块类型;a memory management module for dividing a memory management module into a plurality of data block types of different sizes;
    写入模块,用于在第一节点写入文件,并对所述文件设置租约,使所述文件与第一节点绑定;a writing module, configured to write a file at the first node, and set a lease to the file, so that the file is bound to the first node;
    内存处理模块,用于在内存中,将文件对应一种数据块类型,并将文件在内存中分成一个或多个大小相同的数据块,同时生成数据块访问上下文,所述访问上下文包括数据块导引和所属文件编码;a memory processing module, configured to, in memory, correspond to a data block type, and divide the file into one or more data blocks of the same size in memory, and generate a data block access context, where the access context includes a data block Guidance and file encoding;
    存储模块,用于将所述数据块和访问上下文存储在第一节点并更新版本;a storage module, configured to store the data block and the access context in the first node and update the version;
    所述写入模块还用于当在第二节点写入文件,所述文件的租约已被第一节点持有,则所述第一节点接收从第二节点转发的文件,并将所述文件写入第一节点的本地存储并更新版本。The writing module is further configured to: when a file is written in the second node, the lease of the file is already held by the first node, the first node receives the file forwarded from the second node, and the file is Write to the local storage of the first node and update the version.
  9. 如权利要求8所述的一种分布式存储内存管理系统,其特征在于,所述系统还包括:A distributed storage memory management system according to claim 8, wherein the system further comprises:
    读取模块,用于在第一节点分别从一个或多个数据块中读取数据;如果数据在第一节点则直接读取;如果数据在第二节点则向第二节点请求数据,请求数据后将数据缓存到第一节点并更新版本。a reading module, configured to read data from one or more data blocks respectively at the first node; directly read the data if the data is in the first node; and request data from the second node if the data is in the second node, requesting data The data is then cached to the first node and the version is updated.
  10. 如权利要求9所述的一种分布式存储内存管理系统,其特征在于,所述读取模块还用于请求数据后判断数据本地存储版本与全局版本;如果数据版本为最新的版本则直接加载;如果数据版本低于全局版本,则请求 最新版本的数据,请求最新版本的数据后将数据缓存到第一节点。The distributed storage memory management system according to claim 9, wherein the reading module is further configured to: after requesting the data, determine the data local storage version and the global version; if the data version is the latest version, directly load If the data version is lower than the global version, request the latest version of the data, and request the latest version of the data to cache the data to the first node.
  11. 如权利要求8所述的一种分布式存储内存管理系统,其特征在于,所述内存管理模块包括工作数据块链和空闲数据块链;还用于对数据块进行操作,并将数据块根据状态插入工作数据块链或空闲数据块链;所述内存管理模块还用于将工作数据块链中的数据块释放后插入空闲数据块链;从空闲数据块链中取出一个数据块放入工作数据块链。A distributed storage memory management system according to claim 8, wherein said memory management module comprises a working data block chain and a free data block chain; and is further configured to operate the data block and base the data block according to The state is inserted into the working data block chain or the free data block chain; the memory management module is further configured to insert the data block in the working data block chain into the free data block chain; and take a data block from the free data block chain and put it into the work. Data block chain.
  12. 如权利要求11所述的一种分布式存储内存管理系统,其特征在于,所述内存管理模块还用于按预设周期且按预设比例释放空闲数据块链中的数据块。A distributed storage memory management system according to claim 11, wherein the memory management module is further configured to release data blocks in the idle data block chain by a preset period and at a preset ratio.
  13. 如权利要求8所述的一种分布式存储内存管理系统,其特征在于,所述内存处理模块还用于将文件在内存中的单个数据块进行写、读、释放、刷入磁盘、同步到网络中、删除、更新文件大小、版本失效中的至少一种操作。The distributed storage memory management system according to claim 8, wherein the memory processing module is further configured to write, read, release, flush, and synchronize a single data block of the file in the memory to the disk. At least one of the operations in the network, deleting, updating the file size, and version invalidation.
  14. 一种计算机存储介质,其特征在于,所述计算机存储介质可存储有程序,所述程序执行包括如权利要求1-7中任一项所述的步骤。A computer storage medium, characterized in that the computer storage medium can store a program, the program execution comprising the steps of any one of claims 1-7.
PCT/CN2018/079685 2017-06-22 2018-03-20 Distributed storage internal storage management method and system, and computer storage medium WO2018233332A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710480382.6A CN107329695B (en) 2017-06-22 2017-06-22 Distributed storage memory management method, system and computer storage medium
CN201710480382.6 2017-06-22

Publications (1)

Publication Number Publication Date
WO2018233332A1 true WO2018233332A1 (en) 2018-12-27

Family

ID=60194386

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/079685 WO2018233332A1 (en) 2017-06-22 2018-03-20 Distributed storage internal storage management method and system, and computer storage medium

Country Status (2)

Country Link
CN (1) CN107329695B (en)
WO (1) WO2018233332A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329695B (en) * 2017-06-22 2020-03-20 上海七牛信息技术有限公司 Distributed storage memory management method, system and computer storage medium
CN109710194A (en) * 2018-12-29 2019-05-03 武汉思普崚技术有限公司 The storage method and device of upper transmitting file
CN111752919A (en) * 2020-06-16 2020-10-09 北京字节跳动网络技术有限公司 Data writing method, data reading method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102622284A (en) * 2012-02-21 2012-08-01 上海交通大学 Data asynchronous replication method directing to mass storage system
CN103577500A (en) * 2012-08-10 2014-02-12 腾讯科技(深圳)有限公司 Method for carrying out data processing by distributed file system and distributed file system
US20150088827A1 (en) * 2013-09-26 2015-03-26 Cygnus Broadband, Inc. File block placement in a distributed file system network
CN104516967A (en) * 2014-12-25 2015-04-15 国家电网公司 Electric power system mass data management system and use method thereof
CN107329695A (en) * 2017-06-22 2017-11-07 上海七牛信息技术有限公司 A kind of distributed storage EMS memory management process, system and computer-readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101188569B (en) * 2006-11-16 2011-05-04 饶大平 Method for constructing data quanta space in network and distributed file storage system
US10264071B2 (en) * 2014-03-31 2019-04-16 Amazon Technologies, Inc. Session management in distributed storage systems
CN106445409A (en) * 2016-09-13 2017-02-22 郑州云海信息技术有限公司 Distributed block storage data writing method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102622284A (en) * 2012-02-21 2012-08-01 上海交通大学 Data asynchronous replication method directing to mass storage system
CN103577500A (en) * 2012-08-10 2014-02-12 腾讯科技(深圳)有限公司 Method for carrying out data processing by distributed file system and distributed file system
US20150088827A1 (en) * 2013-09-26 2015-03-26 Cygnus Broadband, Inc. File block placement in a distributed file system network
CN104516967A (en) * 2014-12-25 2015-04-15 国家电网公司 Electric power system mass data management system and use method thereof
CN107329695A (en) * 2017-06-22 2017-11-07 上海七牛信息技术有限公司 A kind of distributed storage EMS memory management process, system and computer-readable storage medium

Also Published As

Publication number Publication date
CN107329695B (en) 2020-03-20
CN107329695A (en) 2017-11-07

Similar Documents

Publication Publication Date Title
JP6309103B2 (en) Snapshot and clone replication
US8793531B2 (en) Recovery and replication of a flash memory-based object store
JP5948340B2 (en) File cloning and decloning in data storage systems
JP5411250B2 (en) Data placement according to instructions to redundant data storage system
US9600558B2 (en) Grouping of objects in a distributed storage system based on journals and placement policies
CN103179185B (en) Method and system for creating files in cache of distributed file system client
CN105493474B (en) System and method for supporting partition level logging for synchronizing data in a distributed data grid
WO2018157602A1 (en) Method and device for synchronizing active transaction lists
CN101188544A (en) Buffer-Based File Transfer Method for Distributed File Servers
WO2015107666A1 (en) Storage apparatus and cache control method for storage apparatus
CN105739924A (en) Cache cluster-based cache method and system
JP2010533324A (en) Mounting a file system to a clustered file system
JP2010102738A (en) Apparatus and method for hardware-based file system
CN112068992B (en) Remote data replication method, storage device and storage system
CN113220729A (en) Data storage method and device, electronic equipment and computer readable storage medium
KR20130123897A (en) Method and appratus for managing file in hybrid storage system
WO2018233332A1 (en) Distributed storage internal storage management method and system, and computer storage medium
JP4615344B2 (en) Data processing system and database management method
US7640410B2 (en) Instant copy of data through pointers interchanging
CN110134551B (en) Continuous data protection method and device
CN113204520B (en) Remote sensing data rapid concurrent read-write method based on distributed file system
WO2016206070A1 (en) File updating method and storage device
WO2015141219A1 (en) Storage system, control device, memory device, data access method, and program recording medium
US7558929B2 (en) Instant copy of data in a cache memory via an atomic command
CN101154172A (en) Method for centralized establishing dependent snapshot

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18820659

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.04.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18820659

Country of ref document: EP

Kind code of ref document: A1