WO2018077092A1 - Saving method applied to distributed file system, apparatus and distributed file system - Google Patents

Saving method applied to distributed file system, apparatus and distributed file system Download PDF

Info

Publication number
WO2018077092A1
WO2018077092A1 PCT/CN2017/106690 CN2017106690W WO2018077092A1 WO 2018077092 A1 WO2018077092 A1 WO 2018077092A1 CN 2017106690 W CN2017106690 W CN 2017106690W WO 2018077092 A1 WO2018077092 A1 WO 2018077092A1
Authority
WO
WIPO (PCT)
Prior art keywords
bitmap
snapshot
data
save
aggregation degree
Prior art date
Application number
PCT/CN2017/106690
Other languages
French (fr)
Chinese (zh)
Inventor
柴军红
尹丹
汪雷
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2018077092A1 publication Critical patent/WO2018077092A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/128Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion

Definitions

  • the present disclosure relates to computer storage technology, and more particularly to a storage method for a distributed file system, a storage device, a distributed file system having the storage device, and a computer storage medium thereof.
  • the term big data is increasingly mentioned as a term used to describe and define the vast amounts of data generated by the information explosion era.
  • IDC Internet Data Center
  • the world will produce 44 times the size of today's data. Its growth rate is equivalent to more than 200GB of data per person per year worldwide.
  • the time process will cause the loss of metadata; and the existing saving methods are sorted according to the size of the record number, and the data to be saved is traversed from small to large each time. When the amount of data to be saved is too large, the position is backward. The data may not be saved for several hours, resulting in data loss, which affects the system's access performance and data integrity.
  • the technical problem to be solved by the embodiments of the present invention is to provide a storage method for a distributed file system, and a storage device and system thereof, which maintain a corresponding file system table in a memory by a preset save snapshot period.
  • the amount of data guarantees data integrity and accessibility.
  • an embodiment of the present invention provides a method for saving a file in a distributed file system, which presets a snapshot cycle, and pre-creates a change for the metadata of the current save snapshot period for the file system table.
  • the current bitmap of the record, and the snapshot bitmap used to represent the change record of the metadata in the last save snapshot period, the save method includes the steps:
  • the snapshot bitmap is saved and the new change record of the metadata is restarted with the new current bitmap.
  • the save priority is performed according to the save priority sequence of each data segment corresponding to the new snapshot bitmap.
  • Determining whether the data aggregation degree of each data segment is greater than or equal to a preset data aggregation degree threshold respectively obtaining a plurality of first data segments whose data aggregation degree is greater than or equal to a preset data aggregation degree threshold, and the data aggregation degree is less than a preset a plurality of second data segments of the data aggregation degree threshold;
  • Each of the first data segments is arranged according to a preset rule, thereby obtaining a corresponding save priority sequence.
  • the preset rule refers to arranging the data segments according to the degree of aggregation from large to small; and/or, the two save snapshot cycles are one check cycle.
  • an embodiment of the present invention further provides a disk storage device applied to a distributed file system, including:
  • a processing module configured to preset a save snapshot period, and pre-create a current bitmap for indicating a change record of metadata in the current save snapshot period for the file system table, and for indicating metadata in the last save snapshot period Change the snapshot bitmap of the record;
  • a data access module configured to receive a change record of metadata input by the user in real time
  • an update module configured to update the current bitmap in real time according to the change record of the metadata received by the data access module
  • a save disk module configured to replace the snapshot bitmap with the updated current bitmap when the save snapshot period is reached, to obtain a new current bitmap and a new snapshot bitmap, and according to the new snapshot
  • the bitmap is saved, and at the same time, the update module is triggered to update the new current bitmap according to the new change record of the metadata.
  • the save module includes:
  • a determining unit configured to determine whether the current snapshot snapshot period is reached
  • a replacement unit configured to replace the updated current bitmap with the snapshot bitmap when the determining unit determines that the save snapshot period is currently reached, to obtain a new current bitmap and a new snapshot bitmap.
  • a priority sorting unit configured to calculate a save priority sequence of each data segment according to the new snapshot bitmap obtained after the replacement
  • a saving thread unit configured to save each corresponding data segment according to the saving priority sequence.
  • the processing module is further configured to preset a data aggregation degree threshold, where the priority ordering unit includes:
  • a data aggregation degree calculation sub-unit configured to calculate a data aggregation degree of each corresponding data segment according to the new snapshot bitmap obtained after the replacement
  • the data aggregation degree of each data segment is compared with the preset data aggregation degree threshold, respectively, to obtain a plurality of first data with a data aggregation degree greater than or equal to a preset data aggregation degree threshold. a segment, and a plurality of second data segments whose data aggregation degree is less than a preset data aggregation degree threshold;
  • the sorting subunit is configured to arrange each of the first data segments that are greater than or equal to the preset data aggregation degree threshold according to a preset rule according to the comparison result of the comparison subunit, to obtain a corresponding save priority sequence.
  • the prioritization unit further includes:
  • Writing a recording subunit configured to extract a record corresponding to each of the second data segments into a log file; and the comparing subunit is further configured to check each second data segment according to a preset check cycle timing, until When the data aggregation degree of each second data segment is equal to or greater than a preset data aggregation degree threshold, the sorting subunit is triggered to save the second data segment to a corresponding priority queue according to a preset rule.
  • the embodiment of the present invention further provides a distributed file system, which includes any of the above-mentioned storage devices, and the storage method thereof is the same as the above-described storage method.
  • An embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores one or more programs executable by a computer, and when the one or more programs are executed by the computer, causing the computer to execute the application.
  • the saving method and the saving device record whether the metadata is modified, and pass the snapshot bit by pre-storing the snapshot cycle and maintaining the current bitmap and the snapshot bitmap corresponding to the file system table in the memory. Take a snapshot to extend the save cycle, so that you can save the same record or file block in one cycle, and write the disk in batch mode according to the priority from high to low, thus increasing the file size.
  • the degree of aggregation reduces the amount of data stored in the IO, ensuring data integrity and accessibility.
  • FIG. 1 is a schematic diagram showing the basic architecture of a distributed file system based on the present invention
  • FIG. 2 is a flow chart of an embodiment of an application and a file storage method of a distributed file system according to the present invention
  • FIG. 3 is a schematic diagram showing the replacement of the current bitmap and the snapshot bitmap in step S15 of FIG. 2;
  • FIG. 4 is a flow chart reflecting an embodiment of step S17 of Figure 2;
  • FIG. 5 is a timing diagram reflecting an embodiment of a write file based on the save method of the distributed file system of FIG. 2;
  • Figure 6 is a functional block diagram of an embodiment of a disk storage device for use in a distributed file system of the present invention.
  • the present invention is applied to a distributed file system DFS, and the basic architecture of the distributed file system is shown in FIG.
  • the client FAC When the user accesses the client FAC to write the file through the file, that is, when the metadata is changed, the file full path is first sent to the directory tree server DTS, and the global unique identifier FILEID and the file location register FLR corresponding to the file are obtained; secondly, the file access client The FAC sends a write file request to the file location register FLR to obtain the data block copy location information of the file (usually a file is divided into several data blocks of the same size, for example, a data block 64M size, called a CHUNK); Finally, the file access client FAC establishes a connection with the data storage server, passes the data block to the data storage server, and writes to the disk.
  • the metadata structure of the DFS is organized: the directory tree server DTS is used to manage the file namespace, the global unique identifier FILEID is allocated, and the FLR is allocated; the file location register FLR is used to manage file attributes (such as FILEID, file size). , file type, access rights, uid, gid, etc.) and the storage location of the file contents.
  • the present invention is based on the above-described distributed file system, which sets two bitmaps for the file system table by setting the save snapshot period in advance, that is, the current bitmap and the snapshot problem, respectively, for indicating the current save snapshot period intra-element a record of the change of the data, and a record of the change of the metadata in the last save snapshot period, and only when the save snapshot period is reached, the two bitmaps are replaced, and then the save is performed according to the replaced snapshot bitmap, thereby
  • the snapshot is saved in a timed manner, that is, only one snapshot of the same file/data block is saved in a save snapshot cycle, thereby avoiding the need to save a record every time in the existing mode, and doing a snapshot
  • the data is lost, and the present invention saves the disk according to the priority sequence, and performs batch writing in the order of high to low, thereby increasing the degree of aggregation of the file to reduce the amount of data stored in the disk, thereby ensuring data. Integrity, and accessibility.
  • FIG. 2 is a flowchart of an embodiment of a method for saving a disk in a distributed file system according to the present invention.
  • the disk in order to avoid the occurrence of the change record every time, the disk is saved once. Therefore, in this embodiment, the disk save snapshot period is set in advance, thereby implementing the timed save disk, and two different sizes of the current need to be separately created for the file system table.
  • the bitmap and the snapshot bitmap, in this embodiment, the saving method includes the following steps:
  • S11 Receive a change record of the metadata in real time, and update the corresponding current bitmap according to the change record in real time.
  • the metadata change record refers to N change records generated when operations such as adding metadata, modifying existing metadata, or deleting existing metadata are performed.
  • the save snapshot period is preset, so that only when a save snapshot period is reached, the save snapshot is performed, thereby implementing a timed batch manner to save the snapshot, thereby avoiding the need to save the disk every time in the existing manner. Record, all have a snapshot problem.
  • the current bitmap cur_bit is used to indicate the change record of the metadata in the current save snapshot period; the snapshot bitmap snap_bit represents the record of the last save cycle change.
  • the size of the current bitmap and the snapshot bitmap is proportional to the table capacity of the file system table, and 0 and 1 are used to record whether the corresponding record is modified, that is, when the metadata is changed.
  • the save snapshot period is preset, specifically, by a timer, and when the timer is reached, that is, when the save snapshot period is reached, a message, such as a pulse signal, is triggered to trigger the save. Thread, therefore, it can be directly judged according to the message fed back by the timer whether the save snapshot period is reached.
  • the current bitmap is used to indicate the change record of the metadata in the current save snapshot period, and the snapshot bitmap represents the change record of the metadata in the last save snapshot period
  • the current bitmap can be directly replaced with the snapshot bitmap, as shown in FIG. 3, that is, the snapshot bitmap is cleared and replaced with a new current bitmap, so as to immediately restart the recording of the new change record of the metadata;
  • the bitmap is replaced by a new snapshot bitmap, which is used as the basis for saving, that is, the file system table in the new snapshot bitmap obtained after the replacement until the next storage snapshot period is completed. Record and save the operation.
  • the disk is saved in the order of the priority of the data segment when the disk is saved, that is, the amount of data stored in the disk is reduced by increasing the degree of aggregation of the file, as shown in FIG.
  • the step S17 includes the steps of:
  • each time 16K is used as a unit, which is called a data segment DATA, wherein the ratio of the data segment to the file system table record length TupleLen is stored in the segment.
  • the maximum number of records MaxTupleNumber; the degree of data aggregation in each data segment refers to the record of changes in the bitmap (ie bitmap bit)
  • the ratio set to 1) is multiplied by 100 with MaxTupleNumber, and the data aggregation degree DP of the data segment is used as a parameter for sorting the data segment prioritization.
  • S173 Compare data aggregation degrees of the data segments with the preset aggregation degree thresholds, and obtain multiple first data segments whose data aggregation degree is greater than or equal to the preset data aggregation degree threshold, and the data aggregation degree is smaller than the preset data aggregation. A plurality of second data segments of the degree threshold.
  • step S175 the first data segments are arranged according to a preset rule to obtain a corresponding save priority sequence, and step S179 is performed.
  • the data aggregation degree threshold is set in advance, for example, 30. Therefore, when it is determined that the data aggregation degree of the plurality of data segments is greater than the preset threshold, the first data segments need to be in a certain order. Arrange to obtain a priority queue for the first data segment.
  • the preset rule refers to arranging the first data segments whose data aggregation degree is greater than or equal to the preset threshold according to the data aggregation degree from large to small. Of course, it is understandable to arrange them in order from small to large, or according to other rules.
  • the data aggregation degree of the second data segment whose original data aggregation degree is less than the preset threshold in the previous save snapshot period can be greater than or equal to the preset threshold after a save snapshot cycle.
  • the second data segment is checked once every two save snapshot periods to determine whether the data aggregation degree reaches a preset threshold, that is, each two save snapshot cycles is set to one check cycle.
  • a preset threshold that is, each two save snapshot cycles is set to one check cycle.
  • the save priority sequence is the priority sequence constructed according to the data aggregation of each first data segment in the second save snapshot cycle.
  • the change record of the metadata can be continuously recorded in real time through the new current bitmap while the disk is being saved, until the next time When a save snapshot cycle arrives, the new current bitmap is replaced while another new current bitmap is obtained, thus looping.
  • the change record of the metadata is recorded by setting the current bitmap, and when the save snapshot period is reached, the current bitmap is immediately replaced with the snapshot bitmap, and a new current bitmap is obtained to record the new metadata.
  • Change the record at the same time, before the next save snapshot cycle, you can directly save the new snapshot bitmap according to the replacement. It can be seen that by taking a snapshot of the current bitmap, the save period is extended, so that the same record or file block in one cycle needs to be saved once, and the order of priority is from high to low in batch mode. Write the disk, thereby increasing the degree of aggregation of the file to reduce the amount of data stored in the disk, ensuring data integrity, and accessibility.
  • the modification of the metadata includes an addition, such as writing a file/metadata. Therefore, the method of saving the file when the file is written will be described in detail below with reference to the drawings and the exemplary embodiments.
  • FIG. 5 it is a sequence diagram of an embodiment of a write file based on the save method in the first embodiment, wherein the file is written in the distributed file system in the embodiment:
  • the file access client FAC sends a write file request to the directory tree server DTS.
  • the user sends a write file request to the DTS through the FAC, and the write file request carries the full path of the file object to be written.
  • the DTS determines whether the file exists. If not, the DTS generates a new file identifier FILEID, allocates an available FLR to it, generates a dictionary table record to store the file name, generates a file FILEID record, and stores the FILEID, FLRID, and the like. The information is then given a success message to the file access client FAC feedback; if the file does not exist, the DTS gives the FAC an error.
  • the DTS searches for a presence in the namespace to determine whether the file exists.
  • the FAC After receiving the message, the FAC sends a create file message to the corresponding file location register FLR.
  • the FLR determines whether the file already exists. If yes, the feedback already exists. If not, the FILE record is created, the FILEID, the generation time, and the like are stored, and the FAC feedback creation file is successfully acknowledged.
  • the FLR traverses the file by FILEID to determine whether it exists.
  • the FAC receives the create file response, and sends a file block request to the FLR through the FILEID.
  • the FLR selects the destination disk of the file block according to the storage rule, and generates a file block corresponding record, and simultaneously feeds the FAC to create the disk information of the file block.
  • the FAC creates a file block on the FAS according to the returned disk information, and writes the file content.
  • the FAS writes the file according to the timed batch manner, and after writing the file, after writing, returns the write result and the file block size information to the FAC.
  • the FAS writes the file according to the timed batch mode, which means that the file content in the first embodiment is received in real time, and the corresponding preset current bitmap is updated in real time, and then the cycle is adopted.
  • the current bitmap and the snapshot bitmap are replaced by the updated bitmap, and the snapshot bitmap obtained after the replacement is saved until the entire file content is written, that is, the file content is periodically and batch-written.
  • the writing process is written according to the priority sequence corresponding to each data segment.
  • the FAC reports the write result and the file block size information to the FLR.
  • the FLR records the reported content into the file block record and replies to the FAC.
  • the FAC when the FAC receives the reply returned by the FLR, it indicates that the writing of the file is completed, and the user is sent a file completion response.
  • the present invention also provides a distributed file system, which will be described in detail below with reference to the accompanying drawings and exemplary embodiments.
  • a disk storage device for a distributed file system includes:
  • the processing module 61 is configured to preset a save snapshot period, and pre-create a current bitmap for indicating a change record of the metadata in the current save snapshot period for the file system table, and used to represent the metadata in the last save snapshot period. Snapshot bitmap of the change record;
  • the data is calculated into the module 62, and is configured to receive a change record of the metadata input by the user in real time;
  • the update module 63 is configured to update the current bitmap in real time according to the change record received by the data access module 62.
  • the current bitmap refers to the preset current bitmap in the initial state after the system is powered on. Or, after the system is powered on, the new current bitmap obtained after the replacement;
  • the save disk module 64 is configured to replace the snapshot bitmap representing the change record of the last save cycle metadata with the current bitmap updated by the update module 63 in real time to obtain a new current bitmap when the save snapshot period arrives. And the snapshot bitmap is saved according to the new snapshot bitmap obtained after the replacement; at the same time, the trigger update module updates the new current bitmap according to the new change record of the metadata.
  • the save module 64 is saved according to the save priority sequence of each data segment when the save module 64 is saved.
  • the save module 64 includes:
  • the determining unit 641 is configured to determine whether the save snapshot period is currently reached.
  • a timer is set by the processing module 61 to perform timing, so that when the timing reaches a preset duration, a trigger signal is sent to the The save module 64 triggers a save operation or the like. Therefore, whether the save snapshot period is reached can be determined by determining whether the trigger signal sent by the processing module 61 is received.
  • the replacement unit 642 is configured to replace the updated current bitmap with the snapshot bitmap when the determining unit 641 determines that the save snapshot period is currently reached, to obtain a new current bitmap and a new snapshot bitmap.
  • the determination unit 641 receives the trigger signal sent by the processing module 61, the determining unit 641 sends a trigger signal to the replacement unit 642, so that the replacement unit 642 will indicate the current save snapshot period.
  • the current bitmap of the change record of the metadata and the snapshot bitmap representing the change record of the metadata in the previous save snapshot period are replaced, thereby obtaining a new current bitmap and
  • the new snapshot bitmap as shown in Figure 3, will empty the original snapshot bitmap as the new current bitmap, and use the original current bitmap as the new snapshot bitmap;
  • the priority sorting unit 644 is configured to calculate a save priority sequence of each data segment according to the replaced new snapshot bitmap.
  • the priority sorting unit 644 includes: a data aggregation degree calculation subunit, configured to: The data aggregation degree of each corresponding data segment is calculated according to the new snapshot bitmap obtained after the replacement; the comparison subunit is configured to perform the data aggregation degree of each data segment and the data aggregation degree threshold preset by the processing module 61.
  • the sorting subunit is configured to arrange each of the first data segments according to a comparison rule according to a comparison result of the comparison subunit to obtain a corresponding storage priority sequence; and write a recording subunit for Extracting, according to the comparison result of the comparison subunit, the record corresponding to each second data segment is written into the log file;
  • the save thread unit 643 is configured to save the corresponding data segments according to the calculated save priority sequence.
  • the preset rule refers to arranging the first data segments whose data aggregation degree is greater than or equal to the preset threshold according to the data aggregation degree from large to small; of course, in order from small to large, Or it is understandable to arrange according to other rules.
  • the save priority sequence is the priority sequence constructed according to the data aggregation of each first data segment in the second save snapshot cycle.
  • the current bitmap is separately set by the processing module to record the change record of the metadata in the current period, and the snapshot bitmap is used to record the change record of the metadata in the last save snapshot period, and after the save snapshot period is reached.
  • the current bitmap is immediately replaced with the snapshot bitmap, and a new current bitmap is obtained to record the new change record of the metadata, and at the same time, the new snapshot bitmap can be directly obtained according to the replacement before the next save snapshot period arrives.
  • Performing a batch save it can be seen that by taking a snapshot of the current bitmap, the save cycle is extended, so that the same record or file block in one cycle needs to be saved once, and the file is saved in batch mode according to the priority. High-to-low sequential writes, which increase the degree of file aggregation to reduce the amount of data stored in the disk, ensuring data integrity and accessibility.
  • the present invention also provides a distributed file system, which includes the disk storage device in the third embodiment, the method and the principle of the disk storage, and the above embodiment, based on the storage method and the disk storage device applied to the distributed file system.
  • a distributed file system which includes the disk storage device in the third embodiment, the method and the principle of the disk storage, and the above embodiment, based on the storage method and the disk storage device applied to the distributed file system.
  • the principles in one or two or three are the same and will not be described here.
  • the technical solution provided by the embodiment of the present invention can be applied to the technical field of computer storage.
  • whether the metadata is modified is determined by presetting the save snapshot period and maintaining the current bitmap and the snapshot bitmap corresponding to the file system table in the memory. And by taking a snapshot of the snapshot bitmap to extend the save cycle, so that the same record or file block in one cycle, only need to be saved once, and When saving, the disk is written in batch mode according to the priority from high to low, thereby increasing the degree of aggregation of the file to reduce the amount of data stored in the disk, ensuring data integrity and accessibility.

Abstract

A saving method applied to a distributed file system, saving apparatus, system thereof and computer storage medium thereof. Receiving a change record of metadata in real time by means of presetting a saving snapshot period, a current bitmap and a snapshot bitmap, and updating the corresponding current bitmap in real time according to the change record (S11); when determining that the saving snapshot period has been reached (S13), replacing the snapshot bitmap with the updated current bitmap to obtain a new current bitmap and a new snapshot bitmap (S15); and saving according to the new snapshot bitmap, while starting to record a new change record of the metadata again using the new current bitmap (S17). That is to say, the saving period is prolonged by means of regularly saving, so that the same record or file block only needs to be saved once within a period, and sequence writing is performed from top to bottom using a batch mode and according to priority, thereby increasing the degree of aggregation of files, reducing the amount of saved IO data, and ensuring completeness and accessibility of data.

Description

应用于分布式文件系统的存盘方法、装置及分布式文件系统Storage method, device and distributed file system applied to distributed file system 技术领域Technical field
本公开涉及计算机存储技术,尤其涉及一种应用于分布式文件系统的存盘方法、存储装置、具有该存盘装置的分布式文件系统及其计算机存储介质。The present disclosure relates to computer storage technology, and more particularly to a storage method for a distributed file system, a storage device, a distributed file system having the storage device, and a computer storage medium thereof.
背景技术Background technique
在数字化信息时代,大数据(big data)一词越来越多的被人们提及,它用来描述和定义信息爆炸时代所产生海量数据的名词。据互联网数据中心(Internet Data Center,IDC)的调研结果显示,2011年全球产生的数据量为1.8ZB(1ZB=1024EB,1EB=1024PB,1PB=1024TB,1TB=1024GB),与2010年同期相比,又增长了超过1ZB的数据量。而到了2020年,全世界所产生的数据规模将达到今天的44倍。其增长速度相当于全球每人每年产生200GB以上的数据。In the era of digital information, the term big data is increasingly mentioned as a term used to describe and define the vast amounts of data generated by the information explosion era. According to the Internet Data Center (IDC) survey, the global data generated in 2011 was 1.8ZB (1ZB=1024EB, 1EB=1024PB, 1PB=1024TB, 1TB=1024GB), compared with the same period in 2010. , and increased the amount of data more than 1ZB. By 2020, the world will produce 44 times the size of today's data. Its growth rate is equivalent to more than 200GB of data per person per year worldwide.
在这种数据快速增长的情况下,海量数据存储技术成为了支撑数据高速增长的技术基础。一方面对信息数据的存储、计算、提取提出了严峻的考验,另一方面对信息数据的容灾系统、备份、归档提出了更严格的要求。进而分布式存储技术也应运而生。现有分布式文件系统的研究主要分为元数据与实际数据存储分开管理,文件系统中元数据请求占据所有请求的50%以上,因此,元数据管理问题成为分布式文件系统研究中的一个重要研究方向。In the case of such rapid growth of data, massive data storage technology has become the technical basis to support the rapid growth of data. On the one hand, it puts a severe test on the storage, calculation and extraction of information data. On the other hand, it puts more stringent requirements on the disaster recovery system, backup and archiving of information data. In turn, distributed storage technology has emerged. The existing research on distributed file system is mainly divided into metadata and actual data storage. The metadata request in the file system occupies more than 50% of all requests. Therefore, the metadata management problem becomes an important issue in the research of distributed file system. research direction.
在当前众多分布式文件系统中,为了实现元数据高效访问和存储效率已采用了缓存技术,由于用户对数据对象的任何操作,如增加、删除、重命名等,势必都需要触发元数据存盘操作,尤其是对于操作频率高且元数据记录变化很离散的情况下,从内存镜像缓冲区中写入数据表对应存盘文件时,需要通过记录号找到记录对应的位置,元数据盘就对应着大量的随机读写IO操作。而对于元数据写盘过程对于表记录分布非常散列的文件来说,这样会大大增加元数据管理系统内部交互的次数,从而增大元数据盘随机读写IO,导致元数据盘忙,存盘时间过程,会造成元数据的丢失;并且,现有的存盘方式都是按照记录号大小排序,每次都按照从小到大遍历待存盘的数据,当需要存盘数据量过大时,位置靠后的数据有可能会历经几个小时不存盘,从而造成数据丢失,进而影响了系统的访问性能以及数据完整性。In many current distributed file systems, caching technology has been adopted to achieve efficient metadata access and storage efficiency. Because of any operation of the user on the data object, such as adding, deleting, renaming, etc., it is necessary to trigger the metadata storage operation. Especially when the operation frequency is high and the metadata record changes are very discrete, when the data file corresponding to the save file is written from the memory mirror buffer, the corresponding position of the record needs to be found by the record number, and the metadata disk corresponds to a large number. Random read and write IO operations. For the file write process of the metadata is very hashed for the table record distribution, this will greatly increase the number of internal interactions of the metadata management system, thereby increasing the random read and write IO of the metadata disk, resulting in the metadata disk being busy and saving. The time process will cause the loss of metadata; and the existing saving methods are sorted according to the size of the record number, and the data to be saved is traversed from small to large each time. When the amount of data to be saved is too large, the position is backward. The data may not be saved for several hours, resulting in data loss, which affects the system's access performance and data integrity.
发明内容Summary of the invention
本发明实施例所要解决的技术问题在于,提供一种应用于分布式文件系统中的存盘方法,及其存盘装置和系统,其通过预设存盘快照周期,在内存中维护与文件系统表对应的当前位图和快照位图,来记录对应的记录是否被修改,并通过对快照位图做快照,来延长 存盘周期,使得一个周期内对同一条记录或文件块,只需要做一次存盘,且存盘时以批量方式按照优先级从高到低的顺序写盘,从而增大文件的聚合度来减少存盘IO数据量,保证了数据的完整性,和可访问性。The technical problem to be solved by the embodiments of the present invention is to provide a storage method for a distributed file system, and a storage device and system thereof, which maintain a corresponding file system table in a memory by a preset save snapshot period. Current bitmap and snapshot bitmap to record whether the corresponding record has been modified and extended by taking a snapshot of the snapshot bitmap Save the disk cycle, so that the same record or file block in one cycle, only need to save once, and save the disk in batch mode according to the priority from high to low order, thereby increasing the degree of file aggregation to reduce the save IO The amount of data guarantees data integrity and accessibility.
为了解决上述技术问题,本发明实施例提供了一种应用于分布式文件系统的存盘方法,预设存盘快照周期,并针对文件系统表预先创建分别用于表示当前存盘快照周期内元数据的更改记录的当前位图,和用于表示上一个存盘快照周期内元数据的更改记录的快照位图,则该存盘方法包括步骤:In order to solve the above technical problem, an embodiment of the present invention provides a method for saving a file in a distributed file system, which presets a snapshot cycle, and pre-creates a change for the metadata of the current save snapshot period for the file system table. The current bitmap of the record, and the snapshot bitmap used to represent the change record of the metadata in the last save snapshot period, the save method includes the steps:
实时接收元数据的更改记录,并根据所述更改记录实时更新对应的当前位图;Receiving a change record of the metadata in real time, and updating the corresponding current bitmap according to the change record;
判断当前是否达到存盘快照周期,若是,则根据更新后的所述当前位图和所述快照位图进行置换,得到新的当前位图和新的快照位图,并根据置换后得到的新的快照位图进行存盘,同时利用新的当前位图重新开始记录元数据新的更改记录。Determining whether the current snapshot snapshot period is reached, and if so, performing replacement according to the updated current bitmap and the snapshot bitmap to obtain a new current bitmap and a new snapshot bitmap, and according to the new one obtained after the replacement The snapshot bitmap is saved and the new change record of the metadata is restarted with the new current bitmap.
其中,进行存盘时,是根据所述新的快照位图对应的各个数据段的存盘优先级序列进行存盘的。Wherein, when the save is performed, the save priority is performed according to the save priority sequence of each data segment corresponding to the new snapshot bitmap.
其中,各个数据段的存盘优先级序列的计算步骤,包括步骤:The calculation step of the save priority sequence of each data segment includes the following steps:
根据置换后得到的新的快照位图,计算每个数据段的数据聚合度;Calculating the data aggregation degree of each data segment according to the new snapshot bitmap obtained after the replacement;
判断每个数据段的数据聚合度是否大于或等于预设的数据聚合度阈值,分别得到数据聚合度大于或等于预设数据聚合度阈值的多个第一数据段,以及数据聚合度小于预设数据聚合度阈值的多个第二数据段;Determining whether the data aggregation degree of each data segment is greater than or equal to a preset data aggregation degree threshold, respectively obtaining a plurality of first data segments whose data aggregation degree is greater than or equal to a preset data aggregation degree threshold, and the data aggregation degree is less than a preset a plurality of second data segments of the data aggregation degree threshold;
将各个所述第一数据段按照预设规则进行排列,从而得到相应的存盘优先级序列。Each of the first data segments is arranged according to a preset rule, thereby obtaining a corresponding save priority sequence.
根据一个示例性实施例,提取各个所述第二数据段对应记录写到日志文件中;且根据预设的检查周期定时对各个第二数据段进行检查,直至所述第二数据段的数据聚合都等于或大于预设的数据聚合度阈值时,按照预设规则将所述第二数据段保存至对应的存盘优先级序列中。According to an exemplary embodiment, extracting each of the second data segment corresponding records into a log file; and checking each second data segment according to a preset check cycle timing until data aggregation of the second data segment When the data aggregation degree threshold is equal to or greater than the preset data aggregation degree threshold, the second data segment is saved to the corresponding save priority sequence according to a preset rule.
其中,所述预设规则是指将各个数据段按照聚合度从大到小的顺序进行排列;和/或,两个存盘快照周期为一个检查周期。The preset rule refers to arranging the data segments according to the degree of aggregation from large to small; and/or, the two save snapshot cycles are one check cycle.
相应地,本发明实施例还提供了一种应用于分布式文件系统的存盘装置,其包括:Correspondingly, an embodiment of the present invention further provides a disk storage device applied to a distributed file system, including:
处理模块,用于预设存盘快照周期,并针对文件系统表预先创建分别用于表示当前存盘快照周期内元数据的更改记录的当前位图,和用于表示上一个存盘快照周期内元数据的更改记录的快照位图;a processing module, configured to preset a save snapshot period, and pre-create a current bitmap for indicating a change record of metadata in the current save snapshot period for the file system table, and for indicating metadata in the last save snapshot period Change the snapshot bitmap of the record;
数据接入模块,用于实时接收用户输入的元数据的更改记录;a data access module, configured to receive a change record of metadata input by the user in real time;
更新模块,用于根据所述数据接入模块所接收的元数据的更改记录实时更新所述当前位图;And an update module, configured to update the current bitmap in real time according to the change record of the metadata received by the data access module;
存盘模块,用于当存盘快照周期达到时,将所述快照位图与更新后的所述当前位图进行置换,得到新的当前位图和新的快照位图,并根据所述新的快照位图进行存盘,同时,触发所述更新模块根据所述元数据新的更改记录更新所述新的当前位图。 a save disk module, configured to replace the snapshot bitmap with the updated current bitmap when the save snapshot period is reached, to obtain a new current bitmap and a new snapshot bitmap, and according to the new snapshot The bitmap is saved, and at the same time, the update module is triggered to update the new current bitmap according to the new change record of the metadata.
其中,所述存盘模块包括:The save module includes:
判断单元,用于判断当前是否达到存盘快照周期;a determining unit, configured to determine whether the current snapshot snapshot period is reached;
置换单元,用于当所述判断单元判断出当前达到存盘快照周期时,将所述更新后的当前位图与所述的快照位图进行置换,得到新的当前位图和新的快照位图;a replacement unit, configured to replace the updated current bitmap with the snapshot bitmap when the determining unit determines that the save snapshot period is currently reached, to obtain a new current bitmap and a new snapshot bitmap. ;
优先级排序单元,用于根据置换后得到的新的快照位图,计算各个数据段的存盘优先级序列;a priority sorting unit, configured to calculate a save priority sequence of each data segment according to the new snapshot bitmap obtained after the replacement;
存盘线程单元,用于根据根据所述存盘优先级序列将对应的各个数据段进行存盘。And a saving thread unit, configured to save each corresponding data segment according to the saving priority sequence.
其中,所述处理模块还用于预设数据聚合度阈值,则所述优先级排序单元包括:The processing module is further configured to preset a data aggregation degree threshold, where the priority ordering unit includes:
数据聚合度计算子单元,用于根据置换后得到的新的快照位图,计算对应的每个数据段的数据聚合度;a data aggregation degree calculation sub-unit, configured to calculate a data aggregation degree of each corresponding data segment according to the new snapshot bitmap obtained after the replacement;
比较子单元,用于将每个数据段的数据聚合度分别与所述预设的数据聚合度阈值进行比较,分别得到数据聚合度大于或等于预设的数据聚合度阈值的多个第一数据段,以及数据聚合度小于预设数据聚合度阈值的多个第二数据段;Comparing the sub-units, the data aggregation degree of each data segment is compared with the preset data aggregation degree threshold, respectively, to obtain a plurality of first data with a data aggregation degree greater than or equal to a preset data aggregation degree threshold. a segment, and a plurality of second data segments whose data aggregation degree is less than a preset data aggregation degree threshold;
排序子单元,用于根据比较子单元的比较结果,将大于或等于预设的数据聚合度阈值的各个第一数据段,按照预设规则进行排列,得到对应的存盘优先级序列。The sorting subunit is configured to arrange each of the first data segments that are greater than or equal to the preset data aggregation degree threshold according to a preset rule according to the comparison result of the comparison subunit, to obtain a corresponding save priority sequence.
根据一个示例性实施例,所述优先级排序单元还包括:According to an exemplary embodiment, the prioritization unit further includes:
写记录子单元,用于提取各个所述第二数据段对应的记录写到日志文件中;且所述比较子单元还用于根据预设的检查周期定时对各个第二数据段进行检查,直至各个第二数据段的数据聚合度等于或大于预设的数据聚合度阈值时,触发所述排序子单元按照预设规则将所述第二数据段保存到对应的优先级队列中。Writing a recording subunit, configured to extract a record corresponding to each of the second data segments into a log file; and the comparing subunit is further configured to check each second data segment according to a preset check cycle timing, until When the data aggregation degree of each second data segment is equal to or greater than a preset data aggregation degree threshold, the sorting subunit is triggered to save the second data segment to a corresponding priority queue according to a preset rule.
基于上述的存盘装置,本发明实施例还提供了一种分布式文件系统,其包括上述的任意一种存盘装置,且其存盘方法与上述的存盘方法相同。Based on the foregoing storage device, the embodiment of the present invention further provides a distributed file system, which includes any of the above-mentioned storage devices, and the storage method thereof is the same as the above-described storage method.
本发明实施例还提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行的一个或多个程序,所述一个或多个程序被所述计算机执行时使所述计算机执行上述应用于分布式文件系统的存盘方法。An embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores one or more programs executable by a computer, and when the one or more programs are executed by the computer, causing the computer to execute the application. A method of saving a distributed file system.
实施本发明实施例,具有如下有益效果:Embodiments of the present invention have the following beneficial effects:
本发明实施例提供的存盘方法和存盘装置通过预设存盘快照周期,并在内存中维护与文件系统表对应的当前位图和快照位图,来记录元数据是否被修改,并通过对快照位图做快照,来延长存盘周期,使得一个周期内对同一条记录或文件块,只需要做一次存盘,且存盘时以批量方式按照优先级从高到低的顺序写盘,从而增大文件的聚合度来减少存盘IO数据量,保证了数据的完整性,和可访问性。The saving method and the saving device provided by the embodiment of the present invention record whether the metadata is modified, and pass the snapshot bit by pre-storing the snapshot cycle and maintaining the current bitmap and the snapshot bitmap corresponding to the file system table in the memory. Take a snapshot to extend the save cycle, so that you can save the same record or file block in one cycle, and write the disk in batch mode according to the priority from high to low, thus increasing the file size. The degree of aggregation reduces the amount of data stored in the IO, ensuring data integrity and accessibility.
附图说明DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明 的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only Is the invention Some of the embodiments can be obtained by those of ordinary skill in the art in view of the drawings without any inventive effort.
图1是本发明中基于的分布式文件系统的基本构架示意图;1 is a schematic diagram showing the basic architecture of a distributed file system based on the present invention;
图2是本发明的一种应用与分布式文件系统的存盘方法的一实施例的流程图;2 is a flow chart of an embodiment of an application and a file storage method of a distributed file system according to the present invention;
图3是反映图2中步骤S15中将当前位图与快照位图进行置换的示意图;3 is a schematic diagram showing the replacement of the current bitmap and the snapshot bitmap in step S15 of FIG. 2;
图4是反映图2中步骤S17的一实施例的流程图;Figure 4 is a flow chart reflecting an embodiment of step S17 of Figure 2;
图5是反映基于图2中分布式文件系统的存盘方法的写文件的一实施例的时序图;5 is a timing diagram reflecting an embodiment of a write file based on the save method of the distributed file system of FIG. 2;
图6是本发明的一种应用于分布式文件系统的存盘装置的一实施例的功能模块图。Figure 6 is a functional block diagram of an embodiment of a disk storage device for use in a distributed file system of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
本发明应用于分布式文件系统DFS,分布式文件系统的基本架构图1所示。当用户通过文件访问客户端FAC写文件,即更改元数据时,首先将文件全路径发送到目录树服务器DTS,获取全局唯一标识FILEID和该文件对应的文件位置寄存器FLR;其次,文件访问客户端FAC将写文件请求发送到上述文件位置寄存器FLR,获取该文件的数据块副本位置信息(通常一个文件被切分为若干相同大小的数据块,例如一个数据块64M大小,称为一个CHUNK);最后,文件访问客户端FAC与数据存储服务器建立连接,将数据块传递到数据存储服务器,写入磁盘。The present invention is applied to a distributed file system DFS, and the basic architecture of the distributed file system is shown in FIG. When the user accesses the client FAC to write the file through the file, that is, when the metadata is changed, the file full path is first sent to the directory tree server DTS, and the global unique identifier FILEID and the file location register FLR corresponding to the file are obtained; secondly, the file access client The FAC sends a write file request to the file location register FLR to obtain the data block copy location information of the file (usually a file is divided into several data blocks of the same size, for example, a data block 64M size, called a CHUNK); Finally, the file access client FAC establishes a connection with the data storage server, passes the data block to the data storage server, and writes to the disk.
其中,该DFS中元数据结构组织情况:目录树服务器DTS用于管理文件的命名空间、全局唯一标识FILEID的分配、FLR的分配;文件位置寄存器FLR用于管理文件的属性(如FILEID、文件大小、文件类型、访问权限、uid、gid等)和文件内容的存储位置。The metadata structure of the DFS is organized: the directory tree server DTS is used to manage the file namespace, the global unique identifier FILEID is allocated, and the FLR is allocated; the file location register FLR is used to manage file attributes (such as FILEID, file size). , file type, access rights, uid, gid, etc.) and the storage location of the file contents.
本发明是基于上述的分布式文件系统的,其通过预先设置存盘快照周期,同时针对文件系统表设置了两个位图,即当前位图和快照问题,分别用于表示当前存盘快照周期内元数据的变化记录,以及用于表示上一个存盘快照周期内元数据的变化记录,并且只有当存盘快照周期达到时,将两个位图进行置换,然后根据置换后的快照位图进行存盘,从而通过定时的方式来进行存盘快照,即一个存盘快照周期内对同一个文件/数据块只进行一次快照存盘,从而避免了现有方式中每次有需要存盘的记录,都做一次快照而导致造成数据丢失,并且本发明在进行存盘时,是根据优先级序列进行存盘的,从高到低的顺序进行批量写盘,从而增大文件的聚合度来减少存盘IO数据量,进而保证了数据的完整性,和可访问性。The present invention is based on the above-described distributed file system, which sets two bitmaps for the file system table by setting the save snapshot period in advance, that is, the current bitmap and the snapshot problem, respectively, for indicating the current save snapshot period intra-element a record of the change of the data, and a record of the change of the metadata in the last save snapshot period, and only when the save snapshot period is reached, the two bitmaps are replaced, and then the save is performed according to the replaced snapshot bitmap, thereby The snapshot is saved in a timed manner, that is, only one snapshot of the same file/data block is saved in a save snapshot cycle, thereby avoiding the need to save a record every time in the existing mode, and doing a snapshot The data is lost, and the present invention saves the disk according to the priority sequence, and performs batch writing in the order of high to low, thereby increasing the degree of aggregation of the file to reduce the amount of data stored in the disk, thereby ensuring data. Integrity, and accessibility.
实施例一 Embodiment 1
参见图2,为本发明的一种分布式文件系统中的存盘方法的一实施例的流程图,本实 施例中,为了避免每次产生更改记录就存盘一次,因此,本实施例中,通过预先设置存盘快照周期,从而实现定时存盘,并且还需要预先针对文件系统表分别创建两个大小相同的当前位图和快照位图,则本实施例中该存盘方法包括步骤:2 is a flowchart of an embodiment of a method for saving a disk in a distributed file system according to the present invention. In the embodiment, in order to avoid the occurrence of the change record every time, the disk is saved once. Therefore, in this embodiment, the disk save snapshot period is set in advance, thereby implementing the timed save disk, and two different sizes of the current need to be separately created for the file system table. The bitmap and the snapshot bitmap, in this embodiment, the saving method includes the following steps:
S11,实时接收元数据的更改记录,并根据该更改记录实时更新对应的当前位图。S11: Receive a change record of the metadata in real time, and update the corresponding current bitmap according to the change record in real time.
本实施例中,元数据更改记录是指,当需要新增元数据,或修改已有元数据,或者删除已有元数据等操作所产生的N条更改记录。In this embodiment, the metadata change record refers to N change records generated when operations such as adding metadata, modifying existing metadata, or deleting existing metadata are performed.
本实施例中,通过预设存盘快照周期,使得只有当一个存盘快照周期达到时,才进行存盘快照,从而实现定时批量的方式来进行存盘快照,避免了现有方式中每次有需要存盘的记录,都做一次快照的问题。In this embodiment, the save snapshot period is preset, so that only when a save snapshot period is reached, the save snapshot is performed, thereby implementing a timed batch manner to save the snapshot, thereby avoiding the need to save the disk every time in the existing manner. Record, all have a snapshot problem.
本实施例中,该当前位图cur_bit是用来表示当前存盘快照周期中元数据的更改记录;快照位图snap_bit则表示上一个存盘周期变化的记录。在一示例性实施例中,该当前位图和快照位图的大小与文件系统表的表容量成正比,并用0和1的方式来记录对应的记录是否被修改,即,当元数据发生更改时,通过遍历该当前位图,并根据更改记录顺序地将该当前位图中的相应位置置1,如图3所示。另外,在创建表的过程中需要对表创建镜像缓存,系统上电时,需要对镜像缓存初始化为0,对应的镜像位置也初始化为0。In this embodiment, the current bitmap cur_bit is used to indicate the change record of the metadata in the current save snapshot period; the snapshot bitmap snap_bit represents the record of the last save cycle change. In an exemplary embodiment, the size of the current bitmap and the snapshot bitmap is proportional to the table capacity of the file system table, and 0 and 1 are used to record whether the corresponding record is modified, that is, when the metadata is changed. By traversing the current bitmap, and sequentially setting the corresponding position in the current bitmap according to the change record, as shown in FIG. In addition, in the process of creating a table, you need to create a mirror cache for the table. When the system is powered on, you need to initialize the mirror cache to 0, and the corresponding mirror location is also initialized to 0.
S13,判断当前是否达到存盘快照周期,若是,则执行步骤S15,否则,执行步骤S11。S13. Determine whether the save snapshot period is currently reached. If yes, execute step S15. Otherwise, execute step S11.
本实施例中,预先设置了存盘快照周期,具体地,通过一个计时器来实现,而当计时器计时达到时,即存盘快照周期达到时,将反馈一个消息,如一个脉冲信号,以触发存盘线程,因此,可直接根据该计时器反馈的消息来判断是否达到存盘快照周期。In this embodiment, the save snapshot period is preset, specifically, by a timer, and when the timer is reached, that is, when the save snapshot period is reached, a message, such as a pulse signal, is triggered to trigger the save. Thread, therefore, it can be directly judged according to the message fed back by the timer whether the save snapshot period is reached.
S15,将快照位图和更新后的当前位图进行置换,得到新的当前位图和新的快照位图。S15. Replace the snapshot bitmap and the updated current bitmap to obtain a new current bitmap and a new snapshot bitmap.
本实施例中,由于该当前位图是用于表示当前存盘快照周期中元数据的更改记录,而快照位图表示上一个存盘快照周期中元数据的更改记录,因此,当达到存盘快照周期后,可直接将该当前位图与快照位图进行置换,如图3所示,即将快照位图被清空置换为新的当前位图,以立即重新开始记录元数据新的更改记录;而将当前位图置换为新的快照位图,作为存盘的依据,即在置换完成后直到下一个存盘快照周期的这一段时间内,可直接根据置换后得到的该新的快照位图中的文件系统表记录进行存盘操作即可。In this embodiment, since the current bitmap is used to indicate the change record of the metadata in the current save snapshot period, and the snapshot bitmap represents the change record of the metadata in the last save snapshot period, when the save snapshot period is reached, , the current bitmap can be directly replaced with the snapshot bitmap, as shown in FIG. 3, that is, the snapshot bitmap is cleared and replaced with a new current bitmap, so as to immediately restart the recording of the new change record of the metadata; The bitmap is replaced by a new snapshot bitmap, which is used as the basis for saving, that is, the file system table in the new snapshot bitmap obtained after the replacement until the next storage snapshot period is completed. Record and save the operation.
S17,根据置换后得到的新的快照位图进行存盘,同时,利用新的当前位图重新记录元数据的新的更改记录,执行步骤S13。S17, saving according to the new snapshot bitmap obtained after the replacement, and re-recording the new change record of the metadata with the new current bitmap, and executing step S13.
为了减少存盘IO数据量,本实施例中,在进行存盘时是按照各个数据段的存盘优先级序列进行批量存盘的,即通过增大文件的聚合度来减少存盘IO数据量,参见图4,该步骤S17包括步骤:In order to reduce the amount of data stored in the IO, in the embodiment, the disk is saved in the order of the priority of the data segment when the disk is saved, that is, the amount of data stored in the disk is reduced by increasing the degree of aggregation of the file, as shown in FIG. The step S17 includes the steps of:
S171、根据置换后得到的新的快照位图计算各个数据段的数据聚合度。S171. Calculate a data aggregation degree of each data segment according to a new snapshot bitmap obtained after the replacement.
本实施例中,由于快照位图中的数据存盘时,每次按16K为一个单位,称之为数据段DATA,其中数据段和文件系统表表记录长度TupleLen的比值就是该记录段中存放的最大记录个数MaxTupleNumber;每个数据段中的数据聚合度是指位图中变化的记录(即位图位 置为1的记录)与MaxTupleNumber的比值乘以100,而数据段的数据聚合度DP将作为该数据段存盘优先级排序的参数。In this embodiment, since the data in the snapshot bitmap is saved, each time 16K is used as a unit, which is called a data segment DATA, wherein the ratio of the data segment to the file system table record length TupleLen is stored in the segment. The maximum number of records MaxTupleNumber; the degree of data aggregation in each data segment refers to the record of changes in the bitmap (ie bitmap bit) The ratio set to 1) is multiplied by 100 with MaxTupleNumber, and the data aggregation degree DP of the data segment is used as a parameter for sorting the data segment prioritization.
S173,将各个数据段的数据聚合度逐一与预设聚合度阈值比较,分别得到数据聚合度大于或等于预设数据聚合度阈值的多个第一数据段,以及数据聚合度小于预设数据聚合度阈值的多个第二数据段。S173: Compare data aggregation degrees of the data segments with the preset aggregation degree thresholds, and obtain multiple first data segments whose data aggregation degree is greater than or equal to the preset data aggregation degree threshold, and the data aggregation degree is smaller than the preset data aggregation. A plurality of second data segments of the degree threshold.
S175,将上述各个第一数据段按照预设规则进行排列,得到对应的存盘优先级序列,执行步骤S179。S175, the first data segments are arranged according to a preset rule to obtain a corresponding save priority sequence, and step S179 is performed.
本实施例中,预先设置了数据聚合度阀值,比如为30,因此,当判断出多个数据段的数据聚合度均大于该预设阈值时,需要将这些第一数据段按照一定的顺序进行排列,从而得到针对该第一数据段存盘优先级队列。In this embodiment, the data aggregation degree threshold is set in advance, for example, 30. Therefore, when it is determined that the data aggregation degree of the plurality of data segments is greater than the preset threshold, the first data segments need to be in a certain order. Arrange to obtain a priority queue for the first data segment.
本实施例中,该预设规则是指将数据聚合度大于或等于预设阈值的各个第一数据段按照数据聚合度从大到小的顺序进行排列。当然,按照从小到大的顺序排列,或者按照其他规则进行排列也是可以理解的。In this embodiment, the preset rule refers to arranging the first data segments whose data aggregation degree is greater than or equal to the preset threshold according to the data aggregation degree from large to small. Of course, it is understandable to arrange them in order from small to large, or according to other rules.
S177,提取上述各个第二数据段对应记录写到日志文件中,且根据预设的检查周期定时对各个第二数据段进行检查,直至各个第二数据段的数据聚合度等于或大于预设的数据聚合度阈值时,按照预设规则将各个第二数据段保存到对应的存盘优先级序列中,执行步骤S179。S177. Extract corresponding record records of the foregoing second data segments into the log file, and check each second data segment according to a preset check cycle timing, until the data aggregation degree of each second data segment is equal to or greater than a preset When the data aggregation degree threshold is used, each second data segment is saved in the corresponding save priority sequence according to a preset rule, and step S179 is performed.
一般情况下,上一个存盘快照周期中原本数据聚合度小于预设阈值的各个第二数据段,在再次经过一个存盘快照周期后,其数据聚合度即可满足大于或等于预设阈值,因此,本实施例中,设置为每经过两个存盘快照周期对该第二数据段进行一次检查,以判断其数据聚合度是否达到预设阈值,即设置每两个存盘快照周期为一个检查周期。当然,也可以根据实际情况设置三个或者多个存盘快照周期为一个检查周期。The data aggregation degree of the second data segment whose original data aggregation degree is less than the preset threshold in the previous save snapshot period can be greater than or equal to the preset threshold after a save snapshot cycle. In this embodiment, the second data segment is checked once every two save snapshot periods to determine whether the data aggregation degree reaches a preset threshold, that is, each two save snapshot cycles is set to one check cycle. Of course, it is also possible to set three or more save snapshot periods as one check cycle according to actual conditions.
本实施例中,当经过一个检查周期,再次检查到各个第二数据段的数据聚合度大于或等于预设数据聚合度阈值时,即是说,原本在第一个存盘快照周期被化为为第二数据段的各个数据段,经过一段时间的改变,其数据聚合度增大了,即在第二个存盘快照周期其数据类型被判别为第一数据段,因此,直接根据其数据聚合度保存到对应的存盘优先级序列中(这里的存盘优先级序列是第二个存盘快照周期内根据各个第一数据段的数据聚合所构建的优先级序列)。In this embodiment, when a check cycle is passed, it is checked again that the data aggregation degree of each second data segment is greater than or equal to the preset data aggregation degree threshold, that is, the original save snapshot period is turned into Each data segment of the second data segment is increased in data aggregation degree over a period of time, that is, the data type is discriminated as the first data segment in the second save snapshot cycle, and therefore, directly according to the data aggregation degree thereof The save to the corresponding save priority sequence (the save priority sequence here is the priority sequence constructed according to the data aggregation of each first data segment in the second save snapshot cycle).
S179,按照存盘优先级序列进行存盘,并执行步骤S11。S179: Save the disk according to the save priority sequence, and execute step S11.
本实施例中,由于经过置换之后得到新的当前位图(即清空的快照位图),因此,在存盘的同时,可通过该新的当前位图继续实时记录元数据的更改记录,直到下一个存盘快照周期到达时,该新的当前位图被置换,同时得到另一个新的当前位图,如此循环。In this embodiment, since the new current bitmap (ie, the emptied snapshot bitmap) is obtained after the replacement, the change record of the metadata can be continuously recorded in real time through the new current bitmap while the disk is being saved, until the next time When a save snapshot cycle arrives, the new current bitmap is replaced while another new current bitmap is obtained, thus looping.
本实施例中,通过设置当前位图来记录元数据的更改记录,且当达到存盘快照周期后,该当前位图立即与快照位图进行置换,得到新的当前位图来记录元数据新的更改记录,同时,在下一个存盘快照周期到来之前,可直接根据置换得到的新的快照位图进行批量存盘, 由此可知,通过对当前位图做快照,来延长存盘周期,使得一个周期内对同一条记录或文件块,只需要做一次存盘,且存盘时以批量方式按照优先级从高到低的顺序写盘,从而增大文件的聚合度来减少存盘IO数据量,保证了数据的完整性,和可访问性。In this embodiment, the change record of the metadata is recorded by setting the current bitmap, and when the save snapshot period is reached, the current bitmap is immediately replaced with the snapshot bitmap, and a new current bitmap is obtained to record the new metadata. Change the record, at the same time, before the next save snapshot cycle, you can directly save the new snapshot bitmap according to the replacement. It can be seen that by taking a snapshot of the current bitmap, the save period is extended, so that the same record or file block in one cycle needs to be saved once, and the order of priority is from high to low in batch mode. Write the disk, thereby increasing the degree of aggregation of the file to reduce the amount of data stored in the disk, ensuring data integrity, and accessibility.
实施例二Embodiment 2
由上述实施例可知,元数据的更改包括增加,如写文件/元数据,因此,下面结合说明书附图和示例性实施例对写文件时的存盘方法进行详细的说明。As can be seen from the above embodiments, the modification of the metadata includes an addition, such as writing a file/metadata. Therefore, the method of saving the file when the file is written will be described in detail below with reference to the drawings and the exemplary embodiments.
参见图5,为基于上述实施例一中的存盘方法的一种写文件的一实施例的时序图,其中,本实施例中在分布式文件系统中写文件包括步骤:Referring to FIG. 5, it is a sequence diagram of an embodiment of a write file based on the save method in the first embodiment, wherein the file is written in the distributed file system in the embodiment:
S21,文件访问客户端FAC发送写文件请求至目录树服务器DTS。S21. The file access client FAC sends a write file request to the directory tree server DTS.
本实施例中,用户通过FAC发送写文件请求至DTS,且该写文件请求中携带有将要写的文件对象全路径。In this embodiment, the user sends a write file request to the DTS through the FAC, and the write file request carries the full path of the file object to be written.
S22,DTS判断该文件是否存在,若不存在,则DTS生成新的文件标识FILEID,并为其分配可用的FLR,同时生成字典表记录来存储文件名,生成文件FILEID记录,存储FILEID、FLRID等信息,然后给文件访问客户端FAC反馈创建成功消息;若文件不存在,则DTS给FAC报错。S22: The DTS determines whether the file exists. If not, the DTS generates a new file identifier FILEID, allocates an available FLR to it, generates a dictionary table record to store the file name, generates a file FILEID record, and stores the FILEID, FLRID, and the like. The information is then given a success message to the file access client FAC feedback; if the file does not exist, the DTS gives the FAC an error.
本实施例中,该DTS通过在命名空间中查找,以判断该文件是否存在。In this embodiment, the DTS searches for a presence in the namespace to determine whether the file exists.
S23,FAC收到消息之后给对应的文件位置寄存器FLR发送创建文件消息。S23. After receiving the message, the FAC sends a create file message to the corresponding file location register FLR.
S24,FLR判断该文件是否已存在,若存在,则反馈已存在,若不存在,则创建FILE记录,存储FILEID、生成时间等信息,并给FAC反馈创建文件成功应答。S24. The FLR determines whether the file already exists. If yes, the feedback already exists. If not, the FILE record is created, the FILEID, the generation time, and the like are stored, and the FAC feedback creation file is successfully acknowledged.
本实施例中,该FLR通过FILEID遍历查找文件,以判断其是否存在。In this embodiment, the FLR traverses the file by FILEID to determine whether it exists.
S25,FAC收到创建文件应答,通过FILEID给FLR发送创建文件块请求。S25, the FAC receives the create file response, and sends a file block request to the FLR through the FILEID.
S26,FLR根据存储规则选择写文件块的目的磁盘,并生成文件块对应记录,同时向FAC反馈创建文件块所在磁盘信息。S26, the FLR selects the destination disk of the file block according to the storage rule, and generates a file block corresponding record, and simultaneously feeds the FAC to create the disk information of the file block.
S27,FAC根据返回的磁盘信息,到FAS上创建文件块,并写入文件内容。S27. The FAC creates a file block on the FAS according to the returned disk information, and writes the file content.
S28,FAS按照定时批量的方式写入文件,且写入文件之后,写入之后,向FAC回复写入结果和文件块大小信息。S28, the FAS writes the file according to the timed batch manner, and after writing the file, after writing, returns the write result and the file block size information to the FAC.
本实施例中,该FAS按照定时批量的方式写入文件,是指采用上述实施例一中的方式,即实时接收写入的文件内容,并实时更新对应的预设当前位图,然后通过周期性的将更新后的当前位图与快照位图进行置换,以及根据置换后得到的快照位图进行存盘,直至将整个文件内容全部写入,即将文件内容是周期性、分批次的写入,并且写入过程中按照各个数据段对应的优先级序列来写入的。In this embodiment, the FAS writes the file according to the timed batch mode, which means that the file content in the first embodiment is received in real time, and the corresponding preset current bitmap is updated in real time, and then the cycle is adopted. The current bitmap and the snapshot bitmap are replaced by the updated bitmap, and the snapshot bitmap obtained after the replacement is saved until the entire file content is written, that is, the file content is periodically and batch-written. And the writing process is written according to the priority sequence corresponding to each data segment.
S29,FAC则将写入结果和文件块大小信息等上报给FLR。S29, the FAC reports the write result and the file block size information to the FLR.
S210,FLR将上报内容记录到文件块记录中,并给FAC回复。S210, the FLR records the reported content into the file block record and replies to the FAC.
本实施例中,当FAC收到FLR返回的回复后,表示写文件完成,同时给用户发写文件完成应答。 In this embodiment, when the FAC receives the reply returned by the FLR, it indicates that the writing of the file is completed, and the user is sent a file completion response.
实施例三Embodiment 3
对应于上述的存盘方法,本发明还提供了一种分布式文件系统,下面结合附图和示例性实施例进行详细的说明。Corresponding to the above-described storage method, the present invention also provides a distributed file system, which will be described in detail below with reference to the accompanying drawings and exemplary embodiments.
参见图6,为本发明的一种应用于分布式文件系统的存盘装置,该存盘装置包括:Referring to FIG. 6, a disk storage device for a distributed file system according to the present invention, the disk storage device includes:
处理模块61,用于预设存盘快照周期,并针对文件系统表预先创建分别用于表示当前存盘快照周期内元数据的更改记录的当前位图,以及用于表示上一个存盘快照周期内元数据的更改记录的快照位图;The processing module 61 is configured to preset a save snapshot period, and pre-create a current bitmap for indicating a change record of the metadata in the current save snapshot period for the file system table, and used to represent the metadata in the last save snapshot period. Snapshot bitmap of the change record;
数据计入模块62,用于实时接收用户输入的元数据的更改记录;The data is calculated into the module 62, and is configured to receive a change record of the metadata input by the user in real time;
更新模块63,用于根据该数据接入模块62所接收的更改记录,实时更新当前位图;本实施例中,这里的当前位图是指系统上电之后初始状态下的预设当前位图,或者,系统上电之后运行过程中,经过置换之后得到的新的当前位图;The update module 63 is configured to update the current bitmap in real time according to the change record received by the data access module 62. In this embodiment, the current bitmap refers to the preset current bitmap in the initial state after the system is powered on. Or, after the system is powered on, the new current bitmap obtained after the replacement;
存盘模块64,用于当存盘快照周期到达时,将表示上一个存盘周期元数据的更改记录的快照位图,与经过更新模块63实时更新后的当前位图进行置换,得到新的当前位图和快照位图,并根据置换后得到的新的快照位图进行存盘;同时,触发更新模块根据元数据新的更改记录更新该新的当前位图。The save disk module 64 is configured to replace the snapshot bitmap representing the change record of the last save cycle metadata with the current bitmap updated by the update module 63 in real time to obtain a new current bitmap when the save snapshot period arrives. And the snapshot bitmap is saved according to the new snapshot bitmap obtained after the replacement; at the same time, the trigger update module updates the new current bitmap according to the new change record of the metadata.
参见图6,本实施例中,该存盘模块64在进行存盘时,是根据各个数据段的存盘优先级序列进行存盘的,该存盘模块64包括:Referring to FIG. 6, in the embodiment, the save module 64 is saved according to the save priority sequence of each data segment when the save module 64 is saved. The save module 64 includes:
判断单元641,用于判断当前是否达到存盘快照周期;在一示例性实施例中,通过处理模块61设置一个定时器来进行计时,从而当其计时达到预设时长,则发送一个触发信号至该存盘模块64以触发进行存盘操作等,因此,可通过判断是否接收到处理模块61发送来的触发信号来判别是否达到存盘快照周期;The determining unit 641 is configured to determine whether the save snapshot period is currently reached. In an exemplary embodiment, a timer is set by the processing module 61 to perform timing, so that when the timing reaches a preset duration, a trigger signal is sent to the The save module 64 triggers a save operation or the like. Therefore, whether the save snapshot period is reached can be determined by determining whether the trigger signal sent by the processing module 61 is received.
置换单元642,用于当判断单元641判断出当前达到存盘快照周期时,将更新后的当前位图与快照位图进行置换,得到新的当前位图和新的快照位图;本实施例中,当达到存盘快照周期,即判断单元641接收到处理模块61发送来的触发信号后,该判断单元641会发送一个触发信号给该置换单元642,从而该置换单元642将表示当前存盘快照周期内元数据的更改记录的当前位图和表示上一个存盘快照周期内元数据的更改记录的(或者系统上电时初始状态下预设的)快照位图进行置换,从而得到新的当前位图和新的快照位图,如图3所示,即将原本的快照位图清空作为新的当前位图,将原本的当前位图作为新的快照位图;The replacement unit 642 is configured to replace the updated current bitmap with the snapshot bitmap when the determining unit 641 determines that the save snapshot period is currently reached, to obtain a new current bitmap and a new snapshot bitmap. In this embodiment, After the save snapshot period is reached, that is, the determination unit 641 receives the trigger signal sent by the processing module 61, the determining unit 641 sends a trigger signal to the replacement unit 642, so that the replacement unit 642 will indicate the current save snapshot period. The current bitmap of the change record of the metadata and the snapshot bitmap representing the change record of the metadata in the previous save snapshot period (or preset in the initial state when the system is powered on) are replaced, thereby obtaining a new current bitmap and The new snapshot bitmap, as shown in Figure 3, will empty the original snapshot bitmap as the new current bitmap, and use the original current bitmap as the new snapshot bitmap;
优先级排序单元644,用于根据置换后的新的快照位图,计算各个数据段的存盘优先级序列;本实施例中,该优先级排序单元644包括:数据聚合度计算子单元,用于根据置换后得到的新的快照位图,计算对应的每个数据段的数据聚合度;比较子单元,用于将每个数据段的数据聚合度与处理模块61预设的数据聚合度阈值进行比较,分别得到数据聚合度大于或等于预设的数据聚合度阈值的多个第一数据段,以及数据聚合度小于预设数据聚合度阈值的多个第二数据段;以及根据处理模块61预设的检查周期定时对各个第二数 据段进行检查,直至各个第二数据段的聚合度阀值等于或大于预设的数据聚合度阈值时,生成触发信号触发下述的排序子单元按照预设规则将各个第二数据段保存到对应的优先级队列中;排序子单元,用于根据比较子单元的比较结果,将各个第一数据段,按照预设规则进行排列,得到对应的存盘优先级序列;写记录子单元,用于根据比较子单元的比较结果,提取各个第二数据段对应的记录写到日志文件中;The priority sorting unit 644 is configured to calculate a save priority sequence of each data segment according to the replaced new snapshot bitmap. In this embodiment, the priority sorting unit 644 includes: a data aggregation degree calculation subunit, configured to: The data aggregation degree of each corresponding data segment is calculated according to the new snapshot bitmap obtained after the replacement; the comparison subunit is configured to perform the data aggregation degree of each data segment and the data aggregation degree threshold preset by the processing module 61. Comparing, respectively, obtaining a plurality of first data segments whose data aggregation degree is greater than or equal to a preset data aggregation degree threshold, and a plurality of second data segments whose data aggregation degree is less than a preset data aggregation degree threshold; and pre-processing according to the processing module 61 Set the inspection cycle timing to each second number According to the segment inspection, until the aggregation degree threshold of each second data segment is equal to or greater than the preset data aggregation degree threshold, generating a trigger signal triggers the following sorting subunit to save each second data segment according to a preset rule. In the corresponding priority queue, the sorting subunit is configured to arrange each of the first data segments according to a comparison rule according to a comparison result of the comparison subunit to obtain a corresponding storage priority sequence; and write a recording subunit for Extracting, according to the comparison result of the comparison subunit, the record corresponding to each second data segment is written into the log file;
存盘线程单元643,用于根据计算得到的存盘优先级序列将对应的各个数据段进行存盘。The save thread unit 643 is configured to save the corresponding data segments according to the calculated save priority sequence.
本实施例中,该预设规则是指将数据聚合度大于或等于预设阈值的各个第一数据段按照数据聚合度从大到小的顺序进行排列;当然,按照从小到大的顺序排列,或者按照其他规则进行排列也是可以理解的。In this embodiment, the preset rule refers to arranging the first data segments whose data aggregation degree is greater than or equal to the preset threshold according to the data aggregation degree from large to small; of course, in order from small to large, Or it is understandable to arrange according to other rules.
本实施例中,当经过一个检查周期,再次检查到各个第二数据段的数据聚合度大于或等于预设数据聚合度阈值时,即是说,原本在第一个存盘快照周期被化为为第二数据段的各个数据段,经过一段时间的改变,其数据聚合度增大了,即在第二个存盘快照周期其数据类型被判别为第一数据段,因此,直接根据其数据聚合度保存到对应的存盘优先级序列中(这里的存盘优先级序列是第二个存盘快照周期内根据各个第一数据段的数据聚合所构建的优先级序列)。In this embodiment, when a check cycle is passed, it is checked again that the data aggregation degree of each second data segment is greater than or equal to the preset data aggregation degree threshold, that is, the original save snapshot period is turned into Each data segment of the second data segment is increased in data aggregation degree over a period of time, that is, the data type is discriminated as the first data segment in the second save snapshot cycle, and therefore, directly according to the data aggregation degree thereof The save to the corresponding save priority sequence (the save priority sequence here is the priority sequence constructed according to the data aggregation of each first data segment in the second save snapshot cycle).
本实施例中,通过处理模块预先分别设置当前位图来记录当前周期内元数据的更改记录,和快照位图来记录上一个存盘快照周期内元数据的更改记录,且当达到存盘快照周期后,将该当前位图立即与快照位图进行置换,得到新的当前位图来记录元数据新的更改记录,同时,在下一个存盘快照周期到来之前,可直接根据置换得到的新的快照位图进行批量存盘,由此可知,通过对当前位图做快照,来延长存盘周期,使得一个周期内对同一条记录或文件块,只需要做一次存盘,且存盘时以批量方式、按照优先级从高到低的顺序写盘,从而增大文件的聚合度来减少存盘IO数据量,保证了数据的完整性,和可访问性。In this embodiment, the current bitmap is separately set by the processing module to record the change record of the metadata in the current period, and the snapshot bitmap is used to record the change record of the metadata in the last save snapshot period, and after the save snapshot period is reached. , the current bitmap is immediately replaced with the snapshot bitmap, and a new current bitmap is obtained to record the new change record of the metadata, and at the same time, the new snapshot bitmap can be directly obtained according to the replacement before the next save snapshot period arrives. Performing a batch save, it can be seen that by taking a snapshot of the current bitmap, the save cycle is extended, so that the same record or file block in one cycle needs to be saved once, and the file is saved in batch mode according to the priority. High-to-low sequential writes, which increase the degree of file aggregation to reduce the amount of data stored in the disk, ensuring data integrity and accessibility.
实施例四Embodiment 4
基于上述的应用于分布式文件系统的存盘方法和存盘装置,本发明还提供了一种分布式文件系统,其包括了上述实施例三中的存盘装置,其存盘的方法和原理与上述实施例一或二或三中的原理相同,这里不再赘述。The present invention also provides a distributed file system, which includes the disk storage device in the third embodiment, the method and the principle of the disk storage, and the above embodiment, based on the storage method and the disk storage device applied to the distributed file system. The principles in one or two or three are the same and will not be described here.
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本发明权利要求所作的等同变化,仍属于发明所涵盖的范围。The above is only the preferred embodiment of the present invention, and the scope of the present invention is not limited thereto, and those skilled in the art can understand all or part of the process of implementing the above embodiments, and according to the claims of the present invention. The equivalent change is still within the scope of the invention.
工业实用性Industrial applicability
本发明实施例提供的技术方案可以应用于计算机存储技术领域。在本发明实施例的技术方案提供的存盘方法和存盘装置中,通过预设存盘快照周期,并在内存中维护与文件系统表对应的当前位图和快照位图,来记录元数据是否被修改,并通过对快照位图做快照,来延长存盘周期,使得一个周期内对同一条记录或文件块,只需要做一次存盘,且 存盘时以批量方式按照优先级从高到低的顺序写盘,从而增大文件的聚合度来减少存盘IO数据量,保证了数据的完整性,和可访问性。 The technical solution provided by the embodiment of the present invention can be applied to the technical field of computer storage. In the save method and the save device provided by the technical solution of the embodiment of the present invention, whether the metadata is modified is determined by presetting the save snapshot period and maintaining the current bitmap and the snapshot bitmap corresponding to the file system table in the memory. And by taking a snapshot of the snapshot bitmap to extend the save cycle, so that the same record or file block in one cycle, only need to be saved once, and When saving, the disk is written in batch mode according to the priority from high to low, thereby increasing the degree of aggregation of the file to reduce the amount of data stored in the disk, ensuring data integrity and accessibility.

Claims (11)

  1. 一种应用于分布式文件系统的存盘方法,其中,预设存盘快照周期,并针对文件系统表预先创建分别用于表示当前存盘快照周期内元数据的更改记录的当前位图,以及用于表示上一个存盘快照周期内元数据的更改记录的快照位图,则所述存盘方法包括步骤:A storage method for a distributed file system, wherein a save snapshot period is preset, and a current bitmap for indicating a change record of metadata in a current save snapshot period is pre-created for the file system table, and is used for representing The snapshot bitmap of the change record of the metadata in the last save snapshot period, the save method includes the steps of:
    实时接收元数据的更改记录,并根据所述更改记录实时更新对应的当前位图;Receiving a change record of the metadata in real time, and updating the corresponding current bitmap according to the change record;
    判断当前是否达到存盘快照周期,若是,则将所述快照位图与更新后的所述当前位图进行置换,得到新的当前位图和新的快照位图,并根据新的快照位图进行存盘,同时,利用所述新的当前位图重新开始记录元数据新的更改记录。Determining whether the save snapshot period is currently reached, and if so, replacing the snapshot bitmap with the updated current bitmap to obtain a new current bitmap and a new snapshot bitmap, and performing a new snapshot bitmap according to the new snapshot bitmap Save the disk, and at the same time, restart the recording of the new change record of the metadata with the new current bitmap.
  2. 如权利要求2所说的存盘方法,其中,进行存盘时,是根据所述新的快照位图所对应的各个数据段的存盘优先级序列进行存盘的。The method of saving a disk according to claim 2, wherein the saving is performed according to a storage priority sequence of each data segment corresponding to the new snapshot bitmap.
  3. 如权利要求2所说的存盘方法,其中,各个数据段的存盘优先级序列的计算步骤,包括步骤:The method of depositing a disk according to claim 2, wherein the calculating step of the sequence of saving priorities of the respective data segments comprises the steps of:
    根据置换后得到的新的快照位图,计算每个数据段的数据聚合度;Calculating the data aggregation degree of each data segment according to the new snapshot bitmap obtained after the replacement;
    将各个数据段的数据聚合度逐一与预设的数据聚合度阈值进行比较,分别得到数据聚合度大于或等于预设数据聚合度阈值的多个第一数据段,以及数据聚合度小于预设数据聚合度阈值的多个第二数据段;The data aggregation degree of each data segment is compared with a preset data aggregation degree threshold, and a plurality of first data segments whose data aggregation degree is greater than or equal to a preset data aggregation degree threshold are respectively obtained, and the data aggregation degree is less than the preset data. a plurality of second data segments of the degree of polymerization threshold;
    将各个所述第一数据段按照预设规则进行排列,从而得到相应的存盘优先级序列。Each of the first data segments is arranged according to a preset rule, thereby obtaining a corresponding save priority sequence.
  4. 如权利要求3所述的存盘方法,其中,所述各个数据段的存盘优先级序列的计算步骤,还包括步骤:The method of depositing a disk according to claim 3, wherein the calculating step of the priority sequence of the data segments further comprises the steps of:
    提取各个所述第二数据段对应记录写到日志文件中;且根据预设的检查周期定时对所述第二数据段进行检查,直至所述第二数据段的数据聚合度等于或大于预设的数据聚合度阈值时,按照预设规则将所述第二数据段保存到对应的存盘优先级序列中。Extracting each of the second data segment corresponding records into the log file; and checking the second data segment according to a preset check cycle timing until the data aggregation degree of the second data segment is equal to or greater than a preset When the data aggregation degree threshold is used, the second data segment is saved in the corresponding save priority sequence according to a preset rule.
  5. 如权利要求3或4所述的存盘方法,其中,所述预设规则是指将各个数据段按照聚合度从大到小的顺序进行排列;和/或,两个存盘快照周期为一个检查周期。The disk saving method according to claim 3 or 4, wherein the preset rule refers to arranging the data segments in descending order of aggregation degree; and/or, the two disk snapshot cycles are one inspection cycle. .
  6. 一种应用于分布式文件系统的存盘装置,包括:A disk storage device for a distributed file system, comprising:
    处理模块,设置为预设存盘快照周期,并针对文件系统表预先创建分别用于表示当前存盘快照周期内元数据的更改记录的当前位图,以及用于表示上一个存盘快照周期内元数据的更改记录的快照位图;The processing module is configured to preset a snapshot snapshot period, and pre-create a current bitmap for indicating a change record of metadata in the current save snapshot period for the file system table, and for indicating metadata in the last save snapshot period. Change the snapshot bitmap of the record;
    数据接入模块,设置为实时接收用户输入的元数据的更改记录;a data access module, configured to receive a change record of metadata input by the user in real time;
    更新模块,设置为根据所述数据接入模块所接收的元数据的更改记录实时更新所述当前位图;And an update module, configured to update the current bitmap in real time according to the change record of the metadata received by the data access module;
    存盘模块,设置为当存盘快照周期到达时,将所述快照位图与更新后的所述当 前位图进行置换,得到新的当前位图和新的快照位图,并根据所述新的快照位图进行存盘,同时,触发所述更新模块根据所述元数据新的更改记录更新所述新的当前位图。a save disk module, configured to: when the save snapshot period arrives, the snapshot bitmap and the updated The previous bitmap is replaced, a new current bitmap and a new snapshot bitmap are obtained, and are saved according to the new snapshot bitmap, and the update module is triggered to update the new change record according to the metadata. The new current bitmap.
  7. 如权利要求6所述的存盘装置,其中,所述存盘模块包括:The disk storage device of claim 6, wherein the saving module comprises:
    判断单元,设置为判断当前是否达到存盘快照周期;The determining unit is configured to determine whether the current save snapshot period is reached;
    置换单元,设置为当所述判断单元判断出当前达到存盘快照周期时,将所述更新后的当前位图与所述的快照位图进行置换,得到新的当前位图和新的快照位图;a replacement unit, configured to replace the updated current bitmap with the snapshot bitmap to obtain a new current bitmap and a new snapshot bitmap when the determining unit determines that the save snapshot period is currently reached ;
    优先级排序单元,设置为根据置换后得到的新的快照位图,计算各个数据段的存盘优先级序列;a priority sorting unit, configured to calculate a save priority sequence of each data segment according to a new snapshot bitmap obtained after the replacement;
    存盘线程单元,设置为根据根据所述存盘优先级序列将对应的各个数据段进行存盘。The save thread unit is configured to save the corresponding data segments according to the save priority sequence.
  8. 如权利要求7所述的存盘装置,其中,所述处理模块还设置为预设数据聚合度阈值,则所述优先级排序单元包括:The disk storage device of claim 7, wherein the processing module is further configured to preset a data aggregation degree threshold, and the priority ordering unit comprises:
    数据聚合度计算子单元,设置为根据置换后得到的新的快照位图,计算对应的每个数据段的数据聚合度;a data aggregation degree calculation subunit, configured to calculate a data aggregation degree of each corresponding data segment according to the new snapshot bitmap obtained after the replacement;
    比较子单元,设置为将每个数据段的数据聚合度与所述预设的数据聚合度阈值进行比较,分别得到数据聚合度大于或等于预设的数据聚合度阈值的多个第一数据段,以及数据聚合度小于预设数据聚合度阈值的多个第二数据段;Comparing the sub-units, the data aggregation degree of each data segment is compared with the preset data aggregation degree threshold, and respectively obtaining a plurality of first data segments whose data aggregation degree is greater than or equal to a preset data aggregation degree threshold And a plurality of second data segments whose data aggregation degree is less than a preset data aggregation degree threshold;
    排序子单元,设置为根据比较子单元的比较结果,将各个所述第一数据段,按照预设规则进行排列,得到对应的存盘优先级序列。The sorting subunit is configured to arrange each of the first data segments according to a preset rule according to a comparison result of the comparison subunits to obtain a corresponding save priority sequence.
  9. 如权利要求8所述的存盘装置,其中,所述优先级排序单元还包括:The disk storage device of claim 8, wherein the prioritization unit further comprises:
    写记录子单元,设置为根据比较子单元的比较结果,提取各个所述第二数据段对应的记录写到日志文件中;且Writing a recording subunit, configured to extract, according to a comparison result of the comparing subunit, a record corresponding to each of the second data segments to be written into a log file;
    所述比较子单元还设置为根据预设的检查周期定时对各个所述第二数据段进行检查,直至所述第二数据段的数据聚合度等于或大于预设的数据聚合度阈值时,触发所述排序子单元按照预设规则将所述第二数据段保存到对应的优先级队列中。The comparing subunit is further configured to check each of the second data segments according to a preset check cycle timing, and trigger until the data aggregation degree of the second data segment is equal to or greater than a preset data aggregation degree threshold. The sorting subunit saves the second data segment to a corresponding priority queue according to a preset rule.
  10. 一种分布式文件系统,包括如权利要求6至9中任意一项所述的存盘装置。A distributed file system comprising the disk storage device according to any one of claims 6 to 9.
  11. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行的一个或多个程序,所述一个或多个程序被所述计算机执行时使所述计算机执行如根据权利要求1-5中任一项所述的应用于分布式文件系统的存盘方法。 A computer storage medium having stored therein one or more programs executable by a computer, the one or more programs being executed by the computer to cause the computer to perform as in claims 1-5 A storage method for a distributed file system as described in any one of the preceding claims.
PCT/CN2017/106690 2016-10-31 2017-10-18 Saving method applied to distributed file system, apparatus and distributed file system WO2018077092A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610930085.2 2016-10-31
CN201610930085.2A CN108021562B (en) 2016-10-31 2016-10-31 Disk storage method and device applied to distributed file system and distributed file system

Publications (1)

Publication Number Publication Date
WO2018077092A1 true WO2018077092A1 (en) 2018-05-03

Family

ID=62024721

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/106690 WO2018077092A1 (en) 2016-10-31 2017-10-18 Saving method applied to distributed file system, apparatus and distributed file system

Country Status (2)

Country Link
CN (1) CN108021562B (en)
WO (1) WO2018077092A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897822A (en) * 2018-06-21 2018-11-27 郑州云海信息技术有限公司 A kind of data-updating method, device, equipment and readable storage medium storing program for executing
CN111782702A (en) * 2020-06-29 2020-10-16 北京金山云网络技术有限公司 Metadata hot ranking method, device, equipment and storage medium
CN111782702B (en) * 2020-06-29 2024-05-03 北京金山云网络技术有限公司 Metadata heat sorting method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567262A (en) * 2003-06-10 2005-01-19 联想(北京)有限公司 On-line data backup method based on data volume snapshot
US20050165722A1 (en) * 2004-01-27 2005-07-28 International Business Machines Corporation Method, system, and program for storing data for retrieval and transfer
US20050210209A1 (en) * 2004-03-22 2005-09-22 Koji Nagata Storage device and information management system
CN103116533A (en) * 2012-05-28 2013-05-22 北京智网科技股份有限公司 Snapshot implementation method
CN103593436A (en) * 2013-11-12 2014-02-19 华为技术有限公司 File merging method and device
CN104462290A (en) * 2014-11-27 2015-03-25 华为技术有限公司 File system copying method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8001580B1 (en) * 2005-07-25 2011-08-16 Netapp, Inc. System and method for revoking soft locks in a distributed storage system environment
US8769105B2 (en) * 2012-09-14 2014-07-01 Peaxy, Inc. Software-defined network attachable storage system and method
CN105589887B (en) * 2014-10-24 2020-04-03 中兴通讯股份有限公司 Data processing method of distributed file system and distributed file system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567262A (en) * 2003-06-10 2005-01-19 联想(北京)有限公司 On-line data backup method based on data volume snapshot
US20050165722A1 (en) * 2004-01-27 2005-07-28 International Business Machines Corporation Method, system, and program for storing data for retrieval and transfer
US20050210209A1 (en) * 2004-03-22 2005-09-22 Koji Nagata Storage device and information management system
CN103116533A (en) * 2012-05-28 2013-05-22 北京智网科技股份有限公司 Snapshot implementation method
CN103593436A (en) * 2013-11-12 2014-02-19 华为技术有限公司 File merging method and device
CN104462290A (en) * 2014-11-27 2015-03-25 华为技术有限公司 File system copying method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897822A (en) * 2018-06-21 2018-11-27 郑州云海信息技术有限公司 A kind of data-updating method, device, equipment and readable storage medium storing program for executing
CN111782702A (en) * 2020-06-29 2020-10-16 北京金山云网络技术有限公司 Metadata hot ranking method, device, equipment and storage medium
CN111782702B (en) * 2020-06-29 2024-05-03 北京金山云网络技术有限公司 Metadata heat sorting method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN108021562A (en) 2018-05-11
CN108021562B (en) 2022-11-18

Similar Documents

Publication Publication Date Title
US11327799B2 (en) Dynamic allocation of worker nodes for distributed replication
US20200348852A1 (en) Distributed object replication architecture
US20200159611A1 (en) Tracking status and restarting distributed replication
US11349915B2 (en) Distributed replication and deduplication of an object from a source site to a destination site
WO2019228217A1 (en) File system data access method and file system
US20190146946A1 (en) Method and device for archiving block data of blockchain and method and device for querying the same
CN108319654B (en) Computing system, cold and hot data separation method and device, and computer readable storage medium
US8799238B2 (en) Data deduplication
US11093387B1 (en) Garbage collection based on transmission object models
CN103595797B (en) Caching method for distributed storage system
GB2518158A (en) Method and system for data access in a storage infrastructure
US10628298B1 (en) Resumable garbage collection
US10007548B2 (en) Transaction system
US10929176B2 (en) Method of efficiently migrating data from one tier to another with suspend and resume capability
US11238011B2 (en) Intelligent method to index storage system files accounting for snapshots
WO2020098654A1 (en) Data storage method and device based on cloud storage, and storage medium
US9984139B1 (en) Publish session framework for datastore operation records
CN109598156A (en) Engine snapshot stream method is redirected when one kind is write
CN111506253A (en) Distributed storage system and storage method thereof
CN113377868A (en) Offline storage system based on distributed KV database
US11687533B2 (en) Centralized storage for search servers
CN107181773A (en) Data storage and data managing method, the equipment of distributed memory system
WO2018077092A1 (en) Saving method applied to distributed file system, apparatus and distributed file system
US10649807B1 (en) Method to check file data integrity and report inconsistencies with bulk data movement
US20170286442A1 (en) File system support for file-level ghosting

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17863689

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17863689

Country of ref document: EP

Kind code of ref document: A1