CN105335098A - Storage-class memory based method for improving performance of log file system - Google Patents

Storage-class memory based method for improving performance of log file system Download PDF

Info

Publication number
CN105335098A
CN105335098A CN201510621004.6A CN201510621004A CN105335098A CN 105335098 A CN105335098 A CN 105335098A CN 201510621004 A CN201510621004 A CN 201510621004A CN 105335098 A CN105335098 A CN 105335098A
Authority
CN
China
Prior art keywords
log
log recording
proceed
recording
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510621004.6A
Other languages
Chinese (zh)
Other versions
CN105335098B (en
Inventor
曾令仿
涂盛霞
张晓祎
冯丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201510621004.6A priority Critical patent/CN105335098B/en
Publication of CN105335098A publication Critical patent/CN105335098A/en
Application granted granted Critical
Publication of CN105335098B publication Critical patent/CN105335098B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses an SCM (storage-class memory) based method for improving the performance of a log file system. An SCM serves as a memory device for storing metadata and logs of the file system, so that the reading and writing of the metadata are optimized; coverage writing and additional writing are distinguished, only coverage writing data are written in the logs, additional writing data are directly written in the file system, and an update sequence is controlled to ensure the consistency of the file system, reduce the log overhead, and improve the performance of the file system; and the characteristic of modification by byte for the SCM is utilized, and the difference between new and old log blocks is computed, so that the update of log byte granularity is realized, and log data streams are reduced. The method mainly comprises five operations of storage system construction, log writing, garbage collection, data back-writing and system recovery, can be used for various log file systems, is suitable for constructing high-performance, high-capacity and high-reliability large-sized storage systems, and solves the problems of high extra overhead, high metadata back-writing frequency, low recovery speed after downtime and the like of a log technology in an existing log file system.

Description

A kind of Journaling File System performance raising method based on storage level internal memory
Technical field
The invention belongs to technical field of computer data storage, more specifically, relate to a kind of Journaling File System performance raising method based on storage level internal memory.
Background technology
Storage level internal memory (StorageClassMemory is called for short SCM) is the novel nonvolatile memory of a class, has the characteristic of internal memory (memory) and storage (storage) concurrently.The characteristic of the byte addressing of SCM, fast random access is close to internal memory, and therefore SCM can directly carry be on rambus, and this also allows conventional memory device too far behind to catch up.Just based on the performance of these brilliances, the appearance of SCM can not only solve I/O bottleneck and the energy consumption problem at high-performance data center, also will cause the change of Computer Systems Organization simultaneously.Therefore, SCM is not only the focus of current field of storage research, will be all the core of field of storage research in the future for a long period of time.
Traditional journaling techniques is a kind of method of conventional guarantee file system consistency.Before change file system, first more new data is recorded in daily record.Daily record is generally left in disk partition or journal file.System is delayed after machine, and the mode that log recording is write back disk by utilization order carries out System recover.Traditional journaling techniques designs for block level memory device, takes the strategy of trading space for time.Because log mechanism faces the problem of " writing twice ", namely first write a daily record, then write a file system, log mechanism may bring larger expense to file system.In addition, traditional logs technology does not write disk with all making any distinction between when System recover, like this, multiple daily record copies of same data block will occupy many parts of log recording spaces, also to write back repeatedly during System recover, not occupy a large amount of log space, also extend system recovery time.Again because space availability ratio is low, be easy to the situation causing daily record full, traditional logs technology has to continually internal storage data be write back disk.
Metadata refers to the system data of the feature for a description file, and metadata is stored in file system in units of block, and the renewal of part metadata but can cause the read-write to whole piece, thus is exaggerated magnetic disc i/o.Data shows, the request only having 21% is file I/O request, and the request more than 50% is metadata operation.Although metadata is all little renewal, the random small letter poor performance of disk.So the frequent access of metadata and a large amount of little write operation can reduce the performance of file system greatly.
Summary of the invention
For above defect or the Improvement requirement of prior art, the present invention proposes a kind of Journaling File System performance raising method based on storage level internal memory, its object is to, by using SCM as memory device storing documents system metadata and daily record, optimize the access of metadata, adopt the log mechanism simplified in conjunction with garbage reclamation mechanism, solve the problems such as existing journaling techniques expense is large, metadata writes back frequently, I/O bottleneck.
For achieving the above object, the invention provides a kind of Journaling File System performance raising method based on storage level internal memory, comprise the structure of storage system, write daily record, garbage reclamation, data write back, System recover these five operations, wherein:
The structure operation of storage system: add one piece of SCM carry on rambus to original file system, with internal memory shared drive bus; And SCM is divided into log area and meta-data region, log area is for recording metadata daily record and data logging is write in covering, and meta-data region is used for persistence metadata;
Write journalizing: the additional data of writing of memory transaction are write file system; And by the metadata of memory transaction with cover overwriting data and write log area;
Garbage reclamation operates: carry out garbage reclamation at set intervals; In garbage reclamation interval, if log area insufficient space, also carry out garbage reclamation; Namely exchange the invalid Log Label of the effective Log Label in high address place and low address place, thus all effective log recordings are concentrated in the space, one section of continuous print log area from daily record reference position;
Data written-back operation: after completing garbage reclamation, when space, log area is still not enough, the overwriting data that covers be cached in SCM log area is write back to disk, and metadata writes back to meta-data region;
System resumes operation: after system delays machine, utilizes the log recording in SCM to restore the system to coherency state; After System recover, the log recording in SCM all cancels.
In general, the above technical scheme conceived by the present invention compared with prior art, can obtain following beneficial effect:
1, the invention solves the metadata existed in existing file system and frequently access the magnetic disc i/o problem caused: forever leave metadata in SCM, eliminate the magnetic disc i/o problem of frequent accesses meta-data, in conjunction with journaling techniques, ensure that file system consistency.
2, invention increases the write performance of file system.Simplify traditional logging mode, distinguish to add and write data and cover overwriting data, the method using byte granularity to upgrade, decreases daily record data stream, and accelerates System recover speed.
3, the present invention is by metadata granularity refinement, adopts garbage reclamation mechanism simultaneously, takes full advantage of storage level internal memory, improves space availability ratio.
Accompanying drawing explanation
Fig. 1 is system architecture schematic diagram of the present invention;
Fig. 2 is log recording structural representation of the present invention;
Fig. 3 is log recording Hash table schematic diagram of the present invention;
Fig. 4 is daily record superblock schematic diagram of the present invention;
Fig. 5 of the present inventionly writes journalizing process flow diagram;
Fig. 6 is garbage reclamation operational flowchart of the present invention;
Fig. 7 is data written-back operation process flow diagram of the present invention;
Fig. 8 is system resumes operation process flow diagram of the present invention;
Embodiment
In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.In addition, if below in described each embodiment of the present invention involved technical characteristic do not form conflict each other and just can mutually combine.
Fig. 1 is system constructing schematic diagram of the present invention.Hardware to need on rambus carry SCM dish as daily record dish and metadata dish on disposing, with DRAM shared drive bus; Software configuration comprises four operations: write journalizing, garbage reclamation operates, data written-back operation, system resumes operation.In addition, SCM daily record dish is provided with daily record superblock, log recording Hash table and log recording three kinds of structures.
Therefore, realize the inventive method to need in original file system, first to add one piece of SCM carry on rambus, with internal memory shared drive bus.SCM is divided into log area and meta-data region, log area is for recording metadata daily record and data logging is write in covering, and meta-data region is used for persistence metadata.
Secondly, in SCM, main foundation has daily record superblock, log recording Hash table and log recording three kinds of storage organizations.Wherein, daily record superblock is for recording the information such as first available log blocks, daily record capacity, remaining space, effectively log recording quantity, log recording reference position, last affairs end position, Current transaction end position; Log recording Hash table is used for up-to-date log recording corresponding to recording data blocks and time up-to-date log recording; Log recording, comprise log blocks and Log Label, log blocks is used for deposit transaction and is submitted to the buffer contents of log area, and Log Label, for describing the relevant information of log recording, comprises disk block corresponding to log recording number, log recording effective marker and metadata flags position etc.
Finally, considering that metadata is smaller, for avoiding the problem writing amplification and space waste, metadata granularity in meta-data region being set as 128bytes.
Fig. 2 is log recording structural representation of the present invention.Log recording comprises log blocks and Log Label two parts.Log blocks is used for the covering of storing documents system and writes data block and meta data block, and Log Label is for depositing the descriptor of log blocks.Log Label is made up of four fields: disk block blknr, significance bit valid, metadata flags position meta, the position scmnr at log blocks place.To the location of log recording, search, the operation such as marked invalid all undertaken by Log Label.
Fig. 3 is log recording Hash table schematic diagram of the present invention.Log recording Hash table is made up of three fields: disk block blknr, up-to-date log recording scmlog, secondary up-to-date log recording old_scmlog.Each list item of log recording Hash table is used for up-to-date log recording scmlog corresponding to recording disc block blknr and time up-to-date log recording old_scmlog.
Fig. 4 is daily record superblock schematic diagram of the present invention.Daily record superblock comprises seven fields: magic number, represents the mark of daily record superblock; Log recording size, represents the size of daily record; Available log space, represents daily record also has how many free spaces; Next available log blocks, represents daily record assignable piece of starting point; Effective log recording quantity, represents the number of effective log recording; Daily record reference position, represents the reference position of log area; A upper affairs end position, represents the end of affairs, is also the starting point of Current transaction sequential write simultaneously; Current transaction end position, represent the end of Current transaction, Current transaction is intactly recorded between an affairs end position and Current transaction end position.Daily record superblock have recorded the essential information of daily record, guarantees the atomicity of memory transaction.
Fig. 5 of the present inventionly writes journalizing process flow diagram, writes journalizing for memory transaction is submitted to logging device, by ensureing the consistance of file system to the control writing data stream.
A memory transaction is managed by two double-linked circular lists: one is add to write chained list, writes data buffer content for managing to add; Another covers to write and metadata chained list, covers compose buffer and metadata data buffer content for managing.Write journalizing and mainly comprise two processes:
1, the additional data of writing of memory transaction are write file system;
2, by the metadata of memory transaction with cover overwriting data and write log area.
Add and write data and can ensure file system consistency by controlling to write order, namely first data are write file system, after metadata is write daily record, and write back to meta-data region in suitable; Metadata and cover overwriting data and then need to ensure integrality by write-ahead log mode.
Write the process that focuses on 2 of journalizing, be submitted to logging device by the content in memory transaction, and need to ensure that write operation interrupts all can not destroying the consistance of data at any time.Same data block only retains up-to-date log recording and time up-to-date log recording in log area.
Adopt XOR updating method writing in logging process, find out the byte that the log blocks that is about to write is different with secondary up-to-date log recording, realize the renewal of daily record byte granularity.
Write journalizing concrete steps as follows:
(1.1) memory transaction that new is started to submit to, the additional data write in chained list are write file system, and in internal memory log buffer district these affairs of buffer memory, write if this time operate the file system data related to, proceeded to step (1.2);
(1.2) investigate covering to write and log blocks to be committed in metadata chained list, in log recording Hash table, search file system blocks corresponding to described log blocks to be committed whether there is time up-to-date log recording, if do not exist, proceed to step (1.3); If exist, proceed to step (1.4);
(1.3) described log blocks to be committed is write the clear position in log area by order, and adds on log recording Hash table by the information of described log blocks to be committed; Here, log recording Hash table does not find the information of described log blocks to be committed in two kinds of situation: the first, illustrate that this upgrades described log blocks to be committed at first time, therefore it is labeled as up-to-date log recording; The second, log recording Hash table have found up-to-date log recording and but do not find time up-to-date log recording, then its up-to-date log recording is updated to time up-to-date log recording, described log blocks to be committed is labeled as up-to-date log recording; Proceed to step (1.5);
(1.4) XOR updating method is adopted, find out the byte that described log blocks B3 to be committed is different from next up-to-date log recording B1, i.e. P=B1 ⊕ B3 (⊕ is XOR symbol), analyze each byte in P, if 0, then represent that log blocks B3 is identical with this byte of log blocks B1, does not do and upgrades; If non-zero, then represent that log blocks B3 is different from this byte of log blocks B1, this byte in P and B1 is done the renewal of XOR realization to log blocks B1.Because B1 is time up-to-date log recording, it is invalid to be labeled as, even if renewal process is interrupted, also can not destroy consistance and the integrality of daily record affairs.Update log record Hash table, up-to-date log recording corresponding for B3 is labeled as time up-to-date log recording, the B1 after XOR has upgraded is labeled as up-to-date log recording; Proceed to step (1.5);
(1.5) if the log blocks all to be committed in current memory affairs all completes submission, step (1.6) is proceeded to; Otherwise, proceed to step (1.2);
(1.6) position of the last item log recording is set to Current transaction end position.Proceed to step (1.7);
(1.7) data block corresponding for current memory affairs in log buffer district is labeled as invalid, proceeds to step (1.8);
(1.8) Current transaction end position is set to a upper affairs end position.Represent that current memory affairs complete, the submission of next memory transaction can be accepted.
Fig. 6 is garbage reclamation operational flowchart of the present invention, and the object of garbage reclamation operation is to reclaim log recording invalid in log area, and concrete grammar adopts Tag switching method.
Owing to there is log recording invalid in a large number in log area, therefore need to carry out garbage reclamation at set intervals.In garbage reclamation interval, if log area insufficient space, also garbage reclamation can be carried out.In the present invention, the feature of garbage reclamation is, exchanges the invalid Log Label of the effective Log Label in high address place and low address place, thus is concentrated on by all effective log recordings in the space, one section of continuous print log area from daily record reference position.
The concrete steps of garbage reclamation operation are as follows:
(2.1) suppose there is pointer first and pointer last, pointer first points to Article 1 log recording, and pointer last points to the last item log recording.Judge whether first pointer overlaps with last pointer.If overlap, proceed to step (2.8), otherwise, proceed to step (2.2);
(2.2) check that whether the log recording that first points to is invalid, if effectively, first points to next log recording, proceeds to step (2.3); If invalid, proceed to step (2.4);
(2.3) judge whether first pointer overlaps with last pointer.If overlap, proceed to step (2.8), otherwise, proceed to step (2.2);
(2.4) check that whether the log recording that last points to is invalid, if effectively, proceed to step (2.5), otherwise, proceed to step (2.6);
(2.5) log recording that last points to is write the log recording that first points to, and the log recording of first pointed is labeled as effectively.Suppose that the disk block number that log recording that first points to is corresponding is blknr, then search log recording Hash table, upgrade the log recording that up-to-date log recording corresponding to disk block blknr is last pointed.Proceed to step (2.7);
(2.6) log recording in last pointed, proceeds to step (2.3);
(2.7) next log recording of first pointed, proceeds to step (2.2);
(2.8) be Current transaction end position by the position mark of first pointer, proceed to step (2.9);
(2.9) last, the invalid log recording space of continuous print is discharged, completes garbage reclamation operation.
Fig. 7 is data written-back operation process flow diagram of the present invention, and data written-back operation is when when log area insufficient space, carries out data written-back operation.
After completing garbage reclamation, when space, log area is still not enough, the overwriting data that covers be cached in SCM log area is write back to disk, and metadata writes back to meta-data region; Data write back the main consideration of strategy 2 points: the succession 1, writing back data; 2, the cold and hot degree of data is write back.
For the succession writing back data, in the present invention, preferentially write back data logging.In general the succession of data is higher than the succession of metadata, therefore preferentially writes back data, is conducive to improving data and writes back efficiency.For the cold and hot degree writing back data, due to the logical order of log area, from low address to the Data distribution8 of high address, formed one " natural " from the distribution being as cold as heat: the more data of " heat ", the tag addresses at its effective log recording place is higher; The more data of " cold ", the tag addresses at its effective log recording place is lower.
Therefore, when each execution data write back, from the Article 1 log recording of log area, from low address to high address, selecting some effective log recordings to write back, namely achieving daily record data from being as cold as writing back of heat.
Data written-back operation concrete steps are as follows:
(3.1) from Article 1 log recording, check that whether log recording is effective, if invalid, proceed to step (3.2), otherwise, proceed to step (3.3);
(3.2) check that whether next log recording is effective, if invalid, proceed to step (3.2), otherwise, proceed to step (3.3);
(3.3) investigate the metadata flags position of this log recording, if metadata, proceed to step (3.4), otherwise, proceed to step (3.5);
(3.4) metadata is write back to meta-data region, proceed to step (3.6);
(3.5) data block back disk is write in covering, proceed to step (3.6);
(3.6) suppose that disk block number corresponding to the data block that writes back is blknr, search log recording Hash table, it is invalid to be labeled as by log recording corresponding for disk block blknr.Delete the Hash node that disk block blknr is corresponding.Proceed to step (3.7);
(3.7) step (3.2) is repeated, until corresponding for all effective log recordings data block back is write back to meta-data region to disk and meta data block.
Fig. 8 is system resumes operation process flow diagram of the present invention, after system delays machine, needs to utilize the log recording in SCM to restore the system to coherency state.After System recover, the log recording in SCM all cancels.
System resumes operation concrete steps are as follows:
(4.1) from Article 1 log recording, check that whether log recording is effective, if invalid, proceed to step (4.2), otherwise, proceed to step (4.3);
(4.2) check that whether next log recording is effective, if invalid, proceed to step (4.2), otherwise, proceed to step (4.3);
(4.3) suppose that the disk block number that current log record is corresponding is blknr, current log record is write back to the corresponding position of disk and be invalid by current log recording mark.Proceed to step (4.4);
(4.4) search log recording Hash table, delete the Hash node that in log recording Hash table, disk block blknr is corresponding.Proceed to step (4.5);
(4.5) step (4.2) and (4.4) is repeated, until all data log record are returned to disk and metadata log recording returns to meta-data region.Proceed to step (4.6);
(4.6) a upper affairs end mark and Current transaction end mark are all placed in the initial final position of daily record to put.
Those skilled in the art will readily understand; the foregoing is only preferred embodiment of the present invention; not in order to limit the present invention, all any amendments done within the spirit and principles in the present invention, equivalent replacement and improvement etc., all should be included within protection scope of the present invention.

Claims (9)

1. the Journaling File System performance raising method based on storage level internal memory, is characterized in that, comprise the structure of storage system, writes daily record, garbage reclamation, data write back, System recover these five operations, wherein:
The structure operation of storage system: add one piece of SCM carry on rambus to original file system, with internal memory shared drive bus; And SCM is divided into log area and meta-data region, log area is for recording metadata daily record and data logging is write in covering, and meta-data region is used for persistence metadata;
Write journalizing: the additional data of writing of memory transaction are write file system; And by the metadata of memory transaction with cover overwriting data and write log area;
Garbage reclamation operates: carry out garbage reclamation at set intervals; In garbage reclamation interval, if log area insufficient space, also carry out garbage reclamation; Namely exchange the invalid Log Label of the effective Log Label in high address place and low address place, thus all effective log recordings are concentrated in the space, one section of continuous print log area from daily record reference position;
Data written-back operation: after completing garbage reclamation, when space, log area is still not enough, the overwriting data that covers be cached in SCM log area is write back to disk, and metadata writes back to meta-data region;
System resumes operation: after system delays machine, utilizes the log recording in SCM to restore the system to coherency state; After System recover, the log recording in SCM all cancels.
2. the method for claim 1, is characterized in that, in the structure operation of storage system, in SCM, main foundation has daily record superblock, log recording Hash table and log recording three kinds of storage organizations; Wherein, daily record superblock is for recording first available log blocks, daily record capacity, remaining space, effectively log recording quantity, log recording reference position, last affairs end position, Current transaction end position; Log recording Hash table is used for up-to-date log recording corresponding to recording data blocks and time up-to-date log recording; Log recording, comprise log blocks and Log Label, wherein log blocks is used for deposit transaction and is submitted to the buffer contents of log area, and Log Label, for describing the relevant information of log recording, comprises disk block corresponding to log recording number, log recording effective marker and metadata flags position.
3. method as claimed in claim 2, it is characterized in that, described log recording Hash table is made up of three fields: disk block blknr, up-to-date log recording scmlog, secondary up-to-date log recording old_scmlog; Each list item of log recording Hash table is used for up-to-date log recording scmlog corresponding to recording disc block blknr and time up-to-date log recording old_scmlog.
4. the method as described in any one of claims 1 to 3, is characterized in that, writes in journalizing described, and a memory transaction is managed by two double-linked circular lists: one is add to write chained list, writes data buffer content for managing to add; Another covers to write and metadata chained list, covers compose buffer and metadata data buffer content for managing.
5. method as claimed in claim 4, is characterized in that, described in write journalizing and specifically comprise:
(1.1) submit a new memory transaction to, the additional data write in chained list write file system, and in internal memory log buffer district these affairs of buffer memory, write if this time operate the file system data related to, proceeded to step (1.2);
(1.2) investigate covering to write and log blocks to be committed in metadata chained list, in log recording Hash table, search file system blocks corresponding to described log blocks to be committed whether there is time up-to-date log recording, if do not exist, proceed to step (1.3); If exist, proceed to step (1.4);
(1.3) described log blocks to be committed is write the clear position in log area by order, and adds on log recording Hash table by the information of described log blocks to be committed; Proceed to step (1.5);
(1.4) adopt XOR updating method, find out the byte that described log blocks B3 to be committed is different from next up-to-date log recording B1, i.e. P=B1 ⊕ B3, wherein ⊕ is XOR symbol, analyzes each byte in P, if 0, then represent that log blocks B3 is identical with this byte of log blocks B1, does not do and upgrades; If non-zero, then represent that log blocks B3 is different from this byte of log blocks B1, this byte in P and B1 is done the renewal of XOR realization to log blocks B1; Up-to-date log recording corresponding for B3 is labeled as time up-to-date log recording, the B1 after XOR has upgraded is labeled as up-to-date log recording; Proceed to step (1.5);
(1.5) if the log blocks all to be committed in current memory affairs all completes submission, step (1.6) is proceeded to; Otherwise, proceed to step (1.2);
(1.6) position of the last item log recording is set to Current transaction end position; Proceed to step (1.7);
(1.7) data block corresponding for current memory affairs in log buffer district is labeled as invalid, proceeds to step (1.8);
(1.8) Current transaction end position is set to a upper affairs end position; Represent that current memory affairs complete, the submission of next memory transaction can be accepted.
6. method as claimed in claim 4, it is characterized in that, described step (1.3) is in two kinds of situation: first, log recording Hash table does not find the information of described log blocks to be committed, illustrate that this upgrades described log blocks to be committed at first time, it is labeled as up-to-date log recording; The second, log recording Hash table have found up-to-date log recording and but do not find time up-to-date log recording, then its up-to-date log recording is updated to time up-to-date log recording, described log blocks to be committed is labeled as up-to-date log recording.
7. method as claimed in claim 1 or 2, is characterized in that, the concrete steps of described garbage reclamation operation are as follows:
(2.1) be provided with pointer first and pointer last, pointer first points to Article 1 log recording, and pointer last points to the last item log recording; Judge whether first pointer overlaps with last pointer; If overlap, proceed to step (2.8), otherwise, proceed to step (2.2);
(2.2) check that whether the log recording that first points to is invalid, if effectively, first points to next log recording, proceeds to step (2.3); If invalid, proceed to step (2.4);
(2.3) judge whether first pointer overlaps with last pointer; If overlap, proceed to step (2.8), otherwise, proceed to step (2.2);
(2.4) check that whether the log recording that last points to is invalid, if effectively, proceed to step (2.5), otherwise, proceed to step (2.6);
(2.5) log recording that last points to is write the log recording that first points to, and the log recording of first pointed is labeled as effectively; Suppose that the disk block number that log recording that first points to is corresponding is blknr, then search log recording Hash table, upgrade the log recording that up-to-date log recording corresponding to disk block blknr is last pointed; Proceed to step (2.7);
(2.6) log recording in last pointed; Proceed to step (2.3);
(2.7) next log recording of first pointed, proceeds to step (2.2);
(2.8) be Current transaction end position by the position mark of first pointer; Proceed to step (2.9);
(2.9) the invalid log recording space of continuous print is discharged, complete garbage reclamation operation.
8. method as claimed in claim 1 or 2, it is characterized in that, described data written-back operation concrete steps are as follows:
(3.1) from Article 1 log recording, check that whether log recording is effective, if invalid, proceed to step (3.2), otherwise, proceed to step (3.3);
(3.2) check that whether next log recording is effective, if invalid, proceed to step (3.2), otherwise, proceed to step (3.3);
(3.3) investigate the metadata flags position of this log recording, if metadata, proceed to step (3.4), otherwise, proceed to step (3.5);
(3.4) metadata is write back to meta-data region, proceed to step (3.6);
(3.5) data block back disk is write in covering, proceed to step (3.6);
(3.6) suppose that disk block number corresponding to the data block that writes back is blknr, search log recording Hash table, it is invalid to be labeled as by log recording corresponding for disk block blknr; Delete the Hash node that disk block blknr is corresponding; Proceed to step (3.7);
(3.7) step (3.2) is repeated, until corresponding for all effective log recordings data block back is write back to meta-data region to disk and meta data block.
9. method as claimed in claim 1 or 2, it is characterized in that, described system resumes operation concrete steps are as follows:
(4.1) from Article 1 log recording, check that whether log recording is effective, if invalid, proceed to step (4.2), otherwise, proceed to step (4.3);
(4.2) check that whether next log recording is effective, if invalid, proceed to step (4.2), otherwise, proceed to step (4.3);
(4.3) suppose that the disk block number that current log record is corresponding is blknr, current log record is write back to the corresponding position of disk and be invalid by current log recording mark; Proceed to step (4.4);
(4.4) search log recording Hash table, delete the Hash node that in log recording Hash table, disk block blknr is corresponding; Proceed to step (4.5);
(4.5) step (4.2) and (4.4) is repeated, until all data log record are returned to disk and metadata log recording returns to meta-data region; Proceed to step (4.6);
(4.6) a upper affairs end mark and Current transaction end mark are all placed in the initial final position of daily record to put.
CN201510621004.6A 2015-09-25 2015-09-25 A kind of log file system performance improvement method based on storage level memory Active CN105335098B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510621004.6A CN105335098B (en) 2015-09-25 2015-09-25 A kind of log file system performance improvement method based on storage level memory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510621004.6A CN105335098B (en) 2015-09-25 2015-09-25 A kind of log file system performance improvement method based on storage level memory

Publications (2)

Publication Number Publication Date
CN105335098A true CN105335098A (en) 2016-02-17
CN105335098B CN105335098B (en) 2019-03-26

Family

ID=55285679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510621004.6A Active CN105335098B (en) 2015-09-25 2015-09-25 A kind of log file system performance improvement method based on storage level memory

Country Status (1)

Country Link
CN (1) CN105335098B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786410A (en) * 2016-03-01 2016-07-20 深圳市瑞驰信息技术有限公司 Method for increasing processing speed of data storage system and data storage system
CN106202307A (en) * 2016-07-01 2016-12-07 百势软件(北京)有限公司 A kind of batch log preservation method and device
CN107220342A (en) * 2017-05-26 2017-09-29 郑州云海信息技术有限公司 The control method and system of a kind of distributed data base
CN107291924A (en) * 2017-06-29 2017-10-24 深信服科技股份有限公司 A kind of synchronous replication log control method and system for disaster recovery and backup systems
CN107479824A (en) * 2016-06-08 2017-12-15 捷鼎国际股份有限公司 Redundancy magnetic disc array system and its data storage method
CN107908370A (en) * 2017-11-30 2018-04-13 新华三技术有限公司 Date storage method and device
CN108121789A (en) * 2017-12-19 2018-06-05 苏州精濑光电有限公司 A kind of blog management method and system
CN108334277A (en) * 2017-05-10 2018-07-27 中兴通讯股份有限公司 A kind of daily record write-in and synchronous method, device, system, computer storage media
CN108829345A (en) * 2018-05-25 2018-11-16 华为技术有限公司 The data processing method and terminal device of journal file
WO2018233603A1 (en) * 2017-06-20 2018-12-27 707 Limited Method of evidencing existence of digital documents and system therefor, and tag chain blockchain system
CN109669632A (en) * 2018-12-10 2019-04-23 浪潮电子信息产业股份有限公司 Metadata wiring method, device and medium based on distributed objects storage system
WO2019137322A1 (en) * 2018-01-09 2019-07-18 阿里巴巴集团控股有限公司 Method and device for data processing, and computer device
CN110968269A (en) * 2019-11-18 2020-04-07 华中科技大学 SCM and SSD-based key value storage system and read-write request processing method
CN111104254A (en) * 2019-11-29 2020-05-05 北京浪潮数据技术有限公司 Storage system data flashing method, device, equipment and readable storage medium
CN111414320A (en) * 2020-02-20 2020-07-14 上海交通大学 Method and system for constructing disk cache based on nonvolatile memory of log file system
CN111587428A (en) * 2017-11-13 2020-08-25 维卡艾欧有限公司 Metadata journaling in distributed storage systems
CN111708481A (en) * 2020-04-24 2020-09-25 浙江大学 Solid State Disk (SSD) double-area wear leveling method based on super block
CN114281762A (en) * 2022-03-02 2022-04-05 苏州浪潮智能科技有限公司 Log storage acceleration method, device, equipment and medium
CN114579529A (en) * 2022-05-07 2022-06-03 深圳市杉岩数据技术有限公司 Local storage method and system based on redirection and log mixing
CN115048046A (en) * 2022-05-26 2022-09-13 北京华昱卓程软件有限公司 Log file system and data management method
CN118503015A (en) * 2024-07-16 2024-08-16 吉林大学 Method for maintaining data consistency of heterogeneous storage system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061537A1 (en) * 2001-07-16 2003-03-27 Cha Sang K. Parallelized redo-only logging and recovery for highly available main memory database systems
CN1454349A (en) * 2000-06-07 2003-11-05 处理存储器有限公司 A method and system for highly-parallel logging and recovery operation in main-memory transaction processing systems
CN102024021A (en) * 2010-11-04 2011-04-20 曙光信息产业(北京)有限公司 Method for logging metadata in logical file system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1454349A (en) * 2000-06-07 2003-11-05 处理存储器有限公司 A method and system for highly-parallel logging and recovery operation in main-memory transaction processing systems
US20030061537A1 (en) * 2001-07-16 2003-03-27 Cha Sang K. Parallelized redo-only logging and recovery for highly available main memory database systems
CN102024021A (en) * 2010-11-04 2011-04-20 曙光信息产业(北京)有限公司 Method for logging metadata in logical file system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LAM C H.等: "Storage class memory", 《PROCEEDINGS OF THE 10TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY》 *
RU FANG等: "High Performance Database Logging using Storage Class Memory", 《IEEE》 *
WU X等: "SCMFS; A file system for storage class memory", 《PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING》 *
冒伟等: "基于相变存储器的存储技术研究综述", 《计算机学报》 *

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786410A (en) * 2016-03-01 2016-07-20 深圳市瑞驰信息技术有限公司 Method for increasing processing speed of data storage system and data storage system
CN107479824A (en) * 2016-06-08 2017-12-15 捷鼎国际股份有限公司 Redundancy magnetic disc array system and its data storage method
CN107479824B (en) * 2016-06-08 2020-03-06 宜鼎国际股份有限公司 Redundant disk array system and data storage method thereof
CN106202307A (en) * 2016-07-01 2016-12-07 百势软件(北京)有限公司 A kind of batch log preservation method and device
CN108334277A (en) * 2017-05-10 2018-07-27 中兴通讯股份有限公司 A kind of daily record write-in and synchronous method, device, system, computer storage media
CN108334277B (en) * 2017-05-10 2019-06-28 中兴通讯股份有限公司 A kind of log write-in and synchronous method, device, system, computer storage medium
CN107220342A (en) * 2017-05-26 2017-09-29 郑州云海信息技术有限公司 The control method and system of a kind of distributed data base
WO2018233603A1 (en) * 2017-06-20 2018-12-27 707 Limited Method of evidencing existence of digital documents and system therefor, and tag chain blockchain system
US11177940B2 (en) 2017-06-20 2021-11-16 707 Limited Method of evidencing existence of digital documents and a system therefor
CN110771093A (en) * 2017-06-20 2020-02-07 707 有限公司 Method and system for proving existence of digital document and label chain block chain system
CN110771093B (en) * 2017-06-20 2023-01-10 707 有限公司 Method and system for proving existence of digital document
CN107291924B (en) * 2017-06-29 2020-08-14 深信服科技股份有限公司 Synchronous log replication control method and system for disaster recovery backup system
CN107291924A (en) * 2017-06-29 2017-10-24 深信服科技股份有限公司 A kind of synchronous replication log control method and system for disaster recovery and backup systems
CN111587428B (en) * 2017-11-13 2023-12-19 维卡艾欧有限公司 Metadata journaling in distributed storage systems
CN111587428A (en) * 2017-11-13 2020-08-25 维卡艾欧有限公司 Metadata journaling in distributed storage systems
CN107908370B (en) * 2017-11-30 2021-07-06 新华三技术有限公司 Data storage method and device
CN107908370A (en) * 2017-11-30 2018-04-13 新华三技术有限公司 Date storage method and device
CN108121789A (en) * 2017-12-19 2018-06-05 苏州精濑光电有限公司 A kind of blog management method and system
CN108121789B (en) * 2017-12-19 2020-06-30 苏州精濑光电有限公司 Log management method and system
WO2019137322A1 (en) * 2018-01-09 2019-07-18 阿里巴巴集团控股有限公司 Method and device for data processing, and computer device
US11294592B2 (en) 2018-01-09 2022-04-05 Alibaba Group Holding Limited Method and device for data processing, and computer device
CN108829345A (en) * 2018-05-25 2018-11-16 华为技术有限公司 The data processing method and terminal device of journal file
CN108829345B (en) * 2018-05-25 2020-02-21 华为技术有限公司 Data processing method of log file and terminal equipment
CN109669632A (en) * 2018-12-10 2019-04-23 浪潮电子信息产业股份有限公司 Metadata wiring method, device and medium based on distributed objects storage system
CN110968269A (en) * 2019-11-18 2020-04-07 华中科技大学 SCM and SSD-based key value storage system and read-write request processing method
CN111104254A (en) * 2019-11-29 2020-05-05 北京浪潮数据技术有限公司 Storage system data flashing method, device, equipment and readable storage medium
CN111414320B (en) * 2020-02-20 2023-06-06 上海交通大学 Method and system for constructing disk cache based on nonvolatile memory of log file system
CN111414320A (en) * 2020-02-20 2020-07-14 上海交通大学 Method and system for constructing disk cache based on nonvolatile memory of log file system
CN111708481A (en) * 2020-04-24 2020-09-25 浙江大学 Solid State Disk (SSD) double-area wear leveling method based on super block
CN114281762A (en) * 2022-03-02 2022-04-05 苏州浪潮智能科技有限公司 Log storage acceleration method, device, equipment and medium
CN114281762B (en) * 2022-03-02 2022-06-03 苏州浪潮智能科技有限公司 Log storage acceleration method, device, equipment and medium
CN114579529A (en) * 2022-05-07 2022-06-03 深圳市杉岩数据技术有限公司 Local storage method and system based on redirection and log mixing
CN114579529B (en) * 2022-05-07 2022-08-05 深圳市杉岩数据技术有限公司 Local storage method and system based on redirection and log mixing
CN115048046A (en) * 2022-05-26 2022-09-13 北京华昱卓程软件有限公司 Log file system and data management method
CN118503015A (en) * 2024-07-16 2024-08-16 吉林大学 Method for maintaining data consistency of heterogeneous storage system

Also Published As

Publication number Publication date
CN105335098B (en) 2019-03-26

Similar Documents

Publication Publication Date Title
CN105335098A (en) Storage-class memory based method for improving performance of log file system
CN106548789B (en) Method and apparatus for operating stacked tile type magnetic recording equipment
CN102163175B (en) Hybrid address mapping method based on locality analysis
CN102981963B (en) A kind of implementation method of flash translation layer (FTL) of solid-state disk
CN107329696B (en) A kind of method and system guaranteeing data corruption consistency
CN104881371A (en) Persistent internal memory transaction processing cache management method and device
CN102541757B (en) Write cache method, cache synchronization method and device
CN110119425A (en) Solid state drive, distributed data-storage system and the method using key assignments storage
US20080288713A1 (en) Flash-aware storage optimized for mobile and embedded dbms on nand flash memory
CN103106047A (en) Storage system based on object and storage method thereof
CN103631536B (en) A kind of method utilizing the invalid data of SSD to optimize RAID5/6 write performance
CN107784121A (en) Lowercase optimization method of log file system based on nonvolatile memory
CN109710541B (en) Optimization method for Greedy garbage collection of NAND Flash main control chip
CN102696010A (en) Apparatus, system, and method for caching data on a solid-state storage device
CN105955664B (en) A kind of reading/writing method of watt record conversion layer based on segment structure
CN109815165A (en) System and method for storing and processing Efficient Compression cache line
CN105005535A (en) Distributed flash memory transaction processing method
CN106815152A (en) A kind of method for optimizing page level flash translation layer (FTL)
CN105138286A (en) Method for mixed utilization of SSD and SMR hard disks in disk file system
CN110309233A (en) Method, apparatus, server and the storage medium of data storage
CN103049224A (en) Method, device and system for importing data into physical tape
CN103996412A (en) Power-fail protection method applied to intelligent-card nonvolatile memories
US20220129420A1 (en) Method for facilitating recovery from crash of solid-state storage device, method of data synchronization, computer system, and solid-state storage device
CN104461932B (en) Directory cache management method for big data application
CN100504800C (en) Method for snapshot of magnetic disc

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant