WO2022259493A1 - Système de commande de journalisation, procédé de commande de journalisation et programme de commande de journalisation - Google Patents

Système de commande de journalisation, procédé de commande de journalisation et programme de commande de journalisation Download PDF

Info

Publication number
WO2022259493A1
WO2022259493A1 PCT/JP2021/022211 JP2021022211W WO2022259493A1 WO 2022259493 A1 WO2022259493 A1 WO 2022259493A1 JP 2021022211 W JP2021022211 W JP 2021022211W WO 2022259493 A1 WO2022259493 A1 WO 2022259493A1
Authority
WO
WIPO (PCT)
Prior art keywords
log
data
write
journal
journal log
Prior art date
Application number
PCT/JP2021/022211
Other languages
English (en)
Japanese (ja)
Inventor
孝治 佐藤
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2021/022211 priority Critical patent/WO2022259493A1/fr
Publication of WO2022259493A1 publication Critical patent/WO2022259493A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots

Definitions

  • the present invention relates to a journal log control system, a journal log control method, and a journal log control program.
  • Persistent memory is fast, non-volatile, and byte-addressable.
  • Examples of persistent memory technologies include Phase Change Memory (PCM), Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM), Resistive Random Access Memory (ReRAM) )and so on.
  • Examples of persistent memory products include Non-Volatile Dual In-line Memory Module (NVDIMM) and Intel(R) Optane(TM) Persistent Memory (Non-Patent Document 7).
  • file systems are based on the premise of using a hard disk drive (hereinafter referred to as HDD) or a solid state drive (hereinafter referred to as SSD). It is not suitable for using memory.
  • HDD hard disk drive
  • SSD solid state drive
  • file systems for persistent memory have been proposed.
  • Examples of file systems for persistent memory include PMFS (see Non-Patent Document 1), BPFS (see Non-Patent Document 2), NOVA (see Non-Patent Document 3), Shortcut-JFS (see Non-Patent Document 4), Ext4 (see Non-Patent Document 5), XFS (see Non-Patent Document 6), and the like.
  • the file system for persistent memory mentioned above uses one or more of journaling, copy-on-write, and log structuring to ensure the consistency of file system data and metadata.
  • Journaling writes updates to the journal log before updating the data. After the update has been written to the journal log, a process called checkpoint is executed to reflect the update in the file system itself. In the event of a failure, the journal log can be used to restore the file system to a consistent state.
  • the block to be updated is copied to a newly allocated block, and the copied block is updated. Since the file system is never overwritten, the integrity of the file system can always be ensured.
  • CPU cache When a file system for persistent memory writes to persistent memory, if the cache of the Central Processing Unit (hereinafter referred to as CPU) (hereinafter referred to as CPU cache) is a write-back method, even if an instruction to write data to persistent memory is executed, Data is only written to the CPU cache and not to persistent memory. If the server fails due to a state that has been written to the CPU cache but not to persistent memory, the data written to the CPU cache will be lost.
  • a persistent memory-oriented file system either executes instructions to flush the CPU cache after writing, or writes directly to persistent memory, or automatically flushes the CPU cache on failure.
  • Non-Patent Document 8 Enhanced Asynchronous DRAM Refresh (hereafter, eADR) (see Non-Patent Document 8) for flashing dynamically.
  • eADR Enhanced Asynchronous DRAM Refresh
  • journaling copies-on-write, log structuring, or multiple technologies.
  • journaling updates must be written twice, once in the journal log and reflected in the file system itself, and with copy-on-write and log structuring, cascading updates from the updated data block to the file system root is required, there is a problem that the amount of writing increases and the writing performance of the file system decreases.
  • journal log control system of the present invention provides data, metadata, and a journal log, which is an update history of the data and the metadata, stored in a persistent memory.
  • a journal log control system for controlling a journal log comprising journaling processing for storing update history of data and metadata associated with the write request in the journal log when a write request is made; and storing in the journal log.
  • a commit process for confirming the updated history
  • an immediate checkpoint process for reflecting the confirmed update history stored in the journal log in the data and the metadata
  • a delayed checkpoint process for delaying the immediate checkpoint process.
  • journal log control unit for controlling the execution of checkpoint processing.
  • journaling is used to ensure the consistency of both file system data and metadata in the event of a failure, reduce the amount of writes associated with journaling, and use flush commands and memory barriers. By reducing the number of command executions, it is possible to improve the write performance of the file system.
  • FIG. 1 is a diagram showing an example of the configuration of the entire system to which the present invention is applied in the first embodiment.
  • FIG. 2 is a diagram illustrating an example of a journal log configuration according to the first embodiment.
  • FIG. 3 is a diagram showing an example of the configuration of log entries according to the first embodiment.
  • FIG. 4 is a diagram showing an example of the structure of a data write log entry according to the first embodiment.
  • FIG. 5 is a diagram illustrating an example of the configuration of a log page end log entry according to the first embodiment.
  • FIG. 6 is a flowchart illustrating an example of write processing operations in the first embodiment.
  • FIG. 7 is a flowchart illustrating an example of journaling processing operations in the first embodiment.
  • FIG. 1 is a diagram showing an example of the configuration of the entire system to which the present invention is applied in the first embodiment.
  • FIG. 2 is a diagram illustrating an example of a journal log configuration according to the first embodiment.
  • FIG. 3 is a diagram showing an example of
  • FIG. 8 is a flowchart illustrating an example of the operation of data write log entry addition processing according to the first embodiment.
  • FIG. 9 is a flow chart showing an example of the operation of the data additional writing process according to the first embodiment.
  • FIG. 10 is a flowchart illustrating an example of the operation of log entry area allocation processing according to the first embodiment.
  • FIG. 11 is a flowchart illustrating an example of the operation of log page end log entry addition processing according to the first embodiment.
  • FIG. 12 is a flowchart illustrating an example of commit processing operations in the first embodiment.
  • FIG. 13 is a flowchart illustrating an example of checkpoint processing operations in the first embodiment.
  • FIG. 14 is a flow chart showing an example of the operation of checkpoint processing for data write log entries in the first embodiment.
  • FIG. 15 is a flowchart illustrating an example of the operation of immediate checkpoint processing of data write log entries in the first embodiment.
  • FIG. 16 is a flow chart showing an example of operation of delayed checkpoint processing of data write log entries in the first embodiment.
  • FIG. 17 is a flowchart illustrating an example of operation of insert pointer update processing according to the first embodiment.
  • FIG. 1 is a diagram showing an example of the configuration of the entire system to which the present invention is applied in the first embodiment.
  • a system 1000 embodying the present invention has a file system 1100 and persistent memory 1200 .
  • the file system 1100 is implemented, for example, by the CPU executing processes according to a given program.
  • the journal log control system of the present invention includes at least the journal log control unit 1110 .
  • the file system 1100 is composed of a journal log control unit 1110 and a free space management unit 1120.
  • the journal log control unit 1110 writes updates to the journal log in response to requests from users such as file writing and file attribute changes, commits processing to make permanent and confirm the update contents in the journal log, and and a checkpoint process for reflecting the update contents of the file system body.
  • the free space management unit 1120 executes allocation processing and release processing of the permanent memory 1200 .
  • the area of the persistent memory 1200 is divided into units of a certain size called pages (hereinafter referred to as page size) and managed.
  • the free area management unit 1120 allocates and releases areas in the persistent memory 1200 in units of pages. Note that FIG. 1 shows only the components necessary for explaining the first embodiment of the present invention, and the file system 1100 may have other components.
  • journal logs In conventional journaling file systems for HDDs and SSDs, it is common to statically allocate contiguous areas on HDDs or SSDs as journal logs when the file system is created.
  • pages in the persistent memory area are dynamically allocated and released, and the journal log is It should be configured as a linked list of pages.
  • Each page that constitutes the journal log (hereinafter referred to as a log page) holds a log entry indicating update content.
  • a specific area of persistent memory may be statically allocated as a journal log when a file system is created, like a conventional journaling file system on an HDD or SSD.
  • FIG. 2 is a diagram showing an example of the structure of a journal log in the first embodiment.
  • the journal log 2000 consists of two log page lists 2100, 2200.
  • the journal log control unit 1110 alternately uses the log page lists 2100 and 2200 for each update process such as file writing and file attribute change.
  • the journal log control unit 1110 alternates between the log page lists 2100 and 2200 for each update process, such as using the log page list 2100 in the first write process and using the log page list 2200 in the next write process.
  • the log page list 2100 shows the state used in the previous update process
  • the log page list 2200 shows the state used in the currently executing update process.
  • Log page lists 2100 and 2200 are linked lists of log pages.
  • the log page list 2100 is made up of log pages 2110, 2120, and 2130
  • the log page list 2200 is made up of log pages 2210, 2220, and 2230.
  • FIG. At the end of the log pages 2110 and 2120 other than the last log page in the log page list 2100 and at the end of the log pages 2210 and 2220 other than the last log page in the log page list 2200, the address of the next log page is held. do.
  • a null indicating that the next log page does not exist is held.
  • the log page holds log entries that describe updates such as file writes and file attribute changes.
  • the journal log 2000 manages log page lists 2100 and 2200 using start pointers 2300 and 2400, end pointer 2500 and insertion pointer 2600.
  • the start point pointers 2300, 2400 and the end point pointer 2500 are held on permanent memory.
  • the insert pointer 2600 may be held in volatile memory.
  • the initial values of the start pointers 2300, 2400, the end pointer 2500, and the insertion pointer 2600 are null, and when log entries are added to the journal log 2000, log pages are dynamically allocated as needed, as described above.
  • a starting point pointer 2300 indicates the top position of the log page list 2100
  • a starting point pointer 2400 indicates the top position of the log page list 2200
  • the end point pointer 2500 indicates the position of the log entry for which commit processing has been completed in the log page list.
  • log entries 2111, 2112, 2121, 2122, and 2131 between the start point pointer 2300 and the end point pointer 2500 are log entries for which commit processing has been completed.
  • journal log control unit 1110 When an update process is started and a log entry associated with the update process is added to the journal log 2000, it is added to the position of the insertion pointer 2600, and the insertion pointer 2600 is updated to point to the next position of the log entry. be. If the log page at the position of the insertion pointer 2600 does not have enough free space to hold the log entry, the journal log control unit 1110 requests the free space management unit 1120 to allocate a free page to create a new free space. Get page. Next, the journal log control unit 1110 retains the address of the new log page at the end of the log page, joins the new log page to the log page list, and inserts the log page at the position of the insertion pointer 2600.
  • log entries 2211, 2212, 2221, 2222, and 2231 between the start pointer 2400 and the insertion pointer 2600 are log entries that have been added to the journal log 2000 but have not yet been committed. .
  • the journal log control unit 1110 executes commit processing.
  • the journal log control unit 1110 executes a flush instruction and a memory barrier instruction for the log page area between the start pointer 2400 and the insertion pointer 2600 where the log entry is written.
  • the end point pointer 2500 is updated to the position indicated by the insertion pointer 2600, and the flush instruction for the area holding the end point pointer 2500 and the memory barrier instruction are executed.
  • log entries 2111, 2112, 2121, 2122, and 2131 held by the log page list 2100 become invalid, and the log entries 2211, 2212, 2221, 2222, and 2231 held by the log page list 2200 become the committed logs. valid as an entry.
  • journal log control unit 1110 sets the starting point pointer 2300 to null, executes a flush instruction and a memory barrier instruction for the area holding the starting point pointer 2300, and sends the log to the free area management unit 1120. All log pages in page list 2100 may be requested to be released.
  • journal log control unit 1110 executes checkpoint processing and reflects the contents of the committed log entries in the journal log 2000 to the main body of the file system.
  • the journal log control unit 1110 updates the insertion pointer 2600 to the position indicated by the start point pointer of the log page list that is not the log page list used in the update processing, and ends the update processing.
  • the insert pointer 2600 is updated to the position indicated by the starting point pointer 2300 .
  • the journal log control unit 1110 alternately uses the log page lists 2100 and 2200 in the journal log 2000 for each update process. Even if the checkpoint process in the update process is completed and the log entry associated with the update process becomes unnecessary, the update of the end point pointer 2500 for invalidating the log entry, the flush instruction, and the memory barrier instruction are not performed in the update process. By executing the end point pointer 2500 update, flush instruction, and memory barrier instruction in the log entry commit process in the next update process, invalidation of the log entry in the update process and execution of the next By executing commit processing of the log entries in the update processing of 1 at once, it is possible to reduce the number of executions of the update of the end point pointer 2500, the flush instruction, and the memory barrier instruction.
  • FIG. 3 is a diagram showing an example of the configuration of log entries in the first embodiment.
  • a log entry 3000 is composed of a header portion 3100 and a data portion 3200 .
  • the header portion 3100 holds the type of log entry (hereinafter referred to as log entry type).
  • log entry types include a data write log entry, which is a log entry for writing data in a file, and a metadata update log entry, which is a log entry for updating metadata.
  • Other log entry types may be prepared.
  • Header portion 3100 may also hold other information about the log entry.
  • the data part 3200 may have different information and sizes depending on the log entry type held in the header part 3100 of the log entry. Also, the data section 3200 may not exist.
  • FIG. 4 is a diagram showing an example of the structure of a data write log entry in the first embodiment.
  • write data uses one data write log entry 4000 for one page.
  • Embodiments can also be configured to use one data write log entry 4000 for multiple consecutive pages in the write data.
  • a header portion 4100 of the data write log entry 4000 holds data write log entry identification information 4110 that is information indicating that the log entry type is a data write log entry.
  • the data part 4200 of the data write log entry 4000 allocates a write data page 4300 separately from the data write log entry 4000 and holds the write data in the write data page 4300 .
  • the data part 4200 of the data write log entry 4000 includes the address 4210 of the write data page 4300 that holds the write data (hereinafter referred to as the write data page address) and the offset 4220 from the beginning of the file in the write data (hereinafter referred to as the file offset). , the size 4230 of the write data (hereinafter referred to as write data size), and the address (hereinafter referred to as the reflection destination data page address) of the data page to which the write data is reflected in the file system (hereinafter referred to as the reflection destination data page).
  • the address 4240 (hereinafter referred to as the reflection destination data page address holding address) of the position where the reflection destination data page address is held in the metadata page to be processed is held.
  • Data write log entry 4000 may hold other information.
  • FIG. 5 is a diagram showing an example of the configuration of a log page end log entry in the first embodiment.
  • the header portion 5100 of the log page end log entry 5000 holds log page end log entry identification information 5110 which is information indicating that the log entry type is the log page end log entry.
  • the end of log page log entry 5000 does not have a data portion.
  • the description of the structure and content of the metadata update log entry is omitted.
  • the write processing in the first embodiment consists of journaling processing, commit processing, and checkpoint processing.
  • An outline of the write process is described below.
  • data write log entries are used to keep written data in the journal log.
  • Metadata update log entries are also used to keep updates to file system metadata in the journal log.
  • commit process the log entry associated with the write process is made permanent to confirm the write process.
  • checkpoint processing the contents of the log entries held in the journal log are reflected in the file system. At this time, when the write data to the file has not reached the end of the write data page of the data write log entry, delay checkpoint processing is executed to delay the reflection of the data write log entry to the file system. Deferred checkpointing saves the location of the data write log entry.
  • journaling process for subsequent writes write data for which delayed checkpointing of a data write log entry has been performed in the previous write process, and checkpointing has been delayed by the delayed checkpointing of the data write log entry. If the range of write data of the subsequent write process is continuous with respect to the end of the range, write in the subsequent write process to the write data page of the data write log entry, and write the data Add a new data write log entry to the journal log that holds the write data page address of the log entry as the write data page address. On the other hand, in the journaling process of the subsequent write process, the delayed checkpoint process of the data write log entry is executed in the previous write process, but the checkpoint process is delayed by the delayed checkpoint process of the data write log entry. If the range of write data for the subsequent write operation is not contiguous with the end of the range of write data being written, immediately checkpoint the data write log entry and then journal the subsequent write operation. Execute the process.
  • journaling processing of metadata update log entries, commit processing, and checkpoint processing are required. Since the description of the structure and contents of the metadata is omitted, the details of the metadata update log entry journaling process, commit process, and checkpoint process will be omitted in the following description.
  • FIG. 6 is a flow chart showing an example of write processing operations in the first embodiment.
  • the journal log control unit 1110 receives the write data, the size of the entire write data (hereinafter, total write data size), and the position within the file to write the write data (hereinafter, write position).
  • journal log control unit 1110 executes journaling processing (step S6001). Journaling processing will be described later.
  • the journal log control unit 1110 executes commit process (step S6002). Commit processing will be described later.
  • the journal log control unit 1110 executes checkpoint processing (step S6003). Checkpoint processing will be described later.
  • FIG. 7 is a flowchart illustrating an example of journaling processing operations in the first embodiment.
  • the journal log control unit 1110 transitions to step S7002 if the position in the journal log of the delayed data write log entry is saved, otherwise transitions to step S7007 (step S7001).
  • the journal log control unit 1110 acquires the data write log entry from the position in the journal log of the delayed data write log entry.
  • the write range in the write process is continuous to the end of the range of write data held by the delayed data write log entry, that is, the write position in the write process is within the file of the delayed data write log entry. If equal to the sum of the offset value and the write data size value of the data write log entry, go to step S7003; otherwise, go to step S7006 (step S7002).
  • the write range is continuous to the end of the range of write data held by the delayed data write log entry, that is, the write position in the write process is the offset value in the file of the delayed data write log entry and the data
  • the journal log control unit 1110 executes data append write processing (step S7003).
  • the data append write process will be described later.
  • journal log control unit 1110 adds the data size added in step S7003 to the write position (step S7004), and subtracts the added data size from the total write size (step S7005).
  • the write range is not continuous to the end of the range of write data held by the delayed data write log entry, i.e., the write position is the value of the offset in the file of the delayed data write log entry and the delayed data
  • the journal log control unit 1110 sets the delayed checkpoint flag to false and executes checkpoint processing of the delayed data write log entries (step S7006).
  • the delayed checkpoint flag is a flag that indicates whether to delay the checkpoint processing of the data write log entry. Checkpointing of data write log entries will be described later.
  • the journal log control unit 1110 executes the processing from step S7007 to step S7011 for each data page within the write range after the write position. If the total write size is 0, the journal log control unit 1110 proceeds to step S7012; otherwise, to step S7008 (step S7007).
  • the journal log control unit 1110 executes metadata update log entry addition processing for metadata holding the address of the data page at the write position (step S7008).
  • the metadata update log entry addition process is assumed to update metadata accompanying a change in the address of the data page at the write position. As described above, the details of the metadata update log entry addition process, including whether or not the addition process is necessary, will not be described.
  • the journal log control unit 1110 executes data write log entry addition processing for the data page at the write position (step S7009).
  • the data write log entry addition process will be described later.
  • journal log control unit 1110 adds the data size written in step S7009 to the write position (step S7010), and subtracts the data size from the total write size (step S7011).
  • the journal log control unit 1110 executes metadata update log entry addition processing for the metadata of the entire file (step S7012).
  • the metadata update log entry addition process is assumed to update the file update time. , details of the metadata update log entry addition process will be omitted.
  • FIG. 8 is a flowchart illustrating an example of the operation of data write log entry addition processing according to the first embodiment.
  • the journal log control unit 1110 executes log entry area securing processing to secure an empty area for adding a data write log entry in the journal log (step S8001). The log entry area reservation process will be described later.
  • journal log control unit 1110 acquires information held in the data part of the data write log entry in steps S8002 to S8005.
  • the journal log control unit 1110 acquires a free page by requesting the free space management unit 1120 to allocate a free page. is the write data page address (step S8002).
  • the journal log control unit 1110 searches the metadata of the file to be written, and acquires the address of the position that holds the address of the data page in the metadata page that holds the address of the data page corresponding to the write position. , the address is set as the reflection destination data page address holding address of the data write log entry (step S8003).
  • the journal log control unit 1110 sets the write position to the file offset of the data write log entry (step S8004).
  • the journal log control unit 1110 calculates, from the page size, the remainder obtained by dividing the offset in the file of the data write log entry acquired in step S8004 by the page size, and the minimum value of the total write size. The value is set as the write data size of the data write log entry (step S8005).
  • the journal log control unit 1110 writes write data equal to the write data size of the data write log entry to the position of the remainder value calculated in step S8005 in the write data page of the write log entry. is executed (step S8006).
  • the journal log control unit 1110 stores data write log entry identification information indicating that the log entry type is a data write log entry, the write data page address of the write log entry, and the reflection destination data of the write log entry.
  • a data write log entry is created based on the page address holding address, the offset in the file of the write log entry, and the write data size of the write log entry, and the data write log entry is written at the position indicated by the insertion pointer 2600. (Step S8007).
  • the journal log control unit 1110 sets the insertion pointer 2600 to the position next to the data write log entry written in step S8007 (step S8008).
  • FIG. 9 is a flow chart showing an example of the operation of the data additional writing process according to the first embodiment.
  • the journal log control unit 1110 executes log entry area securing processing to secure an empty area for adding a data write log entry in the journal log (step S9001).
  • the log entry area reservation process will be described later.
  • journal log control unit 1110 acquires information held in the data part of the data write log entry in steps S9002 to S9005.
  • the journal log control unit 1110 uses the data write data page address of the delayed data write log entry acquired in step S7002 as the data write data page address of the data write log entry for the data additional write (step S9002).
  • the journal log control unit 1110 sets the reflection destination data page address holding address of the delayed data write log entry acquired in step S7002 as the reflection destination data page address holding address of the data write log entry for the data append write (step S9003).
  • the journal log control unit 1110 sets the file offset of the delayed data write log entry acquired in step S7002 as the file offset of the data write log entry for the data append write (step S9004).
  • the journal log control unit 1110 calculates, from the page size, the remainder obtained by dividing the file offset of the delayed data write log entry acquired in step S7002 by the page size, and the write data size of the delayed data write log entry. Calculate the minimum value of the value obtained by subtracting the total and the current total write size, and calculate the sum of the write data size of the delayed data write log entry and the minimum value of the data write log entry for the data append write.
  • the write data size is set (step S9005).
  • the journal log control unit 1110 divides the in-file offset of the delayed data write log entry by the page size, in the write data page indicated by the write data page address of the data write log entry for the data append write, and the delay
  • the value obtained by subtracting the write data size of the delayed data write log entry from the write data size of the data write log entry corresponding to the data append write is placed at the position of the sum of the write data size of the delayed data write log entry and the write data size of the delayed data write log entry , writes the write data, and executes the flash command for the written area (step S9006).
  • the journal log control unit 1110 includes data write log entry identification information indicating that the log entry type is a data write log entry, the write data page address of the write log entry for the data append write, and the data write log entry. The data page address to which the write log entry is reflected, the offset in the file of the write log entry for the data appending write, and the write data size of the write log entry for the data appending write. An entry is created, and a data write log entry for the data additional write is written at the position indicated by the insertion pointer 2600 (step S9007). Finally, the journal log control unit 1110 sets the insertion pointer 2600 to the position next to the data write log entry for the data additional write written in step S9007 (step S9008).
  • FIG. 10 is a flowchart illustrating an example of the operation of log entry area allocation processing according to the first embodiment.
  • the journal log control unit 1110 transitions to step S10007 if the insertion pointer is null, otherwise transitions to step S10002 (step S10001).
  • journal log control unit 1110 reserves a log entry area if the log page indicated by the insert pointer 2600 has an area sufficient to hold the log entry to be added. End the process. Otherwise, the journal log control unit 1110 transitions to step S10003 (step S10002).
  • journal log control unit 1110 requests the free space management unit 1120 to allocate a free page. to obtain an empty page (step S10003).
  • journal log control unit 1110 writes the address of the free page acquired in step S10003 to the end of the log page at the position indicated by the insertion pointer 2600, and joins the free page to the log page list of the journal log ( step S10004).
  • the journal log control unit 1110 also executes log page end log entry addition processing (step S10005). The log page end log entry addition processing will be described later.
  • the journal log control unit 1110 sets the insertion pointer 2600 to the top position of the free page acquired in step S10003 (step S10006).
  • journal log control unit 1110 requests allocation of a free page to the free space management unit 1120 and acquires a free page (step S10007).
  • the journal log control unit 1110 sets the head pointer indicating the head of the log page list used in the current update process to the head position of the empty page acquired in step S10007, and sets the head pointer to the area holding the head pointer.
  • a flash instruction and a memory barrier instruction are executed (step S10008).
  • the journal log control unit 1110 sets the insertion pointer 2600 to the top position of the empty page acquired in step S10007 (step S10009).
  • FIG. 11 is a flow chart showing an example of the operation of log page end log entry addition processing according to the first embodiment.
  • the journal log control unit 1110 creates a log page end log entry using the log page end log entry identification information, which is information indicating that the log entry type is a log page end log entry, and inserts the log page end log entry at the position indicated by the insertion pointer 2600.
  • a log page end log entry is written (step S11001).
  • FIG. 12 is a flowchart illustrating an example of commit processing operations in the first embodiment.
  • the journal log control unit 1110 sets a log entry pointer indicating a log entry in the journal log to the position of the head pointer of the log page list used in the journaling process of the write process (step S12001).
  • journal log control unit 1110 proceeds to step S12005, otherwise proceeds to step S12003 (step S12002).
  • journal log control unit 1110 executes a flush instruction for the entire log page indicated by the log entry pointer (step S12003).
  • journal log control unit 1110 sets the log entry pointer to the head position of the next log page in the log page list held at the end of the log page indicated by the log entry pointer (step S12004). ).
  • journal log control unit 1110 executes a flush instruction for the area from the log entry pointer to the insertion pointer 2600, and then executes a memory barrier instruction. Execute (step S12005).
  • the journal log control unit 1110 sets the end point pointer 2500 to the position of the insertion pointer 2600, and executes the flush command and memory barrier command for the area holding the end point pointer 2500 (step S12006).
  • the journal log control unit sets the log entry pointer to the position indicated by the head pointer of the log page list not used in the write process (step S12007).
  • journal log control unit 1110 ends commit processing. Otherwise, the journal log control unit 1110 transitions to step S12009 (step S12008).
  • journal log control unit 1110 saves the value of the log entry pointer, and then stores the log entry pointer at the end of the log page at the position indicated by the log entry pointer.
  • the next log page in the log page list is set at the head position (step S12009).
  • the journal log control unit 1110 requests the free space management unit 1120 to release the log page indicated by the value of the log entry pointer saved in step S12009, and releases the log page (step S12010).
  • FIG. 13 is a flowchart illustrating an example of checkpoint processing operations in the first embodiment.
  • the journal log control unit 1110 performs step S13001 on each committed log entry, that is, each entry between the start point pointer indicating the head position of the log page list used in the update process and the end point pointer 2500. , the process of step S13008 is executed. First, the journal log control unit 1110 sets the log entry pointer indicating the log entry to be processed in the journal log to the position indicated by the start point pointer of the log page list used in the update process (step S13001).
  • journal log control unit 1110 proceeds to step S13009; otherwise, to step S13003 (step S13002).
  • journal log control unit 1110 proceeds to step S13004 if the log entry type of the log entry indicated by the log entry pointer is a data write log entry; otherwise, step S13006. (step S13003).
  • journal log control unit 1110 sets the delay checkpoint flag to true and executes checkpoint processing for the data write log entry (step S13004). . Checkpointing of data write log entries will be described later.
  • journal log control unit 1110 adds the log entry size corresponding to the log entry type of the log entry indicated by the log entry pointer to the log entry pointer (step S13005).
  • journal log control unit 1110 proceeds to step S13007 if the log entry type of the log entry indicated by the log entry pointer is a metadata update log entry. Otherwise, the process proceeds to step S13008 (step S13006).
  • the journal log control unit 1110 executes checkpoint processing for the metadata update log entry (step S13007). As described above, the detailed description of the metadata update log entry checkpoint process is omitted.
  • the journal log control unit 1110 stores the log entry pointer at the end of the log page at the position indicated by the log entry pointer.
  • the next log page in the log page list is set to the head position (step S13008).
  • the log entry type is assumed to be one of data write log entry, metadata update log entry, and log page end log entry. .
  • journal log control unit 1110 executes insert pointer update processing (step S13009). Insert pointer update processing will be described later.
  • FIG. 14 is a flow chart showing an example of the operation of checkpoint processing for data write log entries in the first embodiment.
  • the journal log control unit 1110 determines whether the delayed checkpoint flag is false or the write has reached the end of the write data page, that is, the remainder obtained by dividing the offset in the file of the data write log entry by the page size and the data write log If the total write data size of the entry is the same value as the page size, the process proceeds to step S14002; otherwise, the process proceeds to step S14003 (step S14001).
  • the deferred checkpoint flag is false, or the write has reached the end of the write data page, that is, the remainder of dividing the offset in the file of the data write log entry by the page size and the write data size of the data write log entry
  • the journal log control unit 1110 executes immediate checkpoint processing of the data write log entry (step S14002). Immediate checkpointing of data write log entries is described below.
  • the deferred checkpoint flag is true and the write has not reached the end of the write data page, i.e. the remainder of dividing the offset in the file of the data write log entry by the page size and the write data size of the data write log entry
  • the journal log control unit 1110 executes delayed checkpoint processing of data write log entries (step S14003). Delayed checkpointing of data write log entries is described below.
  • Immediate checkpointing of data write log entries can reduce the amount of data written associated with checkpointing compared to checkpointing in conventional journaling.
  • the write data held in the write data page of the data write log entry is transferred to the reflection destination data page address holding address of the data write log entry.
  • the write data held in the write data page of the data write log entry is transferred to the reflection destination data page address holding address of the data write log entry.
  • write to the location of the remainder obtained by dividing the offset in the file of the relevant data write log entry by the page size, and execute the flush command for the written area. do.
  • the write data page of the data write log entry is transferred to an area other than the range where the write data is held. Data is written from the data page indicated by the reflection destination data page address held at the position of the reflection destination data page address holding address, and the flash instruction of the written area is executed. Furthermore, the write data page address of the data write log entry is written at the position of the reflection destination data page address holding address of the data write log entry, and the flush instruction of the written area is executed.
  • the predetermined threshold is, for example, 1/2 of the page size.
  • FIG. 15 is a flow chart showing an example of the operation of immediate checkpoint processing of data write log entries in the first embodiment.
  • the journal log control unit 1110 detects that the reflection destination data page address held at the position of the reflection destination data page address holding address of the data write log entry is null, that is, if the reflection destination data page does not exist, To step S15011, otherwise, to step S15002 (step S15001).
  • step S15003 the journal log control unit 1110 proceeds to step S15003 if the write data size of the data write log entry is less than a predetermined threshold. Otherwise, the process proceeds to step S15004 (step S15002).
  • the journal log control unit 1110 When the write data size of the data write log entry is less than the predetermined threshold, the journal log control unit 1110 writes the write data held in the write data page indicated by the write data page address of the data write log entry to the The remainder obtained by dividing the offset in the file of the data write log entry by the page size in the reflection destination data page indicated by the reflection destination data page address held at the position of the reflection destination data page address holding address of the data write log entry The write data size of the data write log entry is written to the position of the value, and the flush command of the written area is executed (step S15003).
  • journal log control unit 1110 determines that the write is not from the beginning of the write data page of the data write log entry, that is, within the file of the write log entry. If the remainder obtained by dividing the offset by the page size is not 0, the process proceeds to step S15005; otherwise, the process proceeds to step S15006 (step S15004).
  • the journal log control unit 1110 When writing is not from the beginning of the write data page of the data write log entry, that is, when the remainder obtained by dividing the offset in the file of the data write log entry by the page size is not 0, the journal log control unit 1110 starts the data write. From the beginning of the reflection destination data page indicated by the reflection destination data page address held at the position of the reflection destination data page address holding address of the log entry, to the write data page indicated by the write data page address of the data write log entry Data is written to the top by the remainder obtained by dividing the offset in the file of the data write log entry by the page size, and the flush command for the written area is executed (step S15005).
  • the journal log control unit 1110 determines that the write has not reached the end of the write data page of the data write log entry, that is, the remainder obtained by dividing the offset in the file of the data write log entry by the page size, and the data write log entry is smaller than the page size, the process proceeds to step S15007; otherwise, the process proceeds to step S15008 (step S15006).
  • the write has not reached the end of the write data page for this data write log entry, i.e., the sum of the remainder of dividing the offset in file of this data write log entry by the page size plus the write data size for this data write log entry is smaller than the page size, the journal log control unit 1110, in the reflection destination data page indicated by the reflection destination data page address held at the position of the reflection destination data page address holding address of the data write log entry, Data from the position of the sum of the remainder obtained by dividing the file offset of the relevant data write log entry by the page size and the write data size of the relevant data write log entry to the end of the relevant reflection destination data page In the write data page indicated by the write data page address of the write log entry, the position of the sum of the remainder obtained by dividing the file offset of the data write log entry by the page size and the write data size of the data write log entry , and executes the flash command for the written area (step S15007).
  • the journal log control unit 1110 saves the reflection destination data page address held at the position of the reflection destination data page address holding address of the data write log entry (step S15008).
  • the journal log control unit 1110 writes the write data page address of the data write log entry to the position of the reflection destination data page address holding address of the data write log entry, and executes the flush command for the written area (step S15009). ).
  • the journal log control unit 1110 requests the free space management unit 1120 to release the reflection destination data page indicated by the reflection destination data page address saved in step S15008, and releases the reflection destination data page (step S15010). ).
  • the journal log control unit 1110 determines that the write If not from the beginning of the write data page of the data write log entry, that is, if the remainder obtained by dividing the file offset of the data write log entry by the page size is not 0, go to step S15012; otherwise, go to step S15013. Transition (step S15011).
  • journal log control unit 1110 When writing is not from the beginning of the write data page of the data write log entry, that is, when the remainder obtained by dividing the offset in the file of the data write log entry by the page size is not 0, the journal log control unit 1110 starts the data write. In the write data page indicated by the write data page address of the log entry, 0 is written from the beginning of the write data page by the remainder obtained by dividing the offset in the file of the data write log entry by the page size. A flush command for the area is executed (step S15012).
  • the journal log control unit 1110 determines that the write has not reached the end of the write data page of the data write log entry, that is, the remainder obtained by dividing the offset in the file of the data write log entry by the page size, and the data write log entry is smaller than the page size, the process proceeds to step S15014; otherwise, the process proceeds to step S15015 (step S15013).
  • the write has not reached the end of the write data page for this data write log entry, i.e., the sum of the remainder of dividing the offset in file of this data write log entry by the page size plus the write data size for this data write log entry is smaller than the page size
  • the journal log control unit 1110 divides the offset in the file of the data write log entry by the page size in the write data page indicated by the write data page address of the data write log entry. , 0 is written from the position of the total value of the write data size of the data write log entry to the end of the write data page, and the flush command for the written area is executed (step S15014).
  • the journal log control unit 1110 writes the write data page address of the data write log entry to the position of the reflection destination data page address holding address of the data write log entry, and executes the flush command for the written area (step S15015). ).
  • FIG. 16 is a flow chart showing an example of operation of delayed checkpoint processing of data write log entries in the first embodiment.
  • the journal log control unit 1110 saves the position of the data write log entry indicated by the log entry pointer as the position of the delayed data write log entry (step S16001).
  • the area that holds the location of the delayed data write log entry indicated by the log entry pointer may be volatile memory.
  • FIG. 17 is a flowchart illustrating an example of operation of insert pointer update processing according to the first embodiment.
  • the journal log control unit 1110 sets the insertion pointer 2600 to the position indicated by the start point pointer of the log page list that is not the log page list currently indicated by the insertion pointer 2600 (step S17001). For example, if the insertion pointer 2600 points to the log page list 2100, the insertion pointer 2600 is moved to the position of the start pointer 2400. set.
  • the persistent memory 1200 stores data, metadata, and journal logs, which are update histories of data and metadata. Further, when a write request is received, the journal log control unit 1110 performs journaling processing for storing the update history of data and metadata associated with the write request in the journal log, and commit processing for finalizing the update history stored in the journal log.
  • Checkpoint processing that executes either immediate checkpoint processing that reflects the fixed update history stored in the journal log in data or metadata, or delayed checkpoint processing that delays the immediate checkpoint processing; controls the execution of
  • checkpoint processing when the write data has not reached the end of the write data page of the data write log entry, the write of the data write log entry is performed. It can be delayed until the end of the data page is reached. For this reason, according to the first embodiment, checkpoint processing in multiple write processing can be collectively performed. By ensuring consistency and reducing the amount of writing associated with journaling, the write performance of the file system can be improved.
  • the journal log control unit 1110 transfers the write data in the log entry in the journal log to a file. , and if it is equal to or greater than the predetermined threshold, the page storing the write data in the log entry in the journal log is used as the page storing the file data. As a result, the amount of writing from the journal log to the file can be reduced.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Une mémoire persistante (1200) stocke des données, des métadonnées et un journal qui est un historique de mise à jour des données et des métadonnées. Si une demande d'écriture est émise, une unité de commande de journalisation (1110) commande l'exécution d'un processus de journalisation pour stocker un historique de mise à jour de données et de métadonnées en association avec la demande d'écriture dans le journal, d'un processus de validation pour valider l'historique de mise à jour stocké dans le journal, et d'un processus de point de contrôle pour exécuter soit un processus de point de contrôle immédiat qui amène l'historique de mise à jour validé dans le journal à être reflété dans les données et les métadonnées, soit d'un processus de point de contrôle retardé qui retarde le processus de point de contrôle immédiat.
PCT/JP2021/022211 2021-06-10 2021-06-10 Système de commande de journalisation, procédé de commande de journalisation et programme de commande de journalisation WO2022259493A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/022211 WO2022259493A1 (fr) 2021-06-10 2021-06-10 Système de commande de journalisation, procédé de commande de journalisation et programme de commande de journalisation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/022211 WO2022259493A1 (fr) 2021-06-10 2021-06-10 Système de commande de journalisation, procédé de commande de journalisation et programme de commande de journalisation

Publications (1)

Publication Number Publication Date
WO2022259493A1 true WO2022259493A1 (fr) 2022-12-15

Family

ID=84425078

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/022211 WO2022259493A1 (fr) 2021-06-10 2021-06-10 Système de commande de journalisation, procédé de commande de journalisation et programme de commande de journalisation

Country Status (1)

Country Link
WO (1) WO2022259493A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002108640A (ja) * 2000-10-03 2002-04-12 Shinjo Keiei Kenkyusho:Kk デュープレックスシステム、シングルプロセッサシステム、及びサブボード
JP2013531835A (ja) * 2010-05-17 2013-08-08 テクニッシュ ウニヴェルジテート ミュンヘン ハイブリッドoltp及びolap高性能データベースシステム
JP2019003288A (ja) * 2017-06-12 2019-01-10 日本電信電話株式会社 ジャーナルログ制御システム、ジャーナルログ制御方法およびジャーナルログ制御プログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002108640A (ja) * 2000-10-03 2002-04-12 Shinjo Keiei Kenkyusho:Kk デュープレックスシステム、シングルプロセッサシステム、及びサブボード
JP2013531835A (ja) * 2010-05-17 2013-08-08 テクニッシュ ウニヴェルジテート ミュンヘン ハイブリッドoltp及びolap高性能データベースシステム
JP2019003288A (ja) * 2017-06-12 2019-01-10 日本電信電話株式会社 ジャーナルログ制御システム、ジャーナルログ制御方法およびジャーナルログ制御プログラム

Similar Documents

Publication Publication Date Title
US7035881B2 (en) Organization of read-write snapshot copies in a data storage system
US5949970A (en) Dual XPCS for disaster recovery
US7457921B2 (en) Write barrier for data storage integrity
KR101567134B1 (ko) 비휘발성 메모리에 기반하여 저널링 기능을 통합한 버퍼 캐시 장치, 저널링 파일 시스템 및 저널링 방법
US6192432B1 (en) Caching uncompressed data on a compressed drive
US8738850B2 (en) Flash-aware storage optimized for mobile and embedded DBMS on NAND flash memory
US20170185512A1 (en) Specializing i/o access patterns for flash storage
EP1702271B1 (fr) Lecture anticipee de fichiers adaptative, fondee sur plusieurs facteurs
US9959053B2 (en) Method for constructing NVRAM-based efficient file system
US11756618B1 (en) System and method for atomic persistence in storage class memory
US20060227585A1 (en) Computer system, disk apparatus and data update control method
US8250035B1 (en) Methods and apparatus for creating a branch file in a file system
US20060224639A1 (en) Backup system, program and backup method
KR20060044631A (ko) 지속성 메모리 액세스 시스템, 지속성 메모리의 직접액세스 방법 및 지속성 메모리 시스템을 액세스하는 시스템
US10664450B2 (en) Decoupling the commit and replay of metadata updates in a clustered file system
US6658541B2 (en) Computer system and a database access method thereof
US20050256859A1 (en) System, application and method of providing application programs continued access to frozen file systems
US8117160B1 (en) Methods and apparatus for creating point in time copies in a file system using reference counts
US6993627B2 (en) Data storage system and a method of storing data including a multi-level cache
WO2022259493A1 (fr) Système de commande de journalisation, procédé de commande de journalisation et programme de commande de journalisation
JPH11120051A (ja) データベース内の情報を修正するためのコンピュータ装置およびその修正方法
US9323671B1 (en) Managing enhanced write caching
KR101474843B1 (ko) 비휘발성 메모리에 기반하여 저널링 기능을 통합한 버퍼 캐시 장치, 저널링 파일 시스템 및 저널링 방법
JP2000305818A (ja) チップカードのメモリ断片化解消(デフラグ)
JP7450735B2 (ja) 確率的データ構造を使用した要求の低減

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21945167

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21945167

Country of ref document: EP

Kind code of ref document: A1