WO2017113213A1 - 访问请求处理方法、装置及计算机系统 - Google Patents

访问请求处理方法、装置及计算机系统 Download PDF

Info

Publication number
WO2017113213A1
WO2017113213A1 PCT/CN2015/099933 CN2015099933W WO2017113213A1 WO 2017113213 A1 WO2017113213 A1 WO 2017113213A1 CN 2015099933 W CN2015099933 W CN 2015099933W WO 2017113213 A1 WO2017113213 A1 WO 2017113213A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
cache page
log
target cache
target
Prior art date
Application number
PCT/CN2015/099933
Other languages
English (en)
French (fr)
Inventor
徐君
于群
王元钢
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP15911842.1A priority Critical patent/EP3376394B1/en
Priority to CN201580085444.2A priority patent/CN108431783B/zh
Priority to PCT/CN2015/099933 priority patent/WO2017113213A1/zh
Publication of WO2017113213A1 publication Critical patent/WO2017113213A1/zh
Priority to US16/021,555 priority patent/US10649897B2/en
Priority to US16/855,129 priority patent/US11301379B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0804Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1608Error detection by comparing the output signals of redundant hardware
    • G06F11/1612Error detection by comparing the output signals of redundant hardware where the redundant component is persistent storage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0868Data transfer between cache memory and other subsystems, e.g. storage devices or host systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1441Resetting or repowering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1032Reliability improvement, data loss prevention, degraded operation etc
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7203Temporary buffering, e.g. using volatile buffer or dedicated buffer blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7207Details relating to flash memory management management of metadata or control data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7209Validity control, e.g. using flags, time stamps or sequence numbers

Definitions

  • the present invention relates to the field of storage technologies, and in particular, to an access request processing method, apparatus, and computer system.
  • write-ahead logging In a storage system, write-ahead logging (WAL) is usually used to maintain data consistency. In this way, all data written to the storage system is first written to the log file of the external storage device (for example, disk), and then the old data is updated according to the log file. When the system has a power failure or downtime, the data can be restored according to the log to ensure data consistency.
  • NVM Non-Volatile Memory
  • NVM-based Storage Class Memory (SCM) is non-volatile and provides a new way to protect data consistency in storage systems.
  • an SCM block can act as both a cache block and a log block.
  • a block is the basic unit of memory space. Usually, a block can be 4k in size.
  • Each block has three state pairs: frozen/normal, dirty/clean, up-to-date/out-of-date. Frozen is used to indicate that the block is a log block, that is, the data in the block can be used as a log. Normal is used to indicate that the block is a cache block, that is, the block is used as a cache. Dirty is used to indicate that the data stored in the block has been modified.
  • Clean is used to indicate that the data stored in this block has not been modified.
  • Up-to-date is used to indicate that the data stored in the block is the latest version.
  • Out-of-date is used to indicate that the data stored in the block is an old version.
  • In the process of updating data first allocate a block for data in memory, and record the state of the block as (normal, clean, up-to-date). When data is written to the block, the state of the block is updated to (normal, dirty, up-to-date). Blocks in the (normal, dirty, up-to-date) state can be directly read or written, that is, data can be read directly from the block in the (normal, dirty, up-to-date) state or The data is written in the block of the normal, dirty, up-to-date state.
  • the memory block When the write operation is completed, the memory block is used as a log block when the transaction commits.
  • the state of the memory block is modified to (frozen, dirty, up-to-date).
  • the state of the memory block is modified to (frozen, dirty, out-of-date).
  • Memory blocks in the (frozen, dirty, out-of-date) state can be written back to disk.
  • the memory block becomes a free block and can be used for new write operations.
  • the method of using the SCM as both the cache space and the log space is compared with the WAL method, although the data write operation is reduced, the state of each block needs to be maintained, resulting in a large overhead of the system. Moreover, this method must implement data update with block granularity. When the updated data is smaller than one block, the problem of write amplification is caused, so that the data actually written to the disk is more than the data that needs to be written to the disk itself.
  • An access request processing method, apparatus and computer system provided in the embodiments of the present application can reduce system overhead on the basis of protecting data consistency.
  • the present application provides an access request processing method.
  • the method can be performed by a computer system.
  • the computer system includes a processor and a non-volatile memory (NVM).
  • NVM non-volatile memory
  • the processor when the processor receives a write request carrying the file identifier, the buffer pointer, and the size of the data to be written, the processor may be according to the file carried in the write request. Identify the access location.
  • the buffer area pointer is used to point to a buffer area for storing data to be written, the data to be written is modification data of an object file to be accessed by the write request, and the access location indicates that the write request is in the The starting address of the data written in the target file.
  • the processor may determine the target cache page according to the access location, the size of the data to be written, and the size of the cache page.
  • the target cache page is a memory page in the memory for buffering file data modified by the data to be written in the target file.
  • the processor inserts a new data node in a log chain of the target cache page.
  • Each of the data nodes in the log cache of the target cache page contains information of modified data of the target cache page in one modification process.
  • the inserted data node includes information of a log data slice of the target cache page, and the information of the log data slice includes a storage address of the log data slice or the log data slice in the NVM.
  • the log data piece is modified data of the target cache page, and the log data piece is at least a part of data to be written obtained from the buffer area according to the buffer area pointer.
  • the access request processing method provided by the application when the processor needs to When the data of the file is modified, the processor does not directly write the modified data into the target cache page of the file, but writes the modified data into the storage space of the NVM, and uses the log chain.
  • the form records the information of each modified data of the target cache page. Since the NVM is non-volatile, and the log chain is used in the NVM to store the write data, the modified data of the target cache page in the multiple modification process can be recorded in chronological order. Thereby, it is convenient to identify the version relationship of the log data piece, and the consistency between the stored data and the written data is ensured.
  • the access request processing method provided by the present application has a greater impact on the system overhead than the manner of maintaining the data consistency by maintaining different states of the memory block in the prior art.
  • the system overhead of the computer system can be reduced during the access request processing.
  • the access request processing method provided by the present application can support the modification of the smaller-sized file of the comparison page, and the modification The way is more flexible.
  • a success message may be written to the application response.
  • the write success message is used to indicate that the data to be written has been successfully written into the storage device.
  • the processor may specifically: according to the following field in a cache page structure of the target cache page At least one of the fields to determine a log chain in which the target cache page is stored in the NVM: "log head”, “log tail”, “logs”, and "log dirty".
  • the "log head” field is used to point to the first address of the log chain of the target cache page
  • the "log tail” field is used to point to the first address of the last data node in the log chain of the target cache page.
  • the logs field is used to indicate the number of data nodes in the log chain of the target cache page, and the "log dirty" is used to indicate the data node indication of the target cache page and the log chain of the target cache page. Whether the log data slices are synchronized.
  • the processor may create a log chain for the target cache in the NVM. . Thereby, a data node can be inserted in the newly created log chain, and information of the log data piece of the target cache page is recorded in the inserted data node.
  • the processor may be at the end or the head of the log chain of the target cache page during the operation of inserting a new data node in the log chain of the target cache page. Insert a new data node.
  • the log chain of the target cache page includes at least two data nodes that are sequentially linked according to the update order of the target cache page. According to the manner in which the new data nodes are inserted in this order, the log data slices in the different data nodes in the log chain of the target cache page can be linked according to the old and new order of the updated version of the target cache page.
  • different updated versions of the target cache page can be identified based on the order of the data nodes in the log chain of the target cache page. In the process of reading data, it is possible to determine valid data according to log data slices in different data nodes in the log chain of the same cache page, and ensure the correctness of the read data.
  • the processor may further obtain an updated target cache page according to information of at least one log data slice recorded in a log chain of the target cache page, and the updated target
  • the data of the cache page is stored in an external storage device of the computer system.
  • file data in the disk can be updated to maintain data consistency.
  • the log data slice stored in the NVM is updated to a corresponding cache page, and the disk is updated according to the updated cache page.
  • This method of merging modified data into the target cache page and writing the merged target cache page to disk with the existing write ahead log (WAL) method for maintaining data consistency and pre-write copy Compared with the (copy on write) method, the write amplification of the system can be reduced.
  • WAL write ahead log
  • the obtaining, by the information of the at least one piece of log data recorded in the log chain of the target cache page, the updated target cache page comprises: in a log chain according to the target cached page
  • the recorded information of the at least one log data slice determines valid data in the log chain of the target cache page, and updates the valid data into the target cache page to obtain the updated target cache page.
  • the valid data is the latest modified data of the target cache page.
  • the processor may determine valid data in the log chain according to an update order of each data node in a log chain of the target cache page and an in-page location information of a log data slice.
  • the in-page position information of the log data piece can be based on two letters of “intra-page offset” and “data length” in the data node. Get the interest.
  • the processor may further recover a log chain of the target cache page. Thereby, the storage space of the NVM can be recovered, and system resources are saved.
  • the information of the log data piece further includes: in-page location information of the log data slice and address information of a neighboring data node of the inserted data node.
  • the in-page location of the log data slice refers to the location of the log data slice in the target cache page, and the in-page location information of the log data slice may include information such as intra-page offset, log data length, and the like.
  • the intra-page offset is used to indicate a starting position of the log data slice in the target cache page
  • the log data length is used to indicate a length of the log data slice.
  • the address information of the adjacent data node of the inserted data node may be obtained according to information of "previous log address" and "next log address" in the data node.
  • the "previous log address” is used to indicate the starting address of the last data node in the NVM
  • the "next log address” is used to indicate the starting address of the next data node in the NVM.
  • the present application provides yet another method of processing an access request, which method can also be performed by a computer system.
  • the computer system includes a processor and a non-volatile memory (NVM).
  • NVM non-volatile memory
  • the processor may acquire an access location according to the file identifier, wherein the access location A start address indicating that the read request reads data in the target file. Further, the processor may determine the location information of the target cache page and the data to be read in the target cache page according to the access location, the size of the data to be read, and the size of the cache page.
  • the target cache page is a memory page in the memory for buffering file data modified by the data to be written in the target file.
  • the processor may obtain an update according to information of the target cache page and at least one log data slice in a log chain of the target cache page.
  • the log chain of the target cache page includes information of at least one log data slice, and each log data slice is modified data of the target cache page in a modification process, and the information of the log data slice includes a log data slice or the log data slice in the The storage address in the NVM.
  • the processor may read data from the updated target cache page according to location information of data to be read in the target cache page.
  • the access request processing method provided by the present application since the data to be modified of the cached page is stored in the NVM by means of a log chain, can support data modification smaller than the granularity of the page.
  • the latest modified version of the target cache page may be obtained according to the log data slice in the data node in the log chain of the target cache page to be accessed by the read request, thereby ensuring the read data. The accuracy.
  • the obtaining the updated target cache page according to the information of the target cache page and the at least one log data slice in the log chain of the target cache page comprises: caching the page according to the target Information of at least one log data slice recorded in the log chain determines valid data in a log chain of the target cache page, and updates the valid data to the target cache page to obtain the updated target Cache page.
  • the valid data is the latest modified data of the target cache page.
  • the processor may determine valid data in the log chain according to an update order of each data node in a log chain of the target cache page and an in-page location information of a log data slice.
  • the in-page location information of the log data slice can be obtained according to two informations of “intra page offset” and “data length” in the data node.
  • the processor may specifically: according to the following field in a cache page structure of the target cache page At least one of the fields to determine a log chain in which the target cache page is stored in the NVM: "log head”, “log tail”, “logs”, and "log dirty".
  • the "log head” field is used to point to the first address of the log chain of the target cache page
  • the "log tail” field is used to point to the first address of the last data node in the log chain of the target cache page.
  • the logs field is used to indicate the number of data nodes in the log chain of the cache page, and the "log dirty" is used to indicate that the target cache page and the data node of the log chain of the target cache page indicate Whether the log data slice is synchronized.
  • the information of the log data piece further includes: in-page location information of the log data slice and address information of a neighboring data node of the inserted data node.
  • the in-page location of the log data slice refers to the location of the log data slice in the target cache page, and the in-page location information of the log data slice may include information such as intra-page offset, log data length, and the like.
  • the in-page offset is used to indicate a starting position of the log data slice in the target cache page
  • the log data length is used to indicate the length of the log data slice.
  • the address information of the adjacent data node of the inserted data node may be obtained according to information of "previous log address" and "next log address" in the data node.
  • the "previous log address” is used to indicate the starting address of the last data node in the NVM
  • the "next log address” is used to indicate the starting address of the next data node in the NVM.
  • the present application provides a computer system including a non-volatile memory NVM and a processor coupled to the NVM, the processor for performing the first aspect and the first aspect A method as described in a possible implementation.
  • the present application provides a computer system including a non-volatile memory (NVM) and a processor coupled to the NVM, the processor for performing the second aspect and the second aspect described above
  • NVM non-volatile memory
  • the processor for performing the second aspect and the second aspect described above
  • the present application provides an access request processing apparatus, which is applied to a computer system, the computer system includes a non-volatile memory (NVM), and the access request processing apparatus includes A module of the method described in the first aspect above and the various possible implementations of the first aspect.
  • NVM non-volatile memory
  • the present application provides an access request processing apparatus, the access request processing apparatus being applied to a computer system, the computer system comprising a non-volatile memory (NVM), the access request processing apparatus comprising A module of the method described in the second aspect above and the various possible implementations of the second aspect.
  • NVM non-volatile memory
  • the present application provides a computer program product, comprising: a computer readable storage medium storing program code, the program code comprising instructions for performing at least one of the first aspect and the second aspect .
  • FIG. 1 is a schematic structural diagram of a computer system according to an embodiment of the present invention.
  • FIG. 2 is a schematic signaling diagram of a computer system according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of a method for processing an access request according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram of data processing according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a cache page structure and a log chain structure according to an embodiment of the present invention.
  • FIG. 6 is a flowchart of a data merging method according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of signaling of another computer system according to an embodiment of the present disclosure.
  • FIG. 8 is a flowchart of still another method for processing an access request according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of an access request processing apparatus according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of still another access request processing apparatus according to an embodiment of the present invention.
  • FIG. 1 is a schematic structural diagram of a computer system according to an embodiment of the present invention.
  • the computer system architecture shown in Figure 1 is a computer system architecture that mixes memory.
  • DRAM dynamic random access memory
  • PCM phase change memory
  • the computer system 100 can include a central processing unit (CPU) 105, a north bridge chip 110, a south bridge chip 115, a dynamic random access memory DRAM 120, a phase change memory PCM 125, and a magnetic disk 130.
  • CPU central processing unit
  • DRAM dynamic random access memory
  • PCM phase change memory
  • the central processing unit (CPU) 105 is the core of the computer system, and the CPU 105 can invoke different software programs in the computer system 100 to implement different functions.
  • the CPU 105 can implement access to the DRAM 120, the PCM 125, and the disk 130.
  • the central processing unit (CPU) 105 is just one example of a processor.
  • the processor may be an Application Specific Integrated Circuit (ASIC) or one or more integrated circuits configured to implement embodiments of the present invention.
  • ASIC Application Specific Integrated Circuit
  • Northbridge chip 110 is typically used to process high speed signals in computer system 100, specifically The north bridge chip 110 can be used to handle communication between the CPU, memory, and south bridge chips.
  • the north bridge chip 110 is connected to the CPU 105 via a front side bus.
  • the north bridge chip 110 is connected to the DRAM 120 and the PCM 125 via a memory bus. In this manner, both DRAM 120 and PCM 125 are coupled to the memory bus and communicate with CPU 105 via Northbridge chip 110.
  • the north bridge chip 110 can be integrated with the CPU 105.
  • the south bridge chip 115 is responsible for communication between the CPU 105 and external devices.
  • the CPU 105 and the south bridge chip 115 can communicate via a communication bus such as a high-speed Peripheral Component Interconnect Express (PCI-E) bus or a Direct Media Interface (DMI) bus to implement mutual interfacing of the CPU 105.
  • PCI-E Peripheral Component Interconnect Express
  • DMI Direct Media Interface
  • Control of devices such as a Peripheral Component Interconnect (PCI) interface device, a Universal Serial Bus (USB) interface device, and a Serial ATA (Serial Advanced Technology Attachment, SATA) interface device.
  • PCI Peripheral Component Interconnect
  • USB Universal Serial Bus
  • SATA Serial Advanced Technology Attachment
  • the south bridge chip 115 can be connected to the magnetic disk 130 through a Serial ATA (Serial Advanced Technology Attachment, SATA) interface, so that the CPU 105 can communicate with the magnetic disk 130 through the south bridge chip 115 to implement control of the magnetic disk 130.
  • SATA Serial Advanced Technology Attachment
  • the south bridge chip includes, but is not limited to, an integrated south bridge, such as a platform controller hub (PCH).
  • PCH platform controller hub
  • a dynamic random access memory (DRAM) 120 is connected to the north bridge chip 110 through a memory bus.
  • the DRAM 120 can implement communication with the CPU 105 through the north bridge chip 110.
  • the CPU 105 is capable of accessing the DRAM 120 at a high speed to perform a read or write operation to any of the memory cells 120.
  • the DRAM 120 has the advantage of fast access speed, so DRAM is usually used as main memory.
  • DRAM 120 is typically used to store various operating software, input and output data, and information exchanged with external memory in the operating system. However, DRAM 120 is volatile and the information in DRAM 120 will no longer be saved when the computer is powered off.
  • Those skilled in the art know that DRAM is a kind of volatile memory.
  • other random access memory (RAM) can be used as the memory of the computer system.
  • SRAM static random access memory
  • PCM 125 is a new type of non-volatile memory (Non-Volatile Memory, NVM).
  • NVM Non-Volatile Memory
  • PCM 125 and DRAM 120 are collectively used as computer system 100. Memory. Since the new NVM can be addressed in Bytes, data is written to the non-volatile memory in units of bits, and thus can be used as a memory. Compared with the DRAM 120, since the PCM 125 has a non-volatile characteristic, data can be better preserved.
  • a nonvolatile memory that can be used as a memory can be referred to as a storage class memory (SCM).
  • SCM storage class memory
  • SCM can also include: Resistive Random Access Memory (RRAM), Magnetic Random Access Memory (MRAM) or Ferroelectric Random Access Memory (FRAM). Loss memory, the specific type of SCM in the embodiment of the present invention is not limited herein.
  • RRAM Resistive Random Access Memory
  • MRAM Magnetic Random Access Memory
  • FRAM Ferroelectric Random Access Memory
  • the magnetic disk 130 can be connected to the south bridge chip 115 through an interface such as a Serial Advanced Technology Attachment (SATA) interface or a Small Computer System Interface (SCSI).
  • Disk 130 is used to store data for use as an external storage device of computer system 100.
  • a storage medium as an external storage device needs to have a non-volatile characteristic. When the computer is powered off, the data stored in the external storage is still saved. Moreover, the external storage capacity is large.
  • the memory of the external storage device may be a solid state disk (Solid State Drives, SSD) or a hard disk drive (Hard Disk Drive, in addition to the disk 130). HDD), optical discs, storage arrays, and other non-volatile storage devices capable of storing data.
  • the computer system shown in Figure 1 is merely an example of a computer system.
  • the CPU 105 can be connected to the memory through the north bridge chip, and the DRAM 120 and the PCM 125 can pass through the double data rate (DDR) bus and the CPU 105. Communicate.
  • the CPU 105 may connect the magnetic disk 130 without passing through the south bridge chip.
  • the CPU 105 can connect to the disk 130 through a Host Bus Adaptor (HBA).
  • HBA Host Bus Adaptor
  • the connection form between the internal devices of the specific computer system is not limited, as long as it is a computer system including a non-volatile memory (NVM).
  • the computer system described in the embodiment of the present invention is a computer system including a persistent memory (PM).
  • PM persistent memory
  • FIG. 1 is a schematic diagram of signaling of a computer system 100 according to an embodiment of the present invention
  • FIG. 3 is a method for processing an access request according to an embodiment of the present invention. And for convenience of description, FIG. 2 only illustrates the devices involved in the access request processing in the computer system 100 shown in FIG. 1.
  • FIG. 1 is a schematic diagram of signaling of a computer system 100 according to an embodiment of the present invention
  • FIG. 3 is a method for processing an access request according to an embodiment of the present invention.
  • FIG. 2 only illustrates the devices involved in the access request processing in the computer system 100 shown in FIG. 1.
  • FIG. 3 illustrates the processing of a write request by computer system 100 as an example. It should be noted that the CPU 105 performs the processing of the access request by calling data processing logic (not shown in FIG. 2). It will be appreciated that the data processing logic may be a program that implements the request processing method of an embodiment of the present invention.
  • a file system is a software structure in the operating system that is responsible for managing and storing file information.
  • a file system is a system that organizes and allocates space for file storage devices, is responsible for file storage, and protects and retrieves stored files.
  • a file system consists of three parts: an interface to a file system, a collection of software for file manipulation and management, file data, and attributes.
  • the file name can be the full path name of the file, which is a logical description of the location information of the target file on the disk.
  • the file name of the target file can be: D: ⁇ FILE ⁇ file1.
  • the operating system allocates a file handle for the target file accessed by the process, and maintains an array of file handles inside the process.
  • the file handle can be represented by a number, for example, the file handle can be: fd 0, fd1 or fd 2.
  • a pointer to the file description information is stored in the file handle array.
  • the file description information contains pointers to information such as the file directory structure, metadata, and access location.
  • the file directory structure is used to describe the logical location of the file.
  • the file directory structure is stored in memory, and the process can locate the location of the target file through the file directory structure. Metadata is data used to describe file data.
  • Metadata includes information about the organization, data fields, and relationships of file data.
  • the access location is used to indicate the starting location of the current access to the process.
  • the access location can be a logical location.
  • the access location information can be 0 to represent access from the start address of the file.
  • the process can also set the file access location via a system call based on the file handle.
  • the access location may be any access location set by the system call, and in the case of sequential read and write, the currently accessed access location is the end location of the previous visit.
  • the process can find the target file description information in the file handle array maintained by the process according to the file handle of the target file.
  • the file description information is found in the metadata and the access location of the file, thereby implementing a read operation or a write operation on the target file.
  • the file handle is a file identifier that identifies the target file in the process of reading/writing the target file by the current process.
  • the file identifier may also be a file descriptor other than the file handle, which is not limited herein, as long as the process can identify the target file through the file identifier and find the description information of the target file.
  • memory such as DRAM 120 and PCM 125 in Figure 1
  • DRAM 120 and PCM 125 in Figure 1 can be used to store various running software, input and output data, and exchange with external memory in the operating system. Information. Therefore, when accessing the target file, the operating system running in the computer system 100 first loads the file data of the target file to be accessed from the disk 130 into the memory.
  • a memory page in memory for caching file data may be referred to as a cache page.
  • the following describes an example in which file data of a target file is loaded into the DRAM 120. For example, as shown in FIG.
  • both "cache page 1" and “cache page 2" in the DRAM 120 are used to cache the file data 210 of "file 1", and therefore, "cache page 1" and “cache page 2" Both are cache pages of "file 1" stored in the disk 130. It can be understood that the file data cached in different cache pages is different.
  • the size of a page is 4k bytes, that is, one page has 4096 bytes.
  • the size of a cache page is also usually 4K bytes.
  • the size of the cache page can also be set to 8 KB or 16 KB, and the size of the cache page is not limited here.
  • the process of processing the access request mainly relates to a process of performing a write operation or a read operation on the target file according to the write request and the read request of the target file after the target file is opened.
  • the CPU 105 can call the data processing logic 1051 to process the write request 200.
  • Write request 200 It carries the file identifier, the buffer pointer, and the size of the data to be written.
  • the file identifier is a file handle allocated for the target file when the process opens the target file to be accessed by the write request 200.
  • the process identifies the file description information of the target file according to the file identifier.
  • the buffer area pointer is used to point to a buffer area in which data to be written is cached.
  • the buffer area may be a section of memory space partitioned in DRAM 120 or PCM 125.
  • the size of the data to be written is the length of the buffer area in which the data to be written is cached. For example, the size of the data to be written may be 100 bytes.
  • the CPU 105 acquires an access location based on the file identifier.
  • the access location is used to indicate a start address of the write request 200 to write data in the target file.
  • the CPU 105 may use the file identifier carried in the write request 200 as an index to find the description information of the target file through the file handle array maintained by the process, and The access location in the target file to be accessed by the write request 200 is found in the description information of the target file.
  • the access location is a starting address of the data written by the write request 200 in the target file.
  • the access location may be a logical access location.
  • the access location can be the 89th byte of the first file.
  • step 310 the CPU 105 determines N target cache pages and corresponding to the target cache page OCP i in the N target cache pages according to the access location, the size of the data to be written, and the size of the cache page.
  • the CPU 105 may calculate the target cache to be accessed by the write request 200 according to the access location, the size of the data to be written, and the size of the cache page.
  • the logical page number of the page so that the target cache page to be accessed by the write request 200 can be determined based on the calculated logical page number.
  • a cache page is a memory page in memory that is used to cache file data. Therefore, in the embodiment of the present invention, the target cache page is a memory page in the memory for buffering the file data modified by the data to be written in the target file.
  • the size of the data to be written is 212 bytes, that is, the write request 200 is from the 89th of the first file.
  • the bytes begin to write 212 bytes into the first file.
  • a cache page size of 100 bytes is taken as an example for description.
  • the first byte of the first file 0-99 constituting the first page document 1 p 1 1, the first byte of the first file 100-199 constituting a first document page 2 p 2, first page 3 P 200-299 byte of a file constituting the first file 3, P 4, page 300-399 byte of the first file of the first configuration file 4, and so on.
  • the CPU 105 can calculate the write request to access the first page to the fourth page of the first file according to the access location, the size of the data to be written, and the size of the cache page, that is, the first page of the first file to Page 4 is determined as the target cache page, and the value of i is 1 to 4.
  • CPU 105 may determine, respectively, write page 1 to page 4 of the first file in the four pieces of data: log 1 (89,11), log 2 (0,100), log 3 (0,100) and the log 4 (0,1). Specifically, the CPU 105 can determine the log data slice log 1 (89, 11) to be written on the first page, the log data slice log 2 ( 0 , 100) to be written on the second page, and the log data to be written on the third page. The slice log 3 (0, 100) and the log data slice to be written to page 4 log 4 (0, 1).
  • log 1 (89,11) for indicating a position starting from the 89th byte of the first page of 11 bytes log 2 (0,100) is used to indicate 0 byte from page 2
  • the first 100 bytes of the position log 3 (0,100) is used to indicate 100 bytes from the position of the 0th byte of page 3
  • log 4 (0,1) is used to indicate from page 3
  • the first byte of the 0th byte position begins.
  • the log data slice refers to a set of data to be written to each target cache page. In another way, the log data slice is the modified data for each target cache page.
  • the CPU 105 determines the location information of the log data slice log i (x, y) corresponding to each target cache page OCP i .
  • the CPU 105 may further buffer the buffer according to the size of the data slice to be written in each target cache page.
  • the data to be written in the cache is divided into four parts, so that the location information of the log data piece corresponding to each target cache page can be obtained.
  • the location information of the log data slice refers to the location of the data to be written to each target cache page in the buffer pointed to by the buffer pointer carried in the write request 200. For example, as shown in FIG.
  • the CPU 105 can divide the data to be written buffered in the buffer into four parts according to the information of the data slice to be written into four pages: buf 1 (0, 11), buf 2 ( 11 , 100), buf 3 (111, 100) and buf 4 (211, 1), thereby obtaining position information of each log data piece.
  • buf 1 (0,11) is used to indicate that the data of log 1 (89,11) is 11 bytes from the 0th byte in the buffer
  • buf 2 (11,100) is used to represent log 2 (0,100).
  • the data is 100 bytes starting from the 11th byte in the buffer
  • buf 3 (111, 100) is used to indicate the log 3 (0,100) data is the 100 words starting from the 111th byte in the buffer.
  • buf 4 (211, 1) is used to represent the log 4 (0, 1) data as 1 byte starting from the 211st byte in the buffer.
  • the target cache page to be accessed by the write request 200 may be a cache page or multiple cache pages, that is, the value of N may be an integer not less than one.
  • the data to be written carried in an access request may be data written in only one page, or data that needs to be written in multiple pages. The above is an example of writing multiple target cache pages.
  • the access location is the 89th byte of the first file, and the size of one page is 100 bytes, for example, if the size of the data to be written is 5 bytes, that is, The CPU 105 needs to write 5 bytes from the 89th byte of the first file to the first file according to the write request 200.
  • the CPU 105 can calculate the log data slice written to the first page as log p1 (89, 5) according to the first access location, the size of the data to be written, and the size of the cache page. Further, the CPU 105 can obtain the location information of the log data slice written to the target cache page: buf (0, 5), which is the buffer pointed to by the buffer pointer carried in the write request 200.
  • step 320 CPU 105 determines whether the PCM 125 stores a log chain log chain of the target cache page OCP i, log chain OCP i of the target cache page record for the target cache page OCP i at least once modified data information.
  • the process proceeds to step 325, otherwise the method proceeds to step 330.
  • the CPU 105 can also obtain the metadata information of the target file according to the file identifier carried in the write request 200.
  • the metadata information of the target file includes cache page structure information of the target file.
  • the CPU 105 may obtain the cache pages of the N target cache pages from the metadata information of the target file. structure. Further, it is possible to determine whether or not the log chain of the target cache page OCP i is stored in the PCM 125 based on the information recorded in the cache page structure.
  • FIG. 5 is a schematic diagram of a cache page structure and a log chain structure according to an embodiment of the present invention.
  • a cache page may be cached in the DRAM 120, and a log chain of the cache page may be stored in the PCM 125.
  • a cache page structure of a plurality of cache pages is illustrated in a cache page structure 405.
  • the cache page structure is used to describe the metadata information of the cache page.
  • each cache page has a corresponding cache page structure, and each cache page structure further includes information of a log chain log chain of the cache page. Specifically, the following fields are maintained in each cache page structure.
  • a log head header which is used to point to a first address of a log chain of the cache page, wherein the log chain is stored in the PCM 125, and the first address of the log chain of the cache page may be included
  • An inode of the file to which the cache page belongs and a logical page number of the cache page wherein the inode of the file is used to determine a file to which the cache page belongs, and the logical page number is used to determine a cache page.
  • the tail of the log tail is used to point to the first address of the last data node in the log chain of the cache page.
  • the log chain of each cache page consists of data nodes dynamically generated during at least one modification of the cache page.
  • a data node is used to record information of a log data slice of the cache page in a modification process, wherein the log data slice refers to the modified data of the cache page during a modification process.
  • Each data node includes a data field storing a log data piece, a pointer field storing other data node addresses, and other information fields, and the other information fields may be used to store other information such as the address of the data node.
  • Dirty dirty used to indicate whether there is dirty data in the cache page.
  • "dirty” is used to indicate whether the cached page is synchronized with the file data on the disk. For example, when the "dirty" indication bit is 1, it indicates that there is dirty data in the cached page, indicating that the data in the cached page is inconsistent with the file data in the disk. When the "dirty” indicator is 0, it means that the data in the cache page is consistent with the file data on the disk.
  • the log dirty log is dirty, and is used to indicate whether the cache page is synchronized with the log data piece indicated by the data node of the log chain log chain of the cache page. For example, when the "log dirty" field is 1, there is new data in the log data slice indicated by the data node indicating the log chain of the cache page, and the data in the data node is inconsistent with the data in the cache page. When the "log dirty" field is 0, the log data slice indicated by the data node of the log chain of the cache page is consistent with the data in the cache page. In another way, when the "log dirty” field is 1, the log data slice indicated by the data node of the log chain of the cache page has not been updated into the cache page. When the "log dirty” field is 0, the log data slice indicated by the data node indicating the log chain of the cache page has been updated into the cache page.
  • the CPU 105 may obtain the cache page structure of the target cache page from the metadata information of the target file, thereby The CPU 105 can determine whether a log chain of the target cache page is stored in the PCM 125 according to a "log head” indication bit or a "log tail” indication bit in the cache page structure of the target cache page. Specifically, when the CPU 105 in the cache page structure of the target cache in the identified page OCP i "log head” or "log tail" may determine the target buffer page has not been modified OCP i is empty, the page is not the target cache Log chain.
  • the page buffer and the CPU in the target cache page structure of OCP i log head field 105 may The address pointer in the record finds the log chain of the target cache page OCP i .
  • the log head in the first cache page structure is empty, it indicates that the first cache page has no log chain.
  • the address header in the first cache page structure includes an address, it is indicated that the log chain of the first cache page is stored in the PCM 125.
  • the CPU 105 creates a log chain for the target cache page OCP i in the PCM 125.
  • the CPU 105 determines in step 320 that the log chain of the target cache page OCP i is not stored in the PCM 125 according to the information in the cache page structure of the target cache page, the CPU 105 may be in the PCM 125.
  • the target cache page creates a log chain.
  • the physical space can be allocated in the PCM 125 according to the size of the data to be written, and the data structure of the log chain is initialized in the allocated physical space.
  • the PCM 125 stores a log chain of each updated cache page.
  • each updated cache page has a log chain.
  • the log chain is used to record at least one modification of the cache page.
  • a global log chain structure may be created for the file system, as shown in 410 of FIG.
  • the global log chain structure 410 contains a log chain of multiple cache pages.
  • the log chain of each cache page can be seen as a node or sub-log chain in the global log chain.
  • the log chain structure 410 can include control information 4100 and a log chain for each cache page.
  • the global information is included in the control information 4100.
  • the Global log head pointer is used to point to the header of the log chain of the first cache page in the global log chain structure 410. Specifically, the global log head pointer is used to point to the first address of the global log chain structure in the PCM 125.
  • the Global log tail pointer is used to point to the first address of the log chain of the last cache page in the global log chain structure 410.
  • the first address of the log chain of the cache page is the cache page address shown in FIG. 5, and the cache page address may include the inode of the file to which the cache page belongs and the logical page number of the cache page. .
  • the log chain of each cache page is composed of data nodes formed during at least one modification of the cache page.
  • the data node contains information such as log data slices and pointers to other data nodes.
  • the information of the log data slice may include a storage address of a log data slice or a log data slice in the PCM 125.
  • the global log tail pointer in the control information of the global log chain needs to point to the first address of the log chain structure of the newly created cache page, according to the In a manner, the log chains of the newly created cache pages may be mounted in the global log chain of the file system in the order of creation time, so that each computer in the global log chain can be restored during the recovery process after the computer system fails.
  • the log chain of the cache page recovers the data written to the computer system, thereby maintaining data consistency and facilitating system management.
  • the log chain 4105 of the first cache page includes a first data node 41051 and a second data node 41052.
  • the first data node 41051 and the second data slice 41052 node may each include information of a log data slice, a pointer to another data node, logical position information of the log data slice, and the like. It can be understood that one log data slice can be obtained by modifying the first cache page once.
  • the log data piece in the first cache page modification process may be recorded in the log chain 4105 of the first cache page.
  • the first log data piece obtained in the first modification process of the first cache page may be stored in the first data node 41051, which will be obtained during the second modification process of the first cache page.
  • the second log data slice is stored in the second data node 41052.
  • the "Log Data Slice” field in the log chain is used to record information of the current modified data of the cache page.
  • the "log data slice” is used to indicate the first modification data of the first cache page
  • the "log data slice” in the second data node 41052 is used to indicate the second modification data of the first cache page.
  • the modified data can be directly recorded in the "log data slice” portion, and in another case, the modified data can also be stored in other storage spaces in the PCM 125, and then in the ""
  • the log data slice "partially records the address of the modified data.
  • the data storage manner in the log chain structure is not limited, as long as the plurality of modified data of the cache page can be found according to the log chain.
  • each log data piece may be sequentially recorded in the modified order of the target cache page.
  • each data node in the log chain contains information of pointers to other data nodes.
  • a pointer to another data node may include the following fields: a previous log address, a next log address, and the like.
  • the "previous log address” is used to indicate the address of the previous data node.
  • the "previous log address” is used to indicate the starting address of the previous data node in the SCM. For example, as shown in FIG.
  • the "previous log address” can point to the "intra-page bias” in the log data slice in the previous data node. Move field.
  • the previous data node is the data node of the previous insertion of the data node, and is used to indicate the information of the modified data of the target cache page in the previous modification process.
  • the "next log address” is used to indicate the address of the next data node.
  • the specific "next log address” is used to indicate the starting address of the next data node in the SCM. For example, as shown in FIG. 5, the "next log address” may point to the "intra page offset” field in the log data slice in the next data node.
  • the next data node is the next inserted data node of the data node, and is used to indicate the information of the modified data of the target cache page in the next modification process.
  • the "previous log address" field in the first data node in the log chain structure of a cache page is empty.
  • the first data node 41051 in FIG. 4 is the first data node in the log chain 4105 of the first cache page. Therefore, in the first data node 41051, the "previous log address" field is empty.
  • the "next log address” field in the last data node in the log chain of a cached page is empty.
  • the data node is the last modification of the cache page corresponding to the data node.
  • each data node in order to record specific information of the log data piece in the target cache page, each data node further includes in-page position information of the log data piece.
  • the in-page location information of the log data piece may include information such as intra-page offset, log data length, and the like.
  • the in-page position of the log data piece refers to the log The location of the piece of data in the target cache page.
  • the “intra-page offset” is used to indicate the starting position of the log data slice in the cache page.
  • the "log data length” is used to indicate the length information of the log data slice.
  • the first data node of the cache page in order to establish a link between the log chain of the cache page and the cache page, in the log chain of each cache page, the first data node of the cache page further includes a “cache page address” information, wherein
  • the "cache page address information" may include a file inode and a logical page number.
  • the file inode is used to indicate the file to which the log chain belongs
  • the logical page number is used to indicate the cache page to which the log chain belongs.
  • the "cache page address" field in the first data node 41051 of the first cache page includes the inode of the file to which the first cache page belongs and the logical page number of the first cache page.
  • the first data node of each cache page further includes pointer information indicating the next page, and the “next page” is used to indicate the file system.
  • the first data node in the log chain of the cache page of the next modified file According to the "next page" pointer, the log chain of the next modified cache page in the file system can be found.
  • step 330 the CPU 105 inserts a data node in the log chain for the target cache page OCP i , the inserted data node containing information of the log data slice log i (x, y).
  • the method may proceed to step 330 so that the CPU 105 is in the created log chain. Insert the data node and record the information of the log data slice during the modification.
  • CPU 105 may be an existing page in the target cache OCP i Insert a data node into the log chain and record the information of the log data slice during the modification.
  • the information of the log data slice may specifically include a storage address of the log data slice or the log data slice in the PCM 125.
  • the information of the log data piece may also include information such as a pointer to other data nodes and a location of the log data piece. For example, after the CPU 105 creates a log chain structure for the first cache page, the information of the first log data slice may be recorded in the log chain structure of the first cache page.
  • the data in the first log data slice may be directly recorded in the “log data slice” field in the first data node 41051, or the storage address of the first log data slice in the PCM 125 may be recorded in the first
  • the "log data slice” field in the data node 41051 is not limited herein.
  • information such as the position and length of the first log data piece and the pointer to other data nodes may also be recorded in the first data node 41051.
  • the inode of the first file and the logical page number of the first cache page may be recorded in the "cache page address" field of the first data node 41051, and at the first Record 89 in the "Intra Page Offset" field of the data node 41051, record 11 in "log data length", record data of buf 1 (0, 11) in the "log data slice” field or in "log data slice”
  • the storage address of the data slice log 1 (89, 11) in the PCM 125 is recorded in the field.
  • each data node in the log chain 4105 of the first cache page can also be dynamically generated and inserted.
  • the first data node 41051 is the first data node of the first cache page, so when the first data node 41051 is created, the "previous log address" and the "next log address" in the first data node 41051 are air.
  • the pointer of the “next log address” in the first data node 41051 may be updated according to the second data node 41052, and “the next data node 41051”
  • the pointer to the "log address” points to the start address of the second data node 41052 in the PCM 125.
  • the log tail pointer in the first cache page structure needs to be updated to point to the start address of the second data node 41052 in the PCM 125.
  • the pointer of the “next log address” in the first data node 41051 can be pointed to the “page” in the second data node 41052.
  • the Internal Offset field and updates the log tail pointer in the first cache page structure to point to the "Intra Page Offset” field in the second data node 41052.
  • the "intra-page offset” is used to indicate the location of the log data slice of the second data node 41052 in the first cache page.
  • the CPU 105 when the CPU 105 determines in step 320 that the log chain of the target cache page OCP i is stored in the PCM 125, in this step, the CPU 105 may have the log chain existing in the target cache page OCP i .
  • the tail is inserted into the data node and records the information of the log data slice during the modification. For example, when the CPU 105 is not empty according to the log tail field in the first cache page structure in step 320, it may be determined that the log chain structure of the first cache page is stored in the PCM 125. In another way, when the log tail field in the first cache page structure is not empty, it indicates that the first cache page has been modified before this modification.
  • the CPU 105 can find the last data node in the log chain 4105 of the first cache page based on the log tail field in the first cache page structure.
  • the last data node of the first cache page stores the last modified data information from the current time, or the last data node of the first cache page stores the last one of the first cache page. A modified version.
  • the CPU 105 may append a new data node after the last data node, and store the data slice in the attached new data node. Log i (x, y) information.
  • Log chain to the last page of the first cache a first data node is node data 41051
  • the new data node is an example of the second data node 41052
  • the CPU 105 may store information of the data slice log i (x, y) in the second data node 41052.
  • the information of the data slice log i (x, y) may include a data slice log i (x, y), a log data length, an intra-page offset, and pointer information to other data nodes.
  • the pointer of the "next log address" in the first data node 41050 pointed to by the log tail pointer of the cache page may be pointed to the start address of the second data node 41052 in the PCM 125.
  • the manner in which the modified data of the target cache page is sequentially recorded in the log chain according to the modified order provided in the embodiment of the present invention facilitates identifying the target cache through the sequence of the data nodes in the log chain.
  • Different updated versions of the page in practical applications, in the process of sequentially inserting data nodes, in addition to sequentially inserting data nodes in the order from the end to the end of the log chain, the data nodes may be inserted in the order from the back to the front in the log chain. .
  • the specific insertion order is not limited as long as the update order of the target cache pages can be identified according to the data nodes in the log chain.
  • the CPU 105 may write a success message to the application, and the write success message is used. It indicates that the data to be written has been successfully written into the storage device, so that the processing delay of the access request can be reduced.
  • the CPU 105 when the CPU 105 needs to modify the data of a certain file according to the access request, the CPU 105 does not directly write the modified data into the target cache page of the file, but writes the modified data.
  • the PCM 125 space is entered, and the information of each modified data of the target cache page is recorded in the form of a log chain. Since the PCM 125 is non-volatile, and the log chain recording method is used in the PCM 125 to store the write data, the target cache page can be modified many times.
  • the modified data in the process is recorded in chronological order, so that it is easy to identify the version relationship of the log data piece, and the consistency between the stored data and the written data is ensured.
  • the access request processing method provided in FIG. 3 is more expensive than the prior art in maintaining data consistency by maintaining different states of the memory block, because the state maintenance has a large overhead on the system compared to the write update process.
  • the computer system 100 provided by the present invention has a small system overhead in the process of access request processing.
  • the size of the log data slice can be smaller than the page in the embodiment of the present invention, the file modification with smaller granularity of the comparison page can be supported, and the modification manner is more flexible.
  • the merge operation updates the log data slice in the PCM 125 to the cache page of the DRAM 120.
  • the log data slice in the PCM 125 needs to be updated into the DRAM 120, and then the updated cache page in the DRAM 120 is written into the disk 130 to update the file data in the disk.
  • the data can be written back and restored according to the log chain in the PCM 125 to ensure the written data. Not lost to maintain data consistency.
  • the log data slice in the PCM 125 needs to be updated into the cache page of the DRAM 120, and the updated cache page is written into the disk 130.
  • the case of specifically triggering the merge operation is not limited. The process of updating the log data slice to the cache page of the DRAM 120 shown in FIG.
  • FIG. 6 is a data merging method according to an embodiment of the present invention. It can be understood that for each cache page having a log chain, the merging operation can be performed according to the method shown in FIG. 6. For convenience of description, the log chain of any one of the target cache pages OCP i described in FIG. 3 is still taken as an example for description.
  • the CPU 105 determines valid data in the log chain of the target cache page OCP i .
  • the valid data is the latest modified data of the target cache page.
  • CPU 105 catena alberghiera valid data log buffer page in the target may be determined according to the log information of the at least one data slice of the target node in the cache page record OCP i.
  • the CPU 105 may determine valid data in the log chain according to an update order of each data node in the log chain of the target cache page OCP j and an in-page location information of the log data slice. Each data node in the log chain is sequentially obtained according to the order of modification time of the cache page.
  • the in-page position of the log data slice can be obtained from two pieces of information such as "intra page offset” and "data length” in the data node.
  • the in-page positions of the log data slices in each data node in the log chain do not overlap.
  • the CPU 105 can determine that the log data slices in the respective data nodes in the log chain are valid data.
  • the address of the log data slice in the first data node 41051 is the 30th-50th byte
  • the address of the log data slice in the second data node 41052 is the 60th to 80th byte.
  • the CPU 105 determines that the log data slice of the first data node and the log data slice of the second data node are valid data.
  • the intra-page locations of the log data slices in each data node in the log chain overlap.
  • the CPU 105 determines that the data of the overlapping portion included in the later generated data node in the log chain is valid data, and the CPU 105 determines the respectively The data of the non-overlapping portions of the at least two log data slices are valid data.
  • the CPU 105 determines that all of the data in the later generated data node and the non-overlapping portion of the earlier generated data node are valid data. For example, taking the log chain of the first cache page in FIG.
  • the CPU 105 determines that the 30th to 49th bytes in the first log data slice and the 50th to 90th bytes in the second log data slice are valid data.
  • the valid data is updated into the target cache page OCP i to obtain the updated target cache page OCP i '.
  • the CPU 105 may replace the same data in the target cache page OCP i with the valid data in the determined log chain. For example, if the CPU 105 determines in step 600 that the address of the valid data in the log chain of the first cache page is 30-90 bytes, the CPU 105 may determine the 30-90 bytes in the log chain. The valid data replaces the data of the 30th to 90th bytes in the first cache page, thereby obtaining the updated first cache page.
  • the CPU 105 may store the log stored in the PCM 125 according to the merge method as shown in FIG. 6.
  • the data slice is updated to the corresponding cache page, and the file data in the disk is further updated according to the merged cache page.
  • the CPU 105 may update the log data slice stored in the PCM 125 to the corresponding cache according to the merging method as shown in FIG. 6.
  • the CPU 105 can delete the log chain of the cache page to release the storage space of the PCM 125, saving system resources.
  • the CPU 105 may specifically determine whether the data of the cache page needs to be swiped back to the disk 130 through the “dirty” field in the cache page structure of each cache page. Taking a cache page as an example, when the "dirty" field is 1, the CPU 105 determines that the data of the cache page needs to be swiped back to the disk 130. When the "dirty” field is 0, the CPU 105 determines that the cache page is not required. The data is swiped back into disk 130.
  • the CPU 105 determines that the data of the cache page needs to be swiped back into the disk 130, the CPU 105 further needs to further determine whether the log data slice in the PCM 125 needs to be updated according to the "log dirty" field in the cache page structure of the cache page. Go to the cache page. For example, when the "log dirty" field is 1, it indicates that the PCM 125 contains the new modified data of the cache page, and the CPU 105 needs to first update the log data slice in the PCM 125 to the cache page in the DRAM 120, and then The data in the updated cache page is swiped back to disk 130.
  • the "log dirty" field When the "log dirty" field is 0, it indicates that the log data slice in the log chain of the cache page has been updated into the cache page, and the new modified data of the cache page is not included in the PCM 125, and the CPU 105 can directly The data in the cache page is swiped back into disk 130.
  • the write success message can be returned to the application. Since the log data slice in the log chain is not modified data in the granularity of the page, the access request processing method of the present invention can support small-grained modification of the file. Moreover, in the embodiment of the present invention, after the data is written into the PCM 125, the modified data in the PCM 125 is not immediately written into the disk 130, but when certain conditions are met, The log data slice stored in the PCM 125 is updated to the corresponding cache page, and the file data in the disk is updated according to the merged cache page. This method of delaying the merging of data and writing the merged data to the disk is compared with the existing write ahead log (WAL) method and the copy on write method for maintaining data consistency. Can reduce the write amplification of the system.
  • WAL write ahead log
  • the data can be written back and restored according to the log chain in the PCM 125 to ensure that the written data is not lost, so as to maintain the data. consistency.
  • the CPU 105 can sequentially perform data recovery on each cache page having the log chain according to the global log head pointer in the global log chain control information in the PCM 125.
  • the CPU 105 can traverse each log data slice in the log chain of the cache page, and determine valid data in the log chain according to the method shown in FIG.
  • the valid data is updated to the cache page.
  • the updated cache page data is then written to disk 130. In this way, it is guaranteed that the written data is not lost.
  • FIG. 2 and FIG. 3 describe the access request processing method provided by the embodiment of the present invention from the process of writing data.
  • the access request processing method provided by the embodiment of the present invention will be further described from the process of reading data.
  • FIG. 7 is a schematic diagram of still another signaling of the computer system 100 according to an embodiment of the present invention.
  • FIG. 7 is a diagram illustrating a signaling interaction process of each device in the computer system 100 shown in FIG.
  • FIG. 8 is a flowchart of still another method for processing an access request according to an embodiment of the present invention
  • FIG. 8 is a diagram illustrating reading data.
  • the method shown in FIG. 8 can be implemented by the CPU 105 in the computer system 100 shown in FIG. 7 by calling the data processing logic 1051.
  • FIGS. 7 and 8 A detailed description of how data is read from the computer system 100 provided by the embodiment of the present invention will be described below with reference to FIGS. 7 and 8. As shown in Figure 8, the method can include the following steps.
  • the CPU 105 receives the read request 700.
  • the CPU 105 can call the data processing logic 1051 to process the read request 700.
  • the read request carries a file identifier and a size of data to be read.
  • the file identifier carried in the read request 700 may be a file handle of the target file to be accessed, and may be a file descriptor other than the file handle. It is not limited here, as long as the process can identify the target file through the file identifier and find the description information of the target file.
  • the file identifier carried in the read request 700 may be the second file in the disk 130.
  • File identifier may be the first file and the second file in the embodiment of the present invention. It should be noted that the first file and the second file in the embodiment of the present invention are only the distinguishing of the files accessed in different access processes, and the specific files are not limited. In this way, the first file and the second file may be the same file or different files.
  • step 805 the CPU 105 obtains an access location based on the file identifier.
  • the access location is used to indicate a start address of data to be read by the read request in the target file.
  • the access location can be a logical access location.
  • the description of how to obtain the access location according to the file identifier carried in the read request 700 is similar to the step 305. For details, refer to the description of step 305.
  • step 810 the CPU 105 determines M target cache pages and target cache page OCPs in the M target cache pages according to the access location, the size of the data to be read, and the size of the cache page.
  • the position information of the data to be read in j is from 1 to M
  • the M is an integer not less than 1.
  • the size of a page is typically 4k bytes.
  • the manner in which the CPU 105 determines the M target cache pages is similar to the manner in which the CPU 105 determines the N target cache pages in step 310. For details, refer to the description of step 310.
  • CPU 105 may access the specific position, the data to be read out of the determination target page buffer cache page OCP j M in the target page in the cache to be The location information of the read data.
  • the target cache page OCP j is a file page of the second file cached in the DRAM, and takes a cache page size of 100 bytes as an example.
  • the CPU 105 may determine that the target cache page is the second cache page p2 of the second file ( Contains the 100th-199th byte of the second file), the third cache page p3 (including the 200th-299th byte of the second file), and the fourth cache page p4 (including the 300-399th byte of the second file) . And, the CPU 105 can determine that the position information of the data to be read by the read request 700 is p 2 (50, 49), p 3 ( 0 , 100), and p 4 (0, 61), respectively.
  • the target cache page to be accessed by the read request 700 may also be a cache page or multiple cache pages, that is, the value of M may be an integer not less than one.
  • the determined M target cache pages and the location information of the data to be read in the target cache page OCP j may be referred to as information 705 of the data to be read.
  • the CPU 105 can read the data buffered in the DRAM 120 based on the determined information 705 of the data to be read.
  • CPU 105 may each perform the following operations for each target page buffer OCP j. It can be understood that, in practical applications, when the CPU 105 determines only one target cache page in step 810, the CPU 105 can perform the following operations only on the determined one target cache page.
  • the CPU 105 determines a plurality of target cache pages in step 810, that is, when the CPU 105 determines that the data to be read needs to be read from a plurality of target cache pages, respectively, according to the read request 700, the CPU 105 may A target cache page performs the following operations separately. For the sake of clarity, the following describes an operation method for a target cache page as an example.
  • step 815 CPU 105 determines whether the PCM 125 stores a cache page log catena alberghiera OCP j of the target, the target of the cache page OCP j catena alberghiera log for recording the target cache page OCP j at least one piece of data log information.
  • the log chain of the target cache page includes at least one data node, wherein each data node includes information of a log data slice, and each log data slice is the target cache page. Modify data during a modification.
  • the method proceeds to step 820, otherwise the method proceeds to step 825.
  • the CPU 105 may obtain the M target caches from the metadata information of the target file.
  • the cache page structure of the page Further, it can be determined whether the log chain structure of the target cache page OCP j in the M target cache pages is stored in the PCM 125 according to the information recorded in the cache page structure.
  • Cache page structure and the log chain structure may refer to FIG. 5, the description of the cache page structure is stored whether and how the target cache page OCP j cache page structure is determined according to the target cache page OCP j in the log chain PCM125
  • the method is similar to the step 320 in FIG. 3 . For details, refer to the description of step 320 .
  • step 820 CPU 105 reads the target data in the cache page OCP j from the data of the DRAM according to the location information of the target cache in the page to be read OCP j. As described in step 320, for any one of the target cache pages, the CPU 105 can determine whether the target cache is stored in the PCM 125 according to "log head" or "log tail” in the cache page structure of the target cache page. The log chain of the page.
  • CPU 105 determines that the PCM 125 is not stored in the log chain OCP j target cache page cache page structure according to the target cache when the page OCP j, indicating the target of the cache page OCP j The data has not been modified, so the CPU 105 can directly read the data in the target cache page OCP j from the DRAM according to the location of the data to be read. As shown in FIG. 7, the CPU 105 can acquire the read data 720 from the cache page in the DRAM 120.
  • step 825 CPU 105 according to the target cache page log catena alberghiera OCP j and the at least one target cache page log information after the update data pieces obtained OCP j '.
  • CPU 105 determines based on the target cache page OCP j cache page structure of the PCM 125 in the target cache stores the page OCP j log chain, the target cache page description data OCP j It has been modified so that the CPU 105 needs to update the data stored in the log chain in the PCM 125 to the target cache page in the DRAM.
  • the log data slice 215 in the log chain of the target cache page OCP j may be merged into the target cache page to obtain the updated target cache page OCP j ' .
  • the CPU 105 determines valid data in the log chain of the target cache page OCP j .
  • the valid data refers to the latest modified data of the cache page.
  • the CPU 105 updates the valid data into the target cache page OCP j to obtain the updated target cache page OCP j ' .
  • FIG. 6 For a specific data merging method, refer to the description of FIG. 6.
  • step 830 CPU 105 reads the data from the target cache after the updated page OCP j 'in accordance with the position information of the target data in the cache page OCP j to be read.
  • the location information of the data to be read refers to the logical location of the data to be read in the target cache page.
  • the CPU 105 may be from the updated first cache page. Read the data of the 15-50th byte. As shown in FIG. 7, the CPU 105 can acquire the read data 720 from the updated target cache page in the DRAM 120.
  • the operating system when there is no data to be read in the DRAM 120, the operating system first loads the data to be read from the disk into the DRAM 120, The data is then read from the cache page of the DRAM, so that the reading speed can be improved.
  • the access request processing method provided by the embodiment of the present invention can support the granularity of the page because the data to be modified of the cached page is stored in the PCM 125 by means of a log chain. Smaller data modifications.
  • the latest cache page data when the data is read can be obtained according to the log data piece in the data node in the log chain, thereby ensuring the accuracy of the read data.
  • FIG. 9 is a schematic structural diagram of an access request processing apparatus according to an embodiment of the present invention.
  • the apparatus can be applied to a computer system including a non-volatile memory NVM, for example, the apparatus can be applied to a computer system as shown in FIG.
  • the access request processing device 90 may include the following modules.
  • the receiving module 900 is configured to receive a write request.
  • the write request carries a file identifier, a buffer area pointer, and a size of the data to be written, wherein the buffer area pointer is used to point to a buffer area in which the data to be written is cached, and the to-be-written data is used for modification.
  • the write request requests modification data of the target file to be accessed.
  • the obtaining module 905 is configured to obtain an access location according to the file identifier.
  • the access location indicates a start address at which the write request writes data in the target file.
  • the determining module 910 is configured to determine a target cache page according to the access location, the size of the data to be written, and the size of the cache page.
  • the target cache page is a memory page in the memory for buffering file data modified by the data to be written in the target file.
  • the determining module 910 is further configured to determine a log chain in which the target cache page is stored in the NVM.
  • the log chain of the target cache page includes at least one data node, wherein each data node includes information of modified data of the target cache page in a modification process.
  • the inserting module 915 is configured to insert a new data node in a log chain of the target cache page.
  • the inserted data node includes information of a log data slice of the target cache page.
  • the log data piece is modified data of the target cache page, and the log data piece is at least a part of data to be written obtained from the buffer area according to the buffer area pointer.
  • the information of the log data slice includes a storage address of the log data slice or the log data slice in the NVM.
  • the information of the log data piece further includes: the log data piece An offset in the target cache page, a length of the log data slice, and address information of a neighboring data node of the inserted data node.
  • the inserting module 915 may specifically insert new data in the tail or the head of the log chain of the target cache page. node. After the new data node is inserted, the log chain of the target cache page includes at least two data nodes that are sequentially linked according to the update order of the target cache page.
  • the access request processing device 90 may further include an update module 920 and a storage module 925.
  • the update module 920 is configured to obtain the updated target cache page according to the information of the at least one log data slice recorded in the log chain of the target cache page.
  • the storage module 925 is configured to store data of the updated target cache page in an external storage device of the computer system.
  • the update module 920 may determine valid data in a log chain of the target cache page according to information of at least one log data slice recorded in a log chain of the target cache page, and update the valid data to the The target cache page is obtained to obtain the updated target cache page.
  • the valid data is the latest modified data of the target cache page.
  • the access request processing device 90 may further include a recycling module 930.
  • the reclaiming module 930 is configured to reclaim the log chain of the target cache page after storing the updated data of the target cache page in the external storage device of the computer system 100.
  • the access request processing apparatus 90 provided by the embodiment of the present invention can refer to the access request processing method described in the foregoing embodiment.
  • FIG. 10 is a schematic structural diagram of an access request processing apparatus according to an embodiment of the present invention.
  • the apparatus can be applied to a computer system including a non-volatile memory NVM, for example, the apparatus can be applied to a computer system as shown in FIG.
  • the access request processing device 10A may include the following modules.
  • the receiving module 1000 is configured to receive a read request.
  • the read request carries a file identifier and a size of data to be read.
  • the obtaining module 1005 is configured to obtain an access location according to the file identifier, where the access location indicates a start address of the read request to read data in the target file.
  • the determining module 1010 is configured to determine location information of the target cache page and the data to be read in the target cache page according to the access location, the size of the data to be read, and the size of the cache page.
  • the target cache page is a memory page in the memory for buffering file data modified by the data to be written in the target file.
  • the determining module 1010 is further configured to determine a log chain in which the target cache page is stored in the NVM.
  • the log chain of the target cache page contains information of at least one log data slice.
  • Each log data slice is modified data of the target cache page in one modification process.
  • the information of the log data slice includes a storage address of the log data slice or the log data slice in the NVM.
  • the update module 1015 is configured to obtain the updated target cache page according to the information of the target cache page and the at least one log data piece in the log chain of the target cache page. Specifically, the update module 1015 may determine valid data in a log chain of the target cache page according to information of at least one log data slice recorded in a log chain of the target cache page, and update the valid data to the The target cache page is obtained to obtain the updated target cache page.
  • the valid data is the latest modified data of the target cache page.
  • the reading module 1020 is configured to read data from the updated target cache page according to location information of data to be read in the target cache page.
  • the information of the log data piece may further include: an offset of the log data piece in the target cache page, a length of the log data piece, and the inserted data node. Address information of adjacent data nodes.
  • the embodiment of the present invention further provides a computer program product for implementing an access request processing method, comprising: a computer readable storage medium storing program code, the program code comprising instructions for executing the method described in any one of the foregoing method embodiments Process.
  • a computer readable storage medium storing program code, the program code comprising instructions for executing the method described in any one of the foregoing method embodiments Process.
  • the foregoing storage medium includes: a USB flash drive, a mobile hard disk, a magnetic disk, an optical disk, a random access memory (RAM), a solid state disk (SSD), or other nonvolatiles.
  • Sex memory A non-transitory machine readable medium that can store program code, such as non-volatile memory.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本申请揭示了一种访问请求处理方法、装置及计算机系统。该计算机系统包括处理器和非易失性内存。在该计算机系统中,当处理器接收写请求时,所述处理器可以根据所述写请求确定目标缓存页。在确定所述NVM中存储有所述目标缓存页的日志链(log chain)之后,所述处理器在所述目标缓存页的log chain中插入新的数据节点。其中,所述插入的数据节点中包含有所述目标缓存页的log数据片的信息,所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址,所述log数据片是所述写请求待写入的至少一部分待写入数据。本申请提供的计算机系统能够在保护数据一致性的基础上减少系统开销。

Description

访问请求处理方法、装置及计算机系统 技术领域
本发明涉及存储技术领域,尤其涉及一种访问请求处理方法、装置及计算机系统。
背景技术
在存储系统中,通常采用写前日志(Write-ahead logging,WAL)的方式来保持数据的一致性。根据这种方式,所有写入存储系统的数据都先写入外存设备(例如,磁盘)的日志文件中,后续再根据日志文件更新旧数据。当系统出现掉电或宕机等故障时,可以根据日志恢复数据,保证数据的一致性。随着下一代非易失性存储器(Non-Volatile Memory,NVM)技术的发展,由于下一代NVM具有读写速度较快,并且能够按字节寻址,因此下一代NVM可以被作为系统的内存使用。这种以NVM为介质的存储级内存(Storage Class Memory,SCM)具有非易失性的特点,为存储系统中的数据一致性保护提供了新的方法。
在现有技术中的一种基于SCM实现数据一致性的方法中,缓存(cache)和日志(log)共享SCM的存储空间。在这种实现方式中,一个SCM块既可以作为cache块,也可以作为log块。在SCM中,以一个块为内存空间的基本单位,通常,一个块的大小可以为4k。每个块有三个状态对:frozen/normal,dirty/clean,up-to-date/out-of-date。frozen用于表示该块是log块,也就是说,该块中的数据可以作为日志使用。normal用于表示该块是cache块,也就是说,该块作为缓存使用。dirty用于表示该块中存储的数据已经被修改。clean用于表示该块中存储的数据未被修改过。up-to-date用于表示该块中存储的数据是最新的版本。out-of-date用于表示该块中存储的数据是旧版本。在更新数据过程中,首先在内存中为数据分配一个块,将该块的状态记录为(normal,clean,up-to-date)。当将数据写入该块后,更新该块的状态为(normal,dirty,up-to-date)。处于(normal,dirty,up-to-date)状态的块可以直接读或写,也就是说,可以直接从处于(normal,dirty,up-to-date)状态的块中读数据或者向处于(normal,dirty,up-to-date)状态的块中写数据。当本次写操作完成后,事务提交时,该内存块被作为日志块使用, 该内存块的状态被修改为(frozen,dirty,up-to-date)。当后续有新数据写入该内存块时,该内存块的状态被修改为(frozen,dirty,out-of-date)。处于(frozen,dirty,out-of-date)状态的内存块可以被写回到磁盘中。在将该内存块写回磁盘之后,该内存块变为空闲块(free block),可以供新的写操作使用。这种将SCM既作为cache空间又作为log空间的方法与WAL的方式相比虽然减少了数据写操作,但需要维护各个块的状态,造成系统的较大开销。并且,这种方式必须以块为粒度实现数据的更新,在更新的数据小于一个块时,会造成写放大的问题,使得实际写入磁盘的数据比本身需要写入磁盘的数据更多。
发明内容
本申请实施例中提供的一种访问请求处理方法、装置及计算机系统,能够在保护数据一致性的基础上减少系统开销。
第一方面,本申请提供一种访问请求处理方法。该方法可以由计算机系统执行。所述计算机系统包括处理器和非易失性内存(NVM)。在所述计算机系统中,当所述处理器接收到携带有所述文件标识、缓存区指针以及待写入数据的大小的写请求时,所述处理器可以根据所述写请求中携带的文件标识获取访问位置。其中,所述缓存区指针用于指向缓存待写入数据的缓存区,所述待写入数据为所述写请求要访问的目标文件的修改数据,所述访问位置指示所述写请求在所述目标文件中写入数据的起始地址。进一步的,所述处理器可以根据所述访问位置、所述待写入数据的大小以及缓存页的大小确定目标缓存页。所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页。在确定所述NVM中存储有所述目标缓存页的日志链(log chain)之后,所述处理器在所述目标缓存页的log chain中插入新的数据节点。所述目标缓存页的log chain中的每一个数据节点中包含有所述目标缓存页在一次修改过程中的修改数据的信息。所述插入的数据节点中包含有所述目标缓存页的log数据片的信息,所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址。其中,所述log数据片为所述目标缓存页的修改数据,所述log数据片是根据所述缓存区指针从缓存区获得的至少一部分待写入数据。
本申请提供的访问请求处理方法,当处理器根据访问请求需要对某 个文件的数据进行修改时,所述处理器并没有直接将修改的数据写入该文件的目标缓存页中,而是将修改的数据写入所述NVM的存储空间中,并采用log chain的形式记录目标缓存页的每一次修改数据的信息。由于所述NVM具有非易失性,并且在所述NVM中采用了log chain的记录方式来存储写入数据,能够将目标缓存页在多次修改过程中的修改数据按照时间先后顺序记录下来,从而便于识别log数据片的版本关系,保证了存储的数据与写入数据之间的一致性。本申请提供的访问请求处理方法与现有技术中通过维护内存块的不同状态来保持数据一致性的方式相比,由于相对于写更新的过程,状态维护对系统开销的影响较大,因此,使用本申请提供的访问请求处理方法,在访问请求处理过程中能够减小计算机系统的系统开销。并且,由于在本申请提供的访问请求处理方法中,log数据片的大小可以比页(page)更小,因此,本申请提供的访问请求处理方法可以支持对比页更小粒度的文件修改,修改方式更加灵活。
进一步的,在本申请提供的访问请求处理方法中,所述处理器根据所述写请求将待写入的数据写入所述NVM中后,可以向应用响应写入成功消息。所述写入成功消息用于指示待写入数据已经成功被写入存储设备中。从而可以减少访问请求的处理时延。
在一种可能的实现方式中,在确定所述NVM中存储有所述目标缓存页的日志链的过程中,所述处理器具体可以根据所述目标缓存页的缓存页结构中的下述字段中的至少一个字段来确定所述NVM中存储有所述目标缓存页的日志链:“log head”、“log tail”、“logs”以及“log dirty”。其中,所述“log head”字段用于指向所述目标缓存页的log chain的首地址,所述“log tail”字段用于指向所述目标缓存页的log chain中最后一个数据节点的首地址,所述logs字段用于指示所述目标缓存页的log chain中的数据节点的数量,所述“log dirty”用于指示所述目标缓存页与所述目标缓存页的log chain的数据节点指示的log数据片是否同步。
在一种可能的实现方式中,当所述处理器判断所述NVM中未存储有所述目标缓存页的log chain时,所述处理器可以在所述NVM中为所述目标缓存创建log chain。从而能够在新创建的log chain中插入数据节点,并在插入的数据节点中记录所述目标缓存页的log数据片的信息。
在一种可能的实现方式中,在执行在所述目标缓存页的log chain中插入新的数据节点的操作过程中,所述处理器可以在所述目标缓存页的log chain的尾部或头部插入新的数据节点。其中,在插入所述新的数据节点后,所述目标缓存页的log chain中包含有根据所述目标缓存页的更新顺序依次链接的至少两个数据节点。根据这种顺序插入新的数据节点的方式,能够使所述目标缓存页的log chain中的不同数据节点中的log数据片依据对目标缓存页更新版本的新旧顺序进行链接。从而,可以根据目标缓存页的log chain中的数据节点的先后顺序识别出所述目标缓存页的不同更新版本。在读取数据的过程中,能够根据同一个缓存页的log chain中的不同数据节点中的log数据片确定有效数据,保证读取的数据的正确性。
在一种可能的实现方式中,所述处理器还可以根据所述目标缓存页的log chain中记录的至少一个log数据片的信息获得更新后的目标缓存页,并将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中。根据这种方式,能够更新磁盘中的文件数据,保持数据的一致性。并且,在本申请提供的访问请求处理方法中,将数据写入所述NVM中后,无需立即将所述NVM中的修改数据写入所述计算机系统的外存中。而是在需要回收所述NVM的内存空间或在对所述计算机系统进行恢复的过程中,将所述NVM中存储的log数据片更新到对应的缓存页,并根据更新后的缓存页更新磁盘中的文件数据。这种将修改数据延迟合并到目标缓存页,并将合并后的目标缓存页写入磁盘的方式,与现有的保持数据一致性的写前日志(write ahead log,WAL)方式以及写前拷贝(copy on write)方式相比,能够减少系统的写放大。
在一种可能的实现方式中,所述将根据所述目标缓存页的log chain中记录的至少一个log数据片的信息获得更新后的目标缓存页包括:根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,并将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。其中,所述有效数据为所述目标缓存页的最新修改数据。例如,所述处理器可以根据所述目标缓存页的log chain中各数据节点的更新顺序以及log数据片的页内位置信息来确定所述log chain中的有效数据。其中,log数据片的页内位置信息可以根据数据节点中的“页内偏移”和“数据长度”两个信 息获得。
在一种可能的实现方式中,在将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中之后,所述处理器还可以回收所述目标缓存页的log chain。从而能够回收所述NVM的存储空间,节省系统资源。
在一种可能的实现方式中,所述log数据片的信息还包括:所述log数据片的页内位置信息以及所述插入的数据节点的相邻数据节点的地址信息。其中,所述log数据片的页内位置是指log数据片在目标缓存页中的位置,所述log数据片的页内位置信息可以包括:页面内偏移、log数据长度等信息。其中,所述页面内偏移用于指示所述log数据片在所述目标缓存页中的起始位置,所述log数据长度用于指示所述log数据片的长度。所述插入的数据节点的相邻数据节点的地址信息可以根据所述数据节点中的“上一个log地址”以及“下一个log地址”的信息获得。所述“上一个log地址”用于指示上一个数据节点在所述NVM中的起始地址,所述“下一个log地址”用于指示下一个数据节点在所述NVM中的起始地址。根据所述log数据片的信息能够确定目标缓存页的最新修改数据,并能够获得各个数据节点的先后顺序,从而能够根据所述目标缓存页的log chain中的各个数据节点中记录的信息确定所述目标缓存页的不同更新版本。
第二方面,本申请提供了又一种访问请求的处理方法,该方法也可以由计算机系统执行。所述计算机系统包括处理器和非易失性内存(NVM)。在所述计算机系统中,在所述处理器接收携带有文件标识以及待读取的数据的大小的读请求之后,所述处理器可以根据所述文件标识获取访问位置,其中,所述访问位置指示所述读请求在目标文件中读取数据的起始地址。进一步的,所述处理器可以根据所述访问位置、所述待读取的数据的大小以及缓存页的大小确定目标缓存页以及所述目标缓存页中待读取的数据的位置信息。其中,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页。在确定所述NVM中存储有所述目标缓存页的日志链之后,所述处理器可以根据所述目标缓存页以及所述目标缓存页的log chain中的至少一个log数据片的信息获得更新后的目标缓存页。其中,所述目标缓存页的log chain中包含有至少一个log数据片的信息,每个log数据片为所述目标缓存页在一次修改过程中的修改数据,所述log数据片的信息包括所述log数据片或者所述log数据片在所述 NVM中的存储地址。进而,所述处理器可以根据所述目标缓存页中待读取的数据的位置信息从所述更新后的目标缓存页中读取数据。
本申请提供的访问请求处理方法,由于缓存页的待修改数据都通过log chain的方式存储于所述NVM中,因此可以支持比页的粒度更小的数据修改。在处理读请求的过程中,可以根据所述读请求待访问的目标缓存页的log chain中的数据节点中的log数据片获得所述目标缓存页的最新修改版本,从而能够保证读取的数据的准确性。
在一种可能的实现方式中,所述根据所述目标缓存页以及所述目标缓存页的log chain中的至少一个log数据片的信息获得更新后的目标缓存页包括:根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,并将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。其中,所述有效数据为所述目标缓存页的最新修改数据。例如,所述处理器可以根据所述目标缓存页的log chain中各数据节点的更新顺序以及log数据片的页内位置信息来确定所述log chain中的有效数据。其中,log数据片的页内位置信息可以根据数据节点中的“页内偏移”和“数据长度”两个信息获得。
在一种可能的实现方式中,在确定所述NVM中存储有所述目标缓存页的日志链的过程中,所述处理器具体可以根据所述目标缓存页的缓存页结构中的下述字段中的至少一个字段来确定所述NVM中存储有所述目标缓存页的日志链:“log head”、“log tail”、“logs”以及“log dirty”。其中,所述“log head”字段用于指向所述目标缓存页的log chain的首地址,所述“log tail”字段用于指向所述目标缓存页的log chain中最后一个数据节点的首地址,所述logs字段用于指示所述缓存页的log chain中的数据节点的数量,所述“log dirty”用于指示所述目标缓存页与所述目标缓存页的log chain的数据节点指示的log数据片是否同步。
在又一种可能的实现方式中,所述log数据片的信息还包括:所述log数据片的页内位置信息以及所述插入的数据节点的相邻数据节点的地址信息。其中,所述log数据片的页内位置是指log数据片在目标缓存页中的位置,所述log数据片的页内位置信息可以包括:页面内偏移、log数据长度等信息。其中,所述页面内偏移用于指示所述log数据片在所述目标缓存页中的起始位置,所述 log数据长度用于指示所述log数据片的长度。所述插入的数据节点的相邻数据节点的地址信息可以根据所述数据节点中的“上一个log地址”以及“下一个log地址”的信息获得。所述“上一个log地址”用于指示上一个数据节点在所述NVM中的起始地址,所述“下一个log地址”用于指示下一个数据节点在所述NVM中的起始地址。根据所述log数据片的信息能够确定目标缓存页的最新修改数据,并能够获得各个数据节点的先后顺序,从而能够根据所述目标缓存页的log chain中的各个数据节点中记录的信息确定所述目标缓存页的不同更新版本。
第三方面,本申请提供了一种计算机系统,所述计算机系统包括非易失性内存NVM以及与所述NVM连接的处理器,所述处理器用于执行上述第一方面以及第一方面的各种可能的实现方式中所述的方法。
第四方面,本申请提供了一种计算机系统,所述计算机系统包括非易失性内存(NVM)以及与所述NVM连接的处理器,所述处理器用于执行上述第二方面以及第二方面的各种可能的实现方式中所述的方法。
第五方面,本申请提供了一种访问请求处理装置,所述访问请求处理装置应用于计算机系统中,所述计算机系统包括非易失性内存(NVM),所述访问请求处理装置包括用于执行上述第一方面以及第一方面的各种可能的实现方式中所述的方法的模块。
第六方面,本申请提供了一种访问请求处理装置,所述访问请求处理装置应用于计算机系统中,所述计算机系统包括非易失性内存(NVM),所述访问请求处理装置包括用于执行上述第二方面以及第二方面的各种可能的实现方式中所述的方法的模块。
第七方面,本申请提供了一种计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令用于执行上述第一方面及第二方面中的至少一种方法。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍。显而易见地,下面描述中的附图仅仅是本发明的一些实施例。
图1为本发明实施例提供的一种计算机系统架构示意图;
图2为本发明实施例提供的一种计算机系统的信令示意图;
图3为本发明实施例提供的一种访问请求处理方法流程图;
图4为本发明实施例提供的一种数据处理示意图;
图5为本发明实施例提供的一种缓存页结构及日志链结构的示意图;
图6为本发明实施例提供的一种数据合并方法流程图;
图7为本发明实施例提供的又一种计算机系统的信令示意图;
图8为本发明实施例提供的又一种访问请求处理方法流程图;
图9为本发明实施例提供的一种访问请求处理装置的结构示意图;
图10为本发明实施例提供的又一种访问请求处理装置的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行描述。显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。
图1为本发明实施例提供的一种计算机系统架构示意图。图1所示的计算机系统架构是一种混合内存的计算机系统架构。在图1所示的计算机系统架构中,动态随机存储器(Dynamic Random Access Memory,DRAM)和相变存储器(Phase Change Memory,PCM)均作为内存使用。如图1所示,计算机系统100可以包括:中央处理器(Central Processing Unit,CPU)105、北桥芯片110、南桥芯片115、动态随机存储器DRAM 120、相变存储器PCM 125以及磁盘130。
中央处理器(CPU)105是计算机系统的核心,CPU 105可以调用计算机系统100中不同的软件程序实现不同的功能。例如,CPU 105能够实现对DRAM120、PCM 125以及磁盘130的访问。可以理解的是,在本发明实施例中,中央处理器(CPU)105仅仅是处理器的一个示例。除了中央处理器(CPU)105外,处理器还可以是其他特定集成电路(Application Specific Integrated Circuit,ASIC),或者是被配置成实施本发明实施例的一个或多个集成电路。
北桥芯片110通常用来处理计算机系统100中的高速信号,具体的, 北桥芯片110可以用于处理CPU、内存以及南桥芯片之间的通信。北桥芯片110通过前端总线与CPU 105连接。北桥芯片110通过内存总线与DRAM120和PCM125连接。根据这种方式,DRAM120和PCM 125均与内存总线连接,通过北桥芯片110与CPU 105通信。本领域技术人员可以理解的是,北桥芯片110可以与CPU 105集成在一起。
南桥芯片115用于负责CPU 105与外部设备之间的通信。CPU 105与南桥芯片115可以通过高速外围组件互联(Peripheral Component Interconnect Express,PCI-E)总线或直接媒体接口(Direct Media Interface,DMI)总线等通信总线进行通信,以实现CPU 105对外设部件互连标准(PeripheralComponentInterconnect,PCI)接口设备、通用串行总线(Universal Serial Bus,USB)接口设备及串行ATA(Serial Advanced Technology Attachment,SATA)接口设备等设备的控制。例如,南桥芯片115可以通过串行ATA(Serial Advanced Technology Attachment,SATA)接口连接磁盘130,从而,CPU 105可以通过南桥芯片115与磁盘130进行通信,实现对磁盘130的控制。在本发明实施例中,南桥芯片包括但不限于集成南桥,例如平台控制中枢(Platform Controller Hub,PCH)。
动态随机存储器(Dynamic Random Access Memory,DRAM)120通过内存总线与北桥芯片110连接。DRAM 120可以通过北桥芯片110实现与CPU 105之间的通信。CPU 105能够高速访问DRAM 120,对DRAM 120中的任一存储单元进行读或写操作。DRAM 120具有访问速度快的优点,因此通常DRAM作为主内存使用。通常DRAM 120用来存放操作系统中各种正在运行的软件、输入和输出数据以及与外存交换的信息等。然而,DRAM 120是易失性的,当计算机关闭电源后,DRAM 120中的信息将不再保存。本领域技术人员知道,DRAM是易失性内存(volatile memory)的一种,实际应用中还可以采用其他的随机存储器(Random Access Memory,RAM)作为计算机系统的内存。例如,还可以采用静态随机存储器(Static Random Access Memory,SRAM)作为计算机系统的内存。
PCM 125是一种新型的非易失性存储器(Non-Volatile Memory,NVM)。在本发明实施例中,PCM 125与DRAM 120共同作为计算机系统100 的内存。由于新型NVM能够按字节(Byte)寻址,将数据以位(bit)为单位写入非易失性存储器中,因而能够作为内存使用。与DRAM 120相比,由于PCM125具有非易失性的特点,从而能够更好地保存数据。在本发明实施例中,可以将能够作为内存使用的非易失性存储器称为存储级内存(Storage Class Memory,SCM)。需要说明的是,在本发明实施例中,图1中所示的PCM 125仅仅只是SCM的一种示例。除了PCM外,SCM还可以包括:阻变存储器(Resistive Random Access Memory,RRAM)、磁性存储器(Magnetic Random Access Memory,MRAM)或铁电式存储器(Ferroelectric Random Access Memory,FRAM)等其他的新型非易失性存储器,在此不对本发明实施例中的SCM的具体类型进行限定。
磁盘130可以通过串行ATA(Serial Advanced Technology Attachment,SATA)接口、小型计算机系统接口(Small Computer System Interface,SCSI)等接口与南桥芯片115连接。磁盘130用于存储数据,作为计算机系统100的外存设备使用。通常,作为外存设备的存储介质需要具有非易失性的特点,当计算机关闭电源后,存储于外存的数据仍然会被保存。并且,外存的存储容量较大。可以理解的是,磁盘130仅仅是外存设备的一种示例,作为外存设备的存储器除了可以是磁盘130外,还可以是固态硬盘(Solid State Drives,SSD)、机械硬盘(Hard Disk Drive,HDD)、光盘、存储阵列等其他能够存储数据的非易失性的存储设备。
可以理解的是,图1所示的计算机系统仅仅是计算机系统的一种示例。实际应用中,随着计算机技术的发展,新一代的计算机系统中,CPU 105可以不通过北桥芯片与内存连接,DRAM 120和PCM 125可以通过双倍速率(Double Data Rate,DDR)总线与CPU 105进行通信。并且,CPU 105也可以不通过南桥芯片来连接磁盘130。例如,CPU 105可以通过主机总线适配卡(Host Bus Adaptor,HBA)连接磁盘130。在本发明实施例中,并不对具体的计算机系统内部器件之间的连接形式进行限定,只要是包括非易失性内存(non-volatile memory,NVM)的计算机系统即可。换一种表达方式,本发明实施例中所述的计算机系统为包括永久性内存(persistent memory,PM)的计算机系统。
在图1所示的计算机系统中,为了确保即使在计算机系统100出现掉电、宕机或软件故障等情况下,写入的数据也不丢失,保护数据的一致性,在 本发明实施例中,可以通过在SCM中建立的日志链(log chain)保持数据的一致性。下面将结合图2和图3对图1所示的计算机系统100如何在保持数据一致性的基础上减少系统开销进行详细介绍。图2为本发明实施例提供的计算机系统100的信令示意图,图3为本发明实施例提供的一种访问请求处理方法。并且为了描述方便,图2仅将图1所示的计算机系统100中在访问请求处理过程中涉及的器件进行了图示。图3以计算机系统100处理写请求为例进行图示。需要说明的是,CPU 105在对访问请求进行处理的过程中,均是通过调用数据处理逻辑(图2中未示出)来实现的。可以理解的是数据处理逻辑可以是实现本发明实施例的请求处理方法的程序。
本领域技术人员可以知道,文件系统是操作系统中负责管理和存储文件信息的软件结构。从系统角度来看,文件系统是对文件存储设备的空间进行组织和分配,负责文件存储并对存入的文件进行保护和检索的系统。文件系统由三部分组成:文件系统的接口、对文件操纵和管理的软件集合、文件数据和属性。当进程读文件或写文件时,操作系统会先根据文件名打开进程要访问的目标文件,然后再根据接收的读请求或写请求对打开的目标文件进行读操作或写操作。其中,文件名可以是文件的全路径名,是目标文件在磁盘中的位置信息的逻辑描述。例如目标文件的文件名可以为:D:\FILE\file1。在通过文件的全路径名打开文件的过程中,需要按照文件的全路径逐层查找,不停的读磁盘,并在内存创建相应的数据结构以表示目标文件的目录结构。在打开文件后,根据读请求或写请求访问目标文件的过程中,如果还按照目标文件的全路径名对文件进行读操作或写操作,会涉及频繁读磁盘或写磁盘,过程比较复杂,处理时间也会比较长。因此,实际应用中,在操作系统打开目标文件的过程中,操作系统会为该进程访问的目标文件分配一个文件句柄,并在进程内部维护了一个文件句柄数组。其中,文件句柄可以通过数字来表示,例如,文件句柄可以为:fd 0、fd1或fd 2。文件句柄数组中存储有指向文件描述信息的指针。文件描述信息中包含有指向文件目录结构、元数据(metadata)以及访问位置等信息的指针。文件目录结构用于描述文件的逻辑位置,文件目录结构存储于内存中,进程通过文件目录结构能够定位目标文件的位置。元数据是用于描述文件数据的数据,具体的,元数据中包含有文件数据的组织、数据域及其关系的信息。访问位置用于指示进程当前访问的起始位置。 访问位置可以是逻辑上的一个位置。通常,访问位置信息可以为0,用于代表从文件的起始地址开始访问。在打开文件的过程中也可以通过系统调用设置文件访问位置为除0之外的其他位置。在访问文件(读/写文件)过程中,进程也可以根据文件句柄通过系统调用设置文件访问位置。本领域人员知道,在随机读写的情况下,访问位置可以是通过系统调用设置的任意一个访问位置,在顺序读写的情况下,当前访问的访问位置即为上一次访问的结束位置。在对目标文件进行读操作或写操作的过程中,进程可以根据该目标文件的文件句柄在进程维护的文件句柄数组中找到目标文件描述信息。在文件描述信息中查找到文件的元数据及访问位置等信息,从而实现对目标文件的读操作或写操作。可以理解的是,文件句柄是当前进程在读/写目标文件的过程中识别目标文件的一种文件标识。在本发明实施例中,文件标识还可以是除文件句柄之外的其他文件描述符,在此不做限定,只要能够使进程通过文件标识识别出目标文件并找到目标文件的描述信息即可。
如前所述,由于内存具有访问速度快的优点,内存(例如图1中的DRAM 120和PCM 125)可以用来存放操作系统中各种正在运行的软件、输入和输出数据以及与外存交换的信息。因此,当访问目标文件时,运行于计算机系统100中的操作系统会先将待访问的目标文件的文件数据从磁盘130加载到内存中。在本发明实施例中,可以将内存中用于缓存文件数据的内存页称为缓存页。为了描述方便,下面以将目标文件的文件数据加载到DRAM 120中为例进行描述。例如,如图2中所示,DRAM 120中的“缓存页1”和“缓存页2”均用于缓存“文件1”的文件数据210,因此,“缓存页1”和“缓存页2”均为磁盘130中存储的“文件1”的缓存页。可以理解的,不同的缓存页中缓存的文件数据不同。通常,一个页(page)的大小为4k个字节(Byte),即一个页有4096个字节。在本发明实施例中,一个缓存页的大小通常也为4K字节。实际应用中,还可以将缓存页的大小设置为8KB或16KB,在此不对缓存页的大小进行限定。
本发明实施例提供的对访问请求进行处理的过程主要涉及在目标文件被打开后根据目标文件的写请求以及读请求分别对目标文件进行写操作或读操作的过程。如图2以及图3中的步骤300所示,当CPU 105接收到写请求200时,CPU 105可以调用数据处理逻辑1051处理所述写请求200。写请求200中 携带有文件标识、缓存区指针以及待写入数据的大小。所述文件标识为进程打开所述写请求200要访问的目标文件时为所述目标文件分配的文件句柄。进程根据所述文件标识能够找到所述目标文件的文件描述信息。所述缓存区指针用于指向缓存待写入数据的缓存区。本领域技术人员可以知道,缓存区可以是在DRAM120或PCM 125中划分出的一段存储空间。所述待写入数据的大小即为缓存所述待写入数据的缓存区的长度。例如,所述待写入数据的大小可以为100个字节。
在步骤305中,CPU 105根据所述文件标识获取访问位置。其中,所述访问位置用于指示所述写请求200在目标文件中写入数据的起始地址。在本发明实施例中,当CPU 105接收写请求200后,CPU 105可以以所述写请求200中携带的文件标识为索引,通过进程维护的文件句柄数组找到目标文件的描述信息,并在所述目标文件的描述信息中找到所述写请求200要访问的所述目标文件中的访问位置。其中,所述访问位置为所述写请求200在所述目标文件中写入数据的起始地址。在本发明实施例中,所述访问位置可以是一个逻辑的访问位置。例如,访问位置可以第一文件的第89个字节。
在步骤310中,CPU 105根据所述访问位置、所述待写入数据的大小以及缓存页的大小确定N个目标缓存页以及与所述N个目标缓存页中的目标缓存页OCPi对应的log数据片logi(x,y)。其中,i的取值从1到N,所述N为不小于1的整数,x表示该log片相对于该文件页的起始偏移,y表示该log片的长度。例如,log数据片为log1(10,30),则表示该log数据片的起始偏移为第1目标缓存页的第10个字节,且该log数据片的长度为30个字节。在本步骤中,当CPU 105获得写请求200的访问位置之后,CPU 105可以根据所述访问位置、所述待写入数据的大小以及缓存页的大小计算所述写请求200要访问的目标缓存页的逻辑页号,从而能够根据计算出的逻辑页号确定所述写请求200要访问的目标缓存页。如前所述,缓存页是内存中用于缓存文件数据的内存页。因此,在本发明实施例中,目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页。
例如,若所述访问位置为访问第一文件的第89个字节,所述待写入数据的大小为212个字节,也就是说,所述写请求200要从第一文件的第89个字节开始向第一文件中写入212个字节。为了描述方便,以一个缓存页的大小为 100个字节为例进行描述。根据这种方式,第一文件的第0-99个字节构成第一文件的第1页p1,第一文件的第100-199个字节构成第一文件的第2页p2,第一文件的第200-299个字节构成第一文件的第3页p3,第一文件的第300-399个字节构成第一文件的第4页p4,依次类推。因此,CPU 105可以根据访问位置、待写入数据大小以及缓存页的大小计算所述写请求要访问第一文件的第1页至第4页,也就是说,第一文件的第1页至第4页被确定为目标缓存页,i的取值为1~4。
进一步的,CPU 105可以确定分别写入第一文件的第1页至第4页中的4个数据片:log1(89,11)、log2(0,100)、log3(0,100)以及log4(0,1)。具体的,CPU 105可以确定待写入第1页的log数据片log1(89,11),待写入第2页的log数据片log2(0,100),待写入第3页的log数据片log3(0,100)以及待写入第4页的log数据片log4(0,1)。其中,log1(89,11)用于表示从第1页的第89个字节的位置开始的11个字节,log2(0,100)用于表示从第2页的第0个字节的位置开始的100个字节,log3(0,100)用于表示从第3页的第0个字节的位置开始的100个字节,log4(0,1)用于表示从第3页的第0个字节的位置开始的1个字节。在本发明实施例中,log数据片是指待写入每一个目标缓存页的数据的集合。换一种表达方式,log数据片是每一个目标缓存页的修改数据。
在步骤315中,CPU 105确定每一个目标缓存页OCPi对应的log数据片logi(x,y)的位置信息。在CPU 105获得每一个目标缓存页OCPi对应的log数据片logi(x,y)后,进一步的,CPU 105还可以根据待写入每一个目标缓存页中的数据片的大小将缓存区中缓存的待写入数据划分为4个部分,从而可以获得与每一个目标缓存页对应的log数据片的位置信息。其中,log数据片的位置信息是指的待写入每一个目标缓存页的数据在写请求200中携带的缓存区指针指向的缓存区中的位置。例如,如图4所示,CPU 105能够根据待写入4个页面的数据片的信息将缓存区中缓存的待写入数据划分为4个部分:buf1(0,11)、buf2(11,100)、buf3(111,100)和buf4(211,1),从而得到各个log数据片的位置信息。其中,buf1(0,11)用于表示log1(89,11)的数据为缓存区中从第0个字节开始的11个字节,buf2(11,100)用于表示log2(0,100)的数据为缓存区中从第11个字节开始的100个字节,buf3(111,100)用于表示log3(0,100)的数据为缓存区中从第111个字节开始的100个字节,buf4(211,1)用于表示log4(0,1)的数据为缓存区中从第 211个字节开始的1个字节。
可以理解的是,实际应用中,写请求200要访问的目标缓存页可能是一个缓存页也可以是多个缓存页,也就是说,N的值可以是不小于1的整数。换一种表达方式,一个访问请求中携带的待写入数据可以是只写入一个页面中的数据,也可以是需要写入多个页面中的数据。上面是以写入多个目标缓存页为例进行说明。在另一种情形下,以访问位置为第一文件的第89个字节,且一个页面的大小为100个字节为例,若待写入数据的大小为5个字节,也就是说,CPU105需要根据写请求200从第一文件的第89个字节开始向第一文件中写入5个字节。在这种情况下,CPU 105只会涉及对第一文件的第1页进行修改,即N=1。从而,CPU 105可以根据所述第一访问位置、所述待写入数据的大小以及缓存页的大小计算写入第一页的log数据片为logp1(89,5)。进一步的,CPU 105可以获得写入该目标缓存页的log数据片的位置信息:buf(0,5),该位置信息为写请求200中携带的缓存区指针指向的缓存区。
在步骤320中,CPU 105判断所述PCM 125中是否存储有目标缓存页OCPi的日志链log chain,所述目标缓存页OCPi的log chain用于记录目标缓存页OCPi的至少一次修改数据信息。当所述PCM 125中未存储有目标缓存页OCPi的logchain结构时,该方法进入步骤325,否则该方法进入步骤330。本领域技术人员可以知道,当CPU 105接收写请求200后,CPU 105还可以根据写请求200中携带的文件标识获得所述目标文件的元数据信息。在本发明实施例中,目标文件的元数据信息中包含有所述目标文件的缓存页结构信息。当CPU 105在步骤210中确定出所述写请求200待访问的所述N个目标缓存页后,CPU 105可以从所述目标文件的元数据信息中获得所述N个目标缓存页的缓存页结构。进而能够根据缓存页结构中记录的信息判断所述PCM 125中是否存储有目标缓存页OCPi的log chain。
图5为本发明实施例提供的一种缓存页结构及日志链结构的示意图。在本发明实施例中,如图2所示,缓存页可以缓存于所述DRAM 120中,缓存页的log chain可以存储于PCM 125中。如图5所示,缓存页结构405中图示了多个缓存页的缓存页结构。例如,图5中所示的第一缓存页结构、第二缓存页结构以及第N缓存页结构,其中所述N为大于2的整数。本领域技术人员可以知 道,缓存页结构用于描述缓存页的元数据信息,例如,缓存页结构可以用于描述缓存页的偏移位置、大小以及是否加锁等信息。在本发明实施例中,每个缓存页都有相应的缓存页结构,每个缓存页结构中还包含有该缓存页的日志链log chain的信息。具体的,每个缓存页结构中维护有下述字段。
log head日志头,用于指向所述缓存页的日志链(log chain)的首地址,其中,所述日志链存储于所述PCM 125中,所述缓存页的log chain的首地址中可以包括所述缓存页所属的文件的inode以及所述缓存页的逻辑页号,其中,文件的inode用于确定所述缓存页所属的文件,所述逻辑页号用于确定缓存页。
log tail日志尾,用于指向所述缓存页的log chain中最后一个数据节点的首地址。在本发明实施例中,每一个缓存页的日志链由对所述缓存页的至少一次修改过程中动态生成的数据节点组成。一个数据节点用于记录该缓存页在一次修改过程中的一个log数据片的信息,其中,log数据片是指该缓存页在一次修改过程中的修改数据。每一个数据节点中包含有存储log数据片的数据域、存储其他数据节点地址的指针域及其他信息域,所述其他信息域可以用于存储该数据节点的地址等其他信息。
logs日志,用于指示所述缓存页的log chain中的数据节点的数量。
dirty脏,用于指示缓存页中是否有脏数据。换一种表达方式,“dirty”用于指示缓存页与磁盘中的文件数据是否同步。例如,当“dirty”指示位为1时,表示缓存页中存在脏数据,表示缓存页中的数据与磁盘中的文件数据不一致。当“dirty”指示位为0时,表示缓存页中的数据与磁盘中的文件数据一致。
log dirty日志脏,用于指示缓存页与该缓存页的日志链log chain的数据节点指示的log数据片是否同步。例如,当“log dirty”字段为1时,表示该缓存页的log chain的数据节点指示的log数据片中有新数据,数据节点中的数据与缓存页中的数据不一致。当“log dirty”字段为0时,表示该缓存页的log chain的数据节点指示的log数据片与缓存页中的数据一致。换一种表达方式,当“log dirty”字段为1时,表示该缓存页的log chain的数据节点指示的log数据片尚未被更新到缓存页中。当“log dirty”字段为0时,表示该缓存页的log chain的数据节点指示的log数据片已经被更新到缓存页中。
在本发明实施例中,当CPU 105确定了目标缓存页后,以一个目标 缓存页为例,CPU 105可以从所述目标文件的元数据信息中获得所述目标缓存页的缓存页结构,从而CPU 105能够根据所述目标缓存页的缓存页结构中的“log head”指示位或“log tail”指示位判断所述PCM 125中是否存储有所述目标缓存页的日志链(log chain)。具体的,当CPU 105在所述目标缓存页OCPi的缓存页结构中确定“log head”或“log tail”为空时可以确定该目标缓存页OCPi尚未被修改过,该目标缓存页没有log chain。当CPU 105在所述目标缓存页OCPi的缓存页结构中确定“log head”或“log tail”中包含有地址时,说明该目标缓存页已经被修改过,且CPU 105可以根据log head字段中记录的地址指针找到所述目标缓存页OCPi的log chain。以图5中所示的第一缓存页为目标缓存页为例,当第一缓存页结构中的log head为空时,说明所述第一缓存页没有log chain。当第一缓存页结构中的log head中包含有地址时,说明所述PCM 125中存储有所述第一缓存页的log chain。可以理解的是,实际应用中,也可以通过所述目标缓存页的缓存页结构中的“logs”或“log dirty”等字段来判断该目标缓存页是否有log chain。例如,当logs为0时,说明该目标缓存页没有log chain。当“logs”不为0时,说明该目标缓存页有log chain。本发明实施例中不对判断所述PCM 125中是否存储有所述目标缓存页的日志链(log chain)的具体方式进行限定。
在步骤325中,CPU 105在所述PCM 125中为所述目标缓存页OCPi创建log chain。当CPU 105在步骤320中根据所述目标缓存页的缓存页结构中的信息判断所述PCM 125中未存储有目标缓存页OCPi的log chain时,CPU 105可以在所述PCM 125中为所述目标缓存页创建log chain。在为目标缓存页创建log chain时,可以根据待写入数据的大小在所述PCM 125中分配物理空间,并在该分配的物理空间中初始化log chain的数据结构。
在本发明实施例中,所述PCM 125中存储有每一个被更新的缓存页的log chain。换一种表达方式,每一个被更新的缓存页都有一个log chain。log chain用于记录缓存页的至少一次修改信息。如图5所示,在实际应用中,可以为文件系统创建一个全局日志链(global log chain)结构,具体可以如图5中的410所示。该全局日志链结构410包含多个缓存页的日志链。每个缓存页的日志链可以看成是该全局日志链中的一个节点或者子日志链。日志链结构410中可以包括控制信息4100以及各缓存页的日志链。其中控制信息4100中包括全局日志 头(global log head)指针以及全局日志尾(global log tail)指针。Global log head指针用于指向全局日志链结构410中的第一个缓存页的日志链的首部。具体的,global log head指针用于指向所述PCM 125中的全局日志链结构的首地址。Global log tail指针用于指向全局日志链结构410中最后一个缓存页的日志链的首地址。在本发明实施例中,缓存页的日志链的首地址即为图5中所示的缓存页地址,缓存页地址可以包括所述缓存页所属的文件的inode以及所述缓存页的逻辑页号。
如图5所示,每一个缓存页的日志链由对所述缓存页的至少一次修改过程中形成的数据节点组成。数据节点中包含有log数据片的信息以及指向其他数据节点的指针等信息。所述log数据片的信息可以包括log数据片或log数据片在PCM 125中的存储地址。实际应用中,当CPU 105为某个缓存页创建log chain之后,需要将全局日志链的控制信息中的global log tail指针指向所述新创建的缓存页的log chain结构的首地址,根据这种方式,可以将所述新创建的缓存页的log chain按照创建时间的先后顺序挂载在该文件系统的全局log chain中,从而在计算机系统出现故障后恢复过程中能够根据全局log chain中的各个缓存页的log chain对写入计算机系统中的数据进行恢复,从而能够保持数据的一致性,且便于系统管理。
为了描述方便,以图5中的第一缓存页的日志链4105为例对每一个缓存页的日志链的结构进行具体描述。如图5所示,第一缓存页的日志链4105包括第一数据节点41051以及第二数据节点41052。在第一数据节点41051和第二数据片41052节点中均可以包括log数据片的信息、指向其他数据节点的指针以及log数据片的逻辑位置信息等。可以理解的是,对第一缓存页修改一次即可得到一个log数据片。在本发明实施例中,可以将对第一缓存页修改过程中的log数据片记录于第一缓存页的log chain 4105中。例如,可以将在所述第一缓存页的第一次修改过程中获得的第一log数据片存储于第一数据节点41051中,将在所述第一缓存页的第二次修改过程中获得的第二log数据片存储于第二数据节点41052中。
在本发明实施例中,log chain中的“Log数据片”字段用于记录所述缓存页的本次修改数据的信息。例如,如图5中所示,第一数据节点41051中的 “log数据片”用于指示第一缓存页的第一次修改数据,第二数据节点41052中的“log数据片”用于指示第一缓存页的第二次修改数据。实际应用中,在一种情形下,可以在“log数据片”部分直接记录修改的数据,在另一种情形下,也可以将修改的数据存储于PCM 125中的其他存储空间,然后在“log数据片”部分记录修改的数据的地址。在本发明实施例中不对log chain结构中的数据存储方式进行限定,只要能够根据log chain查找到缓存页的多次修改数据即可。
在将log数据片记录于log chain的过程中,可以按照对目标缓存页的修改顺序依次记录各个log数据片。在本发明实施例中,为了记录对目标缓存页的修改顺序,在log chain中的每个数据节点中均包含有指向其他数据节点的指针的信息。指向其他数据节点的指针可以包括如下字段:上一个log地址、下一个log地址等字段。其中,“上一个log地址”用于指示上一个数据节点的地址。具体的,“上一个log地址”用于指示上一个数据节点在SCM中的起始地址。例如,如图5所示,由于“页面内偏移”字段是数据节点中的首个字段,因此,“上一个log地址”可以指向上一个数据节点中的log数据片中的“页面内偏移”字段。所述上一个数据节点为本数据节点的前一次插入的数据节点,用于指示目标缓存页在前一次修改过程中的修改数据的信息。“下一个log地址”用于指示下一个数据节点的地址。具体的“下一个log地址”用于指示下一个数据节点在SCM中的起始地址。例如,如图5所示,“下一个log地址”可以指向下一个数据节点中的log数据片中的“页面内偏移”字段。所述下一个数据节点为本数据节点的下一次插入的数据节点,用于指示目标缓存页在下一次修改过程中的修改数据的信息。可以理解的是,一个缓存页的日志链结构中的首个数据节点中的“上一个log地址”字段为空。例如,图4中的第一数据节点41051为第一缓存页的log chain 4105中的首个数据节点。因此,在第一数据节点41051中,“上一个log地址”字段为空。类似的,一个缓存页的日志链中的最后一个数据节点中的“下一个log地址”字段为空。当一个数据节点中的“下一个log地址”字段为空时说明该数据节点为与该数据节点对应的缓存页的最后一次修改。
在本发明实施例中,为了记录log数据片在目标缓存页中的具体信息,每个数据节点中还包含有log数据片的页内位置信息。log数据片的页内位置信息可以包括:页面内偏移、log数据长度等信息。log数据片的页内位置是指log 数据片在目标缓存页中的位置。具体的,“页面内偏移”用于指示该log数据片在缓存页中的起始位置。“log数据长度”用于指示该log数据片的长度信息。
在本发明实施例中,为了建立缓存页的log chain与缓存页的联系,在每个缓存页的log chain中,该缓存页的第一个数据节点中还包括“缓存页地址”信息,其中,“缓存页地址信息”可以包括文件inode以及逻辑页号。其中,文件inode用于指示该log chain所属的文件,逻辑页号用于指示该log chain所属的缓存页。如图4所示,第一缓存页的第一个数据节点41051中的“缓存页地址”字段中包括有第一缓存页所属的文件的inode和第一缓存页的逻辑页号。进一步的,为了建立多个缓存页的log chain之间的联系,在每个缓存页的第一个数据节点中还包含有指示下一个页面的指针信息,“下一个页面”用于指示文件系统中下一个被修改的文件的缓存页的log chain中的第一个数据节点。根据“下一个页面”的指针能够找到该文件系统中下一个被修改的缓存页的log chain。
在步骤330中,CPU 105在为所述目标缓存页OCPi的log chain中插入数据节点,所述插入的数据节点中包含有log数据片logi(x,y)的信息。在本发明实施例中,一种情形下,当CPU 105在所述PCM 125中为所述目标缓存页OCPi创建log chain之后,该方法可以进入步骤330,以便CPU 105在创建的log chain中插入数据节点并记录本次修改过程中的log数据片的信息。在另一种情形下,当CPU 105在步骤320判断所述PCM 125中存储有目标缓存页OCPi的log chain时,该方法可以进入步骤330,CPU 105可以在目标缓存页OCPi已有的log chain中插入数据节点并记录本次修改过程中的log数据片的信息。其中,log数据片的信息具体可以包括log数据片或log数据片在所述PCM 125中的存储地址。log数据片的信息还可以包括指向其他数据节点的指针以及log数据片的位置等信息。例如,当CPU 105为第一缓存页创建了log chain结构后,可以在第一缓存页的log chain结构中记录第一log数据片的信息。具体的,可以直接将第一log数据片中的数据记录在第一数据节点41051中的“log数据片”字段中,也可以将第一log数据片在PCM 125中的存储地址记录在第一数据节点41051中的“log数据片”字段中,在此不做限定。并且,还可以在第一数据节点41051中记录第一log数据片的位置、长度以及指向其他数据节点的指针等信息。
为了描述清楚,以前述的待写入第1页的log数据片为log1(89,11), 且第1页为图4所示的第一缓存页为例。当CPU 105为第一缓存页创建log chain4105后,可以在第一数据节点41051的“缓存页地址”字段记录所述第一文件的inode以及该第一缓存页的逻辑页号,并在第一数据节点41051的“页面内偏移”字段中记录89,在“log数据长度”中记录11,在“log数据片”字段中记录buf1(0,11)的数据或者在“log数据片”字段中记录数据片log1(89,11)在PCM 125中的存储地址。
本领域技术人员可以知道,由于链表结构中的各个节点是可以在系统运行过程中动态生成并插入的,因此,第一缓存页的log chain 4105中的各个数据节点也是可以动态生成并插入的。当有新的数据节点生成时,需要相应更新该链表中的已有数据节点中的指向其他数据节点的指针,同时也需要更新缓存页结构中的log tail指针。例如,第一数据节点41051为第一缓存页的第一个数据节点,因此在创建第一数据节点41051时,第一数据节点41051中的“上一个log地址”以及“下一个log地址”为空。当系统运行过程中,动态生成第二数据节点41052后,可以根据第二数据节点41052更新第一数据节点41051中的“下一个log地址”的指针,将第一数据节点41051中的“下一个log地址”的指针指向第二数据节点41052在PCM 125中的起始地址。并且,还需要将第一缓存页结构中的log tail指针更新为指向第二数据节点41052在PCM 125中的起始地址。具体的,由于“页面内偏移”字段为第二数据节点41052的首个字段,因此,可以将第一数据节点41051中的“下一个log地址”的指针指向第二数据节点41052中“页面内偏移”字段,并将第一缓存页结构中的log tail指针更新为指向第二数据节点41052中“页面内偏移”字段。其中,“页面内偏移”用于指示第二数据节点41052的log数据片在第一缓存页中的位置。
在本发明实施例中,当CPU 105在步骤320判断所述PCM 125中存储有目标缓存页OCPi的log chain时,在本步骤中,CPU 105可以在目标缓存页OCPi已有的log chain的尾部插入数据节点并记录本次修改过程中的log数据片的信息。例如,当在步骤320中CPU 105根据第一缓存页结构中的log tail字段不为空时,可以确定所述PCM 125中存储有所述第一缓存页的log chain结构。换一种表达方式,当第一缓存页结构中的log tail字段不为空时,说明在本次修改之前第一缓存页已经被修改过。在这种情况下,在本步骤中,CPU 105可以根 据第一缓存页结构中的log tail字段查找到第一缓存页的log chain 4105中的最后一个数据节点。在本发明实施例中,第一缓存页的最后一个数据节点中存储有距离当前时间最近一次修改的数据信息,或者说第一缓存页的最后一个数据节点中存储有第一缓存页的最后一个修改版本。当CPU 105找到第一缓存页的最后一个数据节点后,CPU 105可以将在所述最后一个数据节点之后附加一个新的数据节点,在附加的所述新的数据节点中存储有所述数据片logi(x,y)的信息。以第一缓存页的log chain中最后一个数据节点为第一数据节点41051,新修改的数据为数据片logi(x,y),且新的数据节点为第二数据节点41052为例,CPU 105可以在第二数据节点41052中存储数据片logi(x,y)的信息。其中,所述数据片logi(x,y)的信息可以包括数据片logi(x,y)、log数据长度、页面内偏移及指向其他数据节点的指针信息。并且,可以将所述缓存页的log tail指针指向的第一数据节点41050中的“下一个log地址”的指针指向第二数据节点41052在PCM 125中的起始地址。
可以理解的是,本发明实施例中提供的这种将目标缓存页的修改数据按照修改顺序依次记录在log chain中的方式,便于通过log chain中的数据节点的先后顺序识别出所述目标缓存页的不同更新版本。实际应用中,在依次插入数据节点的过程中,除了依次在log chain的尾部按照从前往后的顺序插入数据节点外,也可以在log chain的头部按照从后往前的顺序依次插入数据节点。在本发明实施例中不对具体的插入顺序进行限定,只要能够按照log chain中的数据节点识别出目标缓存页的更新顺序即可。
在本发明实施例中,CPU 105根据写请求200将待写入的数据205写入PCM 125中(如图2所示)后,可以向应用响应写入成功消息,所述写入成功消息用于指示待写入数据已经成功被写入存储设备中,从而可以减少访问请求的处理时延。
在本发明实施例中,当CPU 105根据访问请求需要对某个文件的数据进行修改时,CPU 105并没有直接将修改的数据写入该文件的目标缓存页中,而是将修改的数据写入PCM 125空间中,并采用log chain的形式记录目标缓存页的每一次修改数据的信息。由于PCM 125具有非易失性,并且在PCM 125中采用了log chain的记录方式来存储写入数据,能够将目标缓存页在多次修改过 程中的修改数据按照时间先后顺序记录下来,从而便于识别log数据片的版本关系,保证了存储的数据与写入数据之间的一致性。在读取数据的过程中,能够根据同一个缓存页的log数据片的写入时间确定有效数据,保证读取数据的正确性。图3提供的访问请求处理方法与现有技术中通过维护内存块的不同状态来保持数据一致性的方式相比,由于状态维护相对于写更新的过程来说对系统的开销较大,因此,本发明提供的计算机系统100在访问请求处理过程中的系统开销较小。并且,由于本发明实施例中log数据片的大小可以比页(page)更小,因此,可以支持对比页更小粒度的文件修改,修改方式更加灵活。
如图2所示,在本发明实施例提供的计算机系统100中,在将待写入数据205以图3所示的方法写入PCM 125(例如,PCM 125)后,在一些情况下可以触发合并操作,将PCM 125中的log数据片更新到DRAM 120的缓存页中。例如,一种情形下,为了节省系统存储空间,需要及时回收PCM 125的存储空间,并回收缓存页的log chain。在这种情况下,需要先将PCM 125中的log数据片更新到DRAM 120中,然后将DRAM 120中更新后的缓存页写入磁盘130中,以更新磁盘中的文件数据。另一种情形下,当计算机系统在处理写数据的过程中遇到故障时,在计算机系统被重新启动后,可以根据所述PCM 125中的log chain进行数据写回恢复,保证写入的数据不丢失,以保持数据的一致性。在这种情形下,需要先将PCM 125中的log数据片更新到DRAM 120的缓存页中,再将更新后的缓存页写入磁盘130中。又一种情形下,当读数据时,也需要根据PCM 125中的log chain将log数据片更新到目标缓存页,从而能够读取正确的数据。在本发明实施例中,不对具体触发合并操作的情况进行限定。下面将结合图6对图2中所示的将log数据片更新到DRAM 120的缓存页中的过程进行描述。图6为本发明实施例提供的一种数据合并方法,可以理解的是,对每一个有log chain的缓存页都可以按照图6所示的方法进行合并操作。为了描述方便,仍然以图3中所述的任意一个目标缓存页OCPi的log chain为例进行描述。
在步骤600中,CPU 105确定所述目标缓存页OCPi的log chain中的有效数据。在本发明实施例中,所述有效数据为所述目标缓存页的最新修改数据。具体的,CPU 105可以根据所述目标缓存页OCPi的至少一个数据节点中记录的log数据片的信息确定所述目标缓存页的log chain中的有效数据。CPU 105 可以根据所述目标缓存页OCPj的log chain中各数据节点的更新顺序以及log数据片的页内位置信息来确定所述log chain中的有效数据。log chain中的各数据节点均是依据对缓存页的修改时间的先后顺序依次获得的,根据这种方式,处于log chain尾部的数据节点的获得时间晚于处于log chain头部的数据节点的获得时间。log数据片的页内位置可以根据数据节点中的“页内偏移”和“数据长度”这两个信息获得。
在具体进行数据片合并的过程中,可能出现以下两种情形。在第一种情形下,log chain中各数据节点中的log数据片的页内位置均没有重叠。在这种情形下,CPU 105可以确定该所述log chain中各个数据节点中的log数据片均为有效数据。以图4中所示的第一缓存页的log chain为例。如图4所示,第一缓存页的log chain中有两个数据节点:第一数据节点41051和第二数据节点41052。且第二数据节点的生成时间晚于第一数据节点。若第一数据节点41051中的log数据片的地址为第30-50字节,第二数据节点41052中的log数据片的地址为第60-80字节。在这种情形下,CPU 105确定第一数据节点的log数据片和第二数据节点中的log数据片均为有效数据。
在第二种情形下,log chain中各数据节点中的log数据片的页内位置有重叠。在这种情形下,对于有重叠的至少两个log数据片,CPU 105确定所述log chain中较晚生成的数据节点中包含的重叠部分的数据为有效数据,并且,CPU 105分别确定所述至少两个log数据片中的非重叠部分的数据均为有效数据。换一种表达方式,在有重叠时,CPU 105确定较晚生成的数据节点中的全部数据以及较早生成的数据节点中的非重叠部分的数据为有效数据。例如,以图4中的第一缓存页的log chain为例,若第一数据节点41051中的log数据片的地址为第30-70字节,第二数据节点41052中的log数据片的地址为第50-90字节。则CPU105确定第一log数据片中的第30-49字节和第二log数据片中的第50-90字节为有效数据。
在步骤605中,将所述有效数据更新到所述目标缓存页OCPi中,以获得所述更新后的目标缓存页OCPi’。具体的,CPU 105可以用确定的log chain中的有效数据替换所述目标缓存页OCPi中与所述有效数据的位置相同的数据。例如,若在步骤600中,CPU 105确定第一缓存页的log chain中的有效数据的 地址为第30-90字节,则CPU 105可以将确定所述log chain中的第30-90字节的有效数据替换所述第一缓存页中第30-90字节的数据,从而得到更新后的第一缓存页。
实际应用中,为了及时回收PCM 125的存储空间,并回收缓存页的log chain,当设定的条件满足时,CPU 105可以按照如图6所示的合并方法将所述PCM 125中存储的log数据片更新到对应的缓存页,并进一步根据合并后的缓存页更新磁盘中的文件数据。例如,当PCM 125中的存储空间低于预设阈值或达到设定的时间时,CPU 105可以按照如图6所示的合并方法将所述PCM 125中存储的log数据片更新到对应的缓存页,并进一步根据合并后的缓存页更新磁盘中的文件数据。在将更新后的缓存页的数据写到磁盘130后,CPU 105可以删除所述缓存页的log chain,以释放PCM 125的存储空间,节省系统资源。
实际应用中,在需要回收log chain占用的存储空间时,CPU 105具体可以通过每一个缓存页的缓存页结构中的“dirty”字段确定是否需要将该缓存页的数据刷回磁盘130中。以一个缓存页为例,当“dirty”字段为1时,CPU 105确定需要将该缓存页的数据刷回磁盘130中,当“dirty”字段为0时,CPU 105确定不需要将该缓存页的数据刷回磁盘130中。在CPU 105确定需要将该缓存页的数据刷回磁盘130中时,CPU 105还需要进一步根据该缓存页的缓存页结构中的“log dirty”字段判断是否需要将PCM 125中的log数据片更新到该缓存页中。例如,当“log dirty”字段为1时,说明PCM 125中包含有该缓存页的新修改数据,CPU 105需要先将PCM 125中的log数据片更新到DRAM 120中的缓存页中后,再将更新后的缓存页中的数据刷回磁盘130中。当“log dirty”字段为0时,说明该缓存页的log chain中的log数据片已经被更新到该缓存页中,PCM 125中未包含该缓存页的新修改数据,CPU 105可以直接将该缓存页中的数据刷回磁盘130中。
可以理解的是,在本发明实施例中,由于在将待写入数据205写入PCM 125中后,就可以向应用返回写入成功消息。由于log chain中的log数据片并不是以页为粒度的修改数据,因此,采用本发明的访问请求处理方法能够支持对文件的小粒度的修改。并且,在本发明实施例中,将数据写入PCM 125中后,并未立即将PCM 125中的修改数据写入磁盘130中,而是在满足一定条件时, 才将所述PCM 125中存储的log数据片更新到对应的缓存页,并根据合并后的缓存页更新磁盘中的文件数据。这种延迟合并数据及将合并后的数据写入磁盘的方式,与现有的保持数据一致性的写前日志(write ahead log,WAL)方式以及写前拷贝(copy on write)方式相比,能够减少系统的写放大。
如前所述,本发明实施例中,在计算机系统出现故障被重新启动后,也可以根据所述PCM 125中的log chain进行数据写回恢复,保证写入的数据不丢失,以保持数据的一致性。具体的,在计算机系统发生故障并重新启动后,CPU 105可以根据PCM 125中的全局log chain控制信息中的全局log head指针,依次对每一个有log chain的缓存页进行数据恢复。具体的,对于任意一个有log chain的缓存页,CPU 105能够遍历该缓存页的log chain中的每一个log数据片,并按照图6所示的方法确定该log chain中的有效数据,将确定的有效数据更新到缓存页中。然后再将更新后的缓存页的数据写入到磁盘130中。通过这种方式能够保证写入的数据不丢失。
图2-图3从写数据的过程对本发明实施例提供的访问请求处理方法进行了描述。下面将从读数据的过程对本发明实施例提供的访问请求处理方法进行进一步的说明。图7为本发明实施例提供的计算机系统100的又一种信令示意图,图7对图1所示的计算机系统100中各个器件在读数据过程中的信令交互过程进行了图示。图8为本发明实施例提供的又一种访问请求处理方法的流程图,图8以读取数据为例进行图示。图8所示的方法可以由图7所示的计算机系统100中CPU 105通过调用数据处理逻辑1051来实现。下面将结合图7和图8对如何从本发明实施例提供的计算机系统100中读取数据进行详细的描述。如图8所示,该方法可以包括下述步骤。
在步骤800中,CPU 105接收读请求700。如图7所示,当CPU 105接收到读请求700时,CPU 105可以调用数据处理逻辑1051处理所述读请求700。所述读请求中携带有文件标识以及待读取的数据的大小。具体对文件标识的描述可以参见前面的描述。在图8所示的实施例中,读请求700中携带的文件标识可以是待访问的目标文件的文件句柄,还可以是除文件句柄之外的其他文件描述符。在此不做限定,只要能够使进程通过文件标识识别出目标文件并找到目标文件的描述信息即可。例如,读请求700中携带的文件标识可以为磁盘130中第二文件 的文件标识。需要说明的是,本发明实施例中的第一文件和第二文件仅仅是不同访问流程中访问的文件的区分,不对具体的文件进行限定。根据这种方式,第一文件和第二文件可以是相同的文件也可以是不同的文件。
在步骤805中,CPU 105根据所述文件标识获取访问位置。其中,所述访问位置用于指示所述读请求在所述目标文件中要读取的数据的起始地址。所述访问位置可以是一个逻辑的访问位置。在本步骤中,CPU 105如何根据所述读请求700中携带的文件标识获取访问位置的描述与步骤305类似,具体可以参见步骤305的描述。
在步骤810中,CPU 105根据所述访问位置、所述待读取的数据的大小以及所述缓存页的大小确定M个目标缓存页以及在所述M个目标缓存页中的目标缓存页OCPj中待读取的数据的位置信息。其中,j的取值从1到M,所述M为不小于1的整数。如前所述,一个页(page)的大小通常为4k个字节(Byte)。该步骤中,CPU 105确定M个目标缓存页的方式与在步骤310中CPU 105确定N个目标缓存页的方式类似,具体可以参见步骤310的描述。
进一步的,在本步骤中,CPU 105具体可以根据访问位置、所述待读取的数据的大小以及所述缓存页的大小确定在所述M个目标缓存页中的目标缓存页OCPj中待读取的数据的位置信息。为了描述方便,以所述目标缓存页OCPj是所述DRAM中缓存的所述第二文件的文件页,且以一个缓存页的大小为100个字节为例。若在步骤805中确定的访问位置为第二文件的150个字节,待读取数据的大小为210个字节,则CPU 105可以确定目标缓存页为第二文件的第二缓存页p2(包含第二文件的第100-199字节)、第三缓存页p3(包含第二文件的第200-299字节)和第四缓存页p4(包含第二文件的第300-399字节)。并且,CPU 105能够确定出所述读请求700待读取的数据的位置信息分别为p2(50,49)、p3(0,100)和p4(0,61)。其中,p2(50,49)用于表示从第二缓存页中的第50个字节的位置开始的49个字节,p3(0,100)用于表示从第三缓存页中的第0个字节的位置开始的100个字节,p4(0,61)用于表示从第四缓存页中的第0个字节的位置开始的61个字节。可以理解的是,读请求700要访问的目标缓存页也可能是一个缓存页也可以是多个缓存页,也就是说,M的值可以是不小于1的整数。为了描述方便,在本发明实施例中,可以将确定的M个目标缓存页以 及目标缓存页OCPj中待读取的数据的位置信息称为待读取的数据的信息705。如图7所示,在CPU 105确定待读取的数据的置信息705之后,CPU 105可以根据确定的待读取的数据的信息705读取DRAM 120中缓存的数据。
在图8所示的实施例中,当获得读请求700对各个目标缓存页OCPj中待读取的数据的位置信息后,CPU 105可以分别对每一个目标缓存页OCPj执行下述操作。可以理解的是,实际应用中,当在步骤810中CPU 105只确定有一个目标缓存页时,CPU 105可以只对确定的一个目标缓存页执行下述操作。当在步骤810中CPU 105确定出多个目标缓存页时,也就是说,当CPU 105根据读请求700确定待读取的数据需要分别从多个目标缓存页读取时,CPU 105可以对每一个目标缓存页分别执行下述操作。为了描述清楚,下面将以对一个目标缓存页的操作方法为例进行描述。
在步骤815中,CPU 105判断所述PCM 125中是否存储有目标缓存页OCPj的log chain,所述目标缓存页OCPj的log chain用于记录目标缓存页OCPj的至少一个log数据片的信息。在本发明实施例中,所述目标缓存页的log chain中包含有至少一个数据节点,其中,每一个数据节点中包含有一个log数据片的信息,每个log数据片为所述目标缓存页在一次修改过程中的修改数据。当所述PCM 125中未存储有目标缓存页OCPj的log chain时,该方法进入步骤820,否则该方法进入步骤825。实际应用中,当CPU 105在步骤810中确定出所述读请求700待访问的所述M个目标缓存页后,CPU 105可以从所述目标文件的元数据信息中获得所述M个目标缓存页的缓存页结构。进而能够根据缓存页结构中记录的信息判断所述PCM 125中是否存储有所述M个目标缓存页中的目标缓存页OCPj的log chain结构。缓存页结构以及log chain结构可以参见图5所示,关于缓存页结构的描述以及如何根据目标缓存页OCPj的缓存页结构判断所述PCM125中是否存储有所述目标缓存页OCPj的log chain的方式与图3中的步骤320类似,具体可以参见步骤320的描述。
在步骤820中,CPU 105根据所述目标缓存页OCPj中待读取的数据的位置信息从所述DRAM中读取所述目标缓存页OCPj的数据。如步骤320中所述,对于任意一个目标缓存页,CPU 105能够根据该目标缓存页的缓存页结构中的“log head”或“log tail”判断所述PCM 125中是否存储有所述目标缓存页的log  chain。当在步骤815中,CPU 105根据所述目标缓存页OCPj的缓存页结构确定所述PCM 125中未存储有所述目标缓存页OCPj的log chain时,说明所述目标缓存页OCPj的数据未被修改过,因此,CPU 105可以根据所述待读取数据的位置直接从所述DRAM中读取所述目标缓存页OCPj中的数据。如图7所示,CPU 105可以从DRAM 120中的缓存页中获取读取的数据720。
在步骤825中,CPU 105根据所述目标缓存页OCPj以及所述log chain中至少一个log数据片的信息获得更新后的目标缓存页OCPj 。当在步骤815中,CPU 105根据所述目标缓存页OCPj的缓存页结构确定所述PCM 125中存储有所述目标缓存页OCPj的log chain时,说明所述目标缓存页OCPj的数据已被修改过,因此,CPU 105需要将存储于PCM 125中的log chain中的数据更新至DRAM中的目标缓存页中。具体的,可以将目标缓存页OCPj的log chain中的log数据片215合并至目标缓存页中,以得到更新后的目标缓存页OCPj 。在数据合并过程中,CPU 105确定所述目标缓存页OCPj的log chain中的有效数据。在本发明实施例中,有效数据是指所述缓存页的最新修改数据。在获得所述目标缓存页OCPj的log chain中的有效数据之后,CPU 105将所述有效数据更新到所述目标缓存页OCPj中,以获得更新后的目标缓存页OCPj 。具体的数据合并方法可以参见图6的描述。
在步骤830中,CPU 105根据所述目标缓存页OCPj中待读取的数据的位置信息从所述更新后的目标缓存页OCPj’中读取数据。可以理解的是,待读取数据的位置信息是指待读取数据在目标缓存页中的逻辑位置。当在步骤825中,CPU 105根据所述PCM 125中存储的所述目标缓存页OCPj的log chain中的数据更新所述目标缓存页OCPj,获得更新后的目标缓存页OCPj’后,CPU 105能够根据在步骤810中确定的目标缓存页OCPj中待读取的数据的位置信息从所述更新后的目标缓存页OCPj’中读取数据。例如,若在步骤810中确定在第一缓存页中读取的数据的位置信息为第15-50个字节,则在本步骤中,CPU 105可以从所述更新后的第一缓存页中读取第15-50个字节的数据。如图7所示,CPU 105可以从DRAM 120中更新后的目标缓存页中获取读取的数据720。
本领域技术人员可以知道,实际应用中,当DRAM 120中没有待读取的数据时,操作系统会先从磁盘中将待读取的数据加载在DRAM 120中,然 后从所述DRAM的缓存页中读取数据,从而能够提高读取速度。
从图8描述的读数据过程可以看出,本发明实施例提供的访问请求处理方法,由于将缓存页的待修改数据都通过log chain的方式存储于PCM 125中,因此可以支持比页的粒度更小的数据修改。当需要读取缓存页中的数据时,可以根据log chain中的数据节点中的log数据片获得读取数据时最新的缓存页数据,从而能够保证读取的数据的准确性。
图9为本发明实施例提供的一种访问请求处理装置的结构示意图。该装置可以应用于包含有非易失性内存NVM的计算机系统中,例如,该装置可以应用于如图1所示的计算机系统中。如图9所示,所述访问请求处理装置90可以包括下述模块。
接收模块900,用于接收写请求。所述写请求中携带有文件标识、缓存区指针以及待写入数据的大小,其中,所述缓存区指针用于指向缓存待写入数据的缓存区,所述待写入数据为用于修改所述写请求要访问的目标文件的修改数据。
获取模块905,用于根据所述文件标识获取访问位置。所述访问位置指示所述写请求在所述目标文件中写入数据的起始地址。
确定模块910,用于根据所述访问位置、所述待写入数据的大小以及缓存页的大小确定目标缓存页。其中,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页。
所述确定模块910,还用于确定所述NVM中存储有所述目标缓存页的日志链(log chain)。所述目标缓存页的log chain中包含有至少一个数据节点,其中,每一个数据节点中包含有所述目标缓存页在一次修改过程中的修改数据的信息。
插入模块915,用于在所述目标缓存页的log chain中插入新的数据节点。所述插入的数据节点中包含有所述目标缓存页的log数据片的信息。其中,所述log数据片为所述目标缓存页的修改数据,所述log数据片是根据所述缓存区指针从缓存区获得的至少一部分待写入数据。所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址。
在本发明实施例中,所述log数据片的信息还包括:所述log数据片 在所述目标缓存页中的偏移量、所述log数据片的长度以及所述插入的数据节点的相邻数据节点的地址信息。
具体的,插入模块915在执行在所述目标缓存页的log chain中插入新的数据节点的操作中,插入模块915具体可以在所述目标缓存页的log chain的尾部或头部插入新的数据节点。其中,在插入所述新的数据节点后,所述目标缓存页的log chain中包含有根据所述目标缓存页的更新顺序依次链接的至少两个数据节点。
实际应用中,访问请求处理装置90还可以包括更新模块920和存储模块925。所述更新模块920用于根据所述目标缓存页的log chain中记录的至少一个log数据片的信息获得更新后的目标缓存页。所述存储模块925用于将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中。具体的,更新模块920可以根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,并将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。其中,所述有效数据为所述目标缓存页的最新修改数据。
进一步的,访问请求处理装置90还可以包括回收模块930。回收模块930用于在将所述更新后的目标缓存页的数据存储于所述计算机系统100的外存设备中之后,回收所述目标缓存页的log chain。
本发明实施例所提供的访问请求处理装置90可以参见前述实施例描述的访问请求处理方法。各个模块功能的详细描述可分别参见前述实施例中对图2-6的描述,在此不再赘述。
图10为本发明实施例提供的一种访问请求处理装置的结构示意图。该装置可以应用于包含有非易失性内存NVM的计算机系统中,例如,该装置可以应用于如图1所示的计算机系统中。如图10所示,所述访问请求处理装置10A可以包括下述模块。
接收模块1000用于接收读请求。其中,所述读请求中携带有文件标识以及待读取的数据的大小。
获取模块1005用于根据所述文件标识获取访问位置,所述访问位置指示所述读请求在目标文件中读取数据的起始地址。
确定模块1010用于根据所述访问位置、所述待读取的数据的大小以及缓存页的大小确定目标缓存页以及所述目标缓存页中待读取的数据的位置信息。其中,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页。
所述确定模块1010还用于确定所述NVM中存储有所述目标缓存页的日志链(log chain)。所述目标缓存页的log chain中包含有至少一个log数据片的信息。每个log数据片为所述目标缓存页在一次修改过程中的修改数据。所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址。
更新模块1015用于根据所述目标缓存页以及所述目标缓存页的log chain中的至少一个log数据片的信息获得更新后的目标缓存页。具体的,更新模块1015可以根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,并将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。其中,所述有效数据为所述目标缓存页的最新修改数据。
读取模块1020用于根据所述目标缓存页中待读取的数据的位置信息从所述更新后的目标缓存页中读取数据。
在本发明实施例中,所述log数据片的信息还可以包括:所述log数据片在所述目标缓存页中的偏移量、所述log数据片的长度以及所述插入的数据节点的相邻数据节点的地址信息。具体对于log chain以及log数据片的信息的描述可以参见前述的实施例。
本发明实施例所提供的装置10A可以参见前述实施例描述的访问请求处理方法,具体的,各个模块功能的详细描述可参见前述实施例中对图7-8的描述,在此不再赘述。
本发明实施例还提供一种实现访问请求处理方法的计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令用于执行前述任意一个方法实施例所述的方法流程。本领域普通技术人员可以理解,前述的存储介质包括:U盘、移动硬盘、磁碟、光盘、随机存储器(Random-Access Memory,RAM)、固态硬盘(Solid State Disk,SSD)或者其他非易失性存储器 (non-volatile memory)等各种可以存储程序代码的非短暂性的(non-transitory)机器可读介质。
需要说明的是,本申请所提供的实施例仅仅是示意性的。所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。在本发明实施例、权利要求以及附图中揭示的特征可以独立存在也可以组合存在。在本发明实施例中以硬件形式描述的特征可以通过软件来执行,反之亦然。在此不做限定。

Claims (27)

  1. 一种访问请求处理方法,所述方法由计算机系统执行,其中,所述计算机系统包括处理器和非易失性内存NVM,其特征在于,包括:
    接收写请求,所述写请求中携带有文件标识、缓存区指针以及待写入数据的大小,其中,所述缓存区指针用于指向缓存待写入数据的缓存区,所述待写入数据为所述写请求要访问的目标文件的修改数据;
    根据所述文件标识获取访问位置,所述访问位置指示所述写请求在所述目标文件中写入数据的起始地址;
    根据所述访问位置、所述待写入数据的大小以及缓存页的大小确定目标缓存页,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页;
    确定所述NVM中存储有所述目标缓存页的日志链log chain,所述目标缓存页的log chain中包含有至少一个数据节点,其中,每一个数据节点中包含有所述目标缓存页在一次修改过程中的修改数据的信息;
    在所述目标缓存页的log chain中插入新的数据节点,所述插入的数据节点中包含有所述目标缓存页的log数据片的信息,其中,所述log数据片为所述目标缓存页的修改数据,所述log数据片是根据所述缓存区指针从缓存区获得的至少一部分待写入数据,所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址。
  2. 根据权利要求1所述的方法,其特征在于,所述在所述目标缓存页的log chain中插入新的数据节点包括:
    在所述目标缓存页的log chain的尾部或头部插入新的数据节点,其中,在插入所述新的数据节点后,所述目标缓存页的log chain中包含有根据所述目标缓存页的更新顺序依次链接的至少两个数据节点。
  3. 根据权利要求1或2所述的方法,其特征在于,还包括:
    根据所述目标缓存页的log chain中记录的至少一个log数据片的信息获得更新后的目标缓存页;
    将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中。
  4. 根据权利要求3所述的方法,其特征在于,所述将根据所述目标缓存页的log chain中记录的至少一个log数据片的信息获得更新后的目标缓存页包括:
    根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,其中,所述有效数据为所述目标缓存页的最新修改数据;
    将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。
  5. 根据权利要求3-4任意一项所述的方法,其特征在于,还包括:
    在将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中之后,回收所述目标缓存页的log chain。
  6. 根据权利要求1-5任意一项所述的方法,其特征在于,所述log数据片的信息还包括:所述log数据片在所述目标缓存页中的偏移量、所述log数据片的长度以及所述插入的数据节点的相邻数据节点的地址信息。
  7. 一种访问请求处理方法,所述方法由计算机系统中的处理器执行,其中,所述计算机系统的内存包括所述处理器和非易失性内存NVM,其特征在于,包括:
    接收读请求,所述读请求中携带有文件标识以及待读取的数据的大小;
    根据所述文件标识获取访问位置,所述访问位置指示所述读请求在目标文件中读取数据的起始地址;
    根据所述访问位置、所述待读取的数据的大小以及缓存页的大小确定目标缓存页以及所述目标缓存页中待读取的数据的位置信息,其中,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页;
    确定所述NVM中存储有所述目标缓存页的日志链log chain,所述目标缓存页的log chain中包含有至少一个log数据片的信息,每个log数据片为所述目标缓存页在一次修改过程中的修改数据,所述log数据片的信息包括所述log数据片或者 所述log数据片在所述NVM中的存储地址;
    根据所述目标缓存页以及所述目标缓存页的log chain中的至少一个log数据片的信息获得更新后的目标缓存页;
    根据所述目标缓存页中待读取的数据的位置信息从所述更新后的目标缓存页中读取数据。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述目标缓存页以及所述目标缓存页的log chain中的至少一个log数据片的信息获得更新后的目标缓存页包括:
    根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,其中,所述有效数据为所述目标缓存页的最新修改数据;
    将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。
  9. 根据权利要求7或8所述的方法,其特征在于:所述log数据片的信息还包括:所述log数据片在所述目标缓存页中的偏移量、所述log数据片的长度以及所述插入的数据节点的相邻数据节点的地址信息。
  10. 一种计算机系统,所述计算机系统包括非易失性内存NVM以及与所述NVM连接的处理器,其特征在于,所述处理器用于:
    接收写请求,所述写请求中携带有文件标识、缓存区指针以及待写入数据的大小,其中,所述缓存区指针用于指向缓存待写入数据的缓存区,所述待写入数据为所述写请求要访问的目标文件的修改数据;
    根据所述文件标识获取访问位置,所述访问位置指示所述写请求在所述目标文件中写入数据的起始地址;
    根据所述访问位置、所述待写入数据的大小以及缓存页的大小确定目标缓存页,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页;
    确定所述NVM中存储有所述目标缓存页的日志链log chain,所述目标缓存页 的log chain中包含有至少一个数据节点,其中,每一个数据节点中包含有所述目标缓存页在一次修改过程中的修改数据的信息;
    在所述目标缓存页的log chain中插入新的数据节点,其中,所述插入的数据节点中包含有所述目标缓存页的log数据片的信息,所述log数据片为所述目标缓存页的修改数据,所述log数据片是根据所述缓存区指针从缓存区获得的至少一部分待写入数据,所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址。
  11. 根据权利要求10所述的计算机系统,其特征在于,所述处理器具体用于:
    在所述目标缓存页的log chain的尾部或头部插入新的数据节点,其中,在插入所述新的数据节点后,所述目标缓存页的log chain中包含有根据所述目标缓存页的更新顺序依次链接的至少两个数据节点。
  12. 根据权利要求10或11所述的计算机系统,其特征在于,所述处理器还用于:
    根据所述目标缓存页的log chain中记录的至少一个log数据片的信息获得更新后的目标缓存页;
    将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中。
  13. 根据权利要求12所述的计算机系统,其特征在于,所述处理器具体用于:
    根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,其中,所述有效数据为所述目标缓存页的最新修改数据;
    将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。
  14. 根据权利要求12-13任意一项所述的计算机系统,其特征在于,所述处理器还用于:
    在将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中之后,回收所述目标缓存页的log chain。
  15. 根据权利要求10-14任意一项所述的计算机系统,其特征在于:所述log数据片的信息还包括:所述log数据片在所述目标缓存页中的偏移量、所述log数据片的长度以及所述插入的数据节点的相邻数据节点的地址信息。
  16. 一种计算机系统,所述计算机系统包括非易失性内存NVM以及与所述NVM连接的处理器,其特征在于,所述处理器用于:
    接收读请求,所述读请求中携带有文件标识以及待读取的数据的大小;
    根据所述文件标识获取访问位置,所述访问位置指示所述读请求在目标文件中读取数据的起始地址;
    根据所述访问位置、所述待读取的数据的大小以及缓存页的大小确定目标缓存页以及所述目标缓存页中待读取的数据的位置信息,其中,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页;
    确定所述NVM中存储有所述目标缓存页的日志链log chain,所述目标缓存页的log chain中包含有至少一个log数据片的信息,每个log数据片为所述目标缓存页在一次修改过程中的修改数据,所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址;
    根据所述目标缓存页以及所述目标缓存页的log chain中的至少一个log数据片的信息获得更新后的目标缓存页;
    根据所述目标缓存页中待读取的数据的位置信息从所述更新后的目标缓存页中读取数据。
  17. 根据权利要求16所述的计算机系统,其特征在于,所述处理器具体用于:
    根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,其中,所述有效数据为所述目标缓存页的最新修改数据;
    将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。
  18. 根据权利要求16或17所述的计算机系统,其特征在于,所述log数据片 的信息还包括:所述log数据片在所述目标缓存页中的偏移量、所述log数据片的长度以及所述插入的数据节点的相邻数据节点的地址信息。
  19. 一种访问请求处理装置,其特征在于,所述访问请求处理装置应用于计算机系统中,所述计算机系统包括非易失性内存NVM,所述访问请求处理装置包括:
    接收模块,用于接收写请求,所述写请求中携带有文件标识、缓存区指针以及待写入数据的大小,其中,所述缓存区指针用于指向缓存待写入数据的缓存区,所述待写入数据为所述写请求要访问的目标文件的修改数据;
    获取模块,用于根据所述文件标识获取访问位置,所述访问位置指示所述写请求在所述目标文件中写入数据的起始地址;
    确定模块,用于根据所述访问位置、所述待写入数据的大小以及缓存页的大小确定目标缓存页,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件数据的内存页;
    所述确定模块,还用于确定所述NVM中存储有所述目标缓存页的日志链log chain,所述目标缓存页的log chain中包含有至少一个数据节点,其中,每一个数据节点中包含有所述目标缓存页在一次修改过程中的修改数据的信息;
    插入模块,用于在所述目标缓存页的log chain中插入新的数据节点,所述插入的数据节点中包含有所述目标缓存页的log数据片的信息,其中,所述log数据片为所述目标缓存页的修改数据,所述log数据片是根据所述缓存区指针从缓存区获得的至少一部分待写入数据,所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址。
  20. 根据权利要求19所述的装置,其特征在于,所述插入模块具体用于:
    在所述目标缓存页的log chain的尾部或头部插入新的数据节点,其中,在插入所述新的数据节点后,所述目标缓存页的log chain中包含有根据所述目标缓存页的更新顺序依次链接的至少两个数据节点。
  21. 根据权利要求19或20所述的装置,其特征在于,还包括:
    更新模块,用于根据所述目标缓存页的log chain中记录的至少一个log数据片的信息获得更新后的目标缓存页;
    存储模块,用于将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中。
  22. 根据权利要求21所述的装置,其特征在于,所述更新模块具体用于:
    根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,其中,所述有效数据为所述目标缓存页的最新修改数据;
    将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。
  23. 根据权利要求21-22任意一项所述的装置,其特征在于,还包括:
    回收模块,用于在将所述更新后的目标缓存页的数据存储于所述计算机系统的外存设备中之后,回收所述目标缓存页的log chain。
  24. 根据权利要求19-23任意一项所述装置,其特征在于,所述log数据片的信息还包括:所述log数据片在所述目标缓存页中的偏移量、所述log数据片的长度以及所述插入的数据节点的相邻数据节点的地址信息。
  25. 一种访问请求处理装置,其特征在于,所述访问请求处理装置应用于计算机系统中,所述计算机系统包括非易失性内存NVM,所述访问请求处理装置包括:
    接收模块,用于接收读请求,所述读请求中携带有文件标识以及待读取的数据的大小;
    获取模块,用于根据所述文件标识获取访问位置,所述访问位置指示所述读请求在目标文件中读取数据的起始地址;
    确定模块,用于根据所述访问位置、所述待读取的数据的大小以及缓存页的大小确定目标缓存页以及所述目标缓存页中待读取的数据的位置信息,其中,所述目标缓存页是内存中用于缓存所述目标文件中被所述待写入数据修改的文件 数据的内存页;
    所述确定模块,还用于确定所述NVM中存储有所述目标缓存页的日志链log chain,其中,所述目标缓存页的log chain中包含有至少一个log数据片的信息,每个log数据片为所述目标缓存页在一次修改过程中的修改数据,所述log数据片的信息包括所述log数据片或者所述log数据片在所述NVM中的存储地址;
    更新模块,用于根据所述目标缓存页以及所述目标缓存页的log chain中的至少一个log数据片的信息获得更新后的目标缓存页;
    读取模块,用于根据所述目标缓存页中待读取的数据的位置信息从所述更新后的目标缓存页中读取数据。
  26. 根据权利要求25所述的装置,其特征在于,所述更新模块具体用于:
    根据所述目标缓存页的log chain中记录的至少一个log数据片的信息确定所述目标缓存页的log chain中的有效数据,其中,所述有效数据为所述目标缓存页的最新修改数据;
    将所述有效数据更新到所述目标缓存页中,以获得所述更新后的目标缓存页。
  27. 根据权利要求25或26所述的装置,其特征在于:所述log数据片的信息还包括:所述log数据片在所述目标缓存页中的偏移量、所述log数据片的长度以及所述插入的数据节点的相邻数据节点的地址信息。
PCT/CN2015/099933 2015-12-30 2015-12-30 访问请求处理方法、装置及计算机系统 WO2017113213A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP15911842.1A EP3376394B1 (en) 2015-12-30 2015-12-30 Method and device for processing access request, and computer system
CN201580085444.2A CN108431783B (zh) 2015-12-30 2015-12-30 访问请求处理方法、装置及计算机系统
PCT/CN2015/099933 WO2017113213A1 (zh) 2015-12-30 2015-12-30 访问请求处理方法、装置及计算机系统
US16/021,555 US10649897B2 (en) 2015-12-30 2018-06-28 Access request processing method and apparatus, and computer device
US16/855,129 US11301379B2 (en) 2015-12-30 2020-04-22 Access request processing method and apparatus, and computer device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/099933 WO2017113213A1 (zh) 2015-12-30 2015-12-30 访问请求处理方法、装置及计算机系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/021,555 Continuation US10649897B2 (en) 2015-12-30 2018-06-28 Access request processing method and apparatus, and computer device

Publications (1)

Publication Number Publication Date
WO2017113213A1 true WO2017113213A1 (zh) 2017-07-06

Family

ID=59224246

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/099933 WO2017113213A1 (zh) 2015-12-30 2015-12-30 访问请求处理方法、装置及计算机系统

Country Status (4)

Country Link
US (2) US10649897B2 (zh)
EP (1) EP3376394B1 (zh)
CN (1) CN108431783B (zh)
WO (1) WO2017113213A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144425A (zh) * 2018-09-07 2019-01-04 郑州云海信息技术有限公司 一种元数据存储方法、装置、设备及计算机可读存储介质

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10740231B2 (en) 2018-11-20 2020-08-11 Western Digital Technologies, Inc. Data access in data storage device including storage class memory
CN111385335B (zh) * 2018-12-29 2023-04-07 广州市百果园信息技术有限公司 数据请求的处理方法、装置、设备及存储介质
SG10201913065QA (en) * 2019-12-23 2021-07-29 Sensetime Int Pte Ltd Data processing method and apparatus, and edge device
US11526415B2 (en) 2020-04-22 2022-12-13 StreamSets, Inc. Progressive error handling
US11249921B2 (en) * 2020-05-06 2022-02-15 Western Digital Technologies, Inc. Page modification encoding and caching
CN112148736B (zh) * 2020-09-23 2024-03-12 抖音视界有限公司 缓存数据的方法、设备及存储介质
CN113553346B (zh) * 2021-07-22 2022-08-16 中国电子科技集团公司第十五研究所 大规模实时数据流一体化处理、转发和存储方法及系统
CN113553300B (zh) * 2021-07-27 2024-05-24 北京字跳网络技术有限公司 文件的处理方法、装置、可读介质和电子设备
CN114510198B (zh) * 2022-02-16 2023-06-30 北京中电华大电子设计有限责任公司 一种提高nvm擦写效率的方法
CN115858421B (zh) * 2023-03-01 2023-05-23 浪潮电子信息产业股份有限公司 一种缓存管理方法、装置、设备、可读存储介质及服务器
CN117539460A (zh) * 2023-10-30 2024-02-09 浙江工企信息技术股份有限公司 一种数据集编辑处理方法及系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090049234A1 (en) * 2007-08-14 2009-02-19 Samsung Electronics Co., Ltd. Solid state memory (ssm), computer system including an ssm, and method of operating an ssm
CN101903866A (zh) * 2007-11-21 2010-12-01 提琴存储器公司 非易失存储介质中的数据存储的方法和系统
CN103955528A (zh) * 2014-05-09 2014-07-30 北京华信博研科技有限公司 写入文件数据的方法、读取文件数据的方法以及装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5893140A (en) * 1996-08-14 1999-04-06 Emc Corporation File server having a file system cache and protocol for truly safe asynchronous writes
GB0123415D0 (en) 2001-09-28 2001-11-21 Memquest Ltd Method of writing data to non-volatile memory
US20060184719A1 (en) 2005-02-16 2006-08-17 Sinclair Alan W Direct data file storage implementation techniques in flash memories
JP2006338734A (ja) * 2005-05-31 2006-12-14 Hitachi Global Storage Technologies Netherlands Bv データ記憶装置及びエラーリカバリ方法
EP2111583A4 (en) 2008-02-29 2010-06-02 Toshiba Kk MEMORY SYSTEM
WO2011090500A1 (en) 2010-01-19 2011-07-28 Rether Networks Inc. Random write optimization techniques for flash disks
US10949415B2 (en) * 2011-03-31 2021-03-16 International Business Machines Corporation Logging system using persistent memory
US9274937B2 (en) 2011-12-22 2016-03-01 Longitude Enterprise Flash S.A.R.L. Systems, methods, and interfaces for vector input/output operations
CN103838676B (zh) 2012-11-22 2017-10-17 华为技术有限公司 数据存储系统、数据存储方法及pcm桥
EP3033682A4 (en) * 2013-08-14 2017-04-05 Skyera, LLC Address translation for a non-volatile memory storage device
US9342457B2 (en) * 2014-03-11 2016-05-17 Amazon Technologies, Inc. Dynamically modifying durability properties for individual data volumes
CN105159818B (zh) * 2015-08-28 2018-01-02 东北大学 内存数据管理中日志恢复方法及其仿真系统
EP3385846B1 (en) * 2015-12-30 2020-02-12 Huawei Technologies Co., Ltd. Method and device for processing access request, and computer system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090049234A1 (en) * 2007-08-14 2009-02-19 Samsung Electronics Co., Ltd. Solid state memory (ssm), computer system including an ssm, and method of operating an ssm
CN101903866A (zh) * 2007-11-21 2010-12-01 提琴存储器公司 非易失存储介质中的数据存储的方法和系统
CN103955528A (zh) * 2014-05-09 2014-07-30 北京华信博研科技有限公司 写入文件数据的方法、读取文件数据的方法以及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3376394A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144425A (zh) * 2018-09-07 2019-01-04 郑州云海信息技术有限公司 一种元数据存储方法、装置、设备及计算机可读存储介质

Also Published As

Publication number Publication date
CN108431783B (zh) 2020-09-18
EP3376394A1 (en) 2018-09-19
CN108431783A (zh) 2018-08-21
EP3376394A4 (en) 2018-12-26
EP3376394B1 (en) 2022-09-28
US20180307602A1 (en) 2018-10-25
US11301379B2 (en) 2022-04-12
US10649897B2 (en) 2020-05-12
US20200250091A1 (en) 2020-08-06

Similar Documents

Publication Publication Date Title
US11301379B2 (en) Access request processing method and apparatus, and computer device
CN105843551B (zh) 高性能和大容量储存重复删除中的数据完整性和损耗电阻
US11030092B2 (en) Access request processing method and apparatus, and computer system
KR101717644B1 (ko) 고체-상태 저장 디바이스 상에서 데이터를 캐싱하는 장치, 시스템, 및 방법
EP2879040B1 (en) Data storage method, data storage apparatus, and storage device
US9778860B2 (en) Re-TRIM of free space within VHDX
US9075754B1 (en) Managing cache backup and restore
US9021222B1 (en) Managing incremental cache backup and restore
CN106445405B (zh) 一种面向闪存存储的数据访问方法及其装置
CN106951375B (zh) 在存储系统中删除快照卷的方法及装置
KR101779174B1 (ko) 저널링 파일시스템의 데이터 폐기 방법 및 이를 구현하기 위한 메모리 관리 장치
CN107329704B (zh) 一种缓存镜像方法及控制器
WO2018076633A1 (zh) 一种远程数据复制方法、存储设备及存储系统
KR20200060220A (ko) 비휘발성 메모리 기반 파일 시스템 및 이를 이용한 데이터 갱신 방법
US11947419B2 (en) Storage device with data deduplication, operation method of storage device, and operation method of storage server
WO2016206070A1 (zh) 一种文件更新方法及存储设备
CN111625477B (zh) 访问擦除块的读请求的处理方法与装置
US20240192860A1 (en) Method and device for log structured merge-tree based key-value data storage
US11954352B2 (en) Techniques for lock contention reduction in a log structured system
CN116185949A (zh) 缓存存储方法以及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15911842

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2015911842

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE