WO2020069654A1 - Procédé de gestion de demande de création d'instantané et dispositif de stockage associé - Google Patents

Procédé de gestion de demande de création d'instantané et dispositif de stockage associé

Info

Publication number
WO2020069654A1
WO2020069654A1 PCT/CN2019/107808 CN2019107808W WO2020069654A1 WO 2020069654 A1 WO2020069654 A1 WO 2020069654A1 CN 2019107808 W CN2019107808 W CN 2019107808W WO 2020069654 A1 WO2020069654 A1 WO 2020069654A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage device
transaction
file system
snapshot
future
Prior art date
Application number
PCT/CN2019/107808
Other languages
English (en)
Inventor
Mandar Govind NANIVADEKAR
Vaiapuri RAMASUBRAMANIAM
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to CN201980065309.XA priority Critical patent/CN112805949B/zh
Publication of WO2020069654A1 publication Critical patent/WO2020069654A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/128Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0613Improving I/O performance in relation to throughput
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/26Using a specific storage system architecture
    • G06F2212/261Storage comprising a plurality of storage devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms

Definitions

  • the present subject matter described herein in general, relates to the field of storage technologies, and data replication method in a storage system. More particularly, the present subject matter relates to snapshot handling and processing method applied to storage technologies employing data replication method.
  • a remote disaster recovery center to implement remote data backup, so as to ensure that original data is not lost or damaged after a disaster (such as a fire and an earthquake) occurs, and to ensure that a key service resumes within an allowed time range, for example within 5 to 10 seconds, thereby reducing a loss caused by the disaster as much as possible.
  • Current disaster recovery systems may include a 2 center disaster recovery system where disaster recovery is implemented by establishing two data centers which are interconnected by using a dedicated network, where an active data center is used to undertake a service of a user, and a passive data center is deployed to take over the service and the like of the active data center when a disaster occurs in the active data center without data loss or service interruption.
  • the active and the passive data centers may be located at respective site A and site B separated by a distance of up to 200 km.
  • a synchronous replication environment may be employed where the data volumes at the two data centers at the respective sites are mirrored to each other.
  • a file system in the active data center at site A exports a Read- Write File System to a client, and in case of failure of the active data center at site A, the mirrored file system in the passive data center at site B will be made Read-Write File System and will start serving the client operations.
  • the file system architecture is based on objects. Each file serves as an object and each file system is a collection of objects.
  • storage devices at the two data centers process the client services simultaneously, where the storage devices at the site A and the site B are always kept in a synchronized state.
  • the input/output (IO) operations which are received from a host file system in the client device are received at the storage device at site A and the same IO operations are cloned at real time at the storage device at site B.
  • the host file system is basically the source file system mounted on the client device.
  • the IO operations are successfully acknowledged back to the client only after successfully processing of the IO operations at both the storage devices at site A and site B, where the processing includes at least writing operation in a non-volatile cache of both the storage devices at site A and site B.
  • the data centers operating in the synchronous replication environment support creation of snapshots.
  • a snapshot may be understood as a copy of a data set, and the copy includes an image of corresponding data at a particular time point (a start time point of copying) , or the snapshot may be understood as a duplicate of a data set at a time point.
  • the snapshot creation operation is provided both to mirrored data volumes at site A and at site B. Data in the snapshots is completely the same, so that the mirrored data volumes may roll back to a state of a time point (for example, a snapshot rollback of the mirrored data volume means that data at the two locations needs to roll back to the same state) .
  • snapshots finally are taken at the two locations may be inconsistent without a good cooperation mechanism. As such all the IO transactions which are part of the snapshot at site A should also be part of the corresponding snapshot at site B.
  • the input/output (I/O) transactions from the client are blocked when a snapshot creation is triggered at both the data centers at site A and site B.
  • all the IO transactions are flushed to the respective non-volatile memories of the storage devices at site A and site B.
  • site A and site B which may be separated by a distance up to 200 km at least, a typical latency of typical latency of 2-5 ms may be experienced for an IO transaction to be processed at site A with respect to site B.
  • completing taking snapshots needs a relatively long latency. Therefore, for a synchronous replication environment, blocking the IO transactions from the client for up to 5ms may significantly drop the file systems performances at the data centers, when a snapshot is to be created.
  • This summary is provided to introduce concepts related to method and system for handling snapshot creation requests in a synchronous replication environment, and the related storage devices in a dedicated communication network in the system.
  • an aspect of the present invention is to provide a method of handling snapshot creation request at a source file system in a first storage device.
  • the first storage device is in communication with a second storage device in a communication network.
  • the source file system in the first storage device is a synchronous relationship with a destination file system in the second storage device in the communication network.
  • the synchronous relationship herein implies that the source file system and the destination file system are in a synchronized state in real-time in a synchronous replication environment where any client input/output (IO) transaction received at the source file system in the first storage device is also cloned in real-time to the destination file system.
  • IO client input/output
  • the method comprises of determining a throughput of successful input/out (IO) transactions acknowledged both by the source file system and the destination file system, and determining a future I/O transaction marker based on the throughput as determined.
  • the future I/O transaction marker is indicative of the sequence number of a future I/O transaction.
  • the future IO transaction when successfully cached in a primary cache of the first storage device is succeeded with a snapshot creation associated with a snapshot creation request pending with the source file system.
  • the method comprises of creating the snapshot associated with the snapshot creation request pending at the source file system on successfully caching the future I/O transaction in the primary cache, wherein during the creating of the snapshot, an IO transaction received after the future IO transaction, is cached in a secondary cache of the first storage device.
  • another aspect of the present invention is to provide a method of handling snapshot creation request at a destination file system in a second storage device.
  • the second storage device is a communication with a first storage device in a communication network.
  • the destination file system in the second storage device is in a synchronous relationship with the source file system in the first storage device in the communication network.
  • the method comprises of receiving a future input/output (IO) transaction marker from the first storage device, the future IO transaction is determined at the first storage device based on a throughput of successful IO transactions acknowledged both by the source file system and the destination file system.
  • the future I/O transaction marker is indicative of the sequence number of a future I/O transaction.
  • the future IO transaction when successfully cached in a primary cache of the second storage device is succeeded with a snapshot creation associated with a snapshot creation request pending with the destination file system.
  • the method comprises of creating a snapshot associated with the snapshot creation request pending at the destination file system on successfully caching the future I/O transaction in the primary cache of the second storage device, wherein during the creating of the snapshot, an IO transaction received after the future IO transaction, is cached in a secondary cache of the second storage device.
  • another aspect of the present invention is to provide a first storage device storing a source file system having a synchronous replication relationship with a destination file system in a second storage device, the first storage device and the second storage device in communication with each other in a communication network.
  • the first storage device comprises of a first storage device storing a source file system having a synchronous replication relationship with a destination file system in a second storage device, the first storage device and the second storage device in communication with each other in a communication network.
  • the first storage device comprises of a first storage controller, and a primary cache and a secondary cache.
  • the first storage controller is configured to determine a throughput of successful input/output (IO) transactions acknowledged both by the source file system and the destination file system, determine a future I/O transaction marker based on the throughput as determined, wherein the future I/O transaction marker is indicative of the sequence number of a future I/O transaction.
  • the future IO transaction when successfully cached in the primary cache of the first storage device is succeeded with a snapshot creation associated with a snapshot creation request pending with the source file system.
  • the first storage controller is configured to create the snapshot associated with the snapshot creation request pending at the source file system on successfully caching the future I/O transaction in the primary cache, wherein during the creating of the snapshot, an IO transaction received after the future IO transaction, is cached in the secondary cache of the first storage device.
  • yet another aspect of the present invention is to provide a second storage device storing a destination file system having a synchronous replication relationship with a source file system in a first storage device, the first storage device and the second storage device in communication with each other in a communication network.
  • the second storage device comprises of a second storage controller, a primary cache, and a secondary cache.
  • the second storage controller is configured to receive a future input/output (IO) transaction marker from the first storage device, the future IO transaction is determined at the first storage device based on a throughput of successful IO transactions acknowledged both by the source file system and the destination file system.
  • the future I/O transaction marker is indicative of the sequence number of a future I/O transaction.
  • the future IO transaction when successfully cached in a primary cache of the second storage device is succeeded with a snapshot creation associated with a snapshot creation request pending with the destination file system.
  • the second storage controller is configured to create a snapshot associated with the snapshot creation request pending at the destination file system on successfully caching the future I/O transaction in the primary cache, wherein during the creating of the snapshot, an IO transaction received after the future IO transaction, is cached in a secondary cache of the second storage device.
  • Fig. 1A is a schematic diagram of an application scenario according to an embodiment of the present invention.
  • Fig. 1B is a schematic diagram of file system according to an embodiment of the present invention.
  • Fig. 2 illustrates a schematic structure of a storage device according to an embodiment of the present invention.
  • Fig. 3A illustrates a schematic representation of a sequence of a client IO transactions handled along with a snapshot creation request by a file system in a storage device, according to an embodiment of a related art.
  • Fig. 3B illustrates a schematic representation of a sequence of client IO transactions handled along with a snapshot creation request by respective file systems in two storage devices in a synchronous replication environment, according to an embodiment of the present invention.
  • Fig. 4 illustrates a schematic structure of a storage device according to an embodiment of the present invention.
  • Fig. 5 illustrates a schematic structure of a storage device according to an embodiment of the present invention.
  • Fig. 6 illustrates a method of handing snapshot creation request at a source file system in a first storage device according to an embodiment of the present invention.
  • Fig. 7 illustrates a method of handing snapshot creation request at a destination file system in a second storage device according to an embodiment of the present invention.
  • the invention can be implemented in numerous ways, as a process, an apparatus, a system, a computer-readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or electronic communication links.
  • these implementations, or any other form that the invention may take, may be referred to as techniques.
  • the order of the steps of disclosed processes may be altered within the scope of the invention.
  • the terms “plurality” and “aplurality” as used herein may include, for example, “multiple” or “two or more” .
  • the terms “plurality” or “aplurality” may be used throughout the specification to describe two or more components, devices, elements, units, parameters, or the like. Unless explicitly stated, the method embodiments described herein are not constrained to a particular order or sequence. Additionally, some of the described method embodiments or elements thereof can occur or be performed simultaneously, at the same point in time, or concurrently.
  • the various embodiments of the present invention are described in the context of the following exemplary systems and methods.
  • FIG. 1 illustrates a schematic diagram of a typical application scenario of a 2 center disaster (2DC) recovery system, in accordance with an embodiment of the present invention.
  • a 2DC disaster recovery system shown in Fig. 1 includes at least one host 100 and two data centers included at site 10 and site 12 respectively.
  • the two data centers may be connected to each other in a dedicated communication network.
  • the dedicated communication network may include, for example, a fiber or network cable in a star-shaped networking manner.
  • the two data centers may perform data transmission with each other by using the IP (Internet Protocol) protocol or an FC (Fiber Channel) protocol.
  • the host 100 may communicate with the site 10 and/or the site 12 based on the Small Computer System Interface (SCSI) protocol or based on the Internet Small Computer System Interface (iSCSI) protocol, which is not limited herein. Further, the host 100 may access data from the data centers at site 10 and site 12 using NFS (Network File System Protocol) protocol or CIFS (Common Internet File System Protocol) protocol but is not limited herein.
  • SCSI Small Computer System Interface
  • iSCSI Internet Small Computer System Interface
  • NFS Network File System Protocol
  • CIFS Common Internet File System Protocol
  • the host 100 may include any computing device at the client end, which may also be referred to as the ‘client device 100’ .
  • client device 100 may include a server, a desktop computer or an application server, or any other similar device known in current technologies.
  • An operating system and other application programs may be installed in the client device 100.
  • the client device 100 includes a file system 14, hereafter referred to as a ‘host file system 14’ , as illustrated in Fig. 1B.
  • the host file system 100 is stored/backed-up in storage arrays at the two data centers as site 10 and site 12.
  • An input/output (IO) transaction originating from the host file system 14 in the client device 100 may also be referred to as a client IO transaction and may include multiple operations corresponding to the host file system 14 which may include, for e.g., but are not limited to, a read/write operation, an updating operation, a creation operation, a deleting operation and like operations.
  • Each client IO transaction may be of same or different sizes.
  • Each client IO transaction has a Time Point (TP) value associated therewith.
  • the TP value of the respective IO transaction may be maintained by the host file system 100 and is sent along with the client IO transaction when the respective IO transaction is to be written in a corresponding file system of the storage arrays at the site 10 and the site 12.
  • the TP value is associated with a snapshot and is used to distinguish a snapshot from other snapshots. Whenever a snapshot command, also referred to as a snapshot creation request, is issued by the host file system 14 to the corresponding file system of the storage arrays at the site 10 and the site 12, the TP value is incremented. Therefore, the TP value is a direct indicator of a sequence of the snapshot creation requests issued by the host file system 14 in the client device 100.
  • a snapshot may be identified to which the respective IO transaction belongs to or the group of IO transaction belongs to.
  • the host 100 writes data from the host file system to the 2DC at the site 10 and the site 12.
  • the 2DC at the site 10 and the site 12 may keep, using a synchronous replication technology, data stored at the site 10 and the site 12 synchronized in real time.
  • the host file system 14 when the host file system, 14 writes data, i.e., the client IO transaction, to the site 10, the data center at the site 10 may simultaneously back-up data to the data center at the site 12.
  • the host file system 14 may perform double-write on data, i.e., double-write the client IO transactions where the client IO is sent to both the data centers at the site 10 and the site 12 simultaneously in the synchronous replication environment.
  • Both the data centers at the site 10 and the 12 process the client IO transactions received from the host file system 14 in the client device 100.
  • the IO transaction is successfully processed at the data center at the site 12, the same is acknowledged to the data center at the site 10. Only after receiving an acknowledgment from the data center at the site 12, the data center at the site 10 sends an acknowledgment to the client device 100.
  • the host file system 14 performs a write data to the corresponding file systems in the data centers at the site 10 and the site 12
  • the host file system 14 receives a write success response from the data center at the site 10, only when the write data is successfully cached at both the site 10 and the site 12.
  • the data center at the site 10 may include a first storage device 102 and the data center at the site 12 may include a second storage device 104.
  • the first storage device 102 and the second storage device 104 may be storage devices such as storage array or a server known in current technologies.
  • the first storage device 102 and the second storage device 104 may include a storage area network (SAN) array or may include a network attached storage (NAS) array.
  • SAN storage area network
  • NAS network attached storage
  • a specific form of a storage device is each data center is not limited in this embodiment of the present disclosure. It should be noted that all the methods in the embodiments of the present disclosure may be performed by the storage devices at the sites 10 and the site 12. In the application scenario shown in FIG. 1 the distance between the first storage device 102 and the second storage device 104 may be up to 200 km.
  • the first storage device 102 and the second storage device may be in the same city or different cities, as long as the synchronous replication of the data between the first storage device 102 and the second storage device 104
  • the first storage device 102 and the second storage device 104 form respective storage space at the site 10 and the site 12 for storing the host file system 14 of the client device 100.
  • the storage space thus formed may include a respective file system corresponding to the host file system 14 in the client device 100.
  • Fig. 1B illustrates a source file system 16 in the first storage device 102 and a destination file system 18 in the second storage device 104 where the source file system 16 and the destination file system 18 for the storage space for the replicated data corresponding to the host file system 14 in the client device 100.
  • the source file system 16 and the destination file system 18 are said to be in a synchronous replication relationship with each other as they are formed synchronously in real-time.
  • the data volume may be a file system or a logical storage space formed by mapping a physical storage space.
  • the data volume may be a logical unit number (LUN) , or a file system, and the data volumes may be multiple.
  • LUN logical unit number
  • the 2DC employed support creation of snapshots. Due to the nature of synchronous replication relationship between the source file system 16 in the first storage device 102 and the destination file system 18 in the second storage device 104, a snapshot creation request received from the host file system 14 in the client device 100 is applied both to the source file system 16 and the destination file system 18. ‘Snapshot’ used herein should be understood as different from the replication of the file systems described in the present disclosure. Data replication may be performed by using the snapshot view. However, in the context of the present invention, data replication at the second storage device 104 may be performed using other known operations. A snapshot creation request may be received from the host 100 to create a copy of the file system at the first storage device 102 and the second storage device 104.
  • a snapshot is a fully usable copy of a specified collection of data, where the copy includes an image of corresponding data at a time point (a time point at which the copy begins) .
  • the snapshot may be a duplicate of the data represented by the snapshot, or a replica of the data.
  • the first storage device 102 and the second storage device 104 continue receiving write operation, i.e., the incoming IO transactions sent by the host 100 even when a snapshot is being created.
  • a snapshot creation can happen at any time.
  • the snapshot creation request and the IO transactions are not related. While the prior art storage solutions block the incoming IO transactions, and after creating the snapshot, unblock the incoming IO transactions, the storage devices in the present invention do not block the IO transactions. However, some of the IO transactions fall before in the current snapshot and some of the IO transactions fall after the next snapshot.
  • the snapshot creation command is only sent to the source file system 16 from the host 100.
  • the corresponding snapshot creation command also has to be sent to the destination file system 18 at the site 12 from the source file system 16 at the site 10.
  • the destination file system 18 at the site 12 needs to interpret the snapshot creation command and create the snapshot only after all the IO transactions belonging to the earlier snapshot are written to its cache and not allow IO transactions belonging to the next snapshot to be written before the earlier snapshot is created. Therefore, it is important to ensure that the IO transactions which form part of the snapshot at the source file system 16 in the first storage device 102 also form part of the same snapshot at the destination file system 18 in the second storage device 104.
  • the IO transactions which form part of a snapshot creation request at the first storage device 102 consistently also form part of a snapshot creation request sent to the second storage device 104 so as not to affect the file system performance during the snapshot creation at the second storage device 104 and ensure that the snapshot creation is instantaneous.
  • the solutions implemented in the present invention ensure that the exact IO transactions are written at the source file system 16 in the first storage device 102 and the destination file system 18 in the second storage device 104 before a snapshot gets created at both sites 10 and 12 to avoid any inconsistency.
  • Fig. 3A illustrates an example situation where the incoming IO transactions which are a part of a snapshot corresponding to a snapshot creation request received at site 10 fall out of band when a snapshot creation request is received.
  • ‘Out of band’ herein refers to those IO transactions whose time point (TP) are incremented due to a snapshots creation request.
  • TP time point
  • the IO transactions 7, 8 and 9 which fall after the snapshot creation request accordingly have a TP 1.
  • the six IO transactions are therefore part of a first snapshot, say, for example, snapshot 1, at the site 10.
  • the six IO transactions also have to a part of snapshot 1 at the site 12.
  • the IO transactions which are received at the sites 10, 12 are committed to a respective cache of the storage devices 102, 104.
  • Now, when while an IO transaction which is associated with TP 0 is in progress to the site 12, if it is delayed and reaches site 12 after the snapshot creation request, it will fail to get committed to the cache at the second storage device 104 because it will get incremented out-of-band.
  • FIG. 2 is a schematic structural diagram of a storage device 20 (for example, the storage device 102 and the storage device 104) according to an embodiment of the present disclosure.
  • the storage device 20 shown in FIG. 2 is a storage array.
  • the storage device 20 may include a storage controller 200 and a disk array 214, where the disk array 214 herein is configured to provide a storage space and may include a redundant array of independent disks (RAID) or a disk chassis including multiple disks.
  • RAID redundant array of independent disks
  • the disk 216 is configured to store data.
  • the disk array 214 is in communication connection with the controller 200 by using a communication protocol, for example, the SCSI Protocol, which is not limited herein.
  • the disk array 214 is merely an example of a storage in the storage system.
  • data may also be stored by using a storage, for example, a tape library.
  • the disk 216 is also merely an example of a memory for building the disk array 214.
  • the disk array 214 may also include a storage including a non-volatile storage medium such as a solid state disk (SSD) , a cabinet including multiple disks, or a server, which is not limited herein.
  • SSD solid state disk
  • the storage controller 200 is a “brain” of the storage device 20, and mainly includes a processor 202, a cache 204, a memory 206, a communications bus (a bus for short) 210, and a communications interface 212.
  • the processor 202, the cache 204, the memory 206, and the communications interface 212 communicate with each other by using the communications bus 210.
  • the communications interface 212 is configured to communicate with the host 100, the disk 216, or another storage device.
  • the memory 206 is configured to store a program 208.
  • the memory 206 may include a high-speed RAM memory, or may further include a non-volatile memory, for example, at least one magnetic disk storage. It is understandable that the memory 206 may be various non-transitory machine-readable media, such as a random access memory (RAM) , a magnetic disk, a hard disk drive, an optical disc, an SSD, or a non-volatile memory, that can store program code.
  • RAM random access memory
  • the program 208 may include program code, and the program code includes a computer operation instruction.
  • the cache 204 is a storage between the controller and the hard disk drive and has a capacity smaller than that of the hard disk drive but a speed faster than that of the hard disk drive.
  • the cache 204 is configured to temporarily store data, for example, the received IO transactions from the host 100, or another storage device and temporarily store data read from the disk 216, so as to improve performance and reliability of the array.
  • the cache 204 may be various non-transitory machine-readable media, such as a RAM, a ROM, a flash memory, or an SSD, that can store data, which is not limited herein. In accordance with an embodiment of the present invention, the cache 204
  • the processor 202 may be a central processing unit CPU or an application-specific integrated circuit ASIC (Application-Specific Integrated Circuit) or is configured as one or more integrated circuits that implement this embodiment of the present disclosure.
  • An operating system and another software program are installed in the processor 202, and different software programs may be considered as different processing module, and have different functions, such as processing an input/output (I/O) request for the disk 216, performing another processing on data in the disk 216, or modifying metadata saved in the storage device 20. Therefore, the storage controller 200 can implement various data management functions, such as an IO operation, a snapshot, a mirroring, and replication.
  • the processor 202 is configured to execute the program 208, and specifically, may perform relevant steps in the following method embodiments.
  • a first storage device 400 is illustrated where the first storage device 400 is configured to implement the handling method of the snapshot creation request in accordance with the embodiments of the present disclosure.
  • the first storage device 400 includes the first storage device 102 (shown in Fig. 1) and the storage device 20 (shown in Fig. 2) and is part of the synchronous replication environment including another storage device (for example the storage device 104) where the file system at both the storage devices form a synchronous replication relationship with each other.
  • the first storage device 400 includes a first storage controller 402 (for example the storage controller 200) , a primary cache 404, a secondary cache 406, a sending unit 408 and a receiving unit 410.
  • the first storage controller 402 implements the processing of the incoming IO transaction from the host 100 and creates the snapshot of the file system in accordance with the teachings of the present disclosure.
  • the source file system 16 is stored at the site 10 in the first storage device 400.
  • the snapshot created is a time point copy of the source file system 16.
  • the first storage controller 402 identifies a sequence number of each IO transaction received in the source file system 16 from the host 100, i.e., the host file system 14.
  • the first storage controller 402 may employ a sequence no. generator 403 which may generate and assign a sequence no. to the each IO transaction received at the source file system 16.
  • the sequence no. generator 403 may generate and assign the sequence no.
  • sequences no. generator 403 is shown as a part of the first storage controller 402 in the embodiment shown in Fig. 4, the same should not be construed as limiting to the present invention and other modification may be possible.
  • the first storage controller 402 determines a throughput of successful IO transactions acknowledged both the by the source file system 16 in the first storage device 400 and a destination file system in a second storage device, where the source file system 16 and the destination file system (for example, the destination file system 18 of the second storage device 104) have a synchronous replication relationship.
  • the first storage device 400 has to take into consideration that the IO transaction is committed to the respective cache at both the first storage device 400 and the second storage device (for example the second storage device 104) before it sends an acknowledgment to the host 100.
  • the throughput of the successful IO transactions being determined by the first storage controller 402 reflects a total number of IO transactions successful per second.
  • This number may also be referred to as write IO operations per second (IOPS) .
  • IOPS write IO operations per second
  • a 40000 IOPS would indicate that 40000 write operations are being committed to the respective cache per second at both the source file system 16 and the destination file system 18.
  • the throughput of the successful IO transactions may differ for different storage solutions, the number of file systems being exported, the current write workload, size of the memory buffer (8k, 16k) , latency being induced by the communication path between the host 100 and the sites 10, 12.
  • the first storage controller 402 determines a future IO transaction marker.
  • the future IO transaction marker is indicative of the sequence number of a future IO transaction which when committed to the cache of the first storage device 400, enables the first storage controller 402 to create a snapshot associated with a pending snapshot creation request received at an earlier point in time from the host 100.
  • the snapshot creation shall be initiated by the first storage controller 402 only after committing all the IO transactions before X+200 th IO transaction and the X+200th IO transaction to its cache.
  • all the IO transactions which fall before the X+200th IO transaction form part of a first snapshot and the X+200th IO transaction onwards, all the IO transactions form part of a second snapshot.
  • the first storage controller 402 may determine a waiting time associated with the snapshot creation request pending with the source file system 16, and on lapse of the waiting time, the first storage controller 402 initiates the creation of the snapshot associated with the request pending.
  • This embodiment may apply to a situation when there is a sudden drop in incoming IO transactions from the host 100. So, the future IO transaction cannot be received and thus the snapshot may never be created. However, since the processing of the snapshot creation request has already been initiated in the source file system 16, the snapshot has to be created. In this case, a waiting time is determined or a pre-determined time-based boundary may be set, on the lapse of which the snapshot is created prior to receiving the future IO transaction.
  • the first storage controller 402 may re-determine the throughput of successful IO transactions acknowledged both by the source file system 16 and the destination file system 18 in case the number of incoming Io transactions from the host 100 in the source file system 16 is reduced below a threshold. Based on the re-determined throughput of successful IO transactions, the future IO transactions marker may be determined.
  • the first storage controller 402 re-determines the throughput of successful IO transactions in case the number of incoming IO transactions from the host 100 becomes so heavy, for example, that the IO transactions forming part of the first snapshot reach the destination file system 18 in the second storage device later than the snapshot creation request from the source file system 16 in the first storage device 400.
  • the primary cache 404 and the secondary cache 406 are associated with committing the IO transactions forming part of the first snapshot and the second snapshot, respectively.
  • the primary cache 404 and the secondary cache 406 form separate partitions of the cache (for example, cache 204 shown in Fig. 2) of the first storage device 400.
  • the primary cache 404 and the secondary cache 406, form two separate cache.
  • the primary cache is configured to commit the IO transactions which are received from the host 100 prior to the IO transaction associated with the future IO transaction marker and the secondary cache is configured to commit the IO transactions which are received from the host 100 after the IO transaction associated with the future IO transaction marker until a snapshot is created.
  • all the IO transactions which fall up to the X+200th IO transaction are stored in the primary cache 404 and all the IO transactions which are received after X+200th IO transaction are stored in the secondary cache 406.
  • the first storage controller 400 processes the IO transactions cached in the secondary cache 406 after creating the snapshot for the IO transactions cached in the primary cache 404.
  • the IO transactions stored in the secondary cache 406 are assigned an incremented TP value which is valid after the snapshot creation pertaining to the IO transactions present in the primary cache 404.
  • the IO transactions stored in the secondary cache 406 are flushed without any performance delay. This means that the IO transactions committed to the secondary cache 406 are assigned a next valid TP value as the snapshot pertaining to the IO transactions in the primary cache has been created. Thus, there is no performance delay as the IO transactions with the next valid TP value are already present in the cache to be processed.
  • the first storage device 400 may include the sending unit 408 configured to communicate the future IO transaction marker to the second storage device (for example the second storage device 104) .
  • the future IO transaction marker may be communicated to the second storage device by supplementing an information indicative of the future I/O transaction marker in a replicating IO transaction sent from the source file system in the first storage device to the destination file system in the second storage device.
  • the replicating IO transaction may be a subsequent IO transaction received after the snapshot creation request at the source file system 16 from the host 100.
  • the snapshot creation request is received after the IO transaction with the sequence number 8
  • the future IO transaction marker may be the IO with the sequence number 9.
  • the indicator corresponding to the future IO transaction marker is supplemented along with said replicating IO transaction.
  • Fig. 3B which shall be described later.
  • the future I/O transaction marker is communicated to the second storage device by sending a table from the first storage device 400 to the second storage device (for example the storage device 104) .
  • the table includes the future I/O transaction marker and a time stamp, i.e., a TP value, corresponding to the snapshot creation request, wherein the table is sent upon receiving multiple snapshot creation requests at the source file system 16 in the first storage device 400.
  • the table may comprise of a plurality of future IO transaction markers and a corresponding time stamp, each future IO transaction marker and the corresponding time stamp is associated with one of the multiple snapshot creations requests received at the source file system 16. Table 1 below illustrates one such example of the present embodiment.
  • the snapshot creation requests have been received at the source file system 16 but the future IO transaction marker is yet to be determined.
  • the table is sent with each replicating IO transaction from the source file system 16 to the destination file system 18.
  • the table is sent with each replicating IO transaction from the source file system 16 to the destination file system 18, only when there is a snapshot creation request pending.
  • Such table in one example, is sent to the destination file system 18 with a replicating IO transaction and subsequent replication IO transactions when a snapshot creation request comes in at the source file system 16.
  • the table need not be sent again by the sending unit 408. Further, in the example shown in Table 1 above, the last three rows may not be sent since they are placeholders for any future snapshot creation requests for which the future IO transaction marker has not been determined yet.
  • both source file system 16 and the destination file system 18 may maintain a copy of the table.
  • the sending unit 408 may be further configured to communicate a revised or a determined future IO transaction marker in case the throughput of successful IO transactions is re-determined.
  • the sending unit 408 may be further configured to communicate the waiting time as determined or the pre-determined time-based boundary as set by the first storage controller 402 in case the number of incoming IO transactions from the host 100 at the source file system 16 falls below a certain threshold.
  • the first storage device 400 may include the receiving unit 410 configured to receive an acknowledgment from the second storage device (for example the second storage device 104) on successfully caching the incoming IO transactions at the second storage device.
  • the IO transactions received at the second storage device is also referred to as replicating IO transactions.
  • the replication IO transactions once successfully written to the cache of the second storage device are acknowledged to the first storage device which is received by the receiving unit 410.
  • the receiving unit 410 is configured to receive a delivery delay communication from the second storage device (for example, the second storage device 104) , the delivery delay communication being associated with a delay in receiving the future IO transaction marker with respect to one or more intermediate IO transactions at the second storage device.
  • One or more intermediate IO transactions used herein refers to those IO transactions which form part of any subsequent snapshot creation request pending with the source file system 16 in the first storage device. This embodiment refers to a situation when the incoming IO transactions from the host 100 suddenly increase and the future IO transaction marker determined by the first storage controller 402 is still in-flight to the destination file system 18 at the second storage controller.
  • the intermediate IO transactions which are received in future forming part of the next snapshot creation request are received prior to the future IO transaction marker at the destination file system 18.
  • the acknowledgment from the destination file system 18 is delayed at the source file system 16.
  • a delivery delay communication may be received at the receiving unit 410 of the first storage device 400 from the second storage device.
  • the first storage controller 402 may re-determine the future IO transaction marker.
  • a second storage device 500 is illustrated where the second storage device 500 is configured to implement the handling method of the snapshot creation request in accordance with the embodiments of the present disclosure.
  • the second storage device 500 is in communication with the first storage device 400 in a communication network and stores the destination file system 18 which is in a synchronous replication relationship with the source file system 16 in the first storage device 400.
  • the second storage device 500 includes the second storage device 104 (shown in Fig. 1) and the storage device 20 (shown in Fig. 2) .
  • the second storage device 500 includes a second storage controller 502 (for example the storage controller 200) , a primary cache 504, a secondary cache 506, a sending unit 508 and a receiving unit 510.
  • the second storage controller 502 implements the processing of the incoming replicating IO transaction received from the host 100 or the source file system 16 and creates the snapshot in accordance with the teachings of the present disclosure.
  • the snapshot thus created is a time point copy of the destination file system 18.
  • the second storage controller 502 identifies a sequence number of each replicating IO transaction received in the destination file system 18.
  • the second storage controller 502 may rely upon the sequence no. of the IO transaction identified at the first storage device 400 where it may have been determined using the sequence no. generators 403. Accordingly, in one embodiment, the sequence no. of the IO transaction may be sent to the second storage device 500 via the sending unit 408 of the first storage device 400 which may be subsequently received by the receiving unit 510 of the second storage device.
  • the second storage controller may employ a sequence no. generator 503 (shown in dotted lines) which may be employed to generate and assign a sequence no. to the each IO transaction received at the destination file system 18.
  • the sequence no. generator 503 may generate and assign the sequence no. only when a snapshot is to be created or when a snapshot creation request is received at the destination file system 18.
  • the sequences no. generator 503 is shown as a part of the second storage controller 502 using dotted lines in the embodiment shown in Fig. 5, the same should not be construed as limiting to the present invention and other modification may be possible.
  • the second storage controller 502 receives the future IO transaction marker from the first storage controller 400 which determines the future IO transaction marker based the throughput of successful IO transactions acknowledged both by the source file system 16 in the first storage device 400 and the destination file system 18 in the second storage device 500.
  • the second storage controller 502 may receive the future IO transaction marker via the receiving unit 510 which may, in turn, receive the future IO transaction marker from the sending unit 408 of the first storage device 400.
  • the receiving unit 510 may receive the future IO transaction marker in form of an indicator on a replication IO transaction and/or in the form of the table and communicate the same to the second storage controller 502.
  • the future IO transaction marker is indicative of the sequence number of a future IO transaction which when committed to the cache of the second storage device 500, enables the second storage controller 502 to create a snapshot associated with a pending snapshot creation request pending at the destination file system 18.
  • the snapshot creation shall be initiated by the second storage controller 502 only after committing all the IO transactions before X+200 th IO transaction and the X+200th IO transaction to its cache.
  • all the IO transactions which fall before the X+200th IO transaction form part of a first snapshot and the X+200th IO transaction onwards, all the IO transactions form part of a second snapshot.
  • the IO transactions which form part of a snapshot at the site 10 and the replicating IO transactions which form part of the corresponding snapshot at the site 12 are exact and same.
  • the snapshot creation at the site 10, i.e., at the source file system 16 in the first storage device 400 and the snapshot creation at the site 12, i.e., at the destination file system 18 in the second storage device 500 are asynchronous with respect to each other. This means that the snapshot creation at the site 10 in the first storage device 400 and the snapshot creation at the site 12 in the second storage device 500 may not happen at the same point in time.
  • the source file system 16 may create its own snapshot as soon as the future IO transaction associated with the future IO transaction marker is committed to its cache, and the destination file system 18 will create its own snapshot when the same future IO transaction is committed to its own cache but the instant will be different because of latency between the first storage device 400 at the site 10 and the second storage device 500 at the site 12.
  • the second storage controller 502 receives the waiting time associated with the snapshot creation request pending with the destination file system 18, from the first storage device 400.
  • the waiting time is determined by the first storage controller 402 and is communicated via its sending unit 408 to the receiving unit 510 of the second storage device 500.
  • the second storage controller 502 initiates the creation of the snapshot associated with the request pending.
  • the predetermined time-based boundary may be set by the first storage controller 402 may be communicated to the second storage controller 502, via the respective sending unit 408 and the respective receiving unity 510.
  • the snapshot is created prior to receiving the future IO transaction.
  • the waiting time and the predetermined time-based boundary is communicated when there is a sudden drop in incoming IO transactions from the host 100 as described above.
  • the first storage controller 402 re-determines the throughput of successful IO transaction and re-determines the future IO transaction marker, the same is received via the receiving unit 510 of the second storage device 500, and the second storage controller 502 accordingly uses the same to identify the future IO transaction which when cached, is succeeded with the snapshot creation request pending with the destination file system 18 in the second storage device 500.
  • the primary cache 504 and the secondary cache 506 are associated with committing the IO transactions forming part of the first snapshot and the second snapshot, respectively.
  • the primary cache 504 and the secondary cache 506 form separate partitions of the cache (for example, cache 204 shown in Fig. 2) of the second storage device 500.
  • the primary cache 504 and the secondary cache 506, are form two separate cache.
  • the primary cache 504 is configured to commit the replicating O transactions which are received at the destination file system 18 prior to the future IO transaction associated with the future IO transaction marker and the secondary cache 506 is configured to commit the IO transactions which are received at the destination file system 18 after the future IO transaction associated with the future IO transaction marker until a snapshot is created.
  • all the IO transactions which fall up to the X+200th IO transaction are stored in the primary cache 504 and all the IO transactions which are received after X+200th IO transaction are stored in the secondary cache 506.
  • Fig. 3B illustrates an example of handling a snapshot creation request at the second storage device 500 based on a received future IO transaction marker from the first storage device 400.
  • the example is shown in Fig. 3B includes a schematic representation of a sequence of client IO transactions handled along with a snapshot creation request by respective file systems in the two storage devices in a synchronous replication environment, according to an embodiment of the present invention.
  • site A which may include, for example, the site 10 where the first storage device 400 is located
  • site B which may include, for example, the second storage device 500 where the second storage device 500 is located
  • the site A and the site B receive IO transactions sequentially numbered from 1 to 17, as depicted from a host (for example, the host 100) . Further, a snapshot creation request is received after the IO transaction with sequence no. 8, as depicted.
  • Such IO transactions are committed to a second block of cache indicated as ‘future IO stored in cache’ .
  • the second block of the cache may include, for example, the secondary cache 406.
  • the above snapshot handling method applied to site A is also applied to site B. It is to be noted that the future IO transaction marker is communicated from the site A to the site B by sending a piggybank on the subsequent replicating IO transaction with sequence number 9 which falls immediately after the snapshot creation request.
  • the IO transactions which are in the second block indicated ‘future IO stored in cache’ are not actually committed to the cache because they fall after the future Io transaction marker.
  • these IO transactions can be immediately committed to the cache with an incremented TP. So even when they are not really being committed to cache, these future IO transactions are acknowledged back to the host as at some point they will be committed to cache and they will not be lost.
  • the second storage controller 502 processes the IO transactions cached in the secondary cache 506 after creating the snapshot for the IO transactions cached in the primary cache 504. Once the snapshot is created, the IO transactions stored in the secondary cache 506 are flushed without any performance delay. On processing the IO transactions in the secondary cache 506, the sending unit 508 of the second storage device 500 sends an acknowledgment to the first storage device 400.
  • the sending unit 508 is also is configured to send the delivery delay communication to the first storage device 400 when there is a delay in receiving the future IO transaction marker with respect to the one or more intermediate IO transactions at the second storage device 400.
  • the storage systems 400 and 500 of the present disclosure may execute the methods of handling the snapshot creation requests as described above in the disclosed embodiments of the synchronous replication environment.
  • the embodiments of the storage systems 400 and 500 have been explained only in relation to the methods of handling the snapshot creation requests at the respective site 10 and the respective site 12. Details with respect to other functions of these units which are apparent to a person skilled in the art have not been described. It is understandable that the embodiments shown in Figs. 4 and 5 are merely exemplary.
  • the controller is merely logical function division and maybe other division in actual implementation.
  • a plurality of modules or components may be combined or integrated into another device, or some features may be ignored or not performed.
  • the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some communications interfaces.
  • the indirect couplings or communication connections between the units and the controller may be implemented in electronic, mechanical, or other forms.
  • Fig. 6 illustrates a method 600 of handling snapshot creating request received at the source file system 16 in the first storage device 400, in accordance with an embodiment of the present invention.
  • the sequence number of each IO transaction received at the source file system 16 in the first storage device 400 from the host 100 is identified.
  • the sequence number of the IO transactions is generated and assigned to the respective IO transaction after receiving the snapshot creation request at the source file system 16 in the first storage device 400 from the host 100.
  • step 604 the throughput of successful IO transactions acknowledged both by the source file system 16 and the destination file system 18 is determined.
  • the future I/O transaction marker based on the throughput is determined, wherein the future I/O transaction marker is indicative of the sequence number of the future I/O transaction which upon successfully cached in a primary cache 404 of the first storage device 400 is succeeded with a snapshot creation associated with a snapshot creation request pending with the source file system 16.
  • the method 600 comprises of determining a waiting time associated with the snapshot creation request pending with the source file system 16. On the lapse of the waiting time, the method further comprises of initiating the creating the snapshot associated with the snapshot creation request prior to receipt of the future IO transaction.
  • the method 600 comprises of redetermining the throughput of successful IO transactions, acknowledged both by the source file system 16 and the destination file system 18. Based on the re-determined throughput of successful IO transactions, the future IO transactions is determined again.
  • the snapshot associated with the snapshot creation request pending at the source file system 16 is created.
  • an IO transaction received after the future IO transaction is cached in the secondary cache 406 of the first storage device 400.
  • the method 600 further comprises of communicating the future I/O transaction marker as determined to the second storage device 500.
  • the method 600 may further comprise of receiving the delivery delay communication from the second storage device 500 based on which the method 600 comprises of re-determining the future I/O transaction marker.
  • the method 600 further comprises of processing the IO transaction cached in the secondary cache 406 of the first storage device on creating the snapshot.
  • the processing of the IO transactions in the secondary cache 406 may include committing the same I0 transactions to the primary cache 404 with an incremented TP value.
  • the method 600 may further comprise of sending an acknowledgment to the host 100 on successfully committing these IO transactions to the primary cache 404.
  • the method 600 may include receiving an acknowledgment of processing of the same IO transactions at the second storage device 500 before sending an acknowledgment to the host 100.
  • the processing of the IO transactions at the first storage device 400 may include receiving an acknowledgment from the second storage device 500 regarding the processing of the same IO transactions.
  • the method 600 may further comprise of a storing all the transaction log details corresponding to the IO transactions received after the future IO transaction associated with the IO transaction marker in an assigned memory area of the first storage device 400.
  • the stored transaction log details of the IO transactions may be used to understand the state when the failure occurred based on which recovery mechanism of the node can be undertaken.
  • Fig. 7 illustrates a method 700 of handling snapshot creation request received at the source file system 16 in the second storage device 500, in accordance with an embodiment of the present invention.
  • the sequence number of each IO transaction received at the destination file system 18 in the second storage device 500 is identified.
  • the sequence number of the IO transactions is received from the first storage device 400.
  • the IO transaction may be generated at the second storage device 500. Further, if generated, the IO transaction is assigned to the respective IO transaction only after receiving the snapshot creation request at the destination file system 18 in the second storage device 500.
  • the future IO transaction marker is received from the first storage device 400, the future IO transaction is determined at the first storage device 400 based on the throughput of successful IO transactions acknowledged both by the source file system 16 and the destination file system 18.
  • the future I/O transaction marker is indicative of the sequence number of a future I/O transaction which upon successfully cached in the primary cache 504 of the second storage device 500 is succeeded with a snapshot creation associated with a snapshot creation request pending with the destination file system 18.
  • a snapshot associated with the snapshot creation request pending at the destination file system 18 is created on successfully caching the future I/O transaction in the secondary cache 506.
  • an IO transaction received after the future IO transaction is cached in the secondary cache 506 of the second storage device 500.
  • the method 700 comprises of receiving a waiting time associated with the snapshot creation request pending with the source file system 16, determined at the first storage device 400. On the lapse of the waiting time, the method 700 further comprises of initiating the creating the snapshot associated with the snapshot creation request prior to receipt of the future IO transaction. In another embodiment, if the number of the IO transactions received at the source file system 16 in the first storage system 400 is reduced below a threshold, the method 700 comprises of receiving a future IO transaction marker based on a redetermined throughput of successful IO transactions at the first storage device 400.
  • the method 700 may further comprise of sending the delivery delay communication to the first storage device 400.
  • the method 700 further comprises processing the IO transaction cached in the secondary cache 506 of the second storage device 500 on creating the snapshot.
  • the processing of the IO transactions in the secondary cache 506 may include committing the same I0 transactions to the primary cache 504 with an incremented TP value.
  • the method 600 may further comprise of sending an acknowledgment to the first storage device 400 on successfully committing these IO transactions to the primary cache 504.
  • the method 700 may further comprise of a storing all the transaction log details corresponding to the IO transactions received after the future IO transaction associated with the IO transaction marker in an assigned memory area of the second storage device 500.
  • the stored transaction log details of the IO transactions may be used to understand the state when the failure occurred based on which recovery mechanism of the node can be undertaken.
  • the disclosed system and method may be implemented in other manners.
  • the described apparatus embodiment is merely exemplary.
  • the unit division is merely logical function division and maybe other division in actual implementation.
  • a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed.
  • the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces.
  • the indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
  • the functions When the functions are implemented in a form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the present invention essentially, or the part contributing to the prior art, or a part of the technical solutions may be implemented in a form of a software product.
  • the computer software product is stored in a storage medium and includes several instructions for instructing a computer node (which may be a personal computer, a server, or a network node) to perform all or a part of the steps of the methods described in the embodiment of the present invention.
  • the foregoing storage medium includes any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM) , a random access memory (Random Access Memory, RAM) , a magnetic disk, or an optical disc.
  • program code such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM) , a random access memory (Random Access Memory, RAM) , a magnetic disk, or an optical disc.
  • Devices that are in communication with each other need not be in continuous communication with each other, unless expressly specified otherwise.
  • devices that are in communication with each other may communicate directly or indirectly through one or more intermediaries.

Abstract

L'invention concerne des procédés et des dispositifs pour gérer une demande de création d'instantané au niveau d'un système de fichier source et d'un système de fichier de destination ayant une relation synchrone l'un avec l'autre dans un réseau de communication. Le procédé comprend les étapes consistant à déterminer un débit de transactions d'entrée/ sortie réussies (IO) reconnues à la fois par le système de fichier source et le système de fichier de destination et déterminer un marqueur de transaction d'entrée/ sortie future sur la base du débit tel que déterminé. Le marqueur de transaction d'entrée/ sortie future indique le numéro de séquence d'une transaction d'entrée/ sortie future qui, lorsqu'elle est mise en cache avec succès dans une mémoire cache primaire du premier dispositif de stockage, est réussie avec une création d'instantané associée à une demande de création d'instantané en attente avec le système de fichier source.
PCT/CN2019/107808 2018-10-01 2019-09-25 Procédé de gestion de demande de création d'instantané et dispositif de stockage associé WO2020069654A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201980065309.XA CN112805949B (zh) 2018-10-01 2019-09-25 处理快照创建请求的方法以及存储设备

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201831037119 2018-10-01
IN201831037119 2018-10-01

Publications (1)

Publication Number Publication Date
WO2020069654A1 true WO2020069654A1 (fr) 2020-04-09

Family

ID=70054942

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/107808 WO2020069654A1 (fr) 2018-10-01 2019-09-25 Procédé de gestion de demande de création d'instantané et dispositif de stockage associé

Country Status (2)

Country Link
CN (1) CN112805949B (fr)
WO (1) WO2020069654A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113868027A (zh) * 2021-12-01 2021-12-31 云和恩墨(北京)信息技术有限公司 数据快照方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120137098A1 (en) * 2010-11-29 2012-05-31 Huawei Technologies Co., Ltd. Virtual storage migration method, virtual storage migration system and virtual machine monitor
US20160205182A1 (en) * 2015-01-12 2016-07-14 Strato Scale Ltd. Synchronization of snapshots in a distributed storage system
US20160320978A1 (en) * 2015-05-01 2016-11-03 Nimble Storage Inc. Management of writable snapshots in a network storage device
US20170068469A1 (en) * 2015-09-03 2017-03-09 Microsoft Technology Licensing, Llc Remote Shared Virtual Disk Snapshot Creation

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7206911B2 (en) * 2004-02-25 2007-04-17 International Business Machines Corporation Method, system, and program for a system architecture for an arbitrary number of backup components
US7165131B2 (en) * 2004-04-27 2007-01-16 Intel Corporation Separating transactions into different virtual channels
US20070300013A1 (en) * 2006-06-21 2007-12-27 Manabu Kitamura Storage system having transaction monitoring capability
EP2486487B1 (fr) * 2009-10-07 2014-12-03 Hewlett Packard Development Company, L.P. Mise en antémémoire de point d'extrémité par une mémoire hôte à partir d'un protocole de notification
US20110252208A1 (en) * 2010-04-12 2011-10-13 Microsoft Corporation Express-full backup of a cluster shared virtual machine
US10430298B2 (en) * 2010-10-28 2019-10-01 Microsoft Technology Licensing, Llc Versatile in-memory database recovery using logical log records
CN104216806B (zh) * 2014-07-24 2016-04-06 上海英方软件股份有限公司 一种文件系统序列化操作日志的捕获与传输方法及其装置
CN104866245B (zh) * 2015-06-03 2018-09-14 马鞍山创久科技股份有限公司 缓存设备和存储系统之间同步快照的方法和装置
US10001933B1 (en) * 2015-06-23 2018-06-19 Amazon Technologies, Inc. Offload pipeline for data copying
US9740566B2 (en) * 2015-07-31 2017-08-22 Netapp, Inc. Snapshot creation workflow
US10042719B1 (en) * 2015-09-22 2018-08-07 EMC IP Holding Company LLC Optimizing application data backup in SMB
WO2017092016A1 (fr) * 2015-12-03 2017-06-08 Huawei Technologies Co., Ltd. Procédé pour qu'un dispositif de stockage de source envoie un fichier source et un fichier cloné du fichier source vers un dispositif de stockage de secours, dispositif de stockage de source et dispositif de stockage de secours

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120137098A1 (en) * 2010-11-29 2012-05-31 Huawei Technologies Co., Ltd. Virtual storage migration method, virtual storage migration system and virtual machine monitor
US20160205182A1 (en) * 2015-01-12 2016-07-14 Strato Scale Ltd. Synchronization of snapshots in a distributed storage system
US20160320978A1 (en) * 2015-05-01 2016-11-03 Nimble Storage Inc. Management of writable snapshots in a network storage device
US20170068469A1 (en) * 2015-09-03 2017-03-09 Microsoft Technology Licensing, Llc Remote Shared Virtual Disk Snapshot Creation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113868027A (zh) * 2021-12-01 2021-12-31 云和恩墨(北京)信息技术有限公司 数据快照方法及装置
CN113868027B (zh) * 2021-12-01 2022-12-23 云和恩墨(北京)信息技术有限公司 数据快照方法及装置

Also Published As

Publication number Publication date
CN112805949B (zh) 2022-08-09
CN112805949A (zh) 2021-05-14

Similar Documents

Publication Publication Date Title
US11836155B2 (en) File system operation handling during cutover and steady state
US11068350B2 (en) Reconciliation in sync replication
US11144211B2 (en) Low overhead resynchronization snapshot creation and utilization
US10191677B1 (en) Asynchronous splitting
US8738813B1 (en) Method and apparatus for round trip synchronous replication using SCSI reads
US8726066B1 (en) Journal based replication with enhance failover
US8595455B2 (en) Maintaining data consistency in mirrored cluster storage systems using bitmap write-intent logging
US7610318B2 (en) Autonomic infrastructure enablement for point in time copy consistency
US9256605B1 (en) Reading and writing to an unexposed device
US9081754B1 (en) Method and apparatus for cascaded replication using a multi splitter
US9003138B1 (en) Read signature command
US9128628B1 (en) Dynamic replication mode switching
US8335771B1 (en) Storage array snapshots for logged access replication in a continuous data protection system
JP4236049B2 (ja) データのミラー・コピーを提供するための方法、システム、およびプログラム
US10223007B1 (en) Predicting IO
US10152267B1 (en) Replication data pull
US9639295B1 (en) Method and apparatus for reducing splitter latency using parallel splitting
US9959174B2 (en) Storage checkpointing in a mirrored virtual machine system
US10372554B1 (en) Verification and restore of replicated data using a cloud storing chunks of data and a plurality of hashes
WO2020069654A1 (fr) Procédé de gestion de demande de création d'instantané et dispositif de stockage associé
US11263091B2 (en) Using inode entries to mirror data operations across data storage sites
US10956271B2 (en) Point-in-time copy on a remote system
US10210060B2 (en) Online NVM format upgrade in a data storage system operating with active and standby memory controllers
US11875060B2 (en) Replication techniques using a replication log
US10185503B1 (en) Consistency group fault tolerance

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19869563

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19869563

Country of ref document: EP

Kind code of ref document: A1