WO2023061557A1

WO2023061557A1 - Memory controller and method for shared memory storage

Info

Publication number: WO2023061557A1
Application number: PCT/EP2021/078128
Authority: WO
Inventors: Itamar OFEK; Michael Hirsch; Assaf Natanzon
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2021-10-12
Filing date: 2021-10-12
Publication date: 2023-04-20
Also published as: CN117693743A

Abstract

A memory controller for use in a shared memory system, which includes a source memory server and a target memory server for storing files. The memory controller is configured to connect to the source memory server and initiate a session utilizing a high-precision clock for synchronization, where the high-precision clock has a precision exceeding a timing threshold. The memory controller further receives a request related to the source memory server, enters the request in a journal file, and initiates backup on the target memory source for the source memory server. One or more files of the source memory server are copied to the target memory server, where the copied files are backup files. The memory controller further analyzes backup files, adapts journal file, and replays requests in journal file on target memory server. Thus, an initial consistent sync point of the shared memory system with multiple clients is created.

Description

MEMORY CONTROLLER AND METHOD FOR SHARED MEMORY STORAGE

TECHNICAL FIELD

The present disclosure relates generally to the field of shared storage and data replication; and more specifically, to a memory controller, and a method for use in a shared memory storage system.

BACKGROUND

Shared storages, such as Network-attached storages (NAS), are widely used as a convenient method for storing and sharing data files. The network-attached storages store data received from multiple clients in a source site and thus are commonly also referred to as a NAS share. This data is further stored as a backup data at a target site, such as a target share. Typically, data backup is used to protect and recover data in an event of data loss in the source site. Examples of the event of data loss may include, but are not limited to, data corruption, hardware or software failure in the source site, accidental deletion of data, hacking, or malicious attack. Thus, for safety reasons, a separate backup storage or the target share is extensively used to store a backup of the data present in the source site.

Conventionally, the NAS share is constantly used by multiple clients for storing new or updated data. Data replication solutions are required to store such data from the NAS share to the target share as a backup. Some NAS manufacturers provide data replication solutions between their own storage devices, i.e., the NAS share and the target share are required to belong to the same product manufacturer or compatible manufactures. Such solutions force users to use the hardware and software products from the same manufacturer (or vendor) and leads to the situation of vendor lock-in, which is not desirable. Further, some conventional data replication solutions are based on continuous replication of snapshot differences through application programming interface (API). However, the conventional solutions that are based on snapshot APIs include scanning the entire file systems and can only detect that a whole file has changed. This is highly inefficient and impractical in cases where applications like databases that modify specific data within very large files. For example, if a small change is made in a very large file, then the snapshot differences will indicate that the file has been modified. Thus, the whole file will be replicated rather than updating the small incremental changes made in it, making the data replication solution inefficient and impractical. Further, some conventional data replication solutions take the approach of watching for changed files, which may be based on periodically scanning the entire shared storage or they may use facilities on the clients to watch for changed files. However, such conventional data replication solutions introduce race conditions as there may be Input/Output (10s) to the files while they are being copied. Thus, there exists a technical problem of inefficiency and unreliability associated with the conventional data replication solutions for shared memory storage.

Therefore, in light of the foregoing discussion, there exists a need to overcome the aforementioned drawbacks associated with conventional data replication solutions.

SUMMARY

The present disclosure provides a memory controller, and a method for use in a shared memory storage system. The present disclosure provides a solution to the existing problem of unreliability and inefficiency in conventional data replication solutions for shared memory storage, where the problem is compounded by the fact that in existing systems, there is a dependency to use compatible vendor services at source and target site, and a user is bound or forced to employ hardware and software solutions in both the source and target shared storage systems from same manufacturer (or vendor), which increases the difficulty to solve this problem of unreliable data replication and data recovery. An aim of the present disclosure is to provide a solution that overcomes at least partially the problems encountered in prior art and provide an improved data replication solution by creating an initial consistent sync point of an active shared storage with multiple clients while introducing minimal latency in the data flow with no additional shared content read. Additionally, the disclosed solution eliminates vendor lock-in issue as the solution does not depend on compatible vendor or manufacturer services at source and target site.

One or more objects of the present disclosure is achieved by the solutions provided in the enclosed independent claims. Advantageous implementations of the present disclosure are further defined in the dependent claims. In one aspect, the present disclosure provides a memory controller. The memory controller is configured to be used in a shared memory storage system comprising a source memory server and a target memory server for storing files, a file having meta data and data. The memory controller being configured to: connect to the source memory server; initiate a session utilizing a high-precision clock for synchronization, the high-precision clock having a precision exceeding a timing threshold; receive a request related to the source memory server; enter the request in a journal file; initiate backup on the target memory server for the source memory server, wherein one or more files of the source memory server are copied to the target memory server, the copied files being backup files; analyze backup files and adapt journal file; and replay requests in journal file on target memory server.

The memory controller of the present disclosure provides an improved data replication solution for the shared memory storage system, which is independent of device manufacturers (or vendors). Thus, there is no need for the source memory server and the target memory server to belong to the same product manufacturer or compatible manufacturers. This eliminates the problem of vendor in-lock. Further, as the memory controller is configured to synchronize file operations of multiple clients using the journal file, race conditions are avoided and any possibility of reading overwritten data is suppressed. Moreover, the data replication solution provided by the memory controller works in a completely distributed fashion as each client is responsible for its own journaling. Furthermore, by virtue of adapting the journal file, an initial sync point is established to bring the target memory server to a known state that is consistent with the state of the source memory server at the start of some sequence of incremental changes. Moreover, adapting the journal file, does not require any additional shared content read. Hence, the memory controller provides a reliable and efficient data replication solution for the shared memory storage system.

In an implementation form, the memory controller is further configured to enter a request in the journal file along with a time stamp of the request generated by the high-precision clock.

The timestamping of the requests in the journal file enables the memory controller to efficiently synchronize file operations for multiple clients, thereby avoiding the single-point-of-failure problems without introducing any noticeable latency in data flow.

In a further implementation form, the memory controller is further configured to analyze the backup files and adapt the journal file by: generating a map for the backup files, wherein each file of the source memory server is mapped to a file in the source server; determine if the meta data of a backup file has been changed, and if so, indicate the backup file as being an orphan file; determine for each orphan file which other backup files are affected by the change to the meta data of the orphan file and link the orphan file to those other backup files in the map; and delete the orphan files from the journal file.

Beneficially, the adapted journal file includes all the metadata that is of interest for journaling, which in turn establishes an initial sync point to bring the target memory server to a known state that is consistent with the state of the source memory server at the start of some sequence of incremental changes.

In a further implementation form, the memory controller is further configured to determine that a request in the journal file relates to a file that is not a backup file, determine whether there is a request in the journal file for generating the file, and if not read the file from the source memory server and copy the file to the target memory sever prior to replaying the journal file.

Beneficially, duplicate copies of backup files are avoided.

In a further implementation form, the memory controller is further configured to replay the journal file by executing all requests in the journal file in order of the time stamps.

Beneficially, the memory controller ensures an improved data replication solution without the need to have a programmatic access to the source memory server and without the requirement to reread all data written to the source memory server, as the journal file includes a complete set of file operations and metadata required to replay them at a remote location.

In a further implementation form, the memory controller comprises a client controller. The client controller is configured to: connect to the source memory server; initiate the session utilizing the high-precision clock for synchronization; receive the request related to the source memory server; enter the request in the journal file; and initiate the backup.

By virtue of the client controller, the file operations of one or more clients are efficiently tracked and coordinated so as to ensure that the files present at the source memory server are reliably backed up at the target memory server.

In a further implementation form, the client controller is further configured to execute a replicator sequencer. By virtue of the replicator sequencer, the issue of initial journal sync of an active shared file system, such as the shared memory storage system, is resolved without any complete pause in production of IO operations.

In a further implementation form, the memory controller comprises a target controller. The target controller is configured to analyze backup files and adapt journal file and replay requests in journal file on the target memory server.

By virtue of the target controller, the backup files created by the client controller are efficiently analyzed so as to ensure that all the files present at the source memory server are reliably backed up at the target memory server.

In a further implementation form, the target controller is further configured to execute a replicator recipient.

By virtue of the replicator recipient, the data backup or replication process at the target site, such as the target memory server, is completed successfully.

In another aspect, the present disclosure provides a method for use in a shared memory storage system comprising a source memory server and a target memory server for storing files, a file having meta data and data. The method comprising: initiating a session utilizing a high- precision clock for synchronization, the high-precision clock having a precision exceeding a timing threshold; receiving a request related to the source memory server; entering the request in a journal file; initiating backup on the target memory server for the source memory server, wherein one or more files of the source memory server are copied to the target memory server, the copied files being backup files; analyzing backup files and adapting journal file; and replaying requests in journal file on the target memory server.

The method of the present disclosure provides an improved data replication solution for the shared memory storage system, which is independent of device manufacturers (or vendors). Thus, there is no need for the source memory server and the target memory server to belong to the same product manufacturer or compatible manufacturers, which eliminates the problem of vendor in-lock. Further, as the method is configured to synchronize file operations of multiple clients using the journal file, race conditions are avoided and any possibility of reading overwritten data is suppressed. Moreover, the data replication solution provided by the method works in a completely distributed fashion as each client is responsible for its own journaling. Hence, the method provides a reliable and efficient data replication solution for the shared memory storage system.

In yet another aspect, the present disclosure provides a computer-readable media comprising instructions that when loaded into and executed by a memory controller enables the memory controller to execute the method of aforementioned aspect.

The computer-readable media achieves all the advantages and effects of the respective method of the present disclosure.

It has to be noted that all devices, elements, circuitry, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof. It will be appreciated that features of the present disclosure are susceptible to being combined in various combinations without departing from the scope of the present disclosure as defined by the appended claims.

Additional aspects, advantages, features, and objects of the present disclosure would be made apparent from the drawings and the detailed description of the illustrative implementations construed in conjunction with the appended claims that follow.

BRIEF DESCRIPTION OF THE DRAWINGS

The summary above, as well as the following detailed description of illustrative embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the present disclosure, exemplary constructions of the disclosure are shown in the drawings. However, the present disclosure is not limited to specific methods and instrumentalities disclosed herein. Moreover, those in the art will understand that the drawings are not to scale. Wherever possible, like elements have been indicated by identical numbers. Embodiments of the present disclosure will now be described, by way of example only, with reference to the following diagrams wherein:

FIG. 1 A is a network environment diagram of a shared memory storage system, in accordance with an embodiment of the present disclosure;

FIG. IB is a block diagram that illustrates various exemplary components of a memory controller, in accordance with an embodiment of the present disclosure;

FIG. 2 is a flowchart for a method for use in a shared memory storage system, in accordance with an embodiment of the present disclosure;

FIG. 3 is an exemplary sequence diagram that depicts a data replication solution, in accordance with an embodiment of the present disclosure;

FIG. 4 is an exemplary timing diagram that depicts a data replication solution, in accordance with an embodiment of the present disclosure.

In the accompanying drawings, an underlined number is employed to represent an item over which the underlined number is positioned or an item to which the underlined number is adjacent. A non-underlined number relates to an item identified by a line linking the nonunderlined number to the item. When a number is non-underlined and accompanied by an associated arrow, the non-underlined number is used to identify a general item at which the arrow is pointing.

DETAILED DESCRIPTION OF EMBODIMENTS

The following detailed description illustrates embodiments of the present disclosure and ways in which they can be implemented. Although some modes of carrying out the present disclosure have been disclosed, those skilled in the art would recognize that other embodiments for carrying out or practicing the present disclosure are also possible.

FIG. 1 A is a network environment diagram of a shared memory storage system, in accordance with an embodiment of the present disclosure. With reference to FIG. 1A, there is shown a network environment diagram 100 of a shared memory storage system 102. The shared memory storage system 102 includes a source memory server 104, a memory controller 106, and a target memory server 108. In an implementation, the memory controller 106 further includes a client controller 110 and a target controller 112. The shared memory storage system 102 refers to a computer data storage system configured to store and share data files. The shared memory storage system 102 provides faster data access, easier administration, and simple configuration. Further, the shared memory storage system 102 simultaneously stores new or updated data received from multiple clients with an intent to provide communication among them and avoid storing redundant copies of data files. The shared memory storage system 102 comprises one or more source memory servers (e.g., the source memory server 104), one or more processors (e.g., the memory controller 106), and one or more target memory servers (e.g., the target memory server 108). Examples of the shared memory storage system 102 include, but are not limited to, a network-attached-storage (NAS) system, a cloud server, a file storage system, a block storage system, an object storage system, or a combination thereof.

The source memory server 104 refers to a file-level computer data storage connected to a computer network, such as a low-latency communications network, which provides data access to a heterogeneous group of one or more clients. The source memory server 104 includes suitable logic, circuitry, and/or interfaces that is configured to receive and store data from the one or more clients. The source memory server 104 may also be referred to as a NAS share. The source memory server 104 supports multiple file-service protocols, and may enable clients to share (i.e., receive or transmit) data files across different operating environments, such as UNIX or windows. In an example, the source memory server 104 may refer to a data center, which may include one or more hard disk drives, solid state drives or persistent memory modules operated as a logical storage, redundant storage containers, or Redundant Array of Inexpensive Disks (RAID).

The memory controller 106 refers to a central sequencer that inserts a synchronization point into its journal stream. The memory controller 106 coordinates synchronization points (or sync points) with high accuracy to initiate backup of data files. Such synchronization points define dataset boundaries, where the datasets are journal segments that are coordinated among the multiple clients. The memory controller 106 includes suitable logic, circuitry, interfaces, and/or code that is configured to execute a memory controlling process in the shared memory storage system 102. Examples of implementation of the memory controller 106 may include, but are not limited to, a central sequencer, a central data processing device, a NAS file operations journal consolidation device, and the like. The various components of the memory controller 106 are explained in detail in FIG. IB. The memory controller 106 includes the client controller 110, and the target controller 112. The client controller 110 refers to a source site replication agent, which records file operations in a timed interval and transmits the data as datasets to a replicator sequencer. The client controller 110 may also be referred to as an IO splitter. The client controller 110 includes suitable logic, circuitry, interfaces, and/or code that is configured to connect to the source memory server 104; initiate the session utilizing the high-precision clock for synchronization; receive the request related to the source memory server 104; enter the request in the journal file; and initiate the backup. Further, the target controller 112 refers to a target site replication agent, which executes a replicator recipient. The target controller 112 includes suitable logic, circuitry, interfaces, and/or code that is configured to analyze the backup files, adapt journal file, and replay requests in journal file on the target memory server 108.

The target memory server 108 refers to a file-level computer data storage connected to a computer network, such as a low-latency communications network, which provides data backup for the data files stored in the source memory server 104. The target memory server 108 includes suitable logic, circuitry, and/or interfaces that is configured to back up the source memory server 104. The target memory server 108 may also be referred to as a target share. The target memory server 108 is used to protect and recover data in an event of data loss in the source site (i.e., the source memory server 104). Examples of the event of data loss may include, but are not limited to, data corruption, hardware or software failure in the source site, accidental deletion of data, hacking, or malicious attack. Thus, for safety reasons, a separate backup storage or the target memory server 108 is extensively used to store a backup of the data present in the source memory server 104. Examples of the target memory server 108 include, but are not limited to, a secondary data storage system, a cloud server, a network-attached-storage (NAS) system, a file storage system, a block storage system, an object storage system, or a combination thereof.

FIG. IB is a block diagram that illustrates various exemplary components of a memory controller, in accordance with an embodiment of the present disclosure. FIG. IB is described in conjunction with elements from FIG. 1 A. With reference to FIG. IB, there is shown a block diagram of a memory controller 106. In an implementation, the memory controller 106 includes the client controller 110, the target controller 112, a network interface 114, a local memory, such as a memory 116, a clock 118, and a control circuitry 120. Further, the memory 116 may store a journal file 122. The network interface 114 include a software or hardware interface that may be configured to establish communication among the source memory server 104, the memory controller 106, and the target memory server 108. Examples of the network interface 114 may include, but are not limited to, a computer port, a network socket, a network interface controller (NIC), and any other network interface device.

The memory 116 include suitable logic, circuitry, and/or interfaces that may be configured to store machine code and/or instructions executable by the memory controller 106. Examples of implementation of the memory 116 may include, but are not limited to, Random Access Memory (RAM), Hard Disk Drive (HDD), Flash memory, Solid-State Drive (SSD), and/or CPU cache memory.

The clock 118 of the memory controller 106 may refer to a high precision clock which is used to synchronize the file operations of the memory controller 106.

The control circuitry 120 include a suitable logic circuitry that may be configured to send a plurality of dataset-sync message to synchronize the journaling operations of one or more clients. Examples of the control circuitry 120 may include, but are not limited to, a microprocessor, a microcontroller, a complex instruction set computing (CISC) processor, an application-specific integrated circuit (ASIC) processor, a reduced instruction set (RISC) processor, a very long instruction word (VLIW) processor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry. In an implementation, the operations executed by the memory controller 106 may be executed and controlled by the control circuitry 120.

The journal file 122 of the memory controller 106 refers to a data structure of a journaling file system, which is a fault-resilient file system. In the event of a system failure, the journal file 122 ensures that the data has been restored to its pre-crash configuration. It also recovers unsaved data and stores it in the location where it would have gone if the computer had not crashed. Instead of just one central device, as the journal file 122 is captured on each client at the individual file operation level, individual writes to files are captured. These writes thus can be replicated, thereby avoiding replicating large files when only a small update was applied by a user in the one or more clients. The journal file 122 information is useful to synchronize the file operations received from the one or more clients and to avoid race conditions by the eradication of any possibility to reread overwritten data. In operation, the memory controller 106 is configured to connect to the source memory server 104. The memory controller 106 is configured to be operatively connected to the source memory server 104 via a wired or a wireless network using known protocols, including, but not limited to, LAN, WLAN, Internet Protocol (IP), and the like. As shown in the FIG. 1A, the memory controller 106 is directly connected to the source memory server 104 in a common network and no gateways are used or network hops are added, which avoids the single-point- of-failure problems due to gateways and additionally latency in the data flow is minimized. In an example, the source memory server 104 provides scalable and shared storage for multiple clients and may act as a primary storage for storing data.

The memory controller 106 is further configured to initiate a session utilizing a high-precision clock 118 for synchronization, the high-precision clock 118 having a precision exceeding a timing threshold. The timing threshold is defined as to be higher than the system’s operating system’s granularity, better than microseconds (e.g., according to IEEE1588), i.e., better than the time taken to handle an IO request. Hence, the memory controller 106 ensures that the clocks of one or more clients are accurately synchronized to at least the resolution of the time taken for an IO operation. Thus, the high-precision clock 118 of the memory controller 106 synchronizes an initial known consistent crash backup point for starting a continuous replication session.

The memory controller 106 is further configured to receive a request related to the source memory server 104. The memory controller 106 receives the request to replicate the data files stored in the source memory server 104 to the target memory server 108. The request is manged by the client controller 110. The request includes the metadata (e.g., inodes) related to all the data files which are to be replicated. The data files refer to the files which are written by one or more clients on the source memory server 104.

The memory controller 106 is further configured to enter the request in a journal file 122. Each of the one or more clients is responsible for journaling its own file operations independently. Alternatively stated, as the journal file 122 is captured on each client, at the individual file operation level, individual writes to files are captured. These writes thus can be replicated, thereby avoiding replicating large files because only a small incremental change was applied by the one or more clients. The journal file 122 includes all the data that the one or more clients send to the source memory server 104. Thus, there is no need to reread the data from the source memory server 104 at a later time, avoiding any possibility of a race condition and avoiding any possibility of reading overwritten data. Further, journaling is initiated prior to the backup process and the journal file 122 is sent periodically by one or more clients to the memory controller 106 independent of the backup process.

In accordance with an embodiment, the memory controller 106 is further configured to enter a request in the journal file 122 along with a time stamp of the request generated by the high- precision clock 118. Each request in the journal file 122 is timestamped to record the time of each request. Each request is entered in the journal file 122 in a sequence of time as it occurs. The order of requests in the journal file 122 refers to the time series of file operations at each of the one or more clients, which enables the memory controller 106 to efficiently synchronize the data for multiple clients, avoid the single-point-of-failure problems, and reduce the latency in data flow.

The memory controller 106 is further configured to initiate backup on the target memory server 108 for the source memory server 104, wherein one or more files of the source memory server 104 are copied to the target memory server 108, the copied files being backup files. The memory controller 106 initiates the backup process of copying one or more data files from a source site, such as the source memory server 104, to a target site, such as the target memory server 108. The copied files at the target memory server 108 are referred to as the backup files. The backup process is required to protect and recover data in an event of data loss in the source site (i.e., the source memory server 104). Examples of the event of data loss may include, but are not limited to, data corruption, hardware or software failure in the source site, accidental deletion of data, hacking, or malicious attack. Thus, for safety reasons, a separate backup storage or the target memory server 108 is extensively used to store a backup of the data present in the source memory server 104.

In accordance with an embodiment, the memory controller 106 comprises a client controller 110. The client controller 110 is configured to: connect to the source memory server 104; initiate the session utilizing the high-precision clock 118 for synchronization; receive the request related to the source memory server 104; enter the request in the journal file 122; and initiate the backup. Hence, it is the responsibility of the client controller 110 to efficiently track and coordinate the file operations of one or more clients so as to ensure that the files present at the source memory server 104 are reliably backed up at the target memory server 108.

In accordance with an embodiment, the client controller 110 is further configured to execute a replicator sequencer. The replicator sequencer refers to an agent that collects records of IO operation transmitted in a form of datasets by the one or more clients, process them and transmits the incremental changes in them. Hence, the replicator sequencer solves the issue of initial journal sync of an active shared file system, such as the shared memory storage system 102, and does not require any complete pause in production of IO operations.

The memory controller 106 is further configured to analyze backup files and adapt journal file 122. The memory controller 106 analyses the backup files to create a translation table to map source metadata (e.g., inodes) to actual restored metadata and accordingly adapt the journal file 122. The backup files refer to the files copied from the source memory server 104 to the target memory server 108. The analysis is done by the target controller 112. The process of analyzing the backup files may refer to processing the backup files to capture all potential instances where small and incremental changes might have taken place, without the need to scan the whole backup file. Further, the memory controller 106 process the data received in the form of journal file 122 from the one or more clients. The processing is executed in the background without the one or more clients having to wait for the processing to finish. The processing helps in reliable and efficient synchronization of data among the one or more clients without introducing any latency in the production data path (i.e., data flow). Further, the journal file 122 is adapted on the basis of the result of analysis. The adapting of journal file 122 refers to modifying the backup files only for the small incremental changes such that an initial sync point is established to bring the target memory server 108 to a known state that is consistent with the state of the source memory server 104 at the start of some sequence of incremental changes. The initial sync point may refer to a point at which synchronization is achieved between the IO operations performed by one or more clients to store data files in the source memory server 104 and the journaling operation performed by the memory controller 106 to back up the files of the source memory server 104 at the target memory server 108. Moreover, adapting the journal file 122, does not require any additional shared content read. Thus, the efficiency and reliability of the data replication solution in setting up a disaster recovery solution is improved.

In accordance with an embodiment, the memory controller 106 is further configured to analyze the backup files and adapt the journal file 122 by: generating a map for the backup files, wherein each file of the source memory server 104 is mapped to a file in the source server; determine if the meta data of a backup file has been changed, and if so, indicate the backup file as being an orphan file; determine for each orphan file which other backup files are affected by the change to the meta data of the orphan file and link the orphan file to those other backup files in the map; and delete the orphan files from the journal file 122. The map refers to an inode map that maps source inode to target inode. The map includes the name (file or directory) associated with the source inode. If the metadata of the inode had been touched (i.e., copies of the inode were created with hard link, copies deleted, or inode parent changed (rename), or inode file name was changed), the backup file is indicated as the orphan file. Further, the journal file 122 is used to compile (create/delete) related operation call graph (create, link, move unlink and delete operations) for each path the inode is found. This determines all the paths the inode should exist at the target site, i.e., the target memory server 108. Further, the inode is linked to all its existing paths (if any), and unlinked or deleted from the orphan directory. Hence, the adapted journal file 122 includes all the inodes that are of interest and on which the journaling should apply.

In accordance with an embodiment, the memory controller 106 is further configured to determine that a request in the journal file 122 relates to a file that is not a backup file, determine whether there is a request in the journal file 122 for generating the file, and if not read the file from the source memory server 104 and copy the file to the target memory server 108 prior to replaying the journal file. The inodes that existed prior to the backup and still exist in the backup after it’s done are kept as it is in the journal file 122. Such inodes refer to the metadata which was not touched, i.e., renamed, created hard link or deleted in the journaling process. For inodes that are found in the journal file 122, but the journal file 122 has no record of creation, and the inode is not in the backup copy, the file is reread from the source site, i.e., the source memory server 104

The memory controller 106 is further configured to replay requests in journal file 122 on the target memory server 108. The journal file 122 includes the requests from one or more clients, which comprise all the data associated with the file operations of the one or more clients. For efficient and reliable replication, these requests are replicated by the memory controller 106 on the target memory server 108 so as to ensure that no data of any client is lost. Thus, the memory controller 106 ensures an improved data replication solution without the need to have a programmatic access to the source memory server 104 and without the requirement to reread all data written to the source memory server 104, as the journal file 122 includes a complete set of file operations and metadata required to replay them at a remote location.

In accordance with an embodiment, the memory controller 106 is further configured to replay the journal file 122 by executing all requests in the journal file 122 in order of the time stamps. The journal file 122 includes the requests from one or more clients, which comprise all the data associated with the file operations of the one or more clients. For efficient and reliable replication, all requests in the journal file 122 are replayed by the memory controller 106 in order of the time stamps at the target memory server 108 so as to ensure that the replicated data is in sync with the data present at source memory server 104. Thus, the memory controller 106 further ensures an improved data replication solution without the need to have a programmatic access to the source memory server 104 and without the requirement to reread all data written to the source memory server 104, as the journal file 122 includes a complete set of file operations and metadata required to replay them at a remote location.

In accordance with an embodiment, the memory controller 106 comprises a target controller 112. The target controller 112 is configured to analyze backup files and adapt journal file 122 and replay requests in journal file 122 on the target memory server 108. Hence, it is the responsibility of the target controller 112 to efficiently analyze the backup files created by the client controller 110 so as to ensure that all the files present at the source memory server 104 are reliably backed up at the target memory server 108.

In accordance with an embodiment, the target controller 112 is further configured to execute a replicator recipient. The replicator recipient refers to a replicator receiver at the target site, such as the target memory server 108. The replicator recipient replays all the requests of the journal file 122 transmitted by the target controller 112 and successfully completes the data backup or replication process at the target site, such as the target memory server 108.

Thus, the memory controller 106 of the present disclosure provides an improved data replication solution for the shared memory storage system 102, which is independent of device manufacturers (or vendors). Thus, there is no need for the source memory server 104 and the target memory server 108 to belong to the same product manufacturer or compatible manufacturers. This eliminates the problem of vendor in-lock. Further, as the memory controller 106 is configured to synchronize file operations of multiple clients using the journal file 122, race conditions are avoided and any possibility of reading overwritten data is suppressed. Moreover, the data replication solution provided by the memory controller 106 works in a completely distributed fashion as each client is responsible for its own journaling. Further, by virtue of adapting the journal file 122, an initial sync point is established to bring the target memory server 108 to a known state that is consistent with the state of the source memory server 104 at the start of some sequence of incremental changes. Moreover, adapting the journal file 122, does not require any additional shared content read. Hence, the memory controller 106 provides a reliable and efficient data replication solution for the shared memory storage system 102, which spawns through all the changes and captures all the potential instances of incremental changes in the journal file 122.

FIG. 2 is a flowchart for a method for use in a shared memory storage system, in accordance with an embodiment of the present disclosure. With reference to FIG. 2, there is shown a method 200. FIG. 2 is described in conjunction with elements of FIG. 1A and IB. The method 200 is for use in the shared memory storage system 102 described, for example, in Fig. 1 A. The method 200 includes steps 202 to 212. The method 200 is executed by the memory controller 106 described, for example, in Fig. 1 A and IB.

At step 202, the method 200 comprises initiating a session utilizing a high-precision clock 118 for synchronization, the high-precision clock 118 having a precision exceeding a timing threshold. The timing threshold is defined as to be higher than the system’s operating system’s granularity, better than microseconds (e.g., according to IEEE1588), i.e., better than the time taken to handle an IO request. Hence, the method 200 ensures that the clocks of one or more clients are accurately synchronized to at least the resolution of the time taken for an IO operation. Thus, the high-precision clock 118 of the memory controller 106 synchronizes an initial known consistent crash backup point for starting a continuous replication session.

At step 204, the method 200 further comprises receiving a request related to the source memory server 104. The memory controller 106 receives the request to replicate the data files stored in the source memory server 104 to the target memory server 108. The request is manged by the client controller 110. The request includes the metadata (e.g., inodes) related to all the data files which are to be replicated. The data files refer to the files which are written by one or more clients on the source memory server 104.

At step 206, the method 200 further comprises entering the request in a journal file 122. Each of the one or more clients is responsible for journaling its own file operations independently. Alternatively stated, as the journal file 122 is captured on each client, at the individual file operation level, individual writes to files are captured. These writes thus can be replicated, thereby avoiding replicating large files because only a small incremental change was applied by the one or more clients. The journal file 122 includes all the data that the one or more clients send to the source memory server 104. Thus, the method 200 eliminates the need to reread the data from the source memory server 104 at a later time, avoiding any possibility of a race condition and avoiding any possibility of reading overwritten data. Further, journaling is initiated prior to the backup process and the journal file 122 is sent periodically by one or more clients to the memory controller 106 independent of the backup process.

At step 208, the method 200 further comprises initiating backup on the target memory server 108 for the source memory server 104, wherein one or more files of the source memory server 104 are copied to the target memory server 108, the copied files being backup files. The method 200 initiates the backup process of copying one or more data files from a source site, such as the source memory server 104, to a target site, such as the target memory server 108. The copied files at the target memory server 108 are referred to as the backup files. The backup process is required to protect and recover data in an event of data loss in the source site (i.e., the source memory server 104)

At step 210, the method 200 further comprises analyzing backup files and adapting journal file 122. The memory controller 106 analyzes the backup files to create a translation table to map source metadata (e.g., inodes) to actual restored metadata and accordingly adapt the journal file 122. The analysis is done by the target controller 112. Further, the memory controller 106 process the data received in the form of journal file 122 from the one or more clients. The processing is executed in the background without the one or more clients having to wait for the processing to finish. The processing helps in reliable and efficient synchronization of data among the one or more clients without introducing any latency in the production data path (i.e., data flow).

At step 212, the method 200 further comprises replaying requests in journal file 122 on the target memory server 108. The journal file 122 includes the requests from one or more clients, which comprise all the data associated with the file operations of the one or more clients. For efficient and reliable replication, these requests are replicated by the memory controller 106 on the target memory server 108 so as to ensure that no data of any client is lost. Thus, the method 200 ensures an improved data replication solution without the need to have a programmatic access to the source memory server 104 and without the requirement to reread all data written to the source memory server 104, as the journal file 122 includes a complete set of file operations and metadata required to replay them at a remote location.

The steps 202 to 212 are only illustrative and other alternatives can also be provided where one or more steps are added, one or more steps are removed, or one or more steps are provided in a different sequence without departing from the scope of the claims herein. The method 200 uses the memory controller 106 to provide an improved data replication solution for the shared memory storage system 102, which is independent of device manufacturers (or vendors). Thus, there is no need for the source memory server 104 and the target memory server 108 to belong to the same product manufacturer or compatible manufacturers, which eliminates the problem of vendor in-lock. Further, as the method 200 is configured to synchronize file operations of multiple clients using the journal file 122, race conditions are avoided and any possibility of reading overwritten data is suppressed. Moreover, the data replication solution provided by the method 200 works in a completely distributed fashion as each client is responsible for its own journaling. Further, by virtue of adapting the journal file 122, an initial sync point is established to bring the target memory server 108 to a known state that is consistent with the state of the source memory server 104 at the start of some sequence of incremental changes. Hence, the method 200 provides a reliable and efficient data replication solution for the shared memory storage system 102.

In yet another aspect, the present disclosure provides a computer-readable media comprising instructions that when loaded into and executed by a memory controller 106 enables the memory controller 106 to execute the method 200. The computer-readable media refers to a non-transitory computer-readable storage medium. Examples of implementation of the computer-readable media include, but are not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, a Secure Digital (SD) card.

FIG. 3 is an exemplary sequence diagram that depicts a data replication solution, in accordance with an embodiment of the present disclosure. FIG. 3 is described in conjunction with elements of FIG. 1A, IB, and 2. With reference to FIG. 3, there is shown a sequence diagram 300 that depicts a data replication solution. There is further shown a source memory server 302, a replicator sequencer 304, one or more clients 306, a replicator recipient 308, a backup API 310, and a target memory server 312. There is further shown an exemplary sequence of operations 314 to 336. The source memory server 302 and the target memory server 312 corresponds to the source memory server 104 and the target memory server 108 (of FIG. 1 A), respectively.

The replicator sequencer 304 refers to an agent that collects records of IO operation transmitted in a form of datasets by the one or more clients 306, process them and transmits the incremental changes in them. Each of the one or more clients 306 refers to a client that is communicatively coupled to the source memory server 302 for data access and storage. The one or more clients 306 are operatively connected to one another and also to the replicator sequencer 304 through a low- latency communications network, such as LAN or WLAN. The one or more clients 306 may be a heterogeneous group of clients, where each of the one or more clients 306 include suitable logic, circuitry, and interfaces that is configured to remotely access data from the source memory server 302. Each of the one or more clients 306 may be associated with a user who may perform specific file operations and further store the data associated with such file operations to the source memory server 302. Examples of the one or more clients 306 include, but are not limited to, a thin client, a laptop computer, a desktop computer, a smartphone, a wireless modem, or other computing devices.

The replicator recipient 308 refers to a replicator receiver at the target site, such as the target memory server 108. The backup API 310 refers a backup application programming interface, which provides an intermediary interface to store the backup files at the target site, i.e., the target memory server 312.

At operation 314, one or more clients 306 are connected to the source memory server 302. The one or more clients 306 may store one or more data files in the source memory server 302 via one or more file operations.

At operation 316, the replicator sequencer 304 joins the high precision clock session. The high precision clock is required to accurately synchronize the journal operations of the one or more clients 306.

At operation 318, one or more clients 306 start sending IO journal. Journaling is initiated by one or more clients 306 prior to the backup process and is sent periodically to the replicator sequencer 304 independent of the backup process. The IO journal refers to a data structure of a journaling file system, which is a fault-resilient file system. In the event of a system failure, the IO journal ensures that the data has been restored to its pre-crash configuration. It also recovers unsaved data and stores it in the location where it would have gone if the computer had not crashed. Since the IO journal is captured on each client, such as the one or more clients 306, at the individual file operation level, individual writes to files are captured. These writes thus can be replicated, thereby avoiding replicating large files when only a small update was applied by a user in the one or more clients 306. The IO journal information is useful to synchronize the file operations received from the one or more clients 306 and to avoid race conditions by the eradication of any possibility to reread overwritten data.

At operation 320, the replicator sequencer 304 creates backup and restores the backup files at the backup API 310. The replicator sequencer 304 creates backup of the IO journal received from one or more clients 306. The backup files are then sent to backup API 310 to be restored at the target memory server 312.

At operation 322, the backup API 310 restores the backup files at the target memory server 312. The backup files can either be directly piped to a restore session on the target memory server 312 or can be initiated via the backup API 310.

At operation 324, the replicator recipient 308 receives the IO journal from the replicator sequencer 304. The IO journal received by the replicator recipient 308 corresponds to the inodes that existed prior to the backup and still exist in the backup after it’s done. Such inodes refer to the metadata which was not touched, i.e., renamed, created hard link or deleted in the journaling process.

At operation 326, the backup API 310 sends an acknowledgement to the replicator sequencer 304 that the backup is done successfully. Since the IO journal received by the replicator recipient 308 was not touched during the journaling process, the IO journal is backed up as it is at the target memory server 312 and an acknowledgement regarding successful backup is sent to the replicator sequencer 304.

At operation 328, one or more clients 306 again sends a journal to the replicator sequencer 304. At operation 330, the replicator sequencer 304 sends the journal to the replicator recipient 308.

At operation 332, the replicator recipient 308 analyzes the journal and prunes orphan inode files. The replicator recipient 308 analyzes the journal to create a translation table to map source metadata (e.g., inodes) to actual restored metadata and accordingly adapt the journal. If the metadata of the inode had been touched (i.e., copies of the inode were created with hard link, copies deleted, or inode parent changed (rename), or inode file name was changed), the journal is indicated as an orphan file. Further, the journal is used to compile (create/delete) related operation call graph (create, link, move unlink and delete operations) for each path the inode is found. This determines all the paths the inode should exist at the target site, i.e., the target memory server 312. Further, the inode is linked to all its existing paths (if any) and unlinked or deleted from the orphan directory. At operation 334, the replicator recipient 308 replays the recorded journal operations for each inode at the target memory server 312. For efficient and reliable data replication, the recorded journal operations for each inode are replayed by the replicator recipient 308 on the target memory server 312 so as to ensure that no data of any client, such as one or more clients 306, is lost.

At operation 336, the replicator recipient 308 sends an acknowledgement to the replicator sequencer 304 that the initial sync is on. When all the inodes that are of interest are found, the replicator recipient 308 sends an acknowledgement to the replicator sequencer 304 indicating that the initial sync is established.

The operations 314 to 336 are only illustrative and other alternatives can also be provided where one or more steps are added, one or more steps are removed, or one or more steps are provided in a different sequence without departing from the scope of the claims herein.

FIG. 4 is an exemplary timing diagram that depicts a data replication solution, in accordance with an embodiment of the present disclosure. FIG. 4 is described in conjunction with elements of FIG. 1A, IB, 2, and 3. With reference to FIG. 4, there is shown a timing diagram 400 that depicts a data replication solution. There is further shown a desired consistency point 402 and a target backup complete point 404.

There is further shown 6 six journal points in time, such as jO, j 1, j2, j3, j4, j5, and j6. The target backup complete point 404 refers to a point in time where the backup process is complete. The target backup complete point 404 must contain j 1 operations and may contain j2 to j4 operations when j5 is chosen as the desired consistency point 402. Further, it is assumed that the journal points are time consistent, and the backup process is not atomic. Further, the target memory server 108 compares the journal with actual state of each inode/parent directory inode in the backup system. All operations of jO must be present/overwritten in backup and the state including j 5 is the desired consistency point 402.

Table 1 shows the backup status at the target memory server 108. Table 1 shows the inodes (or metadata) associated with the backup files along with the size and the path followed to access them. Table 2 shows the final state to be after processing. It can be observed from Table 2 that creation of files is not skipped during the backup process. Further, Table 3 to Table 7 show the replay actions performed to mark the journal entries starting from the desired consistency point 402, i.e., j5. Hence, Table 3 shows j5 processing according to target inodes, Table 4 shows j4 processing according to target inodes, Table 5 shows j3 processing according to target inodes, Table 6 shows j2 processing according to target inodes, and Table 7 shows j l processing according to target inodes. It can be observed that replay action is not performed if same data already exists in the journal.

Table 1 : Backup status

Table 2: Final state to be after processing

Table 3: j5 processing according to target inodes

Table 4: j4 processing according to target inodes

Table 5: j3 processing according to target inodes

Table 6: j2 processing according to target inodes

Table 7: jl processing according to target inodes

Various embodiments of the disclosure thus provide a memory controller 106. The memory controller 106 is configured to be used in a shared memory storage system 102 comprising a source memory server 104 and a target memory server 108 for storing files, a file having meta data and data. The memory controller 106 being configured to: connect to the source memory server 104; initiate a session utilizing a high-precision clock 118 for synchronization, the high- precision clock 118 having a precision exceeding a timing threshold; receive a request related to the source memory server 104; enter the request in a journal file 122; initiate backup on the target memory server 108 for the source memory server 104, wherein one or more files of the source memory server 104 are copied to the target memory server 108, the copied files being backup files; analyze backup files and adapt journal file 122; and replay requests in journal file 122 on target memory server 108.

Various embodiments of the disclosure thus further provide a method 200 for use in a shared memory storage system 102 comprising a source memory server 104 and a target memory server 108 for storing files, a file having meta data and data. The method 200 comprising: initiating a session utilizing a high-precision clock 118 for synchronization, the high-precision clock 118 having a precision exceeding a timing threshold; receiving a request related to the source memory server 104; entering the request in a journal file 122; initiating backup on the target memory server 108 for the source memory server 104, wherein one or more files of the source memory server 104 are copied to the target memory server 108, the copied files being backup files; analyzing backup files and adapting journal file 122; and replaying requests in journal file 122 on the target memory server 108.

Modifications to embodiments of the present disclosure described in the foregoing are possible without departing from the scope of the present disclosure as defined by the accompanying claims. Expressions such as “including”, “comprising”, “incorporating”, “have”, “is” used to describe and claim the present disclosure are intended to be construed in a non-exclusive manner, namely allowing for items, components or elements not explicitly described also to be present. Reference to the singular is also to be construed to relate to the plural. The word “exemplary” is used herein to mean “serving as an example, instance or illustration”. Any embodiment described as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the incorporation of features from other embodiments. The word “optionally” is used herein to mean “is provided in some embodiments and not provided in other embodiments”. It is appreciated that certain features of the present disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the present disclosure, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable combination or as suitable in any other described embodiment of the disclosure.

Claims

1. A memory controller (106) configured to be used in a shared memory storage system (102) comprising a source memory server (104) and a target memory server (108) for storing files, a file having meta data and data, the memory controller (106) being configured to: connect to the source memory server (104); initiate a session utilizing a high-precision clock (118) for synchronization, the high- precision clock (118) having a precision exceeding a timing threshold; receive a request related to the source memory server (104); enter the request in a journal file (122); initiate backup on the target memory server (108) for the source memory server (104), wherein one or more files of the source memory server (104) are copied to the target memory server (108), the copied files being backup files; analyze backup files and adapt journal file (122); and replay requests in journal file (122) on the target memory server (108).

2. The memory controller (106) according to claim 1, wherein the memory controller (106) is further configured to enter a request in the journal file (122) along with a time stamp of the request generated by the high-precision clock (118).

3. The memory controller (106) according to claim 1 or 2, wherein the memory controller (106) is further configured to analyze the backup files and adapt the journal file (122) by: generating a map for the backup files, wherein each file of the source memory server (104) is mapped to a file in the source server; determine if the meta data of a backup file has been changed, and if so, indicate the backup file as being an orphan file; determine for each orphan file which other backup files are affected by the change to the meta data of the orphan file and link the orphan file to those other backup files in the map; and delete the orphan files from the journal file (122).

4. The memory controller (106) according to any of claims 1, 2 or 3, wherein the memory controller (106) is further configured to determine that a request in the journal file (122) relates

26 to a file that is not a backup file, determine whether there is a request in the journal file (122) for generating the file, and if not read the file from the source memory server (104) and copy the file to the target memory sever (108) prior to replaying the journal file (122).

5. The memory controller (106) according to any preceding claim, wherein the memory controller (106) is further configured to replay the journal file (122) by executing all requests in the journal file (122) in order of the time stamps.

6. The memory controller (106) according to any preceding claim, wherein the memory controller (106) comprises a client controller (110), which is configured to: connect to the source memory server (104); initiate the session utilizing the high-precision clock (118) for synchronization; receive the request related to the source memory server (104); enter the request in the journal file (122); and initiate the backup.

7. The memory controller (106) according to claim 6, wherein the client controller (110) is further configured to execute a replicator sequencer.

8. The memory controller (106) according to any preceding claim, wherein the memory controller (106) comprises a target controller (112), which is configured to analyze backup files and adapt journal file (122) and replay requests in journal file (122) on the target memory server (108).

9. The memory controller (106) according to claim 8, wherein the target controller (112) is further configured to execute a replicator recipient.

10. A method (200) for use in a shared memory storage system (102) comprising a source memory server (104) and a target memory server (108) for storing files, a file having meta data and data, the method (200) comprising: initiating a session utilizing a high-precision clock (118) for synchronization, the high- precision clock (118) having a precision exceeding a timing threshold; receiving a request related to the source memory server (104); entering the request in a journal file (122); initiating backup on the target memory server (108) for the source memory server (104), wherein one or more files of the source memory server (104) are copied to the target memory server (108), the copied files being backup files; analyzing backup files and adapting journal file (122); and replaying requests in journal file (122) on the target memory server (108).

11. A computer-readable media comprising instructions that when loaded into and executed by a memory controller (106) enables the memory controller (106) to execute the method (200) according to claim 10.