WO2023227224A1

WO2023227224A1 - Method and system for smart caching

Info

Publication number: WO2023227224A1
Application number: PCT/EP2022/064385
Authority: WO
Inventors: Assaf Natanzon; Amit Golander
Original assignee: Huawei Cloud Computing Technologies Co., Ltd.
Priority date: 2022-05-27
Filing date: 2022-05-27
Publication date: 2023-11-30
Also published as: CN117546154A

Abstract

A device, system, method, and a computer readable storage medium are describe, the device including a cache, a plurality of data objects comprised in the cache, the plurality of data objects to be replicated to a plurality of different known target locations; and a cache manager operative to determine a first data object in the cache that is next to replicated and keep the first data object in the cache, track which data objects of the plurality of data objects are to be replicated to which of the known target locations among the plurality of different target locations, and delete, when the cache is full, one of: a second data object which is not to be replicated and a third data object which is to be replicated last, wherein each replication of a data object is performed by a different task.

Description

METHOD AND SYSTEM FOR SMART CACHING

FIELD AND BACKGROUND OF THE INVENTION

The present invention, in some embodiments thereof, relates to a system and method for caching for object storage replication and, more particularly, but not exclusively, to a method and system for replication of objects to be replicated to multiple targets.

When there are multiple objects to be replicated to with different bandwidth to multiple targets, the performance of the replication typically degrades to the performance of the slowest link In storage systems, generally data is stored in block devices and the data in the block devices is usually stored in a certain file system format. Typically, a block device is a computer data storage device that supports reading and optionally, writing data in fixed-size blocks, sectors, or clusters. Moreover, data replication is a process of storing data in more than one node (e.g. a primary storage as well as a secondary storage) to improve availability of data, and to ensure data protection in an event of data loss. Thus, for safety reasons, usually the secondary storage is used to replicate data present in a primary storage system of the host storage device. In an example, both the primary and secondary storage systems may be block devices used for data storage.

A vast amount of data is being produced and stored globally, on a regular basis. For example, social networks, internet of things, scientific experiments, commercial services, industrial services, banking services, businesses, and the like, play a significant role in generating this data. Presently, data storage systems (for example, secondary storage systems, remote data storage systems, and the like) are being employed to store this data. Proper management and storage of data in the data storage systems is therefore important, in order to improve and maximize efficiency of the data storage systems.

Some cloud storage providers support replication of stored data objects to multiple targets, but they do not use smart caching options.

A simple solution would be that the same data mover will send the object to all the targets when it reads the object from a logging device. However, if the targets have different band widths, a data moving process will either be stuck moving the data objects to the targets and speed will be reducing to the slowest link. Another option is that each target will be managed by another task. In this case, every time the object needs to be replicated it will be read from the logging device increasing the load on the logging device. Accordingly, even when the bandwidth of the target devices is similar to one another, there is a need to read data objects multiple time from the logging device.

One alternative might be to use the Least Recently Used (LRU) cache eviction algorithm, so that data objects will be organized in the cache in order of use. In LRU, as the name suggests, the data object that has not been used for the longest time will be evicted from the cache, when the cache is near full. Typical caching algorithms use methods such as LRU. However, LRU may be a bad option, for instance, if it is known how many times each object is will be read from the cache in the future. Additionally, LRU needs to maintain a record of when each stored data object was last accessed.

By way of example, if there are two replication tasks replicating the same set of objects al, a2, ... ., an. One of the two replication tasks may be, for example twice as fast as the other replication task. It is desirable that each task to work as fast as is possible. However, it is the nature of cache memory that it typically only holds a small amount of objects at any given time. By way of an example, assume that the cache can only hold 10 object, which for the sake of this example are assumed to be the same size. Accordingly, when the first of the two tasks replicates object 20, the second of the two tasks is only replicating object 10. If objects 11 - 20 are stored in the cache, then the second of the two tasks will not need to read these objects from the cache, as it is still replicating objects 1 - 10. However, if LRU is implemented, then when the first of the two tasks is replicating object 22, objects 11 and 12 will have already been deleted. Accordingly, the second replication task will have a cache miss, namely, an event in which replication task makes a request to retrieve data from a cache, but that specific data is not currently in cache memory

Other caching algorithms also have shortcomings. For example, Least Frequently Used (LFU) caching uses a counter to keep track of how often an object in cache is accessed. With the LFU cache algorithm, the object with the lowest count is removed first. However, LFU does not account for an object that had an initially high access rate and then was not accessed for a long time.

Accordingly, there is a need for a smart caching method and system in order to overcome the shortcomings of LRU and other.

SUMMARY OF THE INVENTION

A smart caching method and system is provided to overcome shortcomings of LRU. Data objects stored in the cache are managed by a cache manager, which may be resident in a processor and are replicated to their respective target locations as needed. Each replication of data objects is performed by an individual and dedicated task created by the cache manager. When the cache is full, either data object which is not to be replicated or a data object which is to be replicated last is deleted from the cache.

According to an aspect of some embodiments of the present invention there is provided a cache, a plurality of data objects comprised in the cache, the plurality of data objects to be replicated to a plurality of different known target locations and a cache manager operative to determine a first data object in the cache that is next to replicated and keep the first data object in the cache track which data objects of the plurality of data objects are to be replicated to which of the known target locations among the plurality of different target locations, and delete, when the cache is full, one of: a second data object which is not to be replicated and a third data object which is to be replicated last, wherein each replication of a data object is performed by a different task.

According to an aspect of some embodiments of the present invention there is provided a method for storing a plurality of data objects in a cache, the plurality of data objects to be replicated to a plurality of different known target locations, determining, by a cache manager, a first data object in the cache that is next to replicated, tracking, by the cache manager, which data objects of the plurality of data objects are to be replicated to which of the known target locations among the plurality of different target locations, keeping the first data object in the cache, and deleting, when the cache is full, one of a second data object which is not to be replicated or a third data object which is to be replicated last wherein each replication of a data object is performed by a different task.

According to an aspect of some embodiments of the present invention there is provided a computer readable storage medium having data stored therein representing software executable by a computer, the software including instructions to store a plurality of data objects in a cache, the plurality of data objects to be replicated to a plurality of different locations, determine, by a cache manager, a first data object in the cache that is next to replicated keep the first data object in the cache, and delete, when the cache is full, one of a second data object which is not to be replicated or a third data object which is to be replicated last, wherein each replication of a data object is performed by a different task.

According to some embodiments of the invention, each data object of the plurality of data objects are written to a log device.

According to some embodiments of the invention, the log device is stored in a persistence layer. According to some embodiments of the invention, a data mover reads the first data object data from the log device, and copies the first data object data to a replica site.

According to some embodiments of the invention, each data object of the plurality of data objects are indexed in an index layer.

According to some embodiments of the invention, a data object to be read after the first data object that is stored consecutively in the cache with the first data object.

According to some embodiments of the invention, the cache manager prefetches a fourth data object to be read after the first data object if the fourth data object is not in the cache.

According to some embodiments of the invention, replication of the plurality of data objects occurs in a Kubernetes cluster.

According to some embodiments of the invention, the cache manager is operative to delete a latest received data object when the cache is full.

According to some embodiments of the invention, also including the cache manager deleting one data object of the plurality of data objects when the one data object has been copied to all of its known target locations.

According to some embodiments of the invention, the cache manager deletes at least two data objects of the plurality of data objects which are written sequentially on the hard disk and need to be sent one after the other.

According to some embodiments of the invention, also including after deleting least two data objects of the plurality of data objects, the cache manager fetching the at least two data objects of the plurality of data from the hard drive.

Unless otherwise defined, all technical and/or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the invention, exemplary methods and/or materials are described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and are not intended to be necessarily limiting.

Implementation of the method and/or system of embodiments of the invention can involve performing or completing selected tasks manually, automatically, or a combination thereof. Moreover, according to actual instrumentation and equipment of embodiments of the method and/or system of the invention, several selected tasks could be implemented by hardware, by software or by firmware or by a combination thereof using an operating system. For example, hardware for performing selected tasks according to embodiments of the invention could be implemented as a chip or a circuit. As software, selected tasks according to embodiments of the invention could be implemented as a plurality of software instructions being executed by a computer using any suitable operating system. In an exemplary embodiment of the invention, one or more tasks according to exemplary embodiments of method and/or system as described herein are performed by a data processor, such as a computing platform for executing a plurality of instructions. Optionally, the data processor includes a volatile memory for storing instructions and/or data and/or a non-volatile storage, for example, a magnetic harddisk and/or removable media, for storing instructions and/or data. Optionally, a network connection is provided as well. A display and/or a user input device such as a keyboard or mouse are optionally provided as well.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS )

Some embodiments of the invention are herein described, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of embodiments of the invention. In this regard, the description taken with the drawings makes apparent to those skilled in the art how embodiments of the invention may be practiced.

In the drawings:

FIG. 1 is an exemplary system for optimizing between continuous data object replication and backup to the cloud;

FIG. 2 is an example of backup and replication of multiple data items in a Huawei cloud environment; and

FIG. 3 is a simplified flowchart illustration of a method of operation of the system and apparatus described in Figs. 1 and 2.

DESCRIPTION OF SPECIFIC EMBODIMENTS OF THE INVENTION

A system and method is described for optimizing caching and minimizing reads from object storage when replicating from a single object storage to multiple targets. In some cloud storage systems, such as, by way of a non-limiting example, in a Huawei cloud storage system, there is no smart caching done for object storage replication. For example, when there are multiple targets an object may need to be sent to all targets one after another. When there are multiple targets each having a different bandwidths for data transfer, the performance of the replication degrades to the performance of the slowest data transfer link. Other cloud storage systems, for example, Amazon Web Services (AWS), Microsoft Azure, Google cloud platform, and so forth, might provide alternative support for transferring data to multiple targets. Nevertheless, caching may not be performed in an optimal fashion.

Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not necessarily limited in its application to the details of construction and the arrangement of the components and/or methods set forth in the following description and/or illustrated in the drawings and/or the Examples. The invention is capable of other embodiments or of being practiced or carried out in various ways.

Referring now to the drawings, Figure 1 illustrates the exemplary system 100 for optimizing between continuous data object replication and backup to the cloud. A cache 110 comprising a plurality of data objects 120 A, 120B, 120C may be comprised in a cloud storage system. The cloud storage system may be a Huawei cloud storage system, AWS, Microsoft Azure, Google cloud platform, or other cloud storage systems which are available. The cloud storage system may comprise a publically available cloud storage system or a proprietary cloud storage system. The plurality of data objects 120A, 120B, 120C may be queued in the cache for replication to a plurality of different known target locations, for instance target locations 130A, 13 OB, 130C and so forth. By way of example, data object 120 A may be queued for replication to target location 130A and target location 130C. Data object 120B may be queued for replication to target location 130B and target location 130C. Data object 120C may be queued for replication to target location 130A and target location 130B. Still another data object, not depicted, may be queued for replication to target location 130A, target location 130B, target location 130C and still another target location also not depicted. It is appreciated that the depiction and description of three exemplary data objects 120A, 120B, and 120C and three exemplary target locations 130A, 130B, and 130C is by way of a non-limiting example only, and any number of data objects and target locations may be implemented. Some or all of the plurality of data objects 120A, 120B, 120C may be stored in an encrypted form, or in an unencrypted form.

A cache manager 140 may manage data items in the cache 110. The cache manager 140 may be located on a hardware device associated with one or more physical storage device which comprises the cache 110. For example, and without limiting the generality of the foregoing, the cache 110 may comprise a plurality of hard disk drives, solid state disk drives, or other devices of other data storage technologies which are located in a remote location (e.g., “the cloud”), accessible, for example, via a network. The network may comprise a local area network (LAN), a wide area network (WAN), a combination of both a LAN and a WAN, and so forth. Network access may be restricted to a VPN, for instance. Network access may be limited to access using a secure token or certificate, for example. It is appreciated that although only one cache manager 140 is depicted, a plurality of cache mangers 140 may serve the same or a similar function as a single cache manager 140. A master cache manager 140 (not specifically depicted) may coordinate between individual cache managers 140 of the plurality of cache managers 140, so that the plurality of cache managers 140 operate effectively as a single cache manager 140. The cache manager 140 itself may be implemented in a processor, such as a microprocessor, i.e., an integrated circuit (IC) which incorporates core functions of a computer's central processing unit (CPU). The microprocessor may be a programmable multipurpose silicon chip, clock driven, register based, which accepts binary data as input and provides output after processing it as per the instructions stored in the memory.

A data mover 150, depicted, for ease of depiction as being in the cache, may move the data objects 120A, 120B, 120C , as will be described below, from within the cache 110 to any or all of the target locations 130A, 130B, and 130C, as appropriate. It is appreciated that although only one data mover 150 is depicted, a plurality of data movers 150 may serve the same or a similar function as a single data mover 150. A master data mover 150 (not specifically depicted) may coordinate between individual data mover 150 of the plurality of data mover 150, so that the plurality of data movers 150 operate effectively as a single data mover 150.

The plurality of data objects 120 A, 120B, 120C are to be replicated to a plurality of different known target locations 130A, 130B, 130C. At that time, the cache manager 140 determines which data object of the plurality of data objects 120A, 120B, 120C is next to be replicated. By way of example, data object 120 A may be the next data object to be replicated. In such a case, the cache manager 140 is operative to keep the data object 120A in the cache 110.

The cache manager 140 will track which data objects 120 A, 120B, 120C of the plurality of data objects 120A, 120B, 120C are to be replicated to which of the known target locations 130A, 130B, 130C among the plurality of different target locations 130A, 130B, 130C. As per the example given above, and shown in Fig. 1, data object 120A may be queued for replication to target location 130A and target location 130C; data object 120B may be queued for replication to target location 130B and target location 130C; and data object 120C may be queued for replication to target location 130A and target location 13 OB. As was mentioned above, another, not depicted data object may be queued for replication to all of the target locations I30A, 130B, 130C and still another not-depicted target location.

It is noted that available bandwidth for replication of the data objects 120A, 120B, 120C to the target locations 130A, 130B, 130C varies according to target location. Various factors may affect available bandwidth, such as, and not limited to transfer technology, other usage and traffic on networks, physical hardware, and so forth. By way of example the bandwidth between the cache 110 and target location 130 A is depicted as being 100Mbps; the bandwidth between the cache 110 and target location 130B is depicted as being 50Mbps; and the bandwidth between the cache 110 and target location 130C is depicted as being 70Mbps.

When the cache 110 is full (i.e., data objects, such as the plurality of the data objects 120A, 120B, 120C and so forth are close to exceeding the storage capacity of the cache 110), the cache manager 140 may delete one of a data object 160 which is not to be replicated to the plurality of different target locations 130A, 130B, 130C. Alternatively, if there is no other data object 160 which is not to be replicated to the plurality of different target locations 130A, 130B, 130C, or if said data object 160 which is not to be replicated to the plurality of different target locations 130A, 130B, 130C requires less storage space than one of the plurality of data objects 120A, 120B, 120C which is to be replicated to one of the plurality of different target locations 130A, 130B, 130C, or, for example, a latest received data object (e.g., data object 130B), the cache manager 140 may delete one of the plurality of data objects 120 A, 120B, 120C, typically the one of the plurality of data objects 120 A, 120B, 120C which is to be the last one of the plurality of data objects 120A, 120B, 120C to be replicated. Alternatively, the cache manager 140 may delete a latest received data object 130A, 130B, 130C when the cache 110 is full.

Furthermore, the cache manager 140 may determine a cost (in terms of computing power) of reading an data object, such as one of the plurality of data objects 120 A, 120B, 120C, from a persistence layer (described below in greater detail) and, when deleting one data object of the plurality of data objects one of the plurality of data objects 120A, 120B, 120C, the cache manager 140 will first delete the data object among the plurality of data objects 120A, 120B, 120C with the lowest cost of reading the data object 120A, 120B, 120C from the persistence layer.

The cache manager 140 may create a different task for replication of each data object of the plurality of data objects 120A, 120B, 120C. For example, a first task may be created and implemented in order to replicate data object 120A to target location 130A and a second task may be created and implemented in order to replicate data object 120A to target location 130C.

As will be shown below in more detail with reference to Fig. 2, the plurality of data objects 120A, 120B, 120C might be written to a log device. The log device might be stored in a persistence layer. The persistence layer may comprise a database, a flat file, a registry, and so forth, and the log device might comprise a distributed service provides persistent storage and delivery of records organized in sequences referred to as logs. For durability, copies of each data object, such as data objects 120A, 120B, 120C, are stored across multiple servers and failure domains.

When one of the plurality of data objects 120A, 120B, 120C are to be replicated to a target location, the data mover 150 mover reads the first data object data from the log device, and copies the first data object data to a replica site, e.g., one of the plurality of different target locations 130A, 130B, 130C.

An index layer may comprise a data structure indexing a location of the plurality of data objects 120A, 120B, 120C within the cache 110.

When a particular data object is to read for replication by the data mover 150 after a first data object, the cache manager 140 may locate the particular data object in the cache 110 so that the particular data object is stored consecutively in the cache with the first data object. By way of example, if data object 120 A is queued to be the first data object replicated to one or more target locations 130A, 130C (as per the above example) and then data object 120B is queued to be the next data object replicated to one or more target locations 130B, 130C (as per the above example), the cache manager 140 may dispose data object 120B consecutively in the cache to data object 120A. Data object 120C may be disposed in the cache consecutively in the cache to data object 120B, if data object 120C is queued to be replicated after data object 120B.

In some cases, when the next data object is not in the cache, such as data object 170, and the data object which is not in the cache is to replicated to some target location, the cache manager 140 may prefetch the next data object into the cache, prior to replication. By way of example, data object 170 may be disposed in target location 130A. The cache manager 140 may have a replication queue which may show that there is an upcoming replication of data object 170 from target location 130A to, for example target location 130B. In such a case, the cache manager 140 may prefetch data object 170 from target location 130A into the cache 110. When the queued replication of data object 170 to the target location 130B is at the top of the queue, data object 170 is now in the cache 110, and available for immediate replication to target location BOB. The cache 110 and the plurality of target locations 130A, 13 OB, 130C may be part of a set of nodes that run containerized applications, i.e., a Kubernetes cluster. Containerizing applications packages an application with its dependences and some necessary services (depending on the application so containerized). Containerized applications are typically more lightweight and flexible than virtual machines. In this way, Kubernetes clusters of containerized applications allow for such applications to be more easily moved and managed.

Once one data object 120A, 120B, 120C has been replicated to all of its known target locations, the cache manager 140 may then delete the one data object 120A, 120B, 120C. By way of example, the one data object 120A is, as per the above example, queued to be copied to the target locations 130A and 130C. Once the one data object 120A has been copied to the target locations 130A and 130C, the cache manager 140 may delete the one data object 120A from the cache 110.

Reference is now made to Fig. 2, which is an example of a backup and replication system 200 for of multiple data items in a Huawei cloud environment. It is noted that while Fig. 2 is specific to the Huawei cloud environment, particular elements are exemplary of their corresponding element in other cloud environments (by way of a non-limiting example, AWS, Microsoft Azure, Google cloud platform, and so forth). For example, Fig. 2 may depict one particular type of a message queue (i.e., Kafka), however, it is understood that this one particular type of a message queue is exemplary, and any other appropriate message queue application may be used.

A first object storage 210A has a plurality of nodes 220 A, 220B, . . ., 220n (where ‘n’, as used throughout this description, indicates some positive whole number, and not 14). A data object 120A, 120B, 120C may be stored (or after it is created, stored) in an object storage (OCS, OpenShift Container Storage) 225 in the node 220 A, 220B, . . . , 220n, which, manages data as objects, i.e., a single unit of data and associated metadata (such as access policies). A data object is typically identified by some sort of unique data object ID.

When a data object (corresponding to data objects 120A, 120B, 120C, described above with reference to Fig. 1) is created in one of the plurality of nodes 220A, 220B, ..., 220n, an index layer 230 is updated so as to index pertinent information (at least the location of the newly created data object. The data object is written to a log device, denoted in Fig. 2 as here as PLog 240 stored in the persistence layer 250.

When the index layer 230 is updated that a newly created data object has been added to the first object storage 210A, the index layer 230, via a messaging service (MSG SVC) 235, notifies a message queue, for example, a Kafka message queue 260. Kafka is a distributed event store and stream-processing platform. Kafka provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka uses a binary TCP (Transmission Control Protocol)-based protocol that is optimized for efficiency and relies on a "message set" abstraction that naturally groups messages together to reduce the overhead of the network roundtrip. Kafka aims to provide large network packets, large sequential disk operations, contiguous memory blocks, turning bursty stream of random message writes into linear writes.

The data object (e.g., 120A, Fig. 1) to be replicated to the target location (e.g., 130A, Fig. 1) is replicated in a Kubernetes (K8s) cluster (which may be one of a plurality of Kubernetes clusters) 270.

A replication manager 280 receives a list of data objects to be replicated from the Kafka message queue 260. The replication manager 280 determines which data mover (DMS) 290 is to copy each object to be replicated to a target location (such as target locations 130A, 130B, 130C, Fig. 1). Data mover 290 may be the same as or similar to data mover 150 of Fig. 1. As noted above, with reference to Fig. 1, the cache manager 140 creates a task for each target location (such as target locations 130A, 130B, 130C, Fig. 1), but the same replication node 270 will replicate the same object for each different task. The data mover 290 reads the data object, such as data object 120 A, 120B, 120C of Fig. 1 from the PLog device 240. In typical systems, the same data mover 290 sends the data object 120A, 120B, 120C (Fig. 1) to all of the target locations 130A, 130B, 130C (Fig. 1) when the data mover 290 reads the object from the PLog device 240. In such a case, if the targets have very different band width (such as is depicted in Fig. 1), the data mover may get slowed down while moving the objects to all the locations 130A, 130B, 130C (Fig. 1) and replication speed will be reducing to the speed of the slowest link (50 Mbps in the example of Fig. 1). Alternatively, each target locationl30A, 130B, 130C (Fig. 1) may be managed by another task. In such a case each time one of the data objects 120A, 120B, 120C (Fig. 1) needs to be replicated it will be read from the PLog device 240, increasing load on the PLog device 240. This way even when the bandwidth of the target locations 130A, 130B, 130C (Fig. 1) is similar to each other, data objects 120A, 120B, 120C (Fig. 1) need to be read multiple times from the PLog device 240.

A caching algorithm is provided to manage the cache 295, which may be the same or similar to the cache 150 of Fig. 1. The Least Recently Used (LRU) cache eviction algorithm might be used, so that data objects 120A, 120B, 120C are organized in the cache 295 in order of use. In LRU, as the name suggests, the data object that has not been used for the longest time will be evicted from the cache, when the cache is near full. However, as the number of times each data object 120A, 120B, 120C is to be replicated is known, the methods and systems for cache management described above with reference to Fig. 1 are implemented.

By way of example, two replication tasks may be invoked to replicate data objects 120A, 120B, 120C. As discussed above, due to bandwidth considerations, the first of the two replication tasks may be twice as fast as the second of the two replication tasks. Since the cache 295 is limited in capacity, only a limited number of data objects 120A, 120B, 120C may be held in the cache 295 at any given time. For the sake of example, say that only ten data objects may be held in the cache 295 at any given time. Further, assume that the first of the two replication tasks is twice as fast as the second of the two replication tasks. Accordingly, when the first of the two replication tasks replicates a twentieth data item, the second of the two replication tasks is replicating a tenth data item.

Accordingly, the caching algorithm will keep data objects 120A, 120B, 120C (Fig. 1) that are going to be read next in the cache 295. When the cache 295 is full, one data object 120A, 120B, 120C which is not to be read will be deleted. If no such data object 120A, 120B, 120C is present in the cache 295 when the cache is full, the data object 120A, 120B, 120C that is to be replicated last will be deleted. The data object 120A, 120B, 120C which is deleted from the cache 295 may also be deleted from the PLog device 240 on which said data object 120A, 120B, 120C resides if said PLog device 240 is a least used PLog device 240 among all of the PLog devices 240.

Since PLog devices 240 are often stored on hard disk drives (HDD), a next read from the PLog device 240 will be configured to be sequential. Specifically, data objects 120A, 120B, 120C may be stored sequentially in the cache 295 so that reads from the cache 295 need not be fragmented.

If data objects 120A, 120B, 120C which are next to be replicated are not stored sequentially, the data objects 120A, 120B, 120C may then be prefetched in advance of their replication from their present locations, so that they may be place sequentially prior to their replication.

Returning to the explanation of Fig. 2, data object 120A, 120B, 120C transfer is through a first network gateway 297. The first network gateway 297 may typically comprise a device or node that connects disparate networks by translating communications from one protocol to another. The data object is then sent over a network 301 to a second object storage 210B, a third object storage 210C, and so forth.

For ease of depiction, the second object storage 210B will be described herein, below. The transferred data object 120A, 120B, 120C is received at a second network gateway 397. In general, items in the second object storage 210B will be labelled with numbers one hundred (100) more than their corresponding item in the first object storage 210A, so, for instance, the first object storage 210A has first network gateway 297, and the second object storage 21 OB has corresponding second network gateway 397. Similarly, such items will be described as being a “second” such item, even where the “first” of such items (i.e., in the first object storage 210A is not explicitly designated as “first”).

A second Kubernetes (K8s) cluster (which may be one of a plurality of Kubemetes clusters) 370 comprises a second data mover (DMS) 390 and a second replication manager 380. The second data mover 390 moves the received data object 120 A, 120B, 120C from the second Kubernetes (K8s) cluster 370 to an object storage 320 (which may be one among several - not labelled - object storage nodes of the second object storage 210B.

The replicated data objects 120 A, 120B, 120C are moved by the second data mover 390 to the second OCS 325. The received data object is indexed in the second index layer 335, i.e., the index layer of the second object storage 210B.The replicated data objects 120A, 120B, 120C may be stored in the second PLog device 340 in the second persistence layer 350.

Object storage, on the other hand, manages data as objects: a single unit of data and associated metadata (such as access policies). An object is identified by some sort of unique id.

Reference is now made to Fig. 3, which is 3 is a simplified flowchart illustration of a method of operation of the system and apparatus described in Figs. 1 and 2. In step 410, a plurality of data objects are stored in a cache, the plurality of data objects being data objects to be replicated to a plurality of different known target locations. For example, the plurality of data objects may be the same as or similar to the plurality of data objects 120A, 120B, 120C of Fig.

1. The cache may be the same as or similar to the cache 110 of Fig. 1 and the cache 295 of Fig.

2. The plurality of different known target locations may be the same as or similar to the target locations target locations 130A, 130B, 130C of Fig. 1 and/or the second object storage 210B and the third object storage 210C of Fig. 2.

At step 420, a cache manager determines a first data object in the cache that is next to replicated. The cache manager may be the same as or similar to the cache manager 140 of Fig. 1. At step 430, the cache manager tracks which data objects of the plurality of data objects are to be replicated to which of the known target locations among the plurality of different target locations. At step 440, the first data object is kept in the cache. At step 450, when the cache is full one of a second data object which is not to be replicated or a third data object which is to be replicated last is deleted. Each replication of a data object is performed by a different task.

It is expected that during the life of a patent maturing from this application many relevant data storage technologies will be developed and the scope of the term of this application is intended to include all such new technologies a priori.

The terms "comprises", "comprising", "includes", "including", “having” and their conjugates mean "including but not limited to".

The term “consisting of’ means “including and limited to”.

The term "consisting essentially of' means that the composition, method or structure may include additional ingredients, steps and/or parts, but only if the additional ingredients, steps and/or parts do not materially alter the basic and novel characteristics of the claimed composition, method or structure.

As used herein, the singular form "a", "an" and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a data object" or "at least one data object" may include a plurality of data objects.

Throughout this application, various embodiments of this invention may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.

Whenever a numerical range is indicated herein, it is meant to include any cited numeral (fractional or integral) within the indicated range. The phrases “ranging/ranges between” a first indicate number and a second indicate number and “ranging/ranges from” a first indicate number “to” a second indicate number are used herein interchangeably and are meant to include the first and second indicated numbers and all the fractional and integral numerals therebetween.

It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as suitable in any other described embodiment of the invention. Certain features described in the context of various embodiments are not to be considered essential features of those embodiments, unless the embodiment is inoperative without those elements. Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.

It is the intent of the Applicant(s) that all publications, patents and patent applications referred to in this specification are to be incorporated in their entirety by reference into the specification, as if each individual publication, patent or patent application was specifically and individually noted when referenced that it is to be incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention. To the extent that section headings are used, they should not be construed as necessarily limiting. In addition, any priority document(s) of this application is/are hereby incorporated herein by reference in its/their entirety.

Claims

1. A device comprising: a cache a plurality of data objects comprised in the cache, the plurality of data objects to be replicated to a plurality of different known target locations; and a cache manager operative to: determine a first data object in the cache that is next to replicated and keep the first data object in the cache; track which data objects of the plurality of data objects are to be replicated to which of the known target locations among the plurality of different target locations; and delete, when the cache is full, one of: a second data object which is not to be replicated and a third data object which is to be replicated last; wherein each replication of a data object is performed by a different task.

2. The device of claim 1 wherein each data object of the plurality of data objects are written to a log device.

3. The device of claim 2 wherein the log device is stored in a persistence layer.

4. The device of any of claim 2 or claim 3 wherein a data mover reads the first data object data from the log device, and copies the first data object data to a replica site.

5. The device of any of claim 1 to claim 4 wherein each data object of the plurality of data objects are indexed in an index layer.

6. The device of any of claim 1 to claim 5 wherein a data object to be read after the first data object that is stored consecutively in the cache with the first data object.

7. The device of any of claim 1 to claim 6 wherein the cache manager prefetches a fourth data object to be read after the first data object if the fourth data object is not in the cache.

8. The device of any of claim 1 to claim 7 wherein replication of the plurality of data objects occurs in a Kubernetes cluster.

9. The device of any of claim 1 to claim 8 wherein the cache manager is operative to delete a latest received data object when the cache is full.

10. The device of any of claim 1 to claim 9 and further comprising the cache manager deleting one data object of the plurality of data objects when the one data object has been copied to all of its known target locations.

11. The device of any of claim 1 to claim 10 wherein the cache manager deletes at least two data objects of the plurality of data objects which are written sequentially on the hard disk and need to be sent one after the other.

12. The device of claim 11 further comprising, after deleting least two data objects of the plurality of data objects, the cache manager fetching the at least two data objects of the plurality of data from the hard drive.

13. A method comprising: storing a plurality of data objects in a cache, the plurality of data objects to be replicated to a plurality of different known target locations; determining, by a cache manager, a first data object in the cache that is next to replicated; tracking, by the cache manager, which data objects of the plurality of data objects are to be replicated to which of the known target locations among the plurality of different target locations; keeping the first data object in the cache; and deleting, when the cache is full, one of: a second data object which is not to be replicated or a third data object which is to be replicated last, wherein each replication of a data object is performed by a different task.

14. The method of claim 13 wherein each data object of the plurality of data objects are written to a log device.

15. The method of claim 14 wherein a data mover reads the first data object data from the log device, and copies the first data object data to a replica site.

16. The method of claim 14 wherein the log device is stored in a persistence layer.

17. The method of any of claim 13 to claim 16 wherein each data object of the plurality of data objects are indexed in an index layer.

18. The method of any of claim 13 to claim 17 wherein a data object to be read after the first data object that is stored consecutively in the cache with the first data object.

19. The method of any of claim 13 to claim 18 wherein the cache manager prefetches a fourth data object to be read after the first data object if the fourth data object is not in the cache.

20. The method of any of claim 13 to claim 19 wherein replication of the plurality of data objects occurs in a Kubernetes cluster.

21. The method of any of claim 13 to claim 20 further comprising the cache manager deleting a latest received data object when the cache is full.

22. The method of claim 14 wherein the cache manager determines a cost of reading an data object from the persistence layer and, when deleting one data object of the plurality of data objects, the cache manager will first delete the data object with the lowest cost of reading from the persistence layer.

23. The method of any of claim 13 to claim 22 and further comprising the cache manager deleting one data object of the plurality of data objects when the one data object has been copied to all of its known target locations. A computer readable storage medium having data stored therein representing software executable by a computer, the software including instructions to: store a plurality of data objects in a cache, the plurality of data objects to be replicated to a plurality of different locations; determine, by a cache manager, a first data object in the cache that is next to replicated; keep the first data object in the cache; and delete, when the cache is full, one of: a second data object which is not to be replicated or a third data object which is to be replicated last, wherein each replication of a data object is performed by a different task.