WO2012030358A1 - Content addressable storage with reduced latency - Google Patents

Content addressable storage with reduced latency Download PDF

Info

Publication number
WO2012030358A1
WO2012030358A1 PCT/US2010/058681 US2010058681W WO2012030358A1 WO 2012030358 A1 WO2012030358 A1 WO 2012030358A1 US 2010058681 W US2010058681 W US 2010058681W WO 2012030358 A1 WO2012030358 A1 WO 2012030358A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
data
storage system
address
addressable storage
Prior art date
Application number
PCT/US2010/058681
Other languages
French (fr)
Inventor
Cristian Ungureanu
Original Assignee
Nec Laboratories America, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/905,223 external-priority patent/US8375164B2/en
Application filed by Nec Laboratories America, Inc. filed Critical Nec Laboratories America, Inc.
Priority to EP10856821.3A priority Critical patent/EP2470997A4/en
Priority to JP2013527055A priority patent/JP5591406B2/en
Publication of WO2012030358A1 publication Critical patent/WO2012030358A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/22Employing cache memory using specific memory technology
    • G06F2212/222Non-volatile memory

Definitions

  • the present invention relates to storing data in a content-addressable storage system, and more specifically, to interposing a storage layer between an application and a content-addressable storage system for reducing the latency associated with writing data to the content-addressable storage system.
  • CAS Content-addressable storage
  • a CAS system calculates a hashkey based on the content of the block, performs a check to determine whether or not a block with identical contents (to the one currently being written) has already been written to the CAS system (e.g., by looking up values in a hash table), and writes the block if it determines that the block is unique.
  • acknowledgment also returns a content address, which is equal to or derived from the hashkey.
  • the content address is used during read operations to retrieve the block.
  • a system for storing data in a storage system.
  • the system includes a content-addressable storage system and a persistent cache.
  • the persistent cache includes a temporary address generator that configured to generate a temporary address which is associated with data to be stored in the persistent cache, and a non-content-addressable storage system configured to store and retrieve data in the persistent cache using the temporary address.
  • the persistent cache further comprises an address translator configured to map a temporary address associated with the data in the non-content addressable storage system with a content address associated with the data in the content-addressable storage system.
  • a method for storing data in a storage system includes determining whether data associated with a write request is to be stored in a non-content-addressable storage system or written directly to a content-addressable storage system. If it is determined that the data is to be stored in the non-content- addressable storage system, a temporary address is generated for the data to be stored in the non-content-addressable store and an acknowledgement that data is persistently stored in the non-content addressable storage system may be sent before the data is written to a content-addressable storage system.
  • At least one temporary address associated with the data in the non-content-addressable store is mapped with a content address of the data in the content-addressable storage system after the data is written to the content-addressable storage system.
  • Figure 1 is block/flow diagram of a system for storing data in a content- addressable storage system in accordance with the present principles.
  • Figure 2 is block/flow diagram illustrating in further detail the system in Figure 1 for storing data in a content-addressable storage system.
  • Figure 3 is block/flow diagram illustrating a method for storing data in a content-addressable storage system in accordance with the present principles.
  • a description of a storage system which can reduce the latency associated with accesses to a content-addressable storage system.
  • the system interposes a storage layer comprised of a low latency block store (LLBS) between a content-addressable block store (CABS) and an application which is issuing I/O operations in accordance with a content-addressable API.
  • LLBS low latency block store
  • CABS content-addressable block store
  • an application which issuing I/O operations in accordance with a content-addressable API.
  • blocks can first be written to the LLBS, acknowledged, and subsequently transferred to the CABS. At some point later in time, the blocks may then be removed from LLBS.
  • the disadvantages e.g., high latency
  • content-addressable storage e.g., de- duplication
  • An LLBS may utilize a solid-state drive or hard disk drive for persistent storage. These devices are optimized to reduce latency associated with I/O operations.
  • the LLBS can store data temporarily and return an acknowledgement to an application so that the application does not experience the delay associated with calculating a hash or searching for values in hash table.
  • the LLBS can also initiate a write to CABS which includes the same data that was written to the LLBS. Writes to the CABS experience high latency because of the delays associated with calculating hashes and looking up values in a hash table. However, the latency is not experienced by the application (or an end user utilizing the application) because the LLBS is able to quickly store the data and return an acknowledgment.
  • Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements.
  • the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
  • Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
  • a computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device.
  • the medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or system) or a propagation medium.
  • the medium may include a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
  • a data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus.
  • the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution.
  • I/O devices and systems including but not limited to keyboards, displays, pointing systems, etc. may be coupled to the system either directly or through intervening I/O controllers.
  • Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems, remote printers, storage devices, or storage systems through intervening private or public networks.
  • Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
  • FIG. 1 a block/flow diagram illustratively depicts a system 100 for storing data in a content-addressable storage system in accordance with the present principles.
  • an application 130 stores data in a storage system 1 10.
  • the application 130 may be executing locally on a computer which comprises storage system 1 10, or may be executing on a client machine that is coupled to a server or other system (e.g., via a network) which comprises storage system 110.
  • Storage system 1 10 comprises a low latency block store (LLBS) 150 and a content-addressable block store (CABS) 160.
  • the CABS 160 may represent any type of content-addressable storage system.
  • the LLBS 150 may include a solid-state drive (SSD) or hard disk drive (HDD) which is optimized to reduce latency associated with I/O operations.
  • SSD solid-state drive
  • HDD hard disk drive
  • LLBS 160 is not limited to these types of storage devices, and, in general, may utilize any non-content-addressable storage media that has lower latency than CABS 160 with respecting to input/output (I/O) operations.
  • the application 130 may initially store data in the LLBS 150. Upon successfully storing data to the LLBS 150, an acknowledgment is returned to the application 130. Since the LLBS 150 provides for reduced latency, the acknowledgement is returned relatively quickly, or at the least, quicker than CABS 160 is able to return an acknowledgment.
  • a content-addressable storage application programming interface permits communication between both the application 130 and the LLBS 150 and LLBS 150 and the CABS 160.
  • FIG. 2 a more detailed view of a system 200 for storing data in a content-addressable storage system is illustratively depicted.
  • Application 130 sends a write request to LLBS 150.
  • the cache manager 210 may forward the request to the non-content addressable storage system 235 which is configured as a key- value store 230 which uses the storage device 240 to store data persistently.
  • the cache manager 210 obtains a temporary address from the temporary address (“TA") generator 250 and this address will be used as the key with which the data may be later retrieved.
  • TA temporary address
  • the key-value store 230 is responsible for controlling the manner in which data is stored in the storage device 240.
  • the key-value store 230 stores both the data and its temporary address in storage device 240.
  • the data can later be retrieved or read using the temporary address.
  • Storage device 240 is preferably a low latency system such as a solid-state drive (SSD), hard disk drive (HDD), or other device that provides for a lower latency than CABS 160 with respect to performing I/O operations.
  • SSD solid-state drive
  • HDD hard disk drive
  • the cache manager 210 Upon writing the data to the LLBS 150, the cache manager 210 will forward an acknowledgment to the application 130 along with the temporary address that can be used to retrieve the data. The cache manager 210 will write the data, which has already been written to storage device 240, to the CABS 160 as well. In storing the data, the CABS 160 will compute a hashing value based on the content of the data and perform de- duplication operations (e.g., which may involve looking up values in a hash table). Even if two identical blocks had been written to the LLBS 150 and each was assigned a separate temporary address, both of these blocks will eventually be mapped to the same content address when the LLBS 150 transfers the data to the CABS 160. Since the LLBS 150 had previously confirmed a successful write operation, the application 130 can avoid the latency associated with these hashing and hash table lookup operations while retaining the de-duplication benefits associated with storing data in the CABS 160.
  • de- duplication operations e.g., which may involve looking up
  • the CABS 160 After successfully storing the data, the CABS 160 returns a content address to cache manager 210 at the LLBS 150 which reflects where the data is stored in the CABS 160.
  • the content address is forwarded to the address translator 220 which will map the temporary address (reflecting the location of the data in the LLBS 150) to the content address (reflecting the location of the data in the CABS 160) and store this mapping information in storage device 240.
  • the data associated with each embedded address should first be written to the CABS 160 and mapped to a corresponding content address before the parent block is written to the CABS 160. This avoids writing temporary addresses to the CABS 160.
  • the LLBS 150 can delete the corresponding data in storage device 240. If the application 130 issues a subsequent read request using the temporary address, the content address associated with the temporary address can first be retrieved by the address translator 220, and this information can be used to retrieve the data from the CABS 160.
  • mapping of a temporary address to a content address may involve the cooperation of the application 130.
  • Cooperation of the application 130 is needed to avoid a situation where the application 130 requests a block using its temporary address, but neither the block, nor the mapping from that temporary address to the content address, is available at the LLBS 150.
  • One way to avoid this situation is to have the application 130 periodically drop all of its addresses. Once this is done, the LLBS 150 can delete all of its mappings.
  • the application 130 can access blocks by issuing a read for the labeled block representing the root of a directed acyclic graph, e.g., in the manner explained in United States Patent Application
  • the LLBS 150 needs to be able to distinguish between temporary addresses and content addresses. This can be achieved by reserving a bit in the address which indicates whether the address is a content address or a temporary address.
  • FIG. 3 a block/flow diagram illustrates a method for storing data in a content-addressable storage system in accordance with the present principles.
  • an application 130 issues a write request to store data on a storage system 1 10.
  • the storage system 1 10 may include both a non-content-addressable system (e.g., LLBS 150) and a CABS 160 as shown in Figures 1 and 2.
  • LLBS 150 a non-content-addressable system
  • CABS 160 as shown in Figures 1 and 2.
  • the LLBS 150 Upon receiving the write request, the LLBS 150 will assign a temporary address to the data in block 320. The temporary address is used to store and retrieve the data in the non-content addressable storage 235.
  • determining a temporary address for storing the data does not involve computing a hash.
  • the temporary address may be generated by the temporary address generator 250 in Figure 2, and used by the key- value store 230 to store the data.
  • the data which is the subject of the write request is stored at the LLBS 150 along with the temporary address which was assigned to the data block.
  • the manner in which this information is stored may differ.
  • the non-content addressable store is configured as a key- value store, where the keys are the temporary addresses and the values are the data contents of the write requests.
  • Figure 2 discloses a single storage device 240 for storing both the mapping from temporary address to content addresses and the data retrievable through the temporary address, in other embodiments the mapping between temporary address and content addresses, and the data retrievable through the temporary address may be stored on separate storage devices.
  • the LLBS 150 After the data from the application 130 has been stored in the LLBS 150, the LLBS 150 sends an acknowledgement to that application 130 which indicates that the data has been successfully stored (block 340).
  • the acknowledgement sent from the LLBS 150 to the application 130 also includes the temporary address associated with the data to allow the application 130 to later retrieve the data.
  • the storage device 240 at the LLBS 150 provides for relatively low latency with respect to storing information when compared to the CABS 160. Since the LLBS 150 is able to write the data to storage device 240 and return an acknowledgment to the application 130 more quickly than CABS 160 would have been able to do so, the latency experienced by the application 130 is reduced.
  • the LLBS 150 Upon forwarding the acknowledgment to the application 130, the LLBS 150 will subsequently write the data to the CABS 160 in block 350. Once the data stored at the LLBS 150 has been successfully copied to the CABS 160, the CABS 160 will return a content address to the LLBS 150. The content address, which is based on the content of the data block being written to CABS 160, reflects where the data is written in the CABS 160.
  • storing data in a content-addressable system involves performing latency-intensive operations such as computing a hash and performing de-duplication operations.
  • the application 130 does not have wait for these latency-intensive operations to be performed.
  • the storage system 1 10 of the present application allows an application 130 to reap the benefits of content- addressable storage while eliminating, or at least mitigating, the disadvantages of storing data in such a system.
  • the content address will be sent to the address translator 220 which is configured to map the temporary address to the content address and store this information in storage device 240 (block 360).
  • the data (which is currently stored in both the LLBS 150 and the CABS 160) may be deleted from the LLBS 150 in block 370.
  • the read request may include the temporary address of the data.
  • the temporary address may be used by the address translator 220 to identify the corresponding content address of the data in the CABS 160. The data may then be read from the CABS 160 using the content address.
  • the address mapping (i.e., the mappings between the temporary address and the content address) on the LLBS 150 are periodically removed. This may be advantageous because the mappings stored at LLBS 150 may grow to be very large in size, thus taking up space in the storage device 240 which can be used otherwise for storing data.
  • the application 130 should drop the addresses (or at least the temporary addresses) that are being stored by the application 130. This ensures that the application 130 does not issue a request for data (using the temporary address of the data) at the LLBS 150 when neither the data itself, nor the mapping of the data, is stored in the LLBS 150.
  • the LLBS 150 may monitor the amount of mapping information being stored. Once the size of the mapping information exceeds a certain threshold, the LLBS 150 may send an "address drop signal" to the application 130 to tell the application 130 that the address information being stored by the application 130 should be dropped. After the application 130 has dropped the addresses, an acknowledgment may be sent to the LLBS 150 which indicates such. Upon confirming that the addresses were dropped by the application 130, the LLBS 150 can then delete the mapping information stored on storage device 240. Other ways of indicating that addresses should be dropped by the application 130 are also contemplated.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A system and method for storing data in a content-addressable system is provided. The system includes a content-addressable storage system and a persistent cache. The persistent cache includes a temporary address generator that is configured to generate a temporary address which is associated with data to be stored in the persistent cache, and a non-content-addressable storage system configured to store and retrieve data in the persistent cache using the temporary address. The persistent cache further comprises an address translator configured to map a temporary address associated with the data in the non-content addressable storage system with a content address associated with the data in the content-addressable storage system.

Description

CONTENT ADDRESSABLE STORAGE WITH REDUCED LATENCY
BACKGROUND
Technical Field
[0001 ] The present invention relates to storing data in a content-addressable storage system, and more specifically, to interposing a storage layer between an application and a content-addressable storage system for reducing the latency associated with writing data to the content-addressable storage system.
Description of the Related Art
[0002] Content-addressable storage (CAS) systems are more complex with respect to writing data than traditional storage systems. Before acknowledging a synchronous write operation, a CAS system calculates a hashkey based on the content of the block, performs a check to determine whether or not a block with identical contents (to the one currently being written) has already been written to the CAS system (e.g., by looking up values in a hash table), and writes the block if it determines that the block is unique. The
acknowledgment also returns a content address, which is equal to or derived from the hashkey. The content address is used during read operations to retrieve the block.
[0003] The calculation of the hashkey, as well as the check to determine whether or not a block with identical contents was previously stored, contribute significantly to the latency associated with writing data to a CAS system. SUMMARY
[0004] In accordance with the present principles, a system is provided for storing data in a storage system. The system includes a content-addressable storage system and a persistent cache. The persistent cache includes a temporary address generator that configured to generate a temporary address which is associated with data to be stored in the persistent cache, and a non-content-addressable storage system configured to store and retrieve data in the persistent cache using the temporary address. The persistent cache further comprises an address translator configured to map a temporary address associated with the data in the non-content addressable storage system with a content address associated with the data in the content-addressable storage system.
[0005] In accordance with the present principles, a method for storing data in a storage system includes determining whether data associated with a write request is to be stored in a non-content-addressable storage system or written directly to a content-addressable storage system. If it is determined that the data is to be stored in the non-content- addressable storage system, a temporary address is generated for the data to be stored in the non-content-addressable store and an acknowledgement that data is persistently stored in the non-content addressable storage system may be sent before the data is written to a content-addressable storage system. In addition, at least one temporary address associated with the data in the non-content-addressable store is mapped with a content address of the data in the content-addressable storage system after the data is written to the content-addressable storage system. [0006] These and other features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
BRIEF DESCRIPTION OF DRAWINGS
[0007] The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:
[0008] Figure 1 is block/flow diagram of a system for storing data in a content- addressable storage system in accordance with the present principles.
[0009] Figure 2 is block/flow diagram illustrating in further detail the system in Figure 1 for storing data in a content-addressable storage system.
[0010] Figure 3 is block/flow diagram illustrating a method for storing data in a content-addressable storage system in accordance with the present principles.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
[0011] In accordance with the present principles, a description of a storage system is provided which can reduce the latency associated with accesses to a content-addressable storage system. The system interposes a storage layer comprised of a low latency block store (LLBS) between a content-addressable block store (CABS) and an application which is issuing I/O operations in accordance with a content-addressable API. Rather than writing blocks directly to the CABS, blocks can first be written to the LLBS, acknowledged, and subsequently transferred to the CABS. At some point later in time, the blocks may then be removed from LLBS. In doing such, the disadvantages (e.g., high latency) associated with writing to content-addressable storage are eliminated or mitigated, while the advantages of using content-addressable storage (e.g., de- duplication) are retained.
[0012] An LLBS may utilize a solid-state drive or hard disk drive for persistent storage. These devices are optimized to reduce latency associated with I/O operations. In accordance with the principles described herein, the LLBS can store data temporarily and return an acknowledgement to an application so that the application does not experience the delay associated with calculating a hash or searching for values in hash table. The LLBS can also initiate a write to CABS which includes the same data that was written to the LLBS. Writes to the CABS experience high latency because of the delays associated with calculating hashes and looking up values in a hash table. However, the latency is not experienced by the application (or an end user utilizing the application) because the LLBS is able to quickly store the data and return an acknowledgment.
[0013] Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
[0014] Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. A computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or system) or a propagation medium. The medium may include a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
[0015] A data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution. Input/output or I/O devices and systems (including but not limited to keyboards, displays, pointing systems, etc.) may be coupled to the system either directly or through intervening I/O controllers.
[0016] Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems, remote printers, storage devices, or storage systems through intervening private or public networks.
Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
[0017] Referring now to the drawings in which like numerals represent the same or similar elements and initially to Figure 1 , a block/flow diagram illustratively depicts a system 100 for storing data in a content-addressable storage system in accordance with the present principles. As shown therein, an application 130 stores data in a storage system 1 10. The application 130 may be executing locally on a computer which comprises storage system 1 10, or may be executing on a client machine that is coupled to a server or other system (e.g., via a network) which comprises storage system 110.
[0018] Storage system 1 10 comprises a low latency block store (LLBS) 150 and a content-addressable block store (CABS) 160. The CABS 160 may represent any type of content-addressable storage system. On the other hand, the LLBS 150 may include a solid-state drive (SSD) or hard disk drive (HDD) which is optimized to reduce latency associated with I/O operations. However, LLBS 160 is not limited to these types of storage devices, and, in general, may utilize any non-content-addressable storage media that has lower latency than CABS 160 with respecting to input/output (I/O) operations.
[0019] Rather than directly storing data to the CABS 160, the application 130 may initially store data in the LLBS 150. Upon successfully storing data to the LLBS 150, an acknowledgment is returned to the application 130. Since the LLBS 150 provides for reduced latency, the acknowledgement is returned relatively quickly, or at the least, quicker than CABS 160 is able to return an acknowledgment.
[0020] As can be seen, a content-addressable storage application programming interface (API) permits communication between both the application 130 and the LLBS 150 and LLBS 150 and the CABS 160.
[0021] Moving on to Figure 2, a more detailed view of a system 200 for storing data in a content-addressable storage system is illustratively depicted. Application 130 sends a write request to LLBS 150. Upon receiving a write request from the application 130, the cache manager 210 may forward the request to the non-content addressable storage system 235 which is configured as a key- value store 230 which uses the storage device 240 to store data persistently. To store the data from the write request to the non-content addressable storage system 235, the cache manager 210 obtains a temporary address from the temporary address ("TA") generator 250 and this address will be used as the key with which the data may be later retrieved.
[0022] The key-value store 230 is responsible for controlling the manner in which data is stored in the storage device 240. The key-value store 230 stores both the data and its temporary address in storage device 240. The data can later be retrieved or read using the temporary address. Storage device 240 is preferably a low latency system such as a solid-state drive (SSD), hard disk drive (HDD), or other device that provides for a lower latency than CABS 160 with respect to performing I/O operations.
[0023] Upon writing the data to the LLBS 150, the cache manager 210 will forward an acknowledgment to the application 130 along with the temporary address that can be used to retrieve the data. The cache manager 210 will write the data, which has already been written to storage device 240, to the CABS 160 as well. In storing the data, the CABS 160 will compute a hashing value based on the content of the data and perform de- duplication operations (e.g., which may involve looking up values in a hash table). Even if two identical blocks had been written to the LLBS 150 and each was assigned a separate temporary address, both of these blocks will eventually be mapped to the same content address when the LLBS 150 transfers the data to the CABS 160. Since the LLBS 150 had previously confirmed a successful write operation, the application 130 can avoid the latency associated with these hashing and hash table lookup operations while retaining the de-duplication benefits associated with storing data in the CABS 160.
[0024] After successfully storing the data, the CABS 160 returns a content address to cache manager 210 at the LLBS 150 which reflects where the data is stored in the CABS 160. The content address is forwarded to the address translator 220 which will map the temporary address (reflecting the location of the data in the LLBS 150) to the content address (reflecting the location of the data in the CABS 160) and store this mapping information in storage device 240. In the case where blocks have embedded addresses, the data associated with each embedded address should first be written to the CABS 160 and mapped to a corresponding content address before the parent block is written to the CABS 160. This avoids writing temporary addresses to the CABS 160.
[0025] Once the mapping of addresses has been persistently written to storage device 240, the LLBS 150 can delete the corresponding data in storage device 240. If the application 130 issues a subsequent read request using the temporary address, the content address associated with the temporary address can first be retrieved by the address translator 220, and this information can be used to retrieve the data from the CABS 160.
[0026] Although data blocks can be removed from the LLBS 150 in the manner explained above, removing the mapping of a temporary address to a content address may involve the cooperation of the application 130. Cooperation of the application 130 is needed to avoid a situation where the application 130 requests a block using its temporary address, but neither the block, nor the mapping from that temporary address to the content address, is available at the LLBS 150. One way to avoid this situation is to have the application 130 periodically drop all of its addresses. Once this is done, the LLBS 150 can delete all of its mappings. After the application 130 has dropped all of its addresses and the LLBS 150 has deleted all of its mappings, the application 130 can access blocks by issuing a read for the labeled block representing the root of a directed acyclic graph, e.g., in the manner explained in United States Patent Application
2010/0070698 which is herein incorporated by reference in its entirety.
[0027] While data is typically stored at the LLBS 150 before being transferred to the CABS 160, there may be certain situations where it is preferable for the data to be stored directly in the CABS 160. For example, consider the case where application 130 issues a write request to the LLBS 150, but the LLBS 150 does not have sufficient space available for storing the data. Rather than waiting for the LLBS 150 to free up space by
transferring data to the CABS 160, it may advantageous to write the incoming data block directly to the CABS 160. It should be noted that this is just one exemplary situation where it may be preferable to store data directly in the CABS 160, and that there may be a variety of other situations where data could be written directly to the CABS 160.
[0028] Since data may sometimes be stored directly to the CABS 160, there may be situations where the LLBS 150 returns a content address, rather than a temporary address, to the application 130. This can be handled transparently by the application 130.
However, the LLBS 150 needs to be able to distinguish between temporary addresses and content addresses. This can be achieved by reserving a bit in the address which indicates whether the address is a content address or a temporary address.
[0029] Referring now to Figure 3, a block/flow diagram illustrates a method for storing data in a content-addressable storage system in accordance with the present principles. In block 310, an application 130 issues a write request to store data on a storage system 1 10. The storage system 1 10 may include both a non-content-addressable system (e.g., LLBS 150) and a CABS 160 as shown in Figures 1 and 2. [0030] Upon receiving the write request, the LLBS 150 will assign a temporary address to the data in block 320. The temporary address is used to store and retrieve the data in the non-content addressable storage 235. Unlike the content address which will be subsequently assigned by the CABS 160, determining a temporary address for storing the data does not involve computing a hash. In one embodiment, the temporary address may be generated by the temporary address generator 250 in Figure 2, and used by the key- value store 230 to store the data.
[0031] Next, in block 330, the data which is the subject of the write request is stored at the LLBS 150 along with the temporary address which was assigned to the data block. The manner in which this information is stored may differ. For example, in one embodiment, the non-content addressable store is configured as a key- value store, where the keys are the temporary addresses and the values are the data contents of the write requests. Moreover, although Figure 2 discloses a single storage device 240 for storing both the mapping from temporary address to content addresses and the data retrievable through the temporary address, in other embodiments the mapping between temporary address and content addresses, and the data retrievable through the temporary address may be stored on separate storage devices.
[0032] After the data from the application 130 has been stored in the LLBS 150, the LLBS 150 sends an acknowledgement to that application 130 which indicates that the data has been successfully stored (block 340). The acknowledgement sent from the LLBS 150 to the application 130 also includes the temporary address associated with the data to allow the application 130 to later retrieve the data. As explained above, the storage device 240 at the LLBS 150 provides for relatively low latency with respect to storing information when compared to the CABS 160. Since the LLBS 150 is able to write the data to storage device 240 and return an acknowledgment to the application 130 more quickly than CABS 160 would have been able to do so, the latency experienced by the application 130 is reduced.
[0033] Upon forwarding the acknowledgment to the application 130, the LLBS 150 will subsequently write the data to the CABS 160 in block 350. Once the data stored at the LLBS 150 has been successfully copied to the CABS 160, the CABS 160 will return a content address to the LLBS 150. The content address, which is based on the content of the data block being written to CABS 160, reflects where the data is written in the CABS 160.
[0034] As explained above, storing data in a content-addressable system (e.g., CABS 160) involves performing latency-intensive operations such as computing a hash and performing de-duplication operations. However, by storing data initially at LLBS 150 before transferring the data to CABS 160, the application 130 does not have wait for these latency-intensive operations to be performed. Nevertheless, since the data is eventually transferred to the CABS 160, the application 130 is able to appreciate the benefits of the de-duplication performed by the CABS 160. Hence, the storage system 1 10 of the present application allows an application 130 to reap the benefits of content- addressable storage while eliminating, or at least mitigating, the disadvantages of storing data in such a system.
[0035] After the data is stored in CABS 160 and the content address is returned to the LLBS 150, the content address will be sent to the address translator 220 which is configured to map the temporary address to the content address and store this information in storage device 240 (block 360). Upon storing the mapping information, the data (which is currently stored in both the LLBS 150 and the CABS 160) may be deleted from the LLBS 150 in block 370. If the application 130 wishes to read the data at some later point, the read request may include the temporary address of the data. Despite the fact that the data which was previously stored at LLBS 150 has been deleted from LLBS 150, the temporary address may be used by the address translator 220 to identify the corresponding content address of the data in the CABS 160. The data may then be read from the CABS 160 using the content address.
[0036] In block 380, the address mapping (i.e., the mappings between the temporary address and the content address) on the LLBS 150 are periodically removed. This may be advantageous because the mappings stored at LLBS 150 may grow to be very large in size, thus taking up space in the storage device 240 which can be used otherwise for storing data. However, before the mapping information can be deleted from the LLBS 150, the application 130 should drop the addresses (or at least the temporary addresses) that are being stored by the application 130. This ensures that the application 130 does not issue a request for data (using the temporary address of the data) at the LLBS 150 when neither the data itself, nor the mapping of the data, is stored in the LLBS 150.
[0037] The manner in which the application 130 is told to drop address may differ. For example, in one embodiment, the LLBS 150 may monitor the amount of mapping information being stored. Once the size of the mapping information exceeds a certain threshold, the LLBS 150 may send an "address drop signal" to the application 130 to tell the application 130 that the address information being stored by the application 130 should be dropped. After the application 130 has dropped the addresses, an acknowledgment may be sent to the LLBS 150 which indicates such. Upon confirming that the addresses were dropped by the application 130, the LLBS 150 can then delete the mapping information stored on storage device 240. Other ways of indicating that addresses should be dropped by the application 130 are also contemplated.
[0038] Having described the preferred embodiments of a system and method for storing data in a content-addressable storage system (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments disclosed which are within the scope of the invention as outlined by the appended claims. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.

Claims

WHAT IS CLAIMED IS:
1. A storage system, comprising:
a content-addressable storage system and a persistent cache, wherein the persistent cache comprises:
a temporary address generator configured to generate a temporary address which is associated with data to be stored in the persistent cache;
a non-content-addressable storage system configured to store and retrieve data in the persistent cache using the temporary address; and
an address translator configured to map a temporary address associated with the data in the non-content addressable storage system with a content address associated with the data in the content-addressable storage system.
2. The system of claim 1, further comprising a cache manager configured to write the data that was stored in the non-content-addressable storage system to the content- addressable storage system.
3. The system of claim 2, wherein the cache manager is further configured to determine whether an address associated with a read request is a temporary address or a content address.
4. The system of claim 3, wherein the cache manager issues a read request to either the non-content-addressable storage system or the content-addressable storage system depending upon whether the address is determined to be a temporary address or a content address.
5. The system of claim 1 , wherein the persistent cache handles a read request for a temporary address by reading data associated with the temporary address from the persistent cache if the data resides in the non-content-addressable storage system, or alternatively by obtaining from the address translator the content address associated with the temporary address and issuing a read request to the content-addressable storage system using the content address.
6. The system of claim 1, wherein data is deleted from the persistent cache after the data has been written to the content-addressable storage system, but the mapping between the temporary address and the content address is retained.
7. The system of claim 1, wherein mappings between temporary addresses and content addresses are periodically deleted.
8. The system of claim 7, wherein the mappings are deleted after an application drops all temporary addresses returned to it.
9. The system of claim 1, wherein the non-content addressable storage system comprises a solid-state drive or hard disk drive.
10. A method for storing data in a storage system, comprising:
determining whether data associated with a write request is to be stored in a non- content-addressable storage system or written directly to a content-addressable storage system;
if it is determined that the data is to be stored in the non-content-addressable storage system:
generating a temporary address for the data to be stored in the non- content-addressable store;
acknowledging that data is persistently stored in the non-content addressable storage system before the data is written to a content-addressable storage system; and
mapping at least one temporary address associated with the data in the non-content-addressable store with a content address of the data in the content- addressable storage system after the data is written to the content-addressable storage system.
1 1. The method of claim 10, further comprising writing the data that was stored in the non-content-addressable storage system to the content-addressable storage system.
12. The method of claim 10, wherein the storage system is configured to determine whether an address associated with a read request is a temporary address or a content address.
13. The method of claim 12, wherein a read request is sent to either the non-content- addressable storage system or the content-addressable storage system depending upon whether the address is determined to be a temporary address or a content address.
14. The method of claim 10, wherein a read request for a temporary address is handled by reading data associated with the temporary address from the non-content addressable storage system if the data resides in the non-content-addressable storage system, or alternatively by obtaining the content address associated with the temporary address and issuing a read request to the content-addressable storage system using the content address.
15. The method of claim 10, wherein data is deleted from the non-content-addressable storage system after the data has been written to the content-addressable storage system, but the mapping between the temporary address and the content address is retained.
16. The method of claim 10, wherein mappings between temporary addresses and content addresses are periodically deleted.
17. The method of claim 16, wherein the mappings are deleted after an application drops all temporary addresses returned to it.
18. The method of claim 10, wherein the non-content addressable storage system comprises a solid-state drive or hard disk drive.
PCT/US2010/058681 2010-09-02 2010-12-02 Content addressable storage with reduced latency WO2012030358A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP10856821.3A EP2470997A4 (en) 2010-09-02 2010-12-02 Content addressable storage with reduced latency
JP2013527055A JP5591406B2 (en) 2010-09-02 2010-12-02 Low latency content address storage device

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US37952410P 2010-09-02 2010-09-02
US61/379,524 2010-09-02
US12/905,223 US8375164B2 (en) 2010-10-15 2010-10-15 Content addressable storage with reduced latency
US12/905,223 2010-10-15

Publications (1)

Publication Number Publication Date
WO2012030358A1 true WO2012030358A1 (en) 2012-03-08

Family

ID=45773184

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2010/058681 WO2012030358A1 (en) 2010-09-02 2010-12-02 Content addressable storage with reduced latency

Country Status (3)

Country Link
EP (1) EP2470997A4 (en)
JP (1) JP5591406B2 (en)
WO (1) WO2012030358A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2621681A (en) * 2022-08-11 2024-02-21 Advanced Risc Mach Ltd Circuitry and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050015758A1 (en) * 2003-07-15 2005-01-20 Geraint North Shared code caching method and apparatus for program code conversion
US20090254709A1 (en) * 2003-09-30 2009-10-08 Vmware, Inc. Prediction Mechanism for Subroutine Returns in Binary Translation Sub-Systems of Computers
US20090307430A1 (en) * 2008-06-06 2009-12-10 Vmware, Inc. Sharing and persisting code caches

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1128267A1 (en) * 2000-02-25 2001-08-29 Hewlett-Packard Company, A Delaware Corporation Disk storage system having redundant solid state data storage devices
JP2003263276A (en) * 2002-03-08 2003-09-19 Toshiba Corp Disk system and disk access method
JP2005539309A (en) * 2002-09-16 2005-12-22 ティギ・コーポレイション Storage system architecture and multiple cache device
US7263576B2 (en) * 2003-12-09 2007-08-28 Emc Corporation Methods and apparatus for facilitating access to content in a data storage system
US7240150B1 (en) * 2004-04-30 2007-07-03 Emc Corporation Methods and apparatus for processing access requests in a content addressable computer system
US7761649B2 (en) * 2005-06-02 2010-07-20 Seagate Technology Llc Storage system with synchronized processing elements
US7260681B2 (en) * 2005-06-02 2007-08-21 Seagate Technology Llc Stripe buffer list
US7747663B2 (en) * 2008-03-05 2010-06-29 Nec Laboratories America, Inc. System and method for content addressable storage
US8335889B2 (en) * 2008-09-11 2012-12-18 Nec Laboratories America, Inc. Content addressable storage systems and methods employing searchable blocks
JP5321682B2 (en) * 2009-03-31 2013-10-23 日本電気株式会社 Storage system, storage access method and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050015758A1 (en) * 2003-07-15 2005-01-20 Geraint North Shared code caching method and apparatus for program code conversion
US20090254709A1 (en) * 2003-09-30 2009-10-08 Vmware, Inc. Prediction Mechanism for Subroutine Returns in Binary Translation Sub-Systems of Computers
US20090307430A1 (en) * 2008-06-06 2009-12-10 Vmware, Inc. Sharing and persisting code caches

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2470997A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2621681A (en) * 2022-08-11 2024-02-21 Advanced Risc Mach Ltd Circuitry and method

Also Published As

Publication number Publication date
EP2470997A1 (en) 2012-07-04
JP5591406B2 (en) 2014-09-17
JP2013541753A (en) 2013-11-14
EP2470997A4 (en) 2013-05-01

Similar Documents

Publication Publication Date Title
US10402339B2 (en) Metadata management in a scale out storage system
CN102598020B (en) For the device of data deduplication improved, system and method
US10432723B2 (en) Storage server and storage system
JP7010809B2 (en) Deducible memory cache and how it works
US9778860B2 (en) Re-TRIM of free space within VHDX
EP2711841A1 (en) Data processing method, device and system based on block storage
CN101350030A (en) Method and apparatus for caching data
US10296240B2 (en) Cache management
WO2015192685A1 (en) Data storage method and network interface card
US8595454B1 (en) System and method for caching mapping information for off-host backups
US9971520B2 (en) Processing read and write requests
US9865323B1 (en) Memory device including volatile memory, nonvolatile memory and controller
WO2019000423A1 (en) Data storage method and device
WO2018028218A1 (en) Data writing method and apparatus
US11662932B2 (en) Tiered storage system with defragmentation based on weighted flash fragmentation factor
TW201126338A (en) Flash memory device and data access method for flash memories
US20150193311A1 (en) Managing production data
US8375164B2 (en) Content addressable storage with reduced latency
WO2012030358A1 (en) Content addressable storage with reduced latency
US11132128B2 (en) Systems and methods for data placement in container-based storage systems
WO2016032955A2 (en) Nvram enabled storage systems
US20160259572A1 (en) Storage system and storage control method
US9946656B2 (en) Completion packet return based on eviction or flush
JP2014059760A (en) Storage device, control method of storage device, and control program of storage device
US11537597B1 (en) Method and system for streaming data from portable storage devices

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2013527055

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2010856821

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10856821

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE