WO2014087508A1 - ストレージシステム及びストレージシステムの制御方法 - Google Patents
ストレージシステム及びストレージシステムの制御方法 Download PDFInfo
- Publication number
- WO2014087508A1 WO2014087508A1 PCT/JP2012/081566 JP2012081566W WO2014087508A1 WO 2014087508 A1 WO2014087508 A1 WO 2014087508A1 JP 2012081566 W JP2012081566 W JP 2012081566W WO 2014087508 A1 WO2014087508 A1 WO 2014087508A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- backup
- content
- chunk
- data
- storage
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
- G06F11/1453—Management of the data involved in backup or backup restore using de-duplication of the data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1456—Hardware arrangements for backup
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1464—Management of the backup or restore process for networked environments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1469—Backup restoration techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/845—Systems in which the redundancy can be transformed in increased performance
Definitions
- the present invention relates to a storage system for storing data from an external device and a storage system control method.
- a storage device is connected to an external device such as a host computer via a communication network.
- This type of storage device includes, for example, a plurality of hard disk devices (HDD: Hard Disk Disk Drive) as storage devices for storing data.
- HDD Hard Disk Disk Drive
- a data amount reduction process is executed when data is stored in the storage device.
- file compression compression
- de-duplication reduces the data capacity by contracting data segments having the same contents in one file.
- the deduplication process reduces the total data capacity of the file system and the storage system by reducing the data segments of the same content detected not only within one file but also between files.
- the unit data that is the unit of deduplication processing is referred to as “chunk”.
- data obtained by collecting a plurality of chunks is called a “container”.
- highly related chunks are collected in a container.
- a table in which a hash value calculated for each stored chunk is recorded for each container is referred to as a “container index table (Container Index)”.
- Container Index a table in which a hash value calculated for each stored chunk is recorded for each container.
- content logically grouped data that is a unit stored in the storage device.
- the contents include not only normal files but also files obtained by collecting normal files such as archive files, backup files, and virtual volume files.
- Patent Document 1 describes a method in which a storage apparatus performs duplication determination, the host computer acquires the duplication determination result from the storage apparatus, and stores only new chunks in the storage apparatus.
- the host computer inquires the storage apparatus for duplication determination for all chunks. For this reason, the host computer needs to transmit information necessary for duplication determination to the storage apparatus and receive the duplication determination result from the storage apparatus. Compared with a method in which duplication determination is performed only by the host computer, the performance is reduced by a round trip of data with the storage apparatus.
- Patent Document 2 describes a method of performing duplication determination by a host computer and storing only new chunks in a storage device according to the result.
- an object of the present invention is to provide a storage system and a storage system control method capable of performing efficient deduplication processing in cooperation with an external device and a storage device.
- an embodiment of the present invention is a storage system for storing data from an external device in units of content, and backups the data from the external devices in units of content.
- a backup device that executes backup processing for creating data, and a storage device that is communicably connected to the backup device and stores the backup data received from the backup device.
- the backup device uses first duplication determination information that is information for determining whether or not the content that is the backup data is already stored in the storage device, and the first duplication determination information.
- a first backup processing unit that determines whether the content is already stored in the storage device, and the storage device determines whether the content that is the backup data is already stored in the storage device.
- the first backup processing unit determines that the content is not stored in the storage device, and the second backup processing unit stores the content in the storage device.
- the second backup processing unit transmits the second duplication determination information to the backup device, and the first backup processing unit of the backup device receives the received second duplication determination A process of incorporating information into the first duplication determination information is executed.
- FIG. 1 is a diagram showing an overall configuration of a storage system 1 according to a first embodiment of the present invention. It is a block diagram which shows the structure of the backup server and storage apparatus which concern on 1st Embodiment.
- 3 is a diagram illustrating a configuration example of a container index table and a chunk index table used in backup processing in the storage system 1.
- FIG. 3 is a diagram illustrating a configuration example of a container index table and a chunk index table that are used in a restore process in the storage system 1.
- FIG. 3 is a diagram illustrating a configuration example of a content index table used in a restore process in the storage system 1.
- FIG. It is a figure which illustrates notionally backup processing concerning a 1st embodiment.
- FIG. 14 is a flowchart illustrating an example of a processing procedure of backup processing according to the third embodiment. It is a figure which shows the whole structure of the storage system 1 which concerns on the 4th Embodiment of this invention. It is a flowchart which shows an example of the process sequence of the backup process which concerns on 4th Embodiment. It is a flowchart which shows an example of the process sequence of the backup process which concerns on 5th Embodiment. It is a flowchart which shows an example of the process sequence of the backup process which concerns on 6th Embodiment. It is a flowchart which shows an example of the process sequence of the backup process which concerns on 7th Embodiment.
- FIG. 1 shows an overall configuration of a storage system 1 according to a first embodiment of the present invention.
- the storage system 1 includes a backup server 14 (14a, 14b,..., 14n) installed in each of a plurality of bases 2 (2a, 2b,..., 2n) and a storage apparatus 10 installed in the data center 3. It is prepared for.
- symbol of a, b, ..., n may be abbreviate
- the plurality of bases 2 and the data center 3 are connected via a communication network 4.
- the communication network 4 can be configured as an appropriate communication line including, for example, a WAN (Wide Area Network), a LAN (Local Area Network), the Internet, a public line, or a dedicated line.
- Each base 2 includes business servers 5 (5a, 5b,..., 5n), clients 6 (6a, 6b,..., 6n), and backup servers 14 (14a, 14b,..., 14n).
- the business server 5, the client 6, and the backup server 14 are connected to be communicable with each other via a communication network 13 (13a, 13b,..., 13n) such as a LAN.
- the business server 5 is a computer that receives a request from the client 6 and provides a service corresponding to the request.
- a processor such as a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory) And an auxiliary storage device (not shown) such as HDD (Hard Disk Drive) and SSD (Solid State Drive).
- the client 6 is also a computer having a configuration substantially similar to that of the business server 5 and functions as a terminal for a user who uses a service provided by the business server 5.
- the backup server 14 periodically backs up the data of the business server 5 and the client 6 connected in each base 2 and transmits the backup data to the data center 3. Further, the backup server 14 restores data from the data center 3 to the business server 5 or the client 6 in response to requests from the business server 5 and the client 6.
- the storage device 10 stores the data received from the plurality of backup servers 14 in the storage medium of the storage device 10. Further, the storage apparatus 10 reads out data stored in the storage medium in response to a request from the backup server 14 and transmits the data to the backup server 14.
- the backup server 14 installed in each base 2 and the storage device 10 installed in the data center 3 cooperate to efficiently perform deduplication of data.
- FIG. 2 is a block diagram showing a configuration example of the backup server 14a installed in the base 2a of the storage system 1 shown in FIG. 1 and the storage apparatus 10 installed in the data center 3.
- the backup server 14a is shown, but the backup servers 14b,..., 14n installed in the other bases 2b to 2n have substantially the same configuration.
- the backup server 14a mainly includes a processor 102 such as a CPU (Central Processing Unit), a memory 103 such as a RAM (Random Access Memory), a ROM (Read Only Memory), and an HDD (Hard Disk Drive). ), An auxiliary storage device 104 (hereinafter referred to as “HDD”) such as SSD (Solid State Drive), and a network interface 105 which is a communication interface with the communication network 4 such as NIC (Network Interface Card).
- the processor 102, the memory 103, the HDD 104, and the network interface 105 are communicably connected to each other via the system bus 108.
- the processor 102 functions as an arithmetic processing unit including a CPU and the like, and controls the operation of the backup server 14a according to programs, arithmetic parameters, and the like stored in the memory 103.
- the memory 103 stores a backup program 106 and a restore program 107 on the backup server 14 side.
- the memory 103 is used not only for storing various types of information read from the HDD 104 but also as a work memory for the processor 102.
- the HDD 104 stores various software, management information, backup data, and the like. Note that the backup program 106 and the restore program 107 may be stored in the HDD 104 and read from the HDD 104 to the memory 103 when the programs are executed by the processor.
- the memory 103 also stores a container index table 320 (first duplication determination information) that is a table that is referred to when the backup program 106 and the restore program 107 are executed.
- the container index table 320 can be stored in the HDD 104, and can be rolled into the memory 103 as necessary when the backup program 106 and the restore program 107 refer to it.
- the backup program 106 provides functions for performing data processing such as backup data determination and duplication determination processing, and transmits backup data to the storage apparatus 10 via the network interface 105. Further, the backup program 106 receives information necessary for duplication determination processing from the storage apparatus 10 via the network interface 105.
- the restore program 107 receives backup data necessary for restore processing from the storage apparatus 10 via the network interface 105 and restores the original data.
- the storage apparatus 10 mainly includes a processor 112, a memory 113, an HDD 114, and a network interface 115.
- the processor 112, the memory 113, the HDD 114, and the network interface 115 are connected to each other via a system bus 118 so that they can communicate with each other.
- the processor 112 functions as an arithmetic processing device including a CPU and the like, and controls the operation of the storage device 10 according to programs, arithmetic parameters, and the like stored in the memory 113.
- the memory 113 stores a backup program 116 and a restore program 117 on the storage device 10 side.
- the memory 113 is used not only for storing various types of information read from the HDD 114 but also as a work memory for the processor 112.
- the HDD 114 stores various software, management information, data after deduplication processing, and the like.
- the memory 113 also stores a chunk index table 310, a container index table 320, and a content index table 370, which are tables that are referred to when the backup program 116 and the restore program 117 are executed.
- the chunk index table 310, the container index table 320, and the content index table 370 are stored in the HDD 114, and are rolled into the memory 113 as necessary when the backup program 116 and the restore program 117 refer to them. can do.
- the chunk index table 310, the container index table 320, and the content index table 370 constitute second duplication determination information.
- the backup program 116 performs deduplication processing on the data received from the backup server 14 a and stores the data after deduplication processing in the HDD 114.
- the backup program 116 also transmits information necessary for performing the duplication determination process in the backup server 14a to the backup server 14a via the network interface 115.
- the restore program 117 reads data corresponding to the restore request received from the backup server 14 a from the HDD 114 and transmits the data to the backup server 14 a via the network interface 115.
- the backup program 106 of the backup server 14 and the backup program 116 of the storage apparatus 10 are equipped with a processing function for reducing the amount of data to be backed up.
- data processing such as file compression processing and deduplication processing is used.
- the file compression process is a process for reducing the data capacity by contracting data segments (unit data) having the same contents included in one file.
- deduplication processing reduces the total data capacity of data stored in a file system, storage system, etc. by contracting data segments of the same content detected not only within one file but also between multiple files. This is a process for reducing.
- a data segment that is a unit of deduplication processing for backup data is referred to as “chunk”, and data in which a plurality of chunks are collected is referred to as “container”.
- contents logically collected data that is a unit stored in the storage device is referred to as “content”.
- the contents include files that are a collection of normal files such as archive files, backup files, and virtual volume files.
- One container is created so that chunks that are highly related to each other are aggregated. For example, by setting a predetermined number of chunks or data capacity for each container in advance and collecting chunks generated from one or more contents until the container is full, the container of the container considering the locality of data It can be generated. In other words, when restoring original data from backup data of a certain content, if the container in which the first chunk of that content is stored is identified, it is highly possible that subsequent chunks can be obtained from the same container. . Therefore, it is expected that processing for calling a different container from the HDD 114 to the memory 113 to restore specific content can be reduced.
- the size of one chunk is several kilobytes or more. For this reason, when the duplication determination process is executed, if the chunks are compared in order from the top of the chunk, a lot of processing time and cost are required. Therefore, in the storage apparatus 10 according to the present embodiment, it is possible to execute duplication determination processing in a short time and at low cost by using a message digest of chunks.
- Message digest is a technique for outputting a fixed-length digest in response to an arbitrary length of data input.
- the output result of the message digest is referred to as “finger print (FP)”.
- FP finger print
- the fingerprint can be obtained using any hash function. As this hash function, it is preferable to use a hash function such as SHA256, which has extremely high randomness and is highly likely to have a unique hash value for each chunk.
- the chunk to be transmitted is a chunk in which the same content is already stored in the storage apparatus 10 (hereinafter referred to as “duplicate chunk”). ) Or chunks not yet stored (hereinafter referred to as “new chunks”). Since the backup server 14 does not have information on all the chunks stored in the storage device 10, it may determine that the duplicate chunk is a new chunk.
- the backup program 106 transmits the chunk and the fingerprint (hash value) of the chunk to the storage apparatus 10.
- the fingerprint the fingerprint of the chunk
- the backup program 106 transmits link information indicating the storage location to the storage apparatus 10 without transmitting the chunk to the storage apparatus 10.
- the backup program 116 of the storage apparatus 10 determines whether the received chunk is a duplicate chunk having the same content as the chunk already stored in the HDD 114. It is determined whether the chunk is a new chunk that is not yet stored in the HDD 114.
- the backup program 116 stores the chunk in the HDD 114 as it is.
- the backup program 116 stores link information indicating the storage location in the HDD 114 without storing the chunk in the HDD 114.
- link information of duplicate chunks is received from the backup server 14, it is stored in the HDD 114 as it is.
- the backup program 106 of the backup server 14 and the backup program 116 of the storage apparatus 10 repeatedly execute the chunk deduplication processing in cooperation with each other to prevent multiple registration of duplicate chunks.
- This duplicate chunk elimination process can reduce the used capacity of the HDD 114 and speed up the backup process.
- a “container” is a processing unit for storing data in the HDD 114, which is composed of a plurality of chunks obtained by dividing one or more contents. Further, for each “container”, the backup program 116 of the storage apparatus 10 creates a container index table for managing the arrangement of each chunk constituting each container.
- the container index table stores a chunk offset (a chunk position in the container) and a chunk size.
- the container index table is used for determining chunk duplication.
- the backup program 116 of the storage apparatus 10 also creates a chunk index table.
- the chunk index table is a table indicating in which container index table a chunk generated by dividing backup data is stored.
- the chunk index table is created by the storage apparatus 10 when a container for storing a chunk is determined.
- the chunk index table is used to determine a container index table to be used for chunk deduplication determination when executing backup processing. Details of the container index table and the chunk index table will be described later.
- the fingerprints of each chunk are stored in the container index table described above, and the fingerprints of the chunks are compared with each other during the duplication determination process. Thereby, compared with the case where the chunks are compared bit by bit, it is possible to realize speeding up and cost reduction of the duplication determination process.
- a write-once storage device may be used in order to guarantee data integrity and realize highly reliable backup processing.
- data can be written only once, but the written data can be read any number of times. Since data written in a write-once storage device cannot be erased or altered, it is suitable for archiving data for evidence preservation.
- An example of such a write-once storage device is an optical disk device that uses a ROM optical disk.
- a magnetic disk device is not a write-once type storage device because it can update written data.
- the magnetic disk device can be used as a write-once type storage device by devising the configuration of the file system, driver device, etc. and allowing only appending (ie, prohibiting overwriting of data).
- a predetermined number of chunks or data capacity is set in advance in the container described above. For this reason, chunks are aggregated on the memory 113 side until the container is full, and when the container is full, the chunk is written in the backup storage device (HDD 114) in units of containers. For example, when a write-once hard disk device is used as a storage device, the storage device 10 appends a chunk to the container on the memory 113 until the container is full. At the same time, the storage apparatus 10 creates a container index table that manages the arrangement of the chunks in the container and a chunk index table that manages the correspondence between the chunks and the container index table. In the backup data, there is a universal chunk that always appears for each backup generation, and the universal chunk is stored in a container prepared at the time of initial backup.
- the container index table 320 is a table created for each container.
- the chunk index table 310 is a table for managing the chunks stored in the container.
- FIG. 3 illustrates a container index table Tg (320) created for a specific container in the container index table 320.
- the container index table Tg (320) includes items of a fingerprint 321, a container offset 322, and a chunk length 323.
- the fingerprint 321 stores the fingerprint of each chunk (in this embodiment, a hash value calculated by an appropriate hash function).
- the container offset 322 stores an offset value that gives the start position of the chunk in the container.
- the chunk length 323 stores information indicating the length of the chunk. That is, management information for each chunk is stored in each row of the container index table Tg (320).
- the container index table Tg (320) illustrated in FIG. 3 stores chunk b management information 320b, chunk c management information 320c, and chunk f management information 320f.
- the management information related to each chunk is attached with a code representing each chunk as a subscript. For example, the fingerprint 321 calculated for the chunk b is represented as FPb.
- the plurality of container index tables 320 are managed by the chunk index table 310.
- a container ID 312 which is a code for identifying each container and a fingerprint 311 of each chunk are recorded in association with each other.
- the container ID 312 here is also used as pointer information that allows the container index table 320 to be referred to.
- the container index table 320 corresponding to the container ID 312 is shared by an identifier called UUID (Universally Unique Identifier).
- whether or not to refer to the chunk index table 310 may be determined according to the processing result of the filter processing for identifying whether or not the chunk is a new chunk. That is, for a chunk that is surely not recorded in the chunk index table 310, the reference processing itself of the chunk index table 310 may be skipped, and the chunk may be directly stored in the new container. If this processing technique is adopted, the number of times the backup program 116 of the storage apparatus 10 refers to the chunk index table 310 can be reduced, and the backup processing can be further speeded up.
- the fingerprint 311 and the container ID 312 of all chunks are registered in the chunk index table 310, but the number of registered chunks can be reduced.
- each container 380 is created in consideration of data locality.
- the backup data includes a lot of data that is the same or partially modified between backup generations, if a chunk stored in a certain container 380 is included in an arbitrary content, the same It is very likely that other chunks stored in the container are also included in the content. Therefore, after searching the container index table 320 from the chunk index table 310, it is possible to determine whether or not content is duplicated by using the container index table 320.
- FIG. 4 shows an example of the container index table 320 and the chunk index table 310 when chunks registered in the chunk index table 310 are reduced.
- the backup program 116 of the storage apparatus 10 searches the chunk index table 310 using the fingerprint FP b of the chunk b.
- the fingerprint FP b is associated with the container ID Tg (320). Therefore, the backup program 116 reads the container index table Tg (320) from the HDD 114 and expands it on the memory 113.
- the duplication determination of the chunk c and the chunk f can be performed using the expanded container index table Tg (320).
- the chunk index table 310 by reducing the number of chunks registered in the chunk index table 310, it is possible to reduce the storage capacity and the memory usage necessary for the deduplication processing. Further, since the number of chunk registrations in the chunk index table 310 is reduced, it is possible to speed up the search for the fingerprint 311 corresponding to an arbitrary chunk.
- the content index table 370 is a table created for each content, and is a table for managing chunks included in the content.
- the content index table 370 includes a content ID 371, a fingerprint 372, a container ID 373, a content offset 374, and a chunk length 375.
- the content ID 371 stores information for identifying each content from each other.
- the fingerprint 372 stores the fingerprint of each chunk (a hash value calculated using an appropriate hash function for each chunk).
- the container ID 373 stores identification information for identifying each container in which chunks are stored.
- the content offset 374 stores information indicating the position of the chunk within each content.
- the chunk length 375 stores information indicating the length of each chunk.
- FIG. 5 shows S f1 (370), S f2 (370), S f3 (370), and S fn (370) as examples of the content index table.
- the information of S f3 (370) corresponding to the content f3, content f 3 may be be reconstructed by chunk b, chunk c, chunk d, chunk e, and chunk f, based on the content offset 374 and chunk length 375
- the content offset 374 and the chunk length 375 of the content constituting the content index table 370 indicate the logical arrangement of the chunks in the content. Note that the offset 322 and the chunk length 323 in the container index table 320 (FIG. 3) described above indicate the logical arrangement of the chunks in each container.
- the restore program 117 of the storage apparatus 10 When executing the restore process, the restore program 117 of the storage apparatus 10 refers to the content index table 370, acquires the container ID 373 of each chunk, and searches the container index table 320 using the container ID 373. Next, the restore program 117 acquires the corresponding chunk from the container 380 read from the HDD 114 based on the storage position information of each chunk stored in the container index table 320. Thereafter, the restore program 117 reconstructs the content to be restored according to the logical arrangement of the content index table 370.
- FIG. 6 schematically shows an outline of the deduplication processing realized in the storage system 1 of the present embodiment.
- the backup server 14a is illustrated as the backup server 14 provided in the storage system 1, but a plurality of backup servers 14 (14a, 14b,..., 14n), as in FIG. 4 is connected to the storage device 10 via
- the content to be backed up is composed of chunk a, chunk b, chunk c, chunk d, chunk e, and chunk f.
- the storage device 10 stores a chunk index table U (310) and container index tables Tg (320) and Tc (320).
- the backup server 14a executes the first backup process and the container index table 320 is not yet stored in the memory 103 and the HDD 104 of the backup server 14a.
- the backup program 106 of the backup server 14a determines the duplication of the first chunk a. However, since the container index table 320 is not stored in the memory 103 and the HDD 104, the backup program 106 determines that the chunk a is a new chunk, and transmits the fingerprint FPa of the chunk a and the chunk a to the storage apparatus 10.
- the backup program 116 of the storage apparatus 10 performs duplication determination on the received chunk a using the chunk index table U (310).
- the container index table 320 can be referred to first to find out whether the chunk a is stored in duplicate. Good.
- the backup program 116 processes the chunk a as a duplicate chunk, and the container index table Tg. (320) is transmitted to the backup server 14a.
- the backup program 106 of the backup server 14a expands the received container index table Tg (320) in the memory 103 and uses it for the duplication determination process for the chunk to be backed up thereafter. Since the fingerprints FPb, FPc, and FPd of chunk b, chunk c, and chunk d are registered in the container index table Tg (320), the backup program 106 stores chunk b, chunk c, and chunk d. Judged as a duplicate chunk.
- the backup program 106 determines that the chunk e is a new chunk and transmits the chunk e and its fingerprint FPe to the storage apparatus 10. To do.
- the backup program 116 of the storage device 10 performs duplication determination on the chunk e as in the case of the chunk a, and transmits the corresponding container index table Tc (320) to the backup server 14a.
- the backup program 106 of the backup server 14a uses the received container index table Tc (320) and the container index table Tg (320) already expanded in the memory 103 to perform the duplicate determination process on the subsequent chunk f. Do.
- the chunks are aggregated in each container in consideration of the locality of data. Therefore, the following chunk b, chunk c, and chunk d are added to the container Tg (380) that stores the chunk a.
- the possibility of being stored is high, and efficient deduplication processing can be performed.
- the backup program 106 of the backup server 14 performs duplication determination processing, at least one fingerprint 321 of the container index table 320 is printed unless the container index table 320 is stored in the memory 103 and the HDD 104. Refer to For this reason, the backup program 106 needs to expand the container index table 320 on the memory 103. However, the capacity of the memory 103 is finite, and it is difficult to always expand all the container index tables 320 used by the backup program 106 on the memory 103. Therefore, the backup server 14 rolls in the container index table 320 from the HDD 104 to the memory 103 and rolls out the container index table 320 from the memory 103 to the HDD 104 so as to effectively use the storage resources of the memory 103. I have to. Note that the rolled-out container index table 320 may be deleted from the HDD 104. The same processing is also performed on the memory 113 and the HDD 114 of the storage apparatus 10 when determining the duplication of the backup program 116 in the storage apparatus 10.
- the backup program 106 of the backup server 14 and the backup program 116 of the storage apparatus 10 perform duplication determination by comparing the fingerprints 321 of each chunk, but the reliability of duplication determination is improved. In order to make this happen, the chunks themselves may be compared bit by bit to determine duplication. In that case, the backup program 116 of the storage apparatus 10 transmits the main body of the container 380 including the corresponding chunk to the backup server 14.
- FIG. 7 shows a processing flow example of the backup processing operation executed by the backup program 106 of the backup server 14 and the backup program 116 of the storage apparatus 10 provided in the storage system 1.
- the symbol S attached to each processing step is an abbreviation for the step.
- the backup program 106 of the backup server 14 starts backup processing based on an instruction from the client 6 (S100), and acquires a content ID 371 for specifying content to be backed up from the storage device 10 ( S101). This step is provided because the content ID 371 is managed in the content index table 370 by the backup program 116 of the storage apparatus 10.
- the chunk management information ms i includes a chunk fingerprint 321, a chunk position (offset) 322 in the content, and a chunk length 323.
- the backup program 106 searches the container index table 320 expanded on the memory 103, and performs duplication determination for each chunk (S105). Specifically, the backup program 106 determines whether or not the container index table 320 includes a fingerprint 321 that matches the fingerprint of the chunk disassembled in S102. The backup program 106 determines “duplicate” when the fingerprint of the determination target chunk matches the fingerprint 321 of the container index table 320, and determines “no overlap” when they do not match.
- the backup program 106 transmits the management information ms i chunk s i and chunk si to the storage device 10.
- the backup program 116 of the storage apparatus 10 performs a duplication determination chunk s i received from the backup server 14. If the chunk s i is determined to be novel chunks in S107, the backup program 116 executes the processing of S108.
- the backup program 116 of the storage device 10 transmits the duplication determination result in S107 to the backup server 14.
- the backup program 116 of the storage device 10 writes the chunk s i to the container 380, writes the management information ms i chunk s i to the container index table 320, the message digest (hash chunks s i chunk index table 310 Value) is written (S109), and the process of S111 is executed.
- the backup program 116 of the storage apparatus 10 includes the container index including the fingerprint 321 that matches the fingerprint of the chunk s i received from the backup server 14.
- the table 320 is transmitted to the backup server 14 (S110), and S111 is executed. Note that when the backup program 106 of the backup server 14 receives the container index table 320 from the storage apparatus 10, it is assumed that the determination result “duplicate” has been received.
- the backup program 116 of the storage apparatus 10 creates the content index table 370 for the content to be backed up illustrated in FIG. 5 and registers the management information msi on the chunk, as illustrated in FIG. 5, for use in the restore process. To do.
- the backup program 106 of the backup server 14 determines whether or not duplication determination processing and registration processing in the content index table 370 have been completed for all chunks (S112). Specifically, the backup program 106 of the backup server 14 compares the number of chunks n included in the content to be backed up with the counter number of the counter i.
- the backup program 106 of the backup server 14 performs the restore process.
- a stub file is created (S114), and the backup processing for the content is terminated (S115).
- the stub file stores a content ID 371 for searching the corresponding content index table 370 when executing the restore process.
- the backup program 106 of the backup server 14 acquires the result after, it has started to process the next chunk s i + 1, at S106, after transmitting the management information ms i chunk s i and chunk s i to the storage device 10, the next chunk s i + 1
- the process may be executed.
- the storage apparatus 10 since a plurality of chunks registered in the same container index table 320 may be transmitted to the storage apparatus 10, the storage apparatus 10 stores the container index table 320 expanded in the memory 113. It is stored which backup server 14 (14a, 14b,..., 14n) has been transmitted, and the same container index table 320 as the container index table 320 that has already been transmitted is not transmitted to each backup server 14. To.
- the backup program 106 of the backup server 14a determines that "no overlap" for chunk s i, transmits a chunk s i and its management information ms i to the storage device 10.
- the chunk s i + 1 Since the overlap determination for chunk s i + 1 before receiving the duplication determination result for the chunk s i from the storage device 10, the chunk s i + 1 is determined as "no overlap", it is transmitted to the storage device 10
- the backup program 116 of the storage apparatus 10 overlap determining the chunk s i, transmits the corresponding container index table 320 to the backup server 14a.
- the corresponding container index table 320 stores in the memory 113 that it has been transmitted to the backup server 14a.
- the backup program 116 tries to transmit the corresponding container index table 320 to the backup server 14a in the duplication determination of the chunk s i + 1 , the corresponding container index table 320 has already been transmitted. It is determined that there is, and no transmission is performed.
- the processing performance of the deduplication processing in the storage system 1 can be improved.
- FIG. 6 shows a processing flow example of restore processing executed by the restore program 107 of the backup server 14 and the restore program 117 of the storage apparatus 10.
- the restore program 107 of the backup server 14 reads the content of the content to be restored from the stub file stored in the HDD 104. The ID is acquired and transmitted to the storage apparatus 10 (S201).
- the restore program 117 of the storage apparatus 10 sets 0 to the counter i for counting the chunks necessary for the restore process (S203). Thereafter, the restore program 117 reads the management information ms i in the container index table 320 (S204). Specifically, the restore program 117 reads the container index table 370 to which the chunk s i belongs from the HDD 114 from the information of the chunk s i of the content index table 370 acquired in S202, and reads the management information msi of the corresponding chunk si. . As described above, the management information msi of the chunk si includes information on the fingerprint 321 of the chunk, the position (offset) 322 in the container, and the length 323 of the chunk.
- the restore program 117 of the storage apparatus 10 reads the chunk s i stored in the container 380 corresponding to the container index table 370 based on the management information ms i of the chunk si read in S204 (S205). .
- the restore program 117 of the storage apparatus 10 determines whether or not reading has been completed for all chunks included in the content to be restored (S206). Specifically, the restore program 117 compares the number of chunks n included in the content to be restored with the count number of the counter i.
- the content is reproduced (S208), transmitted to the backup server 14, and the restore process is terminated (S209, S210).
- the restore program 117 an offset 374 in the content that has been previously described in the content index table 370, based on the chunk length 375, recombines the read chunk s i as content.
- the restore program 117 of the storage apparatus 10 adds 1 to the counter i and returns the process to S204 (S207). ).
- the storage system 1 implements efficient deduplication processing by appropriately acquiring the container index table 320 used for deduplication processing of the backup server 14 from the storage apparatus 10. .
- the amount of data transmitted to the storage apparatus 10 can be reduced. Since the amount of data transmitted to the storage device 10 is reduced, the load on the communication network 4 can be reduced.
- the container index table 320 necessary for the deduplication process performed by the backup server 14 is appropriately acquired from the storage apparatus 10, the storage capacity used by the backup server 14 for the deduplication process can be reduced.
- the container index table 320 acquired from the storage apparatus 10 takes into account the locality of data, and since there is a high possibility that it includes information on chunks to be deduplicated thereafter, the memory 104 of the backup server 14 is stored in the container index table 320. It can be used efficiently.
- duplication determination can be performed at high speed.
- FIG. 9 shows an example of the overall configuration of the storage system 1 according to this embodiment.
- the configuration of the second embodiment illustrated in FIG. 9 is different from that shown in FIG. 9 except that it includes a plurality of data centers 3 (3a, 3b,..., 3m) each having storage devices 10 (10a, 10b,..., 10m). This is the same as the first embodiment illustrated in FIG. Therefore, a detailed description regarding the configuration of the storage system 1 is omitted.
- FIG. 10 shows a processing flow example of the backup processing according to the present embodiment.
- the backup program 106 of the backup server 14 transmits the chunk s i and the management information ms i of the chunk si to all the storage devices 10 (10a, 10b,..., 10m) connected to the storage system 1.
- the backup program 116 of the storage apparatus 10 performs a duplication determination chunk s i received from the backup server 14. If the chunk s i is determined to be novel chunks in S307, the backup program 116 of the storage apparatus 10 executes the processing of S308.
- the backup program 116 of the storage apparatus 10 transmits the duplication determination result “no duplication” to the backup server 14.
- the backup program 116 of the storage apparatus 10 executes the processing of S309.
- the backup program 116 of the storage apparatus 10 sends the container index table 320 containing the fingerprint 321 matches the fingerprint of the chunk s i received the backup server 14 executes S313.
- the backup program 106 of the backup server 14 executes the process of S311 when the determination result from all the storage apparatuses 10 is “no duplication” (S310, Yes). Note that when the backup program 106 of the backup server 14 receives the container index table 320 from any of the storage apparatuses 10, it is assumed that the determination result “Duplicate” has been received.
- the backup program 106 of the backup server 14 selects a storage device 10 for storing a chunk s i, transmits a storage request of the chunk s i. At this time, already because the chunk s i is already transmitted in S305, the chunk s i does not transmit.
- Storage device 10 that stores the chunk s i can be selected in any manner.
- the storage apparatus 10 selected in S311 writes the chunk s i to the container 380, registers the management information ms i of the chunk s i in the container index table 320, and registers the message digest of the chunk s i in the chunk index table 310 (S312), and the process of S313 is executed.
- the backup program 116 of the storage apparatus 10 creates the content index table 370 for the content to be backed up as illustrated in FIG. 5 and registers the management information msi on the chunk, as illustrated in FIG. To do.
- the backup program 106 of the backup server 14 determines whether or not duplication determination processing and registration processing have been completed for all chunks (S314). Specifically, the backup program 106 of the backup server 14 compares the number of chunks n included in the content to be processed with the number of counters i.
- the backup program 106 of the backup server 14 creates a stub file for restore processing (S316). Then, the backup processing of the content is terminated (S317).
- the stub file stores a content ID 371 for searching the corresponding content index table 370 at the time of restore processing.
- the restore process according to the present embodiment is substantially the same as that of the first embodiment except that the backup server 14 that has received the restore process execution instruction transmits the content ID 371 related to the restore target content to the plurality of storage apparatuses 10. Therefore, detailed description is omitted.
- efficient deduplication processing can be applied to a plurality of storage apparatuses 10 as in the case of the first embodiment.
- one chunk is transmitted to the storage apparatus 10 in S306, but a plurality of chunks may be transmitted together. For example, ten chunks from chunks determined as new chunks may be collected and transmitted to the storage apparatus 10. These processes may improve the speed of the deduplication process.
- FIG. 11 shows an example of the overall configuration of the storage system 1 according to this embodiment.
- the overall configuration of the storage system 1 according to this embodiment is the same as that of the second embodiment, except that the data center 11 and the chunk management server 12 provided in the data center 11 are provided. Therefore, detailed description of the same configuration as that of the second embodiment is omitted.
- the chunk management server 12 manages the chunk index table 310, the container index table 320, and the content index table 370 stored in all the data centers 3 (3a, 3b,..., 3m) provided in the storage system 1. ing.
- the backup server 14 provided at the base 2 of the storage system 1 illustrated in FIG. 11, the storage device 10 provided at the data center 3, and the chunk management server 12 provided at the data center 11.
- the block diagram which shows the example of a structure of is shown.
- the configuration of the backup server 14 of this embodiment is the same as the configuration of the backup server 14 of the first embodiment, and thus detailed description thereof is omitted.
- the configuration of the storage apparatus 10 of this embodiment is the same as that of the storage apparatus 10 of the first embodiment except that the backup program 116 and the restore program 117 are deleted, and thus detailed description thereof is omitted. To do.
- the functions of the backup program 116 and the restore program 117 in the first embodiment are performed by the backup program 126 and the restore program 127 provided in the chunk management server 12.
- the configuration of the chunk management server 12 according to the present embodiment is substantially the same as the configuration of the storage apparatus 10 according to the first embodiment, and thus detailed description thereof is omitted.
- FIG. 13 shows a processing flow example of the backup processing operation according to the present embodiment.
- the backup process illustrated in FIG. 13 is executed by the backup program 106 of the backup server 14 and the backup program 126 of the chunk management server 12.
- the backup program 106 of the backup server 14 receives the backup process execution instruction from the client 6 or the like, and starts the backup process of the present embodiment (S400).
- the subsequent processing in S401 to S404 is the same as the processing in S101 to S104 in the first embodiment, and thus detailed description thereof is omitted.
- the backup server 14 when it is determined that there is a chunk s i in which the processing target chunk and the fingerprint 321 match in the container index table 320 of the backup server 14 (that is, when it is determined that there is “duplication”), the backup server 14 The backup program 106 executes the process of S412. On the other hand, (if it is determined that "no overlap") fingerprint 321 if it is determined that there is no chunk s i that coincide at S405, the backup program 106 of the backup server 14 executes the process of S406. In S ⁇ b> 406, the backup program 106 of the backup server 14 transmits the management information ms i of the chunk s i and the chunk si to the chunk management server 12.
- the backup program 126 chunk management server 12 performs the duplication determination chunk s i received from the backup server 14. If the chunk s i is determined to be novel chunks in S407, the backup program 126 chunk management server 12 executes the process of S408.
- the backup program 126 of the chunk management server 12 transmits the duplication determination result to the backup server 14, and executes S409.
- the backup program 126 of the chunk management server 12 selects the storage device 10 (10a, 10b,..., 10m) that stores the new chunk received from the backup server 14, and stores the chunk s in the container of the selected storage device 10. Register i .
- the storage device 10 that stores the new chunk can be selected by an arbitrary method.
- the backup program 126 of the chunk management server 12 registers the management information ms i of the chunk s i in the container index table 320, records the message digest of the chunk s i in the chunk index table 310, and executes S412. .
- the backup program 126 of the chunk management server 12 uses the container index table 320 including the fingerprint 321 that matches the fingerprint of the received chunk s i.
- the data is transmitted to the backup server 14 (S411), and S412 is executed.
- the backup program 126 of the chunk management server 12 creates the content index table 370 for the content to be backed up as illustrated in FIG. sign up.
- the backup program 106 of the backup server 14 determines whether or not duplication determination processing and writing processing have been completed for all chunks (S413). Specifically, the backup program 106 of the backup server 14 compares the number of chunks n included in the content to be backed up with the counter number of the counter i.
- the backup program 106 of the backup server 14 uses the stub file for the restore process. Is created (S415), and the backup processing of the content is terminated (S416).
- the stub file stores a content ID 371 for searching the corresponding content index table 370 when restoring the backup data.
- the number of chunk management servers 12 in FIG. 11 is one, there may be a plurality of units. Thereby, inquiries to one chunk management server 12 may be distributed, and the speed of the deduplication processing may be improved.
- the chunk index table managed by each chunk management server 12 is sorted using a part of the fingerprint of the chunk index table managed by the chunk management server 12.
- X is an arbitrary natural number, it is possible to sort by the upper X bits or lower X bits of the fingerprint, or by a bit pattern extracted by a predetermined pattern or the like.
- chunk index table so that the storage device identification information and the container ID can be acquired from the fingerprint of the chunk index table, distributed processing by a plurality of chunk management servers 12 can be realized.
- a restore process execution instruction specifying a content ID is transmitted from the client 6 or the like to the chunk management server 12, and the restore program 117 of the storage apparatus 10 in the first embodiment is executed.
- the functions are substantially the same as those in the first embodiment except that the restore program 127 of the chunk management server 12 realizes the functions, and detailed description thereof will be omitted.
- more efficient deduplication processing can be performed on a plurality of storage apparatuses 10.
- information necessary for deduplication processing is collected in the chunk management server 12, so that the chunk and the chunk management information need only be exchanged between the backup server 14 and the chunk management server 12, and the network The load can be reduced.
- FIG. 14 is a block diagram illustrating a configuration example of the backup server 14 provided in the base 2 of the storage system 1 according to the present embodiment and the data center 3 including the storage device 10.
- the configuration of the backup server 14 is the same as that of the backup server 14 of the first embodiment illustrated in FIG. 2 except for the network monitoring unit 109, and thus detailed description thereof is omitted.
- the network monitoring unit 109 is a functional block that monitors the amount of data that the backup server 14 transmits / receives to / from the communication network 4 via the network interface 105, and appropriate hardware and software that can realize the function. Can be used.
- the configuration of the storage apparatus 10 is the same as that of the storage apparatus 10 of the first embodiment illustrated in FIG. 2 except for the network monitoring unit 119, and thus detailed description thereof is omitted.
- the network monitoring unit 119 of the storage apparatus 10 has the same function as the network monitoring unit 109 of the backup server 14 and monitors the amount of data transmitted to and received from the communication network 4 via the network interface 115.
- FIG. 15 shows a processing flow example of the backup processing operation according to the present embodiment.
- the backup processing illustrated in FIG. 15 is executed by the backup program 106 of the backup server 14 and the backup program 116 of the storage apparatus 10.
- the backup program 106 of the backup server 14 receives the backup process execution instruction from the client 6 or the like, and starts the backup process of this embodiment (S500).
- the subsequent processing in S501 to S504 is the same as the processing in S101 to S104 in the first embodiment, and thus detailed description thereof is omitted.
- the backup program 106 of the backup server 14 measures the load of the communication network 4 via the network monitoring unit 109, and if the load measurement value of the communication network 4 is equal to or greater than a preset threshold value (S506, Yes), the process of S507 is executed.
- the threshold value of the load measurement value of the communication network 4 can be determined in consideration of conditions such as the performance of the communication network 4.
- the backup program 106 of the backup server 14 and the backup program 116 of the storage device 10 are the same as those of S105 in FIG. 5 of the first embodiment. Processes similar to those in S109 are performed. After performing the processes of S105 to S109, the backup program 116 of the storage apparatus 10 executes the process of S514.
- the backup program 106 of the backup server 14 transmits the management information ms i chunk s i to the storage device 10.
- step S ⁇ b> 508 the backup program 116 of the storage apparatus 10 determines the duplication of the chunk s i using the management information ms i of the chunk si received from the backup server 14.
- the backup program 116 of the storage device 10 transmits a corresponding container index table 320 to the backup server 14 (S510). Note that when the backup program 106 of the backup server 14 receives the container index table 320 from the storage apparatus 10, it is assumed that the determination result “duplicate” has been received.
- the backup program 106 of the backup server 14 acquires the duplication determination result from the storage apparatus 10, and performs duplication determination processing. Note that when the result of the duplication determination received from the storage apparatus 10 is “duplication”, the backup program 106 of the backup server 14 performs duplication determination in consideration of the container index table 320 acquired in S510.
- the backup program 106 of the backup server 14 transmits a chunk s i to be processed to the storage device 10 (S512).
- the management information ms i chunk s i is therefore not transmitted in S512 has already been transmitted to the storage device 10.
- the backup program 116 of the storage apparatus 10 registers the chunk s i acquired from the backup server 14 in the container 380, and registers the management information ms i of the chunk s i in the container index table 320, and the chunk index. record the message digest chunks s i Table 310 executes S514.
- the backup program 116 of the storage apparatus 10 executes the process of S514.
- the backup program 116 of the storage apparatus 10 creates the content index table 370 for the content to be backed up illustrated in FIG. 5 and uses the management information msi related to the chunk si for use in the restore process. sign up.
- the backup program 106 of the backup server 14 determines whether or not duplication determination processing and registration processing into the chunk index table and content index table have been completed for all chunks constituting the content that is the target of backup processing. (S515). Specifically, the backup program 106 of the backup server 14 compares the number of chunks n included in the content with the number of counters i.
- the backup program 106 of the backup server 14 performs the restore process.
- a stub file is created (S517), and the backup processing for the content is terminated (S518).
- the stub file stores a content ID 371 for searching the corresponding content index table 370 when restoring the backup data.
- the restore process according to this embodiment is substantially the same as the restore process in the first embodiment illustrated in FIG.
- deduplication processing can be performed in consideration of the network load.
- the communication amount of the communication network 4 used for the deduplication processing can be reduced, and more efficient deduplication processing can be performed.
- the overall configuration of the storage system 1 according to this embodiment is the same as that of the first embodiment illustrated in FIG.
- the block configurations of the backup server 14 and the storage apparatus 10 are also the same as those in the first embodiment illustrated in FIG.
- FIG. 16 shows a processing flow example of the backup processing operation according to the present embodiment.
- the backup process illustrated in FIG. 16 is executed by the backup program 106 of the backup server 14 and the backup program 116 of the storage apparatus 10.
- the backup program 106 of the backup server 14 receives the backup process execution instruction from the client 6 or the like, and starts the backup process of the present embodiment (S600).
- the subsequent processing in S601 to S604 is the same as the processing in S101 to S104 in the first embodiment, and thus detailed description thereof is omitted.
- the backup program 106 of the backup server 14 adds the chunk s i and its management information ms i to the queue set in the memory 103 of the backup server 14, for example, and executes S606.
- the queue is provided to store the chunk si determined to have no overlap as a result of the overlap determination process in S604 and its management information msi up to a predetermined threshold. With this configuration, information about the chunk si determined to have no duplication in S604 is not transmitted to the storage apparatus 10 each time it is determined.
- the backup program 106 of the backup server 14 executes S613.
- the backup program 106 of the backup server 14 transmits to the storage apparatus 10 a queue in which the chunk si determined as “no duplication” in S605 and the management information msi of the chunk si are stored.
- the backup program 116 of the storage apparatus 10 extracts the chunk si and one management information msi of the chunk si from the head of the queue acquired from the backup server 14.
- the backup program 116 of the storage apparatus 10 For the retrieved chunk si and management information msi of the chunk si, the backup program 116 of the storage apparatus 10 performs the same duplication determination process as in S106 to S110 of FIG. 5 in the first embodiment, and the container and container index table. Then, registration processing to the chunk index table is executed (S610). After performing the process corresponding to S110, the backup program 116 of the storage apparatus 10 executes the process of S612.
- the backup program 116 of the storage apparatus 10 determines that the queue acquired from the backup server 14 is empty (S612, Yes), executes S613, and determines that the chunk remains in the queue. In the case (S612, No), the process returns to S609.
- the backup program 106 of the backup server 14 determines whether or not duplication determination processing and registration processing into the container, container index table, and chunk index table have been completed for all chunks (S613). Specifically, the backup program 106 of the backup server 14 compares the number of chunks n included in the content to be backed up with the number of counters i.
- the backup program 106 of the backup server 14 performs the restore process.
- a stub file is created (S614), and the content backup process is terminated (S615).
- the stub file stores a content ID 371 for searching the corresponding content index table 370 when restoring the backup data.
- the backup program 106 of the backup server 14 Adds 1 to the counter i and returns the process to S605 (S616).
- the restore processing according to the present embodiment is substantially the same as the restore processing in the first embodiment illustrated in FIG.
- a plurality of chunks determined as “no duplication” by the backup server 14 are collectively transmitted to the storage apparatus 10, so that overhead such as command analysis in the communication network 4 is reduced and storage is performed.
- the processing performance of the deduplication process in the system 1 can be further improved.
- the overall configuration of the storage system 1 according to this embodiment is the same as that of the first embodiment illustrated in FIG.
- the block configurations of the backup server 14 and the storage apparatus 10 are also the same as those in the first embodiment illustrated in FIG.
- FIG. 17 shows a processing flow example of the backup processing operation according to the present embodiment.
- the backup program 106 of the backup server 14 receives the backup process execution instruction from the client 6 or the like, and starts the backup process of this embodiment (S700).
- the subsequent processing in S701 to S704 is the same as the processing in S101 to S104 in the first embodiment, and thus detailed description thereof is omitted.
- the backup program 106 of the backup server 14 determines the number of chunks m (i + m ⁇ n), which is the number of chunks si to be transmitted to the storage apparatus 10.
- the backup program 106 of the backup server 14 the transmission chunk s i, s i + 1, ..., s i + m and management information ms i, ms i + 1, ..., a s i + m to the storage device 10 To do.
- the variable j is used to determine whether the backup program 116 of the storage apparatus 10 has completed the processing of m chunks.
- the backup program 116 of the storage apparatus 10 performs chunk si + j duplication determination processing.
- the backup program 116 of the storage apparatus 10 records the chunk s i + j in the container 380, and management information ms i of the chunk s i + j in the container index table 320 + j is recorded, the message digest of chunk s i + j is recorded in the chunk index table 310 (S710), and the process proceeds to S713.
- the backup program 116 of the storage apparatus 10 determines whether or not the corresponding container index table 320 has been transmitted to the backup server 14 (S711).
- the backup program 116 of the storage apparatus 10 records chunk management information in the content index table 370 for restore processing.
- the backup program 116 of the storage apparatus 10 determines whether or not duplication determination processing and registration processing into the container, container index table, and chunk index table have been completed for all chunks received from the backup server 14 (S714). ). Specifically, the backup program 116 of the storage apparatus 10 compares the received chunk count m with the count count of the counter j.
- the backup program 116 of the storage apparatus 10 performs the process of S716. Execute the process. If it is determined that the duplication determination process and the registration process to the container, container index table, and chunk index table have not been completed for all the chunks (S714, Yes), the backup program 116 of the storage apparatus 10 sets the counter j to the counter j. 1 is added, and the process returns to S709 (S715).
- the backup program 106 of the backup server 14 adds the number of chunks m transmitted to the storage apparatus 10 to the counter i.
- the backup program 106 of the backup server 14 determines whether or not duplication determination processing and registration processing into the container, container index table, and chunk index table have been completed for all chunks (S718).
- the backup program 106 of the backup server 14 executes the stub for restore processing.
- a file is created (S720), and the content backup processing is terminated (S721).
- the backup program 106 of the backup server 14 Adds 1 to the counter i and returns the process to S705 (S719).
- the processing performance of the deduplication processing can be further improved in this embodiment. For example, when the total size of the chunks to be transmitted and the management information is smaller than the size of the container index table 320, the network traffic can be reduced, so that the processing performance of the storage system 1 can be further improved.
- the restore processing according to the present embodiment is substantially the same as the restore processing in the first embodiment illustrated in FIG.
- the chunks determined as “no duplication” by the backup server 14 and the subsequent chunks are collectively transmitted to the storage apparatus 10.
- the overhead can be reduced and the processing performance of the deduplication processing can be further improved.
- the processing performance of the deduplication processing can be further improved by setting the number of chunks to be transmitted together according to the amount of data to be transmitted.
- the overall configuration of the storage system 1 according to this embodiment is the same as that of the first embodiment illustrated in FIG.
- the block configurations of the backup server 14 and the storage apparatus 10 are also the same as those in the first embodiment illustrated in FIG.
- FIG. 16 shows a processing flow example of the backup processing operation according to the present embodiment.
- the backup program 106 of the backup server 14 receives the backup process execution instruction from the client 6 or the like, and starts the backup process of this embodiment (S800). In step S801 that is subsequently executed, the backup program 106 of the backup server 14 identifies the type of content that is the target of the backup process.
- the backup program 106 of the backup server 14 executes the processing of S803.
- the storage location of the files aggregated in the content can be identified as the location of the header. Even if the file is in a format other than the archive file, the present embodiment can be applied as long as the content includes information for identifying the storage location of the aggregated file.
- the backup server 14 when it is determined that the content to be backed up is a file other than the above (a file that does not contain information identifying the storage location of the aggregated file) (S801, other format), the backup server 14 The backup program 106 performs the same processing as the backup processing in the first embodiment illustrated in FIG.
- the backup program 106 of the backup server 14 acquires a content ID 371 for specifying the content to be backed up from the storage device 10 (S802). Next, the backup program 106 of the backup server 14 divides the content into a plurality of chunks and creates management information for each chunk (S803, S804).
- the backup program 106 of the backup server 14 searches for the first chunk of each file aggregated in the content, and performs deduplication processing on each first chunk.
- the deduplication process in S805 is the same process as S104 to S110 in the backup process of the first embodiment.
- the backup program 106 of the backup server 14 performs deduplication processing on the remaining chunks.
- the de-duplication process in S806 is the same process as S104 to S110 in the backup process of the first embodiment.
- the backup program 106 of the backup server 14 creates a stub file for restore processing, and ends the backup processing of the content (S808).
- the container 380 is created in consideration of locality, there is a high possibility that the necessary container index table 320 is divided for each file aggregated in the content. Therefore, by obtaining in advance the container index table 320 required for the entire content, deduplication processing can be performed more efficiently.
- the first chunk of each file aggregated in the content is deduplicated first, but other than the first chunk may be sampled and deduplicated first.
- the size of the file included in the content may be divided into two or more container index tables 320. Therefore, a plurality of chunks are selected from one file, and deduplication processing is performed first. May be performed.
- deduplication processing may be performed for several files. If the content size is large, the entire container index table cannot be stored in the memory 103 of the backup server 14, and the duplication determination processing performance in the backup program 106 may deteriorate. Therefore, for example, deduplication processing may be executed in order on the aggregated files, and only the first chunk of the file to be processed next may be deduplicated first.
- the restore processing according to the present embodiment is substantially the same as the restore processing in the first embodiment illustrated in FIG.
- the container index table used for identifying each aggregated file and used for determining the duplication of each file is stored in advance in the storage device. By acquiring from 10, more efficient deduplication processing can be performed.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
第1の実施形態によるストレージシステムの構成
図1に、本願発明の第1の実施形態によるストレージシステム1の全体構成を示す。このストレージシステム1は、複数の拠点2(2a,2b,…,2n)にそれぞれ設置されたバックアップサーバ14(14a,14b,…,14n)と、データセンタ3に設置されたストレージ装置10とを備えて構成される。なお、各拠点を互いに区別することなく、個々の拠点について総括して説明する場合、a,b,…,nの符号は省略して記す場合がある。
次に、本実施形態に係るバックアップ処理及びリストア処理の概要について説明する。
まず、本実施形態に係る重複排除機能の概要を説明する。本実施形態に係るバックアップサーバ14のバックアッププログラム106及びストレージ装置10のバックアッププログラム116には、バックアップ対象データのデータ量を削減する処理機能が搭載されている。データ量の削減には、例えばファイル圧縮処理、重複排除処理等のデータ処理が用いられる。ファイル圧縮処理は、1ファイル内に含まれる同一内容のデータセグメント(単位データ)を縮約することにより、データ容量を削減する処理である。一方、重複排除処理は、1ファイル内だけでなく、複数のファイル間で検出される同一内容のデータセグメントを縮約することにより、ファイルシステム、ストレージシステム等に格納されるデータの総データ容量を削減するための処理である。
次に、本実施形態におけるチャンク索引表310及びコンテナ索引表320の構成例について説明する。図3、図4に、本実施形態のバックアップ処理及びリストア処理において使用されるコンテナ索引表320の構成例とチャンク索引表310の構成例を示している。コンテナ索引表320は、コンテナ単位で作成されるテーブルである。また、チャンク索引表310は、コンテナに格納されるチャンクを管理するためのテーブルである。
Container/uuid-Cf …コンテナ本体
ContainerIndex/uuid-Cf …コンテナ索引表データベース
ChunkIndex/fpの上位Nbit …チャンク索引表データベース
Contents/uuid-Cf …コンテンツ索引表データベース
次に、本実施形態のストレージシステム1において実現される重複排除処理の概要について説明する。図6に、本実施形態のストレージシステム1において実現される重複排除処理の概要を模式的に示している。なお、図6では、ストレージシステム1に設けられるバックアップサーバ14としてバックアップサーバ14aのみを図示しているが、図1と同様に複数のバックアップサーバ14(14a,14b,…,14n)が、通信ネットワーク4を介してストレージ装置10と接続されているものとする。
次に、本実施形態のストレージシステム1において実行されるバックアップ処理について説明する。図7に、ストレージシステム1に設けられているバックアップサーバ14のバックアッププログラム106、及びストレージ装置10のバックアッププログラム116により実行されるバックアップ処理動作の処理フロー例を示している。なお、図7の処理フロー例において、各処理ステップに付した符号Sは、ステップの略号である。
次に、本実施形態のストレージシステム1において実行されるリストア処理について説明する。図6に、バックアップサーバ14のリストアプログラム107、及びストレージ装置10のリストアプログラム117により実行されるリストア処理の処理フロー例を示している。
次に、本発明の第2の実施形態に係るストレージシステム1について説明する。
図9に、本実施形態によるストレージシステム1の全体構成例を示す。図9に例示する第2実施形態の構成は、ストレージ装置10(10a,10b,…,10m)をそれぞれ備えるデータセンタ3(3a,3b,…,3m)を複数備えていることを除き、図1に例示した第1の実施形態と同様である。従って、ストレージシステム1の構成に関する詳細な説明は省略する。
次に、第2の実施形態によるストレージシステム1において実行されるバックアップ処理及びリストア処理について説明する。
まず、本実施形態のストレージシステム1において実行されるバックアップ処理について説明する。図10に、本実施形態に係るバックアップ処理の処理フロー例を示している。
次に、本発明の第3の実施形態に係るストレージシステム1について説明する。
図11に、本実施形態によるストレージシステム1の全体構成例を示している。本実施形態によるストレージシステム1の全体構成は、データセンタ11及びデータセンタ11に設けられたチャンク管理サーバ12を備えていることを除き、第2の実施形態と同様である。従って、第2の実施形態と同様の構成については詳細な説明を省略する。
次に、本実施形態のストレージシステム1において実行されるバックアップ処理及びリストア処理について説明する。
まず、本実施形態のストレージシステム1において実行されるバックアップ処理について説明する。図13に、本実施形態に係るバックアップ処理動作の処理フロー例を示している。図13に例示するバックアップ処理は、バックアップサーバ14のバックアッププログラム106、及びチャンク管理サーバ12のバックアッププログラム126によって実行される。
本実施形態に係るリストア処理は、クライアント6等からチャンク管理サーバ12にコンテンツIDを指定したリストア処理実行指示を送信し、第1の実施形態におけるストレージ装置10のリストアプログラム117の機能をチャンク管理サーバ12のリストアプログラム127が実現することを除き、第1の実施形態と実質的に同様であるから、詳細な説明を省略する。
次に、本発明の第4の実施形態に係るストレージシステム1について説明する。
本実施形態によるストレージシステム1の全体構成は、図2に例示した第1の実施形態のストレージシステム1と同様である。従って、詳細な説明を省略する。
次に、本実施形態のストレージシステム1において実行されるバックアップ処理及びリストア処理について説明する。
まず、本実施形態のストレージシステム1において実行されるバックアップ処理について説明する。図15に、本実施形態に係るバックアップ処理動作の処理フロー例を示している。図15に例示するバックアップ処理は、バックアップサーバ14のバックアッププログラム106、及びストレージ装置10のバックアッププログラム116によって実行される。
本実施形態に係るリストア処理は、図8に例示した第1の実施形態におけるリストア処理と実質的に同様であるから、詳細な説明を省略する。
次に、本発明の第5の実施形態に係るストレージシステム1について説明する。
本実施の形態に係るストレージシステム1の全体構成は、図1に例示した第1の実施形態と同様であるから、詳細な説明を省略する。また、バックアップサーバ14及びストレージ装置10のブロック構成も、図3に例示した第1の実施形態と同様であるため、詳細な説明を省略する。
次に、本実施形態のストレージシステム1において実行されるバックアップ処理及びリストア処理について説明する。
まず、本実施形態のストレージシステム1において実行されるバックアップ処理について説明する。図16に、本実施形態に係るバックアップ処理動作の処理フロー例を示している。図16に例示するバックアップ処理は、バックアップサーバ14のバックアッププログラム106、及びストレージ装置10のバックアッププログラム116によって実行される。
本実施形態に係るリストア処理は、図6に例示した第1の実施形態におけるリストア処理と実質的に同様であるから、詳細な説明を省略する。
次に、本発明の第6の実施形態に係るストレージシステム1について説明する。
本実施の形態に係るストレージシステム1の全体構成は、図1に例示した第1の実施形態と同様であるから、詳細な説明を省略する。また、バックアップサーバ14及びストレージ装置10のブロック構成も、図3に例示した第1の実施形態と同様であるため、詳細な説明を省略する。
次に、本実施形態のストレージシステム1において実行されるバックアップ処理及びリストア処理について説明する。
まず、本実施形態のストレージシステム1において実行されるバックアップ処理について説明する。図17に、本実施形態に係るバックアップ処理動作の処理フロー例を示している。
本実施形態に係るリストア処理は、図6に例示した第1の実施形態におけるリストア処理と実質的に同様であるから、詳細な説明を省略する。
次に、本発明の第7の実施形態に係るストレージシステム1について説明する。
本実施の形態に係るストレージシステム1の全体構成は、図1に例示した第1の実施形態と同様であるから、詳細な説明を省略する。また、バックアップサーバ14及びストレージ装置10のブロック構成も、図3に例示した第1の実施形態と同様であるため、詳細な説明を省略する。
次に、本実施形態のストレージシステム1において実行されるバックアップ処理及びリストア処理について説明する。
まず、本実施形態のストレージシステム1において実行されるバックアップ処理について説明する。図16に、本実施形態に係るバックアップ処理動作の処理フロー例を示している。
本実施形態に係るリストア処理は、図6に例示した第1の実施形態におけるリストア処理と実質的に同様であるから、詳細な説明を省略する。
Claims (11)
- 外部装置からのデータをコンテンツ単位で格納するストレージシステムであって、
前記外部装置からのデータについて、前記コンテンツ単位でバックアップデータを作成するバックアップ処理を実行するバックアップ装置と、前記バックアップ装置と通信可能に接続されており、前記バックアップ装置から受信する前記バックアップデータを格納するストレージ装置とを備え、
前記バックアップ装置は、
前記バックアップデータである前記コンテンツが前記ストレージ装置に格納済みであるか否かを判定するための情報である第1の重複判定情報と、
前記第1の重複判定情報を用いて、前記コンテンツが前記ストレージ装置に格納済みであるか判定する第1のバックアップ処理部とを備え、
前記ストレージ装置は、
前記バックアップデータである前記コンテンツが前記ストレージ装置に格納済みであるか否かを判定するための情報である第2の重複判定情報と、
前記第2の重複判定情報を用いて、前記コンテンツが前記ストレージ装置に格納済みであるか判定する第2のバックアップ処理部とを備え、
前記バックアップデータとしての前記コンテンツについて、前記第1のバックアップ処理部が前記コンテンツが前記ストレージ装置に格納されていないと判定し、前記第2のバックアップ処理部が前記コンテンツが前記ストレージ装置に格納されていると判定した場合、前記第2のバックアップ処理部は前記第2の重複判定情報を前記バックアップ装置に送信し、前記バックアップ装置の前記第1のバックアップ処理部は、受信した前記第2の重複判定情報を前記第1の重複判定情報に組み入れる処理を実行する、
ストレージシステム。 - 請求項1に記載のストレージシステムであって、
前記第1の重複判定情報及び前記第2の重複判定情報は、前記バックアップ対象である前記コンテンツを所定のサイズで複数に分割して得られる単位データと、各前記単位データについて求めた固有の情報である単位データ固有情報とを関連付けて格納している、ストレージシステム。 - 請求項2に記載のストレージシステムであって、
前記バックアップ装置は、
前記ストレージ装置に格納されている前記コンテンツを特定するための情報であるリストア情報と、
リストアする際に、リストア対象となる前記コンテンツを特定する前記リストア情報を前記ストレージ装置に送信する第1のリストア処理部と、を備え、
前記ストレージ装置は、
前記バックアップ装置から受信した前記リストア情報によってリストア対象である前記コンテンツを特定し、特定された前記コンテンツを構成する前記単位データを前記第2の重複判定情報を用いて特定し、特定した前記単位データによって前記コンテンツをリストアして前記バックアップ装置に送信する第2のリストア処理部とを備えている、ストレージシステム。 - 請求項1に記載のストレージシステムであって、
前記バックアップ装置に複数の前記ストレージ装置が通信可能に接続されており、各前記ストレージ装置の前記第2のバックアップ処理部は、前記バックアップ装置の前記第1のバックアップ処理部から前記第1の重複判定情報による判定結果を受信して、前記判定結果が前記ストレージ装置にバックアップ対象の前記コンテンツが格納されていないことを示していた場合、前記第2の重複判定情報を用いてさらに前記ストレージ装置に前記コンテンツが格納されているか判定し、格納されていると判定した場合、前記第2の重複判定情報を前記バックアップ装置に送信する、ストレージ装置。 - 請求項1に記載のストレージシステムであって、
前記バックアップ装置に、前記第2の重複判定情報及び前記第2のバックアップ処理部を有しない複数の前記ストレージ装置と、少なくとも一の管理装置とが通信可能に接続されており、
前記管理装置は、
各前記ストレージ装置に関する前記第2の重複判定情報と、
前記第2のバックアップ処理部とを備え、
前記管理装置の前記第2のバックアップ処理部は、前記バックアップ装置の前記第1のバックアップ処理部から前記第1の重複判定情報による判定結果を受信して、前記判定結果が各前記ストレージ装置にバックアップ対象の前記コンテンツが格納されていないことを示していた場合、前記第2の重複判定情報を用いてさらに各前記ストレージ装置に前記コンテンツが格納されているか判定し、いずれかの前記ストレージ装置に格納されていると判定した場合、前記第2の重複判定情報を前記バックアップ装置に送信する、ストレージ装置。 - 請求項1に記載のストレージシステムであって、
少なくとも前記バックアップ装置が前記バックアップ装置と前記ストレージ装置とを通信可能に接続している通信ネットワークのトラフィックを監視するネットワーク監視部を備え、前記第1のバックアップ処理部がバックアップ対象である前記コンテンツを前記ストレージ装置へ送信しようとする際に、前記ネットワーク監視部が、前記通信ネットワークのネットワーク負荷が所定のしきい値以上であると判定した場合、前記コンテンツを特定するための固有の情報のみを前記ストレージ装置に送信し、前記コンテンツのデータは送信しない、ストレージシステム。 - 請求項2に記載のストレージシステムであって、
前記バックアップ装置は、所定数の前記単位データ及び前記単位データ固有情報の組を格納することができる記憶領域である単位データ記憶領域を備え、
前記バックアップ装置の前記第1のバックアップ処理部は、前記単位データ及び前記単位データ固有情報の組の数が前記所定数に達したと判定した場合に、前記単位データ及び前記単位データ固有情報の組を前記ストレージ装置に送信する、
ストレージシステム。 - 請求項2に記載のストレージシステムであって、
前記バックアップ装置の前記第1のバックアップ処理部は、前記第1の重複判定情報を用いて前記単位データが前記ストレージ装置に格納されているか判定し、格納されていないと判定した場合、あらかじめ設定しておいた上限データサイズに従ってバックアップ対象の前記コンテンツを構成する前記単位データのうち、前記ストレージ装置へ送信する前記単位データの数を算出して、前記ストレージ装置に格納されていないと判定された前記単位データを含めて引き続く前記単位データの前記算出された個数を前記ストレージ装置に送信する、
ストレージシステム。 - 請求項1に記載のストレージシステムであって、前記バックアップ装置の前記第1のバックアップ処理部は、バックアップ対象である前記コンテンツが、前記第1のバックアップ処理部が識別可能である分割情報によって互いに区分された複数のファイルの集合であるアーカイブデータであると判定した場合、まず、前記コンテンツの各前記ファイルの先頭にある前記単位データについて、前記第1のバックアップ処理部が前記単位データが前記ストレージ装置に格納されていないと判定し、前記第2のバックアップ処理部が前記単位データが前記ストレージ装置に格納されていると判定した場合、前記第2のバックアップ処理部は前記第2の重複判定情報を前記バックアップ装置に送信し、前記バックアップ装置の前記第1のバックアップ処理部は、受信した前記第2の重複判定情報を前記第1の重複判定情報に組み入れる処理を実行し、次いで、各前記ファイル内の後続の前記単位データについて、順次同様の処理を実行する、ストレージシステム。
- 外部装置からのデータをコンテンツ単位で格納するストレージシステムの制御方法であって、
前記ストレレージシステムは、前記外部装置からのデータについて、前記コンテンツ単位でバックアップデータを作成するバックアップ処理を実行するバックアップ装置と、前記バックアップ装置と通信可能に接続されており、前記バックアップ装置から受信する前記バックアップデータを格納するストレージ装置とを備え、
前記バックアップ装置は、
前記バックアップデータである前記コンテンツが前記ストレージ装置に格納済みであるか否かを判定し、
前記第1の重複判定情報を用いて、前記コンテンツが前記ストレージ装置に格納済みであるか判定し、
前記ストレージ装置は、
前記バックアップデータである前記コンテンツが前記ストレージ装置に格納済みであるか否かを判定するための情報である第2の重複判定情報と、
前記第2の重複判定情報を用いて、前記コンテンツが前記ストレージ装置に格納済みであるか判定する第2のバックアップ処理部とを備え、
前記バックアップデータとしての前記コンテンツについて、前記第1のバックアップ処理部が前記コンテンツが前記ストレージ装置に格納されていないと判定し、前記第2のバックアップ処理部が前記コンテンツが前記ストレージ装置に格納されていると判定した場合、前記第2のバックアップ処理部は前記第2の重複判定情報を前記バックアップ装置に送信し、前記バックアップ装置の前記第1のバックアップ処理部は、受信した前記第2の重複判定情報を前記第1の重複判定情報に組み入れる、
ストレージシステムの制御方法。 - 請求項10に記載のストレージシステムであって、
前記第1の重複判定情報及び前記第2の重複判定情報は、前記バックアップ対象である前記コンテンツを所定のサイズで複数に分割して得られる単位データと、各前記単位データについて求めた固有の情報である単位データ固有情報とを関連付けて格納している、ストレージシステムの制御方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2012/081566 WO2014087508A1 (ja) | 2012-12-05 | 2012-12-05 | ストレージシステム及びストレージシステムの制御方法 |
US14/425,675 US9952936B2 (en) | 2012-12-05 | 2012-12-05 | Storage system and method of controlling storage system |
JP2014550853A JP5774794B2 (ja) | 2012-12-05 | 2012-12-05 | ストレージシステム及びストレージシステムの制御方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2012/081566 WO2014087508A1 (ja) | 2012-12-05 | 2012-12-05 | ストレージシステム及びストレージシステムの制御方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014087508A1 true WO2014087508A1 (ja) | 2014-06-12 |
Family
ID=50882958
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/081566 WO2014087508A1 (ja) | 2012-12-05 | 2012-12-05 | ストレージシステム及びストレージシステムの制御方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9952936B2 (ja) |
JP (1) | JP5774794B2 (ja) |
WO (1) | WO2014087508A1 (ja) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015198371A1 (ja) * | 2014-06-23 | 2015-12-30 | 株式会社日立製作所 | ストレージシステム及び記憶制御方法 |
JP2016134133A (ja) * | 2015-01-22 | 2016-07-25 | 日本電気株式会社 | ストレージシステム |
WO2016181479A1 (ja) * | 2015-05-12 | 2016-11-17 | 株式会社日立製作所 | ストレージシステムおよび記憶制御方法 |
JP2017204706A (ja) * | 2016-05-10 | 2017-11-16 | 日本電信電話株式会社 | コンテンツ流通システム、コンテンツ流通方法、コンテンツ生成装置及びコンテンツ生成プログラム |
JP6337982B1 (ja) * | 2017-03-22 | 2018-06-06 | 日本電気株式会社 | ストレージシステム |
JP2018527681A (ja) * | 2015-09-18 | 2018-09-20 | アリババ・グループ・ホールディング・リミテッドAlibaba Group Holding Limited | ソリッドステートドライブコントローラを使用するデータ重複排除 |
JP2019095984A (ja) * | 2017-11-21 | 2019-06-20 | 株式会社キーエンス | 画像処理システム |
JP2019159785A (ja) * | 2018-03-13 | 2019-09-19 | Necソリューションイノベータ株式会社 | バックアップサーバ、バックアップ方法、プログラム、ストレージシステム |
JP2019160245A (ja) * | 2018-03-16 | 2019-09-19 | Necソリューションイノベータ株式会社 | ストレージシステム、ストレージ制御装置、ストレージ制御方法、及びストレージ制御プログラム |
JP2020047114A (ja) * | 2018-09-20 | 2020-03-26 | 富士ゼロックス株式会社 | データ処理装置、データ処理方法及びデータ処理プログラム |
KR102471662B1 (ko) * | 2022-04-04 | 2022-11-28 | 김상준 | 하수처리장 통합관리 서버 및 이를 포함한 시스템 |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9811423B2 (en) * | 2013-01-11 | 2017-11-07 | Commvault Systems, Inc. | Partial file restore in a data storage system |
US9760444B2 (en) * | 2013-01-11 | 2017-09-12 | Commvault Systems, Inc. | Sharing of secondary storage data |
US9483489B2 (en) | 2013-01-14 | 2016-11-01 | Commvault Systems, Inc. | Partial sharing of secondary storage files in a data storage system |
US10437784B2 (en) * | 2015-01-30 | 2019-10-08 | SK Hynix Inc. | Method and system for endurance enhancing, deferred deduplication with hardware-hash-enabled storage device |
US10110660B2 (en) * | 2015-04-20 | 2018-10-23 | Cisco Technology, Inc. | Instant file upload to a collaboration service by querying file storage systems that are both internal and external to the collaboration service |
US10705750B2 (en) | 2016-06-09 | 2020-07-07 | Informatique Holistec Inc. | Data storage system and method for performing same |
US20170364581A1 (en) * | 2016-06-16 | 2017-12-21 | Vmware, Inc. | Methods and systems to evaluate importance of performance metrics in data center |
US20190129802A1 (en) * | 2017-11-02 | 2019-05-02 | EMC IP Holding Company LLC | Backup within a file system using a persistent cache layer to tier data to cloud storage |
US10216580B1 (en) * | 2018-03-29 | 2019-02-26 | Model9 Software Ltd. | System and method for mainframe computers backup and restore on object storage systems |
JP6884128B2 (ja) | 2018-09-20 | 2021-06-09 | 株式会社日立製作所 | データ重複排除装置、データ重複排除方法、及びデータ重複排除プログラム |
US11269536B2 (en) | 2019-09-27 | 2022-03-08 | Open Text Holdings, Inc. | Method and system for efficient content transfer to distributed stores |
US11082495B1 (en) * | 2020-04-07 | 2021-08-03 | Open Text Holdings, Inc. | Method and system for efficient content transfer to a server |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009116839A (ja) * | 2007-10-19 | 2009-05-28 | Hitachi Ltd | コンテンツ転送システムとその方法およびホームサーバ |
JP2012093827A (ja) * | 2010-10-25 | 2012-05-17 | Internatl Business Mach Corp <Ibm> | ファイルの重複を排除する装置及び方法 |
JP2012150792A (ja) * | 2011-01-14 | 2012-08-09 | Symantec Corp | 重複排除記憶システムのスケーラビリティを向上させるシステムおよび方法 |
JP2012529684A (ja) * | 2009-06-08 | 2012-11-22 | シマンテック コーポレーション | バックアップ操作において重複排除を行うためのソース分類 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996025801A1 (en) * | 1995-02-17 | 1996-08-22 | Trustus Pty. Ltd. | Method for partitioning a block of data into subblocks and for storing and communicating such subblocks |
JP4391265B2 (ja) * | 2004-02-26 | 2009-12-24 | 株式会社日立製作所 | ストレージサブシステムおよび性能チューニング方法 |
JP4546387B2 (ja) * | 2005-11-17 | 2010-09-15 | 富士通株式会社 | バックアップシステム、方法及びプログラム |
CN101170553B (zh) * | 2006-10-24 | 2011-07-20 | 华为技术有限公司 | 实现互联网协议多媒体子系统容灾的方法和装置 |
JP5014821B2 (ja) * | 2007-02-06 | 2012-08-29 | 株式会社日立製作所 | ストレージシステム及びその制御方法 |
JP2008276596A (ja) * | 2007-05-01 | 2008-11-13 | Hitachi Ltd | 記憶装置を決定する方法及び計算機 |
US8819205B2 (en) * | 2007-10-19 | 2014-08-26 | Hitachi, Ltd. | Content transfer system, content transfer method and home server |
JP5224240B2 (ja) * | 2008-03-25 | 2013-07-03 | 株式会社日立製作所 | 計算機システム及び管理計算機 |
US7894334B2 (en) * | 2008-08-15 | 2011-02-22 | Telefonaktiebolaget L M Ericsson | Hierarchical redundancy for a distributed control plane |
US8499191B2 (en) * | 2010-12-17 | 2013-07-30 | Hitachi, Ltd. | Failure recovery method for information processing service and virtual machine image generation apparatus |
WO2012101674A1 (en) | 2011-01-26 | 2012-08-02 | Hitachi, Ltd. | Computer system and data de-duplication method |
US20120260051A1 (en) * | 2011-03-01 | 2012-10-11 | Hitachi, Ltd. | Computer system, management system and data management method |
US8713577B2 (en) * | 2011-06-03 | 2014-04-29 | Hitachi, Ltd. | Storage apparatus and storage apparatus management method performing data I/O processing using a plurality of microprocessors |
WO2014068617A1 (en) * | 2012-10-31 | 2014-05-08 | Hitachi, Ltd. | Storage apparatus and method for controlling storage apparatus |
-
2012
- 2012-12-05 WO PCT/JP2012/081566 patent/WO2014087508A1/ja active Application Filing
- 2012-12-05 US US14/425,675 patent/US9952936B2/en active Active
- 2012-12-05 JP JP2014550853A patent/JP5774794B2/ja not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009116839A (ja) * | 2007-10-19 | 2009-05-28 | Hitachi Ltd | コンテンツ転送システムとその方法およびホームサーバ |
JP2012529684A (ja) * | 2009-06-08 | 2012-11-22 | シマンテック コーポレーション | バックアップ操作において重複排除を行うためのソース分類 |
JP2012093827A (ja) * | 2010-10-25 | 2012-05-17 | Internatl Business Mach Corp <Ibm> | ファイルの重複を排除する装置及び方法 |
JP2012150792A (ja) * | 2011-01-14 | 2012-08-09 | Symantec Corp | 重複排除記憶システムのスケーラビリティを向上させるシステムおよび方法 |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9703497B2 (en) | 2014-06-23 | 2017-07-11 | Hitachi, Ltd. | Storage system and storage control method |
WO2015198371A1 (ja) * | 2014-06-23 | 2015-12-30 | 株式会社日立製作所 | ストレージシステム及び記憶制御方法 |
JP2016134133A (ja) * | 2015-01-22 | 2016-07-25 | 日本電気株式会社 | ストレージシステム |
WO2016181479A1 (ja) * | 2015-05-12 | 2016-11-17 | 株式会社日立製作所 | ストレージシステムおよび記憶制御方法 |
US10678434B2 (en) | 2015-05-12 | 2020-06-09 | Hitachi, Ltd. | Storage system and storage control method for improving a deduplication process |
JP2018527681A (ja) * | 2015-09-18 | 2018-09-20 | アリババ・グループ・ホールディング・リミテッドAlibaba Group Holding Limited | ソリッドステートドライブコントローラを使用するデータ重複排除 |
JP2017204706A (ja) * | 2016-05-10 | 2017-11-16 | 日本電信電話株式会社 | コンテンツ流通システム、コンテンツ流通方法、コンテンツ生成装置及びコンテンツ生成プログラム |
JP2018159999A (ja) * | 2017-03-22 | 2018-10-11 | 日本電気株式会社 | ストレージシステム |
JP6337982B1 (ja) * | 2017-03-22 | 2018-06-06 | 日本電気株式会社 | ストレージシステム |
JP2019095984A (ja) * | 2017-11-21 | 2019-06-20 | 株式会社キーエンス | 画像処理システム |
JP2019159785A (ja) * | 2018-03-13 | 2019-09-19 | Necソリューションイノベータ株式会社 | バックアップサーバ、バックアップ方法、プログラム、ストレージシステム |
JP7075077B2 (ja) | 2018-03-13 | 2022-05-25 | Necソリューションイノベータ株式会社 | バックアップサーバ、バックアップ方法、プログラム、ストレージシステム |
JP2019160245A (ja) * | 2018-03-16 | 2019-09-19 | Necソリューションイノベータ株式会社 | ストレージシステム、ストレージ制御装置、ストレージ制御方法、及びストレージ制御プログラム |
JP7099690B2 (ja) | 2018-03-16 | 2022-07-12 | Necソリューションイノベータ株式会社 | ストレージシステム、ストレージ制御装置、ストレージ制御方法、及びストレージ制御プログラム |
JP2020047114A (ja) * | 2018-09-20 | 2020-03-26 | 富士ゼロックス株式会社 | データ処理装置、データ処理方法及びデータ処理プログラム |
KR102471662B1 (ko) * | 2022-04-04 | 2022-11-28 | 김상준 | 하수처리장 통합관리 서버 및 이를 포함한 시스템 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2014087508A1 (ja) | 2017-01-05 |
US9952936B2 (en) | 2018-04-24 |
JP5774794B2 (ja) | 2015-09-09 |
US20150212900A1 (en) | 2015-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5774794B2 (ja) | ストレージシステム及びストレージシステムの制御方法 | |
JP5434705B2 (ja) | ストレージ装置、ストレージ装置制御プログラムおよびストレージ装置制御方法 | |
US10162555B2 (en) | Deduplicating snapshots associated with a backup operation | |
US20200167238A1 (en) | Snapshot format for object-based storage | |
US10275397B2 (en) | Deduplication storage system with efficient reference updating and space reclamation | |
EP2997497B1 (en) | Selecting a store for deduplicated data | |
KR20170054299A (ko) | 메모리 관리 시의 중복 제거를 위해서 기준 세트로 기준 블록을 취합하는 기법 | |
US20080270436A1 (en) | Storing chunks within a file system | |
JP5313600B2 (ja) | ストレージシステム、及びストレージシステムの運用方法 | |
US20130046944A1 (en) | Storage apparatus and additional data writing method | |
US20130212070A1 (en) | Management apparatus and management method for hierarchical storage system | |
WO2014185914A1 (en) | Deduplicated data storage system having distributed manifest | |
US8806062B1 (en) | Adaptive compression using a sampling based heuristic | |
JP5650982B2 (ja) | ファイルの重複を排除する装置及び方法 | |
CN108415986B (zh) | 一种数据处理方法、装置、系统、介质和计算设备 | |
CN105493080B (zh) | 基于上下文感知的重复数据删除的方法和装置 | |
US9594643B2 (en) | Handling restores in an incremental backup storage system | |
JP2000200208A (ja) | ファイルバックアップ方法,装置およびそのプログラム記録媒体 | |
EP2997474B1 (en) | Reporting degraded state of data retrieved for distributed object | |
JP5621229B2 (ja) | ストレージシステム、管理方法及びプログラム | |
JP2018185562A (ja) | 制御プログラム、制御方法、及び情報処理装置 | |
JP7007565B2 (ja) | 情報処理装置および情報処理プログラム | |
JP5494817B2 (ja) | ストレージシステム、データ管理装置、方法及びプログラム | |
US11989124B2 (en) | Garbage collection for a deduplicated cloud tier with encrypted segments | |
US10877945B1 (en) | Optimized block storage for change block tracking systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12889461 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2014550853 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14425675 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12889461 Country of ref document: EP Kind code of ref document: A1 |