WO2018092288A1 - Dispositif de stockage et son procédé de commande - Google Patents

Dispositif de stockage et son procédé de commande Download PDF

Info

Publication number
WO2018092288A1
WO2018092288A1 PCT/JP2016/084371 JP2016084371W WO2018092288A1 WO 2018092288 A1 WO2018092288 A1 WO 2018092288A1 JP 2016084371 W JP2016084371 W JP 2016084371W WO 2018092288 A1 WO2018092288 A1 WO 2018092288A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
copy
original data
logical volume
volume
Prior art date
Application number
PCT/JP2016/084371
Other languages
English (en)
Japanese (ja)
Inventor
伊織 米川
啓 池田
竹内 久治
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2016/084371 priority Critical patent/WO2018092288A1/fr
Publication of WO2018092288A1 publication Critical patent/WO2018092288A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures

Definitions

  • the present invention relates to a storage apparatus and its control method, and is suitable for application to a storage apparatus equipped with a deduplication function, for example.
  • a deduplication function is widely used as a function of a storage apparatus for satisfying such a request (see, for example, Patent Document 1 and Patent Document 2).
  • the deduplication function leaves only one of them in the storage device and deletes all the remaining data. It is a function to do.
  • the processing executed by the storage device based on the deduplication function processing that leaves only one piece of data with the same content in the storage device in the storage device and deletes all remaining data
  • This is called exclusion processing.
  • data left in the storage device in the storage apparatus by this deduplication processing is called original data.
  • Patent Document 1 when performing deduplication of data, among files stored in a plurality of volumes, duplication is performed on a plurality of volumes in order to avoid further concentration of the load on a high-load volume.
  • the files stored in this way are determined as aggregation target files, a plurality of volumes storing the aggregation target files are identified, and one of the plurality of identified volumes is selected based on the load of the plurality of identified volumes. It has been proposed to select one or more volumes as an aggregation volume and delete the aggregation target files stored in the unselected volumes.
  • Patent Document 2 discloses a duplication determination unit that determines whether or not the storage target data is already stored in the storage device in order to suppress the performance degradation of the storage device equipped with the deduplication function.
  • a storage destination determination unit that determines a storage destination of non-duplicate data that is non-duplicated storage target data, and a data storage control unit that stores non-duplicate data in a storage device that is the determined storage destination. It is disclosed that a destination determination unit determines a storage location of duplicate data that is determined to be related to non-duplicate data according to a predetermined criterion, and determines a storage location of non-duplication data based on the determination result Yes.
  • the deduplicated data when deleting the volume that stores the original data of the deduplicated data, or when updating the original data, the deduplicated data is included. It is necessary to move the original data for another file or the like to another volume in advance.
  • the present invention has been made in consideration of the above points, and intends to propose a storage apparatus and a control method thereof that can reduce the migration processing cost of the original data of the deduplicated data.
  • a logical volume is provided as a storage area to a higher-level device, and one of the data having the same content is provided to the data stored in the logical volume.
  • a management unit that manages copy attribute information on whether or not the copy source has been copied in units of logical volumes; Based on the copy attribute information, a deduplication processing execution unit is provided for executing the deduplication processing so as to leave the original data in the logical volume that has become the copy source.
  • a logical volume is provided to a host device as a storage area, and one of the data having the same content is left as original data for the data stored in the logical volume.
  • the storage apparatus manages copy attribute information indicating whether or not the copy apparatus has been a copy source in units of logical volumes.
  • the present invention it is possible to reduce the resource consumption caused by the movement of the original data of the deduplicated data, and to reduce the movement processing cost of the original data.
  • reference numeral 1 denotes a storage device according to this embodiment as a whole.
  • the storage device 1 includes a channel adapter package 3, a microprocessor board 4 and a cache memory package 5 that form a storage controller 2, and a hard disk unit 6 that provides a storage area to the storage controller 2.
  • the channel adapter package 3 includes one or a plurality of channel adapters (not shown). Each channel adapter is an interface that performs protocol control during communication with the host device 8 via the network 7 and includes a port. A unique WWW (World Wide Name) for identifying the port on the network 7 is assigned to the port.
  • WWW World Wide Name
  • the microprocessor board 4 is a board on which a CPU 11 having one or a plurality of microprocessors 10 each composed of a CPU (Central Processing Unit) core and a processor memory 12 composed of a semiconductor memory are mounted.
  • Each microprocessor 10 of the CPU 11 has a local memory 10A.
  • a microprogram that is a program for the microprocessor 10 to execute various processes and virtual volume information 23 described later are loaded and stored from a shared memory area 22 described later of the cache memory package 5.
  • the processor memory 12 is a memory that is shared and used by the microprocessors 10. Page update frequency information 13 and an address conversion table 14 to be described later are stored and held in the processor memory 12.
  • the cache memory package 5 includes a plurality of DIMMs (Dual In-line Memory Module) 20.
  • the DIMM 20 is a memory module in which a plurality of semiconductor memories such as DRAM (Dynamic Random Access Memory) are mounted on a printed circuit board.
  • a part of the storage area provided by each semiconductor memory that constitutes each of these DIMMs 20 is used as a cache memory area 21 that temporarily holds data to be read / written to a storage device 30 that will be described later that constitutes the hard disk unit 6,
  • the remaining area is used as a shared memory area 22 for storing control information and the like shared by the microprocessors 10 of the CPU 11. Virtual volume information 23 and local copy pair information 24 described later are stored and held in this shared memory area 22.
  • the hard disk unit 6 includes a plurality of storage devices 30.
  • the storage device 30 is an expensive and high-performance disk device such as an FC (Fibre-Chanel) disk or a SAS (Serial-Attached SCSI) disk, an inexpensive and low-performance disk device such as a SATA (Serial-AT-Attachment) disk, or an SSD. (Solid State Drive) etc.
  • FIG. 2 shows a logical configuration of the storage apparatus 1.
  • one or more storage devices 30 constituting the hard disk unit 6 are managed as a RAID (Redundant Arrays of Inexpensive Disks) group RG, and one or more RAID groups are managed.
  • a storage area provided by each storage device 30 constituting each RG is managed as a pool PL.
  • the storage area in the pool PL is managed in units of a partial area having a predetermined size (for example, 42 MB).
  • this partial area is referred to as “page” or “physical page”.
  • Each pool PL is associated with one or a plurality of virtual logical volumes (hereinafter referred to as “virtual volumes”) VVOLs formed using Thin Provisioning technology. It is provided to the host device 8 as a storage area for reading and writing data.
  • this virtual volume VVOL (storage space provided to the host apparatus 8) may be referred to as “overwrite space”.
  • a unique identifier (hereinafter referred to as “LUN (Logical Unit Number)”) is assigned to each virtual volume VVOL.
  • the storage area of the virtual volume VVOL is managed in units of a partial area called a logical block having a predetermined size (for example, 512 bytes).
  • Each logical block is given a unique identifier (hereinafter referred to as “LBA (LogicalBABlock ⁇ Address) ”).
  • LBA LogicalBABlock ⁇ Address
  • the storage area of the virtual volume VVOL is managed by being divided into partial areas having the same size as the physical page, which are configured by a plurality of logical blocks.
  • this partial area is referred to as a “virtual page”.
  • Data read / write from the host device 8 to the virtual volume VVOL includes the LUN of the virtual volume VVOL, the LUN of the first logical block in the area where data is read / written in the virtual volume VVOL, and the data length of the data Is issued by issuing to the storage apparatus 1 a read request or a write request designating.
  • the microprocessor 10 having the lowest load at that time in the CPU 11 of the storage controller 2 is assigned as a person in charge of processing the read request or write request.
  • the assigned microprocessor 10 has a case where the request given from the host device 8 at that time is a write request, and no physical page is assigned to the virtual page to which the data specified in the write request is written. In this case, an unused physical page is allocated to the virtual page from the pool PL associated with the virtual volume VVOL. Then, the microprocessor 10 writes the data from the host device 8 to the physical page assigned to the virtual page.
  • the microprocessor 10 is in the case where the request given from the host device 8 at that time is a read request or a write request, and the physical page is in the read / write destination area of the data designated in the read request or the write request.
  • the data is read from the physical page and transferred to the host device 8 that is the source of the read request (in the case of a read request), or the data given from the host device 8 to the physical page Is written (in the case of a write request).
  • the user can make a setting to apply that data deduplication should be performed on the virtual volume VVOL.
  • a virtual volume VVOL that has been set as appropriate will be referred to as a deduplication-compatible volume.
  • the area in the virtual page is managed by being divided into partial areas called “chunks” having a predetermined size (for example, 8 KB) that is an integral multiple of the logical block in order from the top of the virtual page.
  • Each chunk is given a unique address (hereinafter referred to as LA (Logical Address)).
  • the microprocessor 10 having the lowest load at that time of the CPU 11 is asynchronous with the I / O processing for the read request and write request from the host device 8 for each deduplication-compatible volume.
  • Deduplication is performed by determining whether or not the same contents are in units of chunks at a predetermined period (for example, 50 msec period), and for chunks with the same contents, only one chunk data is left and the other chunk data is deleted Execute the process.
  • the microprocessor 10 provides a check code including a feature amount of a small size (for example, about 8 bytes) calculated based on the data to be compared, such as a hash value calculated using a hash function. Calculate and perform duplication determination between chunks using the calculated check code.
  • a check code generated from data of one chunk is referred to as “FPK (FingerPrint Key)”.
  • the microprocessor 10 When the microprocessor 10 detects duplication of certain data for the first time as a result of the duplication determination, only the data of one chunk among the chunks having the same content is left as the original data, and the data of the other chunks having the same content is stored. delete. At this time, the microprocessor 10 performs compression processing using a reversible compression algorithm such as the LZW algorithm on the data to be left as the original data, and for the data to be deleted, the LA of the chunk in which the data is stored In association with FPK, it is registered and managed in a table stored in a dedicated virtual volume VVOL. Hereinafter, this table is referred to as “FPT (FingerPrintingerkey Table)” 31 (FIG. 2).
  • FPT FingerPrintingerkey Table
  • the compressed data of the original data generated by the compression processing is stored in a location different from the physical page in which the uncompressed data is stored (hereinafter referred to as “additional writing space”).
  • the write-once space is not a storage space accessible by the host device 8, but a storage space (virtual volume VVOL) that can be used only by the storage controller 2.
  • the write-once space is used for the storage controller 2 to store the compressed data in the storage device 30.
  • the compressed data is stored in the additional writing space by additional writing.
  • the correspondence between the LA of the original data stored in the FPT 31 and the address in the write-once space where the compressed data of the original data is stored is a table (hereinafter referred to as an address conversion table) 14 (FIG. 1). ).
  • the storage apparatus 1 again assigns a physical page to the virtual page in the overwrite space, decompresses the data moved to the write-once space, and decompresses the data to the physical page assigned to the virtual page.
  • the data on the physical page may be updated (overwritten).
  • the storage apparatus 1 of the present embodiment includes a deduplication function for executing deduplication processing as described above, as a part of the deduplication function.
  • the placement position of the original data and the virtual volume VVOL of the migration destination when the migration of the original data is necessary after that is determined as the virtual volume VVOL estimated to have the least risk of the migration of the original data.
  • the function is installed.
  • the storage apparatus 1 executes deduplication processing for a certain amount of data for the first time, the data, for example, as shown in FIG. 3, copies master data to generate a plurality of pieces of data having the same contents.
  • the data is a use case (hereinafter referred to as the first use case)
  • the master image data hereinafter referred to as master data
  • the deduplication processing is executed so as to delete the remaining data (hereinafter referred to as copy data) generated.
  • VDI Virtual Desktop Infrastructure
  • a virtual volume storing master data hereinafter referred to as a master volume
  • the original data is concentrated on the master volume and the copy data is deduplicated, so that the movement of the original data is difficult to occur and the virtual volume in which the copy data is stored is deleted. It is considered that the processing time can be shortened.
  • whether or not the data to be deduplicated is in the first use case described above can be determined based on the issuance information of the XCOPY command that is a data replication command. That is, in the case of the first use case, it can be determined that the virtual volume VVOL designated as the copy source in the XCOPY command is the master volume in the first use case.
  • the master volume can be easily identified during deduplication processing by giving the information “XCOPY copy source” to the virtual volume VVOL that is the copy source. can do.
  • the storage apparatus 1 can determine that the deduplicated data is backup data obtained by regular backup (hereinafter, this use case is referred to as a second use case) If it becomes necessary to move the original data to another backup volume due to an update of the original data or deletion of a virtual volume that stores the original data, the destination of the original data Of the other candidate backup volumes, the original data is moved to the last updated backup volume.
  • a backup volume (virtual volume VVOL) for each day of the week is prepared, and a certain backup target data (hereinafter referred to as backup target data) is backed up daily to these backup volumes.
  • backup target data a certain backup target data
  • the backup destination of the backup target data in this case is the backup volume corresponding to the day of the week.
  • deduplication processing is performed on data stored in a total of seven backup volumes for one week, and data “A” stored in the backup volume for Wednesday is left as the original data.
  • An example is shown in which data having the same content (data “A”) in the backup volume of the day of the week is deduplicated.
  • the data to be backed up is updated on Wednesday.
  • the data “A” is updated to the data “B” by updating the backup target data
  • the data “B” is backed up to the data “A” stored in the Wednesday backup volume.
  • the backup volume for Tuesday which is the earliest in the future, is not the backup volume for Thursday where the original data is moved immediately after the update on the next day.
  • the volume is appropriate.
  • it can be said that it is preferable that the migration destination of the original data is a backup volume updated last.
  • the backup operation is often realized by the local copy function. Therefore, the determination of the migration destination of the original data in the second use case can be performed using the pair information (latest operation time) of the local copy function.
  • the use case of the original data is periodically For use cases that cannot be determined to be backup operations (hereinafter referred to as the third use case), as shown in FIG. Move the original data.
  • the migration risk of the original data due to the data update can be reduced by moving the migration destination of the original data to the virtual page with the lowest update frequency among the migration destination candidates.
  • This method is used not only when determining the destination of the original data in use cases other than the second use case, but also when determining the location of the original data in the first deduplication process (first use case). Can also be applied.
  • page update frequency information 13 and an address conversion table 14 are stored in the processor memory 12 of the microprocessor board 4. Are stored, and the virtual volume information 23 and the local copy pair information 24 are stored in the shared memory area 22 of the cache memory package 5. Further, as described above with reference to FIG. 2, a virtual volume VVOL (hereinafter referred to as an FPT volume) that can be used only by the storage controller 2 is defined in the storage apparatus 1, and an FPT 31 is included in this FPT volume. Stored.
  • an FPT volume virtual volume
  • the page update frequency information 13 has a table structure in which the number of updates (update frequency) within a predetermined time (for example, several seconds to several hours) for each chunk of each virtual volume VVOL is stored.
  • the page update frequency information 13 is updated so that the value of the update frequency is incremented by 1 every time the data written in the virtual volume VVOL is updated by the microprocessor 10 (FIG. 1) in charge of the processing.
  • the address conversion table 14 is a table used for managing the movement destination of each chunk when the chunk data on the overwrite space is moved to the write-once space.
  • the address conversion table 14 when the data stored in the overwrite space is compressed and stored in the write-once space, the address of the overwrite space in which the data was stored by the microprocessor 10 in charge of the processing, The address of the additional writing space where the compressed data of the original data is stored is stored in association with each other.
  • the address conversion table 14 includes an overwrite space address column 14A and a write space address column 14B.
  • the overwrite space address column 14A stores the address (LA) on the overwrite space of the chunk that has been compressed and moved to the write space, and the write space address column 14B stores the corresponding chunk.
  • the destination address (PA) in space is stored.
  • the virtual volume information 23 is information used to manage each virtual volume VVOL defined in the storage apparatus 1, and as shown in FIG. 7, a volume number column 23A, a capacity column 23B, a deduplication setting column 23C and XCOPY attribute column 23D.
  • the volume number column 23A stores all identification numbers (volume numbers) assigned to the respective virtual volumes VVOL defined in the storage apparatus 1, and the capacity column 23B sets the corresponding virtual volume VVOL. Stored capacity is stored.
  • deduplication setting column 23C information indicating whether or not the corresponding virtual volume VVOL is set as a deduplication-compatible volume (in FIG. 7, “present” when set, “when not set”). “Nothing”) is stored.
  • the XCOPY attribute column 23D stores an attribute indicating whether or not the corresponding virtual volume VVOL is a copy source of a copy based on the XCOPY command (hereinafter referred to as an XCOPY attribute).
  • FIG. 7 shows an example in which the character string “XCOPY copy source” is stored when the corresponding virtual volume VVOL is the copy source based on the XCOPY command. This information is registered by the microprocessor 10 (FIG. 1) in charge of controlling the copy processing when the virtual volume becomes the XCOPY copy source.
  • the local copy pair information 24 is information for managing each local copy pair defined in the storage apparatus 1, and as shown in FIG. 8, a volume number column 24A, a pair attribute column 24B, and a pair operation time column 24C.
  • the table structure includes a pair number column 24D and a partner volume number column 24E.
  • volume numbers of all virtual volumes VVOL defined in the storage apparatus 1 are stored.
  • the pair attribute column 24B when the corresponding virtual volume VVOL is set as a copy pair of another virtual volume VVOL and a local copy, the primary volume (primary VOL) that is the copy source of the copy pair and the copy destination Information indicating which secondary volume (secondary VOL) is stored is stored. If the corresponding virtual volume VVOL is not set as a copy pair with any virtual volume VVOL, nothing is stored in the pair attribute column 24B.
  • pair operation time column 24C when the corresponding virtual volume VVOL is set as a copy pair with another virtual volume VVOL, a predetermined operation such as formation of the copy pair or resync (resynchronization) is performed last. Stored time is stored.
  • the pair number column 24D stores the number of counterpart virtual volumes VVOL for which the corresponding virtual volume VVOL is set as a copy pair.
  • the counterpart volume number column 24E stores the volumes of these counterpart virtual volumes VVOL. All numbers are stored.
  • the FPT 31 is a table for managing the FPK of each chunk in each deduplication corresponding volume calculated at the time of deduplication processing. As shown in FIG. 9, different FPKs calculated in the deduplication processing are used. Each column 31A is configured.
  • the uppermost row (hereinafter referred to as FPK row) 31B of each column 31A stores the corresponding FPK value, and each row below the FPK row 31B in each column 31A (hereinafter referred to as “FPK row 31B”). These are called LA rows.) 31C stores LAs of all chunks in which the PFK values of the stored data match the FPK values stored in the FPK row 31B.
  • the original data when the original data is updated or when the virtual volume VVOL storing the original data is deleted, the original data is stored in the next LA row 31C of the same column 31A. Moved to LA. Therefore, in the example of FIG. 9, when the data (original data) stored in the virtual page of LA “LA1” is updated, or when the virtual volume VVOL having the virtual page “LA1” is deleted. , The original data stored in the LA virtual page “LA1” is moved to the LA virtual page “LA1031”.
  • FIG. 10 shows one of the microprocessors 10 (FIG. 1) of the CPU 11 (FIG. 1) based on an activation command given periodically (for example, 50 msec) from a scheduler (not shown). The process procedure of the deduplication process performed by this is shown.
  • any one of the microprocessors 10 in the CPU 11 starts the deduplication process shown in FIG. 10, and first performs a step from the deduplication corresponding volume defined in the storage apparatus 1.
  • One deduplication-compatible volume hereinafter referred to as a target volume
  • S1 One deduplication-compatible volume to be processed after S2 is determined (S1).
  • the method of determining the target volume may be either a method of determining at random from the deduplication-compatible volume or a method of determining in a predetermined order from the deduplication-compatible volume.
  • a predetermined prime number is added to the volume number of the deduplication-compatible volume that was the last target volume in the deduplication process performed earlier or the volume number of the previous target volume in this deduplication process.
  • a method is conceivable in which the added value is obtained and the deduplication-compatible volume to which the volume number of that value is assigned is used as the target volume.
  • a method of determining a deduplication-compatible volume as a target volume in ascending or descending order of volume numbers is conceivable.
  • the microprocessor 10 acquires the XCOPY attribute stored in the XCOPY attribute column 23D (FIG. 7) corresponding to the target volume in the virtual volume information 23 described above with reference to FIG. 7 (S2), and acquires the acquired XCOPY attribute. Based on the above, it is determined whether or not the target volume has been the copy source of the copy executed according to the XCOPY command by that time (S3). If the microprocessor 10 obtains a negative result in this determination, it proceeds to step S5.
  • the microprocessor 10 obtains a positive result in the determination at step S3, it performs deduplication on the data stored in the target volume (S4), and thereafter all the deduplications in the storage device 1 are performed. It is determined whether or not the processing of steps S1 to S4 has been executed for the corresponding volume (S4).
  • step S1 If the microprocessor 10 obtains a negative result in this determination, it returns to step S1, and thereafter, the deduplication corresponding volume determined as the target volume in step S1 is sequentially changed to another unprocessed deduplication corresponding volume. While switching, the processing from step S1 to step S5 is repeated.
  • deduplication is performed on the data stored in each deduplication-compatible volume that has become the copy source of the copy executed in accordance with the XCOPY command. .
  • the microprocessor 10 obtains a positive result in step S5 by completing the deduplication on the data stored in all the deduplication-compatible volumes that have become the copy source of the copy executed in accordance with the XCOPY command.
  • Deduplication-compatible volumes for which deduplication execution processing has not been performed are subject to processing in step S6 and subsequent steps.
  • One deduplication volume (target volume) to be determined is determined.
  • the method for determining the target volume in this step S6 is also a method in which the deduplication execution process is randomly determined from among the unprocessed deduplication corresponding volumes, the deduplication execution process is not yet processed. Any of the methods of determining in a predetermined order from the above may be used.
  • the microprocessor 10 performs duplication processing on the data stored in the target volume determined in step S6 (S7), and thereafter stored in all the deduplication-compatible volumes defined in the storage apparatus 1. It is determined whether or not deduplication has been performed on data (S8).
  • step S6 the microprocessor 10 obtains a negative result in this determination, the microprocessor 10 returns to step S6, and then sequentially switches the deduplication corresponding volume determined as the target volume in step S6 to another unprocessed deduplication corresponding volume. However, the processing from step S6 to step S8 is repeated.
  • step S6 deduplication is executed on data stored in each deduplication-compatible volume that has never become a copy source of a copy executed in accordance with the XCOPY command.
  • the microprocessor 10 obtains a positive result in step S8 by completing the deduplication on the data stored in all the deduplication-compatible volumes that have not become the copy source of the copy executed in accordance with the XCOPY command. And this deduplication process is complete
  • FIG. 11 shows specific processing contents of the deduplication execution process executed by the microprocessor 10 in step S4 and step S7 of the deduplication process.
  • the microprocessor 10 When the microprocessor 10 proceeds to step S4 or step S7 of the deduplication process, the microprocessor 10 starts the deduplication execution process shown in FIG. 11, and first selects the target of the process after step S11 from the virtual pages in the target volume.
  • One virtual page hereinafter referred to as a target page
  • the method for determining the target page may be either a method for determining at random from the virtual pages of the target volume or a method for determining in a predetermined order from the virtual pages of the target volume.
  • the microprocessor 10 determines whether or not deduplication is necessary for the data of the target page determined in step S10 (hereinafter referred to as target data) (S11).
  • target data data of the target page determined in step S10
  • the microprocessor 10 refers to the page update frequency information 13 (FIG. 1) and deduplicates the target data based on whether or not the target page is updated within a predetermined time. It is determined whether or not to perform.
  • step S21 If the microprocessor 10 obtains a negative result in this determination, it proceeds to step S21. On the other hand, when the microprocessor 10 obtains a positive result in the determination at step S11, the microprocessor 10 calculates a hash value of the target data using a predetermined hash function as the FPK of the target data (S12).
  • the microprocessor 10 sequentially compares the hash value (FPK) of the target data calculated in step S12 with each FPK stored in the FPT 31 (FIG. 9) (S13), and the hash value (FPK) of the target data Is matched with any FPK already registered in the FPT 31 (S14).
  • Obtaining a negative result in this determination means that the hash value (FPK) of the target data has not been registered in the FPT 41 yet.
  • the microprocessor 10 newly registers the hash value calculated in step S12 in the FPT 31 as the FPK of the target page, and at the top FPK row 31C of the column 31A (FIG. 9) corresponding to the FPK in the FPT 31.
  • the LA of the target page is stored in (S18).
  • the microprocessor 10 then compresses the target data, writes the compressed data thus obtained to the write-once space, and the LA of the target page and the address (PA) where the compressed data in the write-once space is stored. After the correspondence relationship is registered in the address conversion table 14 (S19), the process proceeds to step S20.
  • obtaining a positive result in the determination in step S14 means that the hash value (FPK) of the target data is already registered in the FPT 31.
  • the target data and the data for which the FPK registered in the FPT 31 is the same as the hash value of the target data are not always completely the same. Therefore, at this time, the microprocessor 10 compares the target data with data in which the FPK having the same value as the hash value of the target data is registered in the FPT 31 (S15).
  • the microprocessor 10 determines that the FPK having the same value as the hash value of the target data is the LA of the chunk in which the original data of the data registered in the FPT 31 is stored (the top FPK in the column 31A of the FPK in the FPT 31). (LA registered in the row 31C) is acquired from the FPT 31, the address conversion table 14 (FIG. 6) is referred to, an address on the additional write space corresponding to the LA is acquired, and the address in the additional write space is acquired. Read the compressed data of the original data from the position. The microprocessor decompresses the read compressed data to restore the original data before compression, and compares the restored original data with the data of the target page.
  • the microprocessor 10 determines whether or not the target data and the original data of the data registered in the FPT 31 match the target data and the FPK having the same value as the hash value of the target data based on the comparison result of step S15. Judgment is made (S16).
  • the microprocessor 10 If the microprocessor 10 obtains a negative result in this determination, it compresses the target data, writes the compressed data thus obtained to the write-once space, and stores the LA of the target page and the compressed data in the write-once space. After the correspondence relationship with the address (PA) is registered in the address conversion table 14 (S19), the process proceeds to step S20.
  • the microprocessor 10 when the microprocessor 10 obtains a positive result in the determination at step S16, it additionally registers the LA of the target page in the last FPK row 31C of the FPK column 31A having the same value as the hash value of the target data in the FPT 31. Thereafter, the page in the overwrite space where the target data is stored is discarded (the data on the page is deleted) (S20).
  • the microprocessor 10 determines whether or not the processing of step S11 to step S20 has been executed for all virtual pages in the target volume (S21). If the microprocessor 10 obtains a negative result in this determination, it returns to step S10. Thereafter, the microprocessor 10 sequentially switches the target page determined in step S10 to another unprocessed virtual page in the target volume, step S10 to step S21. Repeat the process.
  • step S21 when the microprocessor 10 eventually obtains a positive result in step S21 by completing the processing of steps S11P to S20 for all virtual pages in the target volume, it ends this deduplication execution processing and deduplication processing.
  • the copy source made in accordance with the XCOPY command is the copy source before the deduplication-compatible volume that has never been the copy source made in accordance with the XCOPY command.
  • the LA of the virtual page of the deduplication-compatible volume that has been the copy source of the copy made according to the XCOPY command in the FPT 31 is executed according to the XCOPY command. It is registered at a position higher than the LA of the virtual page of the deduplication corresponding volume that has never been the copy source of the broken copy.
  • the virtual page data in which LA is stored in the uppermost FA row 31C of the FPT 31 is left as the original data as described above.
  • the LA of the virtual page of the deduplication-compatible volume that has become the copy source of the copy performed in accordance with the XCOPY command is stored in the uppermost FA row 31C of the FPT 31,
  • the data stored in the deduplication-compatible volume that has become the copy source of the copy made in accordance with the XCOPY command is left as the original data and stored in the deduplication target volume that is the copy destination of the copy
  • the deduplicated data will be deduplicated.
  • FIG. 12 shows that original data is transferred from the current virtual volume VVOL to another virtual volume VVOL of another migration destination candidate according to deletion of the virtual volume VVOL or overwriting of the original data.
  • the processing procedure of the original data movement process executed by any of the microprocessors 10 of the CPU 11 when a situation to be moved occurs will be shown.
  • the microprocessor 10 moves the original data to another virtual volume VVOL as a migration destination candidate according to the processing procedure shown in FIG.
  • the microprocessor 10 starts the original data movement process shown in FIG. 12, and first, from the local copy pair information 24 (FIG. 8), the virtual data storing the original data to be moved is stored. Information about a local copy pair in which a volume VVOL (hereinafter referred to as an original data storage volume) is used as a data copy source or copy destination is acquired (S30).
  • VVOL volume VVOL
  • the microprocessor 10 stores the information of the record (row) corresponding to the original data storage volume in the local copy pair information 24 and the volume number of the original data storage volume in the partner volume number column 24E (FIG. 8). Get information on all records (rows).
  • the microprocessor 10 determines whether the original data storage volume is set as a secondary volume (copy destination virtual volume VVOL) of a copy pair with another virtual volume VVOL. Is determined (S31). This determination is made by determining whether or not the pair attribute stored in the pair attribute column 24B (FIG. 8) of the record of the original data storage volume acquired in step S30 is “secondary volume”.
  • the negative result obtained in this determination is that the original data storage volume is not set to a copy pair for local copy with any virtual volume VVOL, or the original data storage volume is used as a primary volume and other virtual volumes VVOL and local volumes. It means that the copy pair is set.
  • the microprocessor 10 determines that the use case of the original data is the third use case described above, and proceeds to step S35.
  • the volume set as the primary volume in the copy pair in which the original data storage volume is set as the secondary volume (hereinafter referred to as a specific volume). ) are acquired from the local copy pair information 24 (S32). Specifically, the row information of the record (row) corresponding to the specific volume is acquired from the local copy pair information 24.
  • the microprocessor is set to a copy pair that is a virtual volume VVOL other than the original data storage volume and has the specific volume as the primary volume, among the migration destination candidates of the original data. It is determined whether or not there is a virtual volume VVOL (secondary volume) (S33). This determination is made based on whether the volume number of the virtual volume VVOL other than the volume number of the original data storage volume is stored in the counterpart volume number column 24E (FIG. 8) of the information acquired in step S32. .
  • Obtaining a positive result in this determination means that there are a plurality of copy pairs whose copy source is a specific volume, and that the original data storage volume exists as a secondary volume of one of the copy pairs. means.
  • the microprocessor 10 determines that the use case of the original data is the second use case, and refers to the local copy pair information 24 (FIG. 8), so that a plurality of copy volumes having a specific volume as the copy source are referred to.
  • the secondary volume other than the original data storage volume and having the latest updated time is determined as the migration destination of the original data (S34).
  • step S34 the microprocessor 10 sets a record pair of each virtual volume VVOL to which a volume number other than the volume number of the original data storage volume detected in step S33 in the local copy pair information 24 is assigned.
  • the virtual volume VVOL with the latest time stored in the operation time column 24C (FIG. 8) is determined as the migration destination of the original data. Then, the microprocessor proceeds to step S37.
  • obtaining a negative result in the determination in step S33 means that the specific volume is not set to a copy pair of a virtual volume VVOL other than the original data storage volume and the local copy.
  • the microprocessor 10 determines that the use case of the original data is the third use case, and the virtual page that can be the destination of the original data in each virtual volume VVOL that can be the destination of the original data.
  • the update frequencies are acquired from the page update frequency information 13 (FIG. 1), and these are compared (S35).
  • the “virtual volume VVOL that can be the migration destination of the original data” here is a deduplication-compatible volume in which data having the same content as the original data is deduplicated, and “a virtual page that can be the migration destination of the original data”. Is a virtual page in which data having the same content as the original data in the deduplication target volume is deduplicated.
  • the microprocessor 10 selects the virtual page with the lowest update frequency from the virtual pages that can be the migration destination of the original data in each virtual volume VVOL that can be the migration destination of the original data. Is determined as the destination (S36).
  • the microprocessor 10 copies the original data to the destination of the original data determined in step S34 or S35 (S37). Note that this copying is performed by copying the compressed data of the original data stored in the write-once space so that it is newly written in the write-once space.
  • the microprocessor 10 uses the address (PA) on the write-once space associated with the source LA of the original data in the address conversion table 14 (FIG. 6) as the copy destination of the original data in the write-once space. Rewrite to address (PA).
  • microprocessor 10 copies the address (PA) in the write-once space of the deduplicated data having the same FPK as the original data in the address conversion table 14 to the address of the copy destination of the original data (in step S37) (PA) (S38).
  • the microprocessor 10 determines the LA of the virtual page in which the original data stored in the first LA row 31C of the column 31A of the FPK corresponding to the original data among the columns 31A of the FPT 31 has been stored. Delete (S39).
  • the microprocessor 10 stores the LA of the virtual page to which the original data is moved in the first LA row 31C of the column 31A in the FPT 31, and the LA other than the LA of the virtual page stored in the column 31A. If necessary, the LA stored in the column 31A is moved so as to be pre-packed (S40).
  • the microprocessor 10 thereafter ends this original data movement process.
  • the storage apparatus 1 of the present embodiment can determine that the data to be deduplicated is copied by copying (first use case). Deduplication processing is performed so that the original data remains in the copy source virtual volume.
  • the storage apparatus 1 is a case (second use case) in which the original data of the deduplicated data exists in any one of the plurality of backup destination virtual volumes VVOL, and when the original data is updated, When deleting the virtual volume VVOL, the original data is moved to the last updated virtual volume VVOL among the other backup destination virtual volumes VVOL. Further, when the use case of the deduplicated data is other than the second use case, the storage apparatus 1 moves the original data to the virtual page with the lowest update frequency.
  • the present storage device 1 it is difficult for the original data to move due to the update of the original data or the deletion of the virtual volume VVOL. As a result, the consumption of resources caused by the movement of the original data of the deduplicated data Can be reduced, and the movement processing cost of the original data can be reduced.
  • the deduplication processing is executed so that the original data is left in the copy source virtual volume.
  • the probability that the original data remains in the deduplication-compatible volume is low. Therefore, the probability that the original data needs to be moved when deleting the deduplication-compatible volume can be reduced as much as possible, and thus the processing time (average) required for deleting the deduplication-compatible volume can be shortened. be able to.
  • the copy attribute information indicating whether or not the copy has been a copy source has been managed as a logical volume unit, and the copy source based on the copy attribute information.
  • the deduplication processing execution unit that performs deduplication processing to leave the original data in a certain logical volume and the situation that the original data should be moved to another logical volume occurs, the next original data moves
  • the microprocessor 10 of the CPU 11 executes the microprogram on the logical volume that is estimated to have the least risk of being determined as the migration destination of the original data and the original data migration unit that migrates the original data to the determined logical volume.
  • the present invention is not limited to this, and these pipes are embodied. Parts, some or all of the deduplication processing execution unit and the original data moved may be configured by its dedicated hardware.
  • the present invention can be widely applied to storage apparatuses equipped with a deduplication function.
  • SYMBOLS 1 ... Storage device, 2 ... Storage controller, 8 ... Host device, 10 ... Microprocessor, 11 ... CPU, 12 ... Processor memory, 13 ... Page update frequency information, 14 ... Address conversion table, 22 ... Shared memory area, 23 ... Virtual volume information, 24 ... Local copy pair information, 30 ... Storage device, 31 ... FPT, VVOL ... Virtual volume.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

[Problème] L'invention a pour objet de proposer un dispositif de stockage et son procédé de commande qui peuvent réduire les coûts de traitement de mouvement pour des données sources de données débarrassées des doublons. [Solution] Un dispositif de stockage selon l'invention qui exécute un traitement de suppression des doublons est configuré de façon à gérer des informations d'attributs de copie, indiquant si un volume logique a servi ou non de source de copie pour une copie, par unité de volume logique, et à exécuter un traitement de suppression des doublons de façon à laisser les données sources sur le volume logique qui a servi de source de copie, sur la base des informations d'attributs de copie; et par conséquent, la probabilité d'un mouvement de données sources en conjonction avec des mises à jour de données sources ou des suppressions de volume logique peut être réduite. En conséquence, il est possible de réaliser un dispositif de stockage et un procédé de commande qui peuvent réduire les coûts de traitement de mouvement pour des données sources de données débarrassées des doublons.
PCT/JP2016/084371 2016-11-18 2016-11-18 Dispositif de stockage et son procédé de commande WO2018092288A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/084371 WO2018092288A1 (fr) 2016-11-18 2016-11-18 Dispositif de stockage et son procédé de commande

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/084371 WO2018092288A1 (fr) 2016-11-18 2016-11-18 Dispositif de stockage et son procédé de commande

Publications (1)

Publication Number Publication Date
WO2018092288A1 true WO2018092288A1 (fr) 2018-05-24

Family

ID=62146303

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/084371 WO2018092288A1 (fr) 2016-11-18 2016-11-18 Dispositif de stockage et son procédé de commande

Country Status (1)

Country Link
WO (1) WO2018092288A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012063902A (ja) * 2010-09-15 2012-03-29 Nec Corp ファイル管理装置、プログラム並びに方法
JP2013541055A (ja) * 2011-09-16 2013-11-07 日本電気株式会社 ストレージ装置
JP2015503780A (ja) * 2012-02-13 2015-02-02 株式会社日立製作所 階層化ストレージシステムの管理装置及び管理方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012063902A (ja) * 2010-09-15 2012-03-29 Nec Corp ファイル管理装置、プログラム並びに方法
JP2013541055A (ja) * 2011-09-16 2013-11-07 日本電気株式会社 ストレージ装置
JP2015503780A (ja) * 2012-02-13 2015-02-02 株式会社日立製作所 階層化ストレージシステムの管理装置及び管理方法

Similar Documents

Publication Publication Date Title
US20210157523A1 (en) Storage system
US20230013281A1 (en) Storage space optimization in a system with varying data redundancy schemes
JP6304406B2 (ja) ストレージ装置、プログラム、情報処理方法
US10031703B1 (en) Extent-based tiering for virtual storage using full LUNs
US7054960B1 (en) System and method for identifying block-level write operations to be transferred to a secondary site during replication
JP6124902B2 (ja) ストレージシステムにおける可変長符号化
JP6240071B2 (ja) ストレージシステムにおけるマッピングテーブルを効果的に管理するコンピューターシステムおよびその方法
US9235535B1 (en) Method and apparatus for reducing overheads of primary storage by transferring modified data in an out-of-order manner
US8521685B1 (en) Background movement of data between nodes in a storage cluster
US8656123B2 (en) Snapshot preserved data cloning
US7975115B2 (en) Method and apparatus for separating snapshot preserved and write data
WO2017119091A1 (fr) Système de stockage distribué, procédé de stockage de données et programme logiciel
JP5685676B2 (ja) 計算機システム及びデータ管理方法
US8204858B2 (en) Snapshot reset method and apparatus
US10176183B1 (en) Method and apparatus for reducing overheads of primary storage while transferring modified data
US9075755B1 (en) Optimizing data less writes for restore operations
JP6094267B2 (ja) ストレージシステム
US10739999B2 (en) Computer system having data amount reduction function and storage control method
US9063892B1 (en) Managing restore operations using data less writes
US10331362B1 (en) Adaptive replication for segmentation anchoring type
US10095700B2 (en) Persistent file handle object container memory expiry
US10089125B2 (en) Virtual machines accessing file data, object data, and block data
US9690809B1 (en) Dynamic parallel save streams
US20210103400A1 (en) Storage system and data migration method
US20200057586A1 (en) Computer system and data storage method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16921962

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16921962

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP