WO2018092288A1 - Storage device and control method therefor - Google Patents

Storage device and control method therefor Download PDF

Info

Publication number
WO2018092288A1
WO2018092288A1 PCT/JP2016/084371 JP2016084371W WO2018092288A1 WO 2018092288 A1 WO2018092288 A1 WO 2018092288A1 JP 2016084371 W JP2016084371 W JP 2016084371W WO 2018092288 A1 WO2018092288 A1 WO 2018092288A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
copy
original data
logical volume
volume
Prior art date
Application number
PCT/JP2016/084371
Other languages
French (fr)
Japanese (ja)
Inventor
伊織 米川
啓 池田
竹内 久治
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2016/084371 priority Critical patent/WO2018092288A1/en
Publication of WO2018092288A1 publication Critical patent/WO2018092288A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures

Definitions

  • the present invention relates to a storage apparatus and its control method, and is suitable for application to a storage apparatus equipped with a deduplication function, for example.
  • a deduplication function is widely used as a function of a storage apparatus for satisfying such a request (see, for example, Patent Document 1 and Patent Document 2).
  • the deduplication function leaves only one of them in the storage device and deletes all the remaining data. It is a function to do.
  • the processing executed by the storage device based on the deduplication function processing that leaves only one piece of data with the same content in the storage device in the storage device and deletes all remaining data
  • This is called exclusion processing.
  • data left in the storage device in the storage apparatus by this deduplication processing is called original data.
  • Patent Document 1 when performing deduplication of data, among files stored in a plurality of volumes, duplication is performed on a plurality of volumes in order to avoid further concentration of the load on a high-load volume.
  • the files stored in this way are determined as aggregation target files, a plurality of volumes storing the aggregation target files are identified, and one of the plurality of identified volumes is selected based on the load of the plurality of identified volumes. It has been proposed to select one or more volumes as an aggregation volume and delete the aggregation target files stored in the unselected volumes.
  • Patent Document 2 discloses a duplication determination unit that determines whether or not the storage target data is already stored in the storage device in order to suppress the performance degradation of the storage device equipped with the deduplication function.
  • a storage destination determination unit that determines a storage destination of non-duplicate data that is non-duplicated storage target data, and a data storage control unit that stores non-duplicate data in a storage device that is the determined storage destination. It is disclosed that a destination determination unit determines a storage location of duplicate data that is determined to be related to non-duplicate data according to a predetermined criterion, and determines a storage location of non-duplication data based on the determination result Yes.
  • the deduplicated data when deleting the volume that stores the original data of the deduplicated data, or when updating the original data, the deduplicated data is included. It is necessary to move the original data for another file or the like to another volume in advance.
  • the present invention has been made in consideration of the above points, and intends to propose a storage apparatus and a control method thereof that can reduce the migration processing cost of the original data of the deduplicated data.
  • a logical volume is provided as a storage area to a higher-level device, and one of the data having the same content is provided to the data stored in the logical volume.
  • a management unit that manages copy attribute information on whether or not the copy source has been copied in units of logical volumes; Based on the copy attribute information, a deduplication processing execution unit is provided for executing the deduplication processing so as to leave the original data in the logical volume that has become the copy source.
  • a logical volume is provided to a host device as a storage area, and one of the data having the same content is left as original data for the data stored in the logical volume.
  • the storage apparatus manages copy attribute information indicating whether or not the copy apparatus has been a copy source in units of logical volumes.
  • the present invention it is possible to reduce the resource consumption caused by the movement of the original data of the deduplicated data, and to reduce the movement processing cost of the original data.
  • reference numeral 1 denotes a storage device according to this embodiment as a whole.
  • the storage device 1 includes a channel adapter package 3, a microprocessor board 4 and a cache memory package 5 that form a storage controller 2, and a hard disk unit 6 that provides a storage area to the storage controller 2.
  • the channel adapter package 3 includes one or a plurality of channel adapters (not shown). Each channel adapter is an interface that performs protocol control during communication with the host device 8 via the network 7 and includes a port. A unique WWW (World Wide Name) for identifying the port on the network 7 is assigned to the port.
  • WWW World Wide Name
  • the microprocessor board 4 is a board on which a CPU 11 having one or a plurality of microprocessors 10 each composed of a CPU (Central Processing Unit) core and a processor memory 12 composed of a semiconductor memory are mounted.
  • Each microprocessor 10 of the CPU 11 has a local memory 10A.
  • a microprogram that is a program for the microprocessor 10 to execute various processes and virtual volume information 23 described later are loaded and stored from a shared memory area 22 described later of the cache memory package 5.
  • the processor memory 12 is a memory that is shared and used by the microprocessors 10. Page update frequency information 13 and an address conversion table 14 to be described later are stored and held in the processor memory 12.
  • the cache memory package 5 includes a plurality of DIMMs (Dual In-line Memory Module) 20.
  • the DIMM 20 is a memory module in which a plurality of semiconductor memories such as DRAM (Dynamic Random Access Memory) are mounted on a printed circuit board.
  • a part of the storage area provided by each semiconductor memory that constitutes each of these DIMMs 20 is used as a cache memory area 21 that temporarily holds data to be read / written to a storage device 30 that will be described later that constitutes the hard disk unit 6,
  • the remaining area is used as a shared memory area 22 for storing control information and the like shared by the microprocessors 10 of the CPU 11. Virtual volume information 23 and local copy pair information 24 described later are stored and held in this shared memory area 22.
  • the hard disk unit 6 includes a plurality of storage devices 30.
  • the storage device 30 is an expensive and high-performance disk device such as an FC (Fibre-Chanel) disk or a SAS (Serial-Attached SCSI) disk, an inexpensive and low-performance disk device such as a SATA (Serial-AT-Attachment) disk, or an SSD. (Solid State Drive) etc.
  • FIG. 2 shows a logical configuration of the storage apparatus 1.
  • one or more storage devices 30 constituting the hard disk unit 6 are managed as a RAID (Redundant Arrays of Inexpensive Disks) group RG, and one or more RAID groups are managed.
  • a storage area provided by each storage device 30 constituting each RG is managed as a pool PL.
  • the storage area in the pool PL is managed in units of a partial area having a predetermined size (for example, 42 MB).
  • this partial area is referred to as “page” or “physical page”.
  • Each pool PL is associated with one or a plurality of virtual logical volumes (hereinafter referred to as “virtual volumes”) VVOLs formed using Thin Provisioning technology. It is provided to the host device 8 as a storage area for reading and writing data.
  • this virtual volume VVOL (storage space provided to the host apparatus 8) may be referred to as “overwrite space”.
  • a unique identifier (hereinafter referred to as “LUN (Logical Unit Number)”) is assigned to each virtual volume VVOL.
  • the storage area of the virtual volume VVOL is managed in units of a partial area called a logical block having a predetermined size (for example, 512 bytes).
  • Each logical block is given a unique identifier (hereinafter referred to as “LBA (LogicalBABlock ⁇ Address) ”).
  • LBA LogicalBABlock ⁇ Address
  • the storage area of the virtual volume VVOL is managed by being divided into partial areas having the same size as the physical page, which are configured by a plurality of logical blocks.
  • this partial area is referred to as a “virtual page”.
  • Data read / write from the host device 8 to the virtual volume VVOL includes the LUN of the virtual volume VVOL, the LUN of the first logical block in the area where data is read / written in the virtual volume VVOL, and the data length of the data Is issued by issuing to the storage apparatus 1 a read request or a write request designating.
  • the microprocessor 10 having the lowest load at that time in the CPU 11 of the storage controller 2 is assigned as a person in charge of processing the read request or write request.
  • the assigned microprocessor 10 has a case where the request given from the host device 8 at that time is a write request, and no physical page is assigned to the virtual page to which the data specified in the write request is written. In this case, an unused physical page is allocated to the virtual page from the pool PL associated with the virtual volume VVOL. Then, the microprocessor 10 writes the data from the host device 8 to the physical page assigned to the virtual page.
  • the microprocessor 10 is in the case where the request given from the host device 8 at that time is a read request or a write request, and the physical page is in the read / write destination area of the data designated in the read request or the write request.
  • the data is read from the physical page and transferred to the host device 8 that is the source of the read request (in the case of a read request), or the data given from the host device 8 to the physical page Is written (in the case of a write request).
  • the user can make a setting to apply that data deduplication should be performed on the virtual volume VVOL.
  • a virtual volume VVOL that has been set as appropriate will be referred to as a deduplication-compatible volume.
  • the area in the virtual page is managed by being divided into partial areas called “chunks” having a predetermined size (for example, 8 KB) that is an integral multiple of the logical block in order from the top of the virtual page.
  • Each chunk is given a unique address (hereinafter referred to as LA (Logical Address)).
  • the microprocessor 10 having the lowest load at that time of the CPU 11 is asynchronous with the I / O processing for the read request and write request from the host device 8 for each deduplication-compatible volume.
  • Deduplication is performed by determining whether or not the same contents are in units of chunks at a predetermined period (for example, 50 msec period), and for chunks with the same contents, only one chunk data is left and the other chunk data is deleted Execute the process.
  • the microprocessor 10 provides a check code including a feature amount of a small size (for example, about 8 bytes) calculated based on the data to be compared, such as a hash value calculated using a hash function. Calculate and perform duplication determination between chunks using the calculated check code.
  • a check code generated from data of one chunk is referred to as “FPK (FingerPrint Key)”.
  • the microprocessor 10 When the microprocessor 10 detects duplication of certain data for the first time as a result of the duplication determination, only the data of one chunk among the chunks having the same content is left as the original data, and the data of the other chunks having the same content is stored. delete. At this time, the microprocessor 10 performs compression processing using a reversible compression algorithm such as the LZW algorithm on the data to be left as the original data, and for the data to be deleted, the LA of the chunk in which the data is stored In association with FPK, it is registered and managed in a table stored in a dedicated virtual volume VVOL. Hereinafter, this table is referred to as “FPT (FingerPrintingerkey Table)” 31 (FIG. 2).
  • FPT FingerPrintingerkey Table
  • the compressed data of the original data generated by the compression processing is stored in a location different from the physical page in which the uncompressed data is stored (hereinafter referred to as “additional writing space”).
  • the write-once space is not a storage space accessible by the host device 8, but a storage space (virtual volume VVOL) that can be used only by the storage controller 2.
  • the write-once space is used for the storage controller 2 to store the compressed data in the storage device 30.
  • the compressed data is stored in the additional writing space by additional writing.
  • the correspondence between the LA of the original data stored in the FPT 31 and the address in the write-once space where the compressed data of the original data is stored is a table (hereinafter referred to as an address conversion table) 14 (FIG. 1). ).
  • the storage apparatus 1 again assigns a physical page to the virtual page in the overwrite space, decompresses the data moved to the write-once space, and decompresses the data to the physical page assigned to the virtual page.
  • the data on the physical page may be updated (overwritten).
  • the storage apparatus 1 of the present embodiment includes a deduplication function for executing deduplication processing as described above, as a part of the deduplication function.
  • the placement position of the original data and the virtual volume VVOL of the migration destination when the migration of the original data is necessary after that is determined as the virtual volume VVOL estimated to have the least risk of the migration of the original data.
  • the function is installed.
  • the storage apparatus 1 executes deduplication processing for a certain amount of data for the first time, the data, for example, as shown in FIG. 3, copies master data to generate a plurality of pieces of data having the same contents.
  • the data is a use case (hereinafter referred to as the first use case)
  • the master image data hereinafter referred to as master data
  • the deduplication processing is executed so as to delete the remaining data (hereinafter referred to as copy data) generated.
  • VDI Virtual Desktop Infrastructure
  • a virtual volume storing master data hereinafter referred to as a master volume
  • the original data is concentrated on the master volume and the copy data is deduplicated, so that the movement of the original data is difficult to occur and the virtual volume in which the copy data is stored is deleted. It is considered that the processing time can be shortened.
  • whether or not the data to be deduplicated is in the first use case described above can be determined based on the issuance information of the XCOPY command that is a data replication command. That is, in the case of the first use case, it can be determined that the virtual volume VVOL designated as the copy source in the XCOPY command is the master volume in the first use case.
  • the master volume can be easily identified during deduplication processing by giving the information “XCOPY copy source” to the virtual volume VVOL that is the copy source. can do.
  • the storage apparatus 1 can determine that the deduplicated data is backup data obtained by regular backup (hereinafter, this use case is referred to as a second use case) If it becomes necessary to move the original data to another backup volume due to an update of the original data or deletion of a virtual volume that stores the original data, the destination of the original data Of the other candidate backup volumes, the original data is moved to the last updated backup volume.
  • a backup volume (virtual volume VVOL) for each day of the week is prepared, and a certain backup target data (hereinafter referred to as backup target data) is backed up daily to these backup volumes.
  • backup target data a certain backup target data
  • the backup destination of the backup target data in this case is the backup volume corresponding to the day of the week.
  • deduplication processing is performed on data stored in a total of seven backup volumes for one week, and data “A” stored in the backup volume for Wednesday is left as the original data.
  • An example is shown in which data having the same content (data “A”) in the backup volume of the day of the week is deduplicated.
  • the data to be backed up is updated on Wednesday.
  • the data “A” is updated to the data “B” by updating the backup target data
  • the data “B” is backed up to the data “A” stored in the Wednesday backup volume.
  • the backup volume for Tuesday which is the earliest in the future, is not the backup volume for Thursday where the original data is moved immediately after the update on the next day.
  • the volume is appropriate.
  • it can be said that it is preferable that the migration destination of the original data is a backup volume updated last.
  • the backup operation is often realized by the local copy function. Therefore, the determination of the migration destination of the original data in the second use case can be performed using the pair information (latest operation time) of the local copy function.
  • the use case of the original data is periodically For use cases that cannot be determined to be backup operations (hereinafter referred to as the third use case), as shown in FIG. Move the original data.
  • the migration risk of the original data due to the data update can be reduced by moving the migration destination of the original data to the virtual page with the lowest update frequency among the migration destination candidates.
  • This method is used not only when determining the destination of the original data in use cases other than the second use case, but also when determining the location of the original data in the first deduplication process (first use case). Can also be applied.
  • page update frequency information 13 and an address conversion table 14 are stored in the processor memory 12 of the microprocessor board 4. Are stored, and the virtual volume information 23 and the local copy pair information 24 are stored in the shared memory area 22 of the cache memory package 5. Further, as described above with reference to FIG. 2, a virtual volume VVOL (hereinafter referred to as an FPT volume) that can be used only by the storage controller 2 is defined in the storage apparatus 1, and an FPT 31 is included in this FPT volume. Stored.
  • an FPT volume virtual volume
  • the page update frequency information 13 has a table structure in which the number of updates (update frequency) within a predetermined time (for example, several seconds to several hours) for each chunk of each virtual volume VVOL is stored.
  • the page update frequency information 13 is updated so that the value of the update frequency is incremented by 1 every time the data written in the virtual volume VVOL is updated by the microprocessor 10 (FIG. 1) in charge of the processing.
  • the address conversion table 14 is a table used for managing the movement destination of each chunk when the chunk data on the overwrite space is moved to the write-once space.
  • the address conversion table 14 when the data stored in the overwrite space is compressed and stored in the write-once space, the address of the overwrite space in which the data was stored by the microprocessor 10 in charge of the processing, The address of the additional writing space where the compressed data of the original data is stored is stored in association with each other.
  • the address conversion table 14 includes an overwrite space address column 14A and a write space address column 14B.
  • the overwrite space address column 14A stores the address (LA) on the overwrite space of the chunk that has been compressed and moved to the write space, and the write space address column 14B stores the corresponding chunk.
  • the destination address (PA) in space is stored.
  • the virtual volume information 23 is information used to manage each virtual volume VVOL defined in the storage apparatus 1, and as shown in FIG. 7, a volume number column 23A, a capacity column 23B, a deduplication setting column 23C and XCOPY attribute column 23D.
  • the volume number column 23A stores all identification numbers (volume numbers) assigned to the respective virtual volumes VVOL defined in the storage apparatus 1, and the capacity column 23B sets the corresponding virtual volume VVOL. Stored capacity is stored.
  • deduplication setting column 23C information indicating whether or not the corresponding virtual volume VVOL is set as a deduplication-compatible volume (in FIG. 7, “present” when set, “when not set”). “Nothing”) is stored.
  • the XCOPY attribute column 23D stores an attribute indicating whether or not the corresponding virtual volume VVOL is a copy source of a copy based on the XCOPY command (hereinafter referred to as an XCOPY attribute).
  • FIG. 7 shows an example in which the character string “XCOPY copy source” is stored when the corresponding virtual volume VVOL is the copy source based on the XCOPY command. This information is registered by the microprocessor 10 (FIG. 1) in charge of controlling the copy processing when the virtual volume becomes the XCOPY copy source.
  • the local copy pair information 24 is information for managing each local copy pair defined in the storage apparatus 1, and as shown in FIG. 8, a volume number column 24A, a pair attribute column 24B, and a pair operation time column 24C.
  • the table structure includes a pair number column 24D and a partner volume number column 24E.
  • volume numbers of all virtual volumes VVOL defined in the storage apparatus 1 are stored.
  • the pair attribute column 24B when the corresponding virtual volume VVOL is set as a copy pair of another virtual volume VVOL and a local copy, the primary volume (primary VOL) that is the copy source of the copy pair and the copy destination Information indicating which secondary volume (secondary VOL) is stored is stored. If the corresponding virtual volume VVOL is not set as a copy pair with any virtual volume VVOL, nothing is stored in the pair attribute column 24B.
  • pair operation time column 24C when the corresponding virtual volume VVOL is set as a copy pair with another virtual volume VVOL, a predetermined operation such as formation of the copy pair or resync (resynchronization) is performed last. Stored time is stored.
  • the pair number column 24D stores the number of counterpart virtual volumes VVOL for which the corresponding virtual volume VVOL is set as a copy pair.
  • the counterpart volume number column 24E stores the volumes of these counterpart virtual volumes VVOL. All numbers are stored.
  • the FPT 31 is a table for managing the FPK of each chunk in each deduplication corresponding volume calculated at the time of deduplication processing. As shown in FIG. 9, different FPKs calculated in the deduplication processing are used. Each column 31A is configured.
  • the uppermost row (hereinafter referred to as FPK row) 31B of each column 31A stores the corresponding FPK value, and each row below the FPK row 31B in each column 31A (hereinafter referred to as “FPK row 31B”). These are called LA rows.) 31C stores LAs of all chunks in which the PFK values of the stored data match the FPK values stored in the FPK row 31B.
  • the original data when the original data is updated or when the virtual volume VVOL storing the original data is deleted, the original data is stored in the next LA row 31C of the same column 31A. Moved to LA. Therefore, in the example of FIG. 9, when the data (original data) stored in the virtual page of LA “LA1” is updated, or when the virtual volume VVOL having the virtual page “LA1” is deleted. , The original data stored in the LA virtual page “LA1” is moved to the LA virtual page “LA1031”.
  • FIG. 10 shows one of the microprocessors 10 (FIG. 1) of the CPU 11 (FIG. 1) based on an activation command given periodically (for example, 50 msec) from a scheduler (not shown). The process procedure of the deduplication process performed by this is shown.
  • any one of the microprocessors 10 in the CPU 11 starts the deduplication process shown in FIG. 10, and first performs a step from the deduplication corresponding volume defined in the storage apparatus 1.
  • One deduplication-compatible volume hereinafter referred to as a target volume
  • S1 One deduplication-compatible volume to be processed after S2 is determined (S1).
  • the method of determining the target volume may be either a method of determining at random from the deduplication-compatible volume or a method of determining in a predetermined order from the deduplication-compatible volume.
  • a predetermined prime number is added to the volume number of the deduplication-compatible volume that was the last target volume in the deduplication process performed earlier or the volume number of the previous target volume in this deduplication process.
  • a method is conceivable in which the added value is obtained and the deduplication-compatible volume to which the volume number of that value is assigned is used as the target volume.
  • a method of determining a deduplication-compatible volume as a target volume in ascending or descending order of volume numbers is conceivable.
  • the microprocessor 10 acquires the XCOPY attribute stored in the XCOPY attribute column 23D (FIG. 7) corresponding to the target volume in the virtual volume information 23 described above with reference to FIG. 7 (S2), and acquires the acquired XCOPY attribute. Based on the above, it is determined whether or not the target volume has been the copy source of the copy executed according to the XCOPY command by that time (S3). If the microprocessor 10 obtains a negative result in this determination, it proceeds to step S5.
  • the microprocessor 10 obtains a positive result in the determination at step S3, it performs deduplication on the data stored in the target volume (S4), and thereafter all the deduplications in the storage device 1 are performed. It is determined whether or not the processing of steps S1 to S4 has been executed for the corresponding volume (S4).
  • step S1 If the microprocessor 10 obtains a negative result in this determination, it returns to step S1, and thereafter, the deduplication corresponding volume determined as the target volume in step S1 is sequentially changed to another unprocessed deduplication corresponding volume. While switching, the processing from step S1 to step S5 is repeated.
  • deduplication is performed on the data stored in each deduplication-compatible volume that has become the copy source of the copy executed in accordance with the XCOPY command. .
  • the microprocessor 10 obtains a positive result in step S5 by completing the deduplication on the data stored in all the deduplication-compatible volumes that have become the copy source of the copy executed in accordance with the XCOPY command.
  • Deduplication-compatible volumes for which deduplication execution processing has not been performed are subject to processing in step S6 and subsequent steps.
  • One deduplication volume (target volume) to be determined is determined.
  • the method for determining the target volume in this step S6 is also a method in which the deduplication execution process is randomly determined from among the unprocessed deduplication corresponding volumes, the deduplication execution process is not yet processed. Any of the methods of determining in a predetermined order from the above may be used.
  • the microprocessor 10 performs duplication processing on the data stored in the target volume determined in step S6 (S7), and thereafter stored in all the deduplication-compatible volumes defined in the storage apparatus 1. It is determined whether or not deduplication has been performed on data (S8).
  • step S6 the microprocessor 10 obtains a negative result in this determination, the microprocessor 10 returns to step S6, and then sequentially switches the deduplication corresponding volume determined as the target volume in step S6 to another unprocessed deduplication corresponding volume. However, the processing from step S6 to step S8 is repeated.
  • step S6 deduplication is executed on data stored in each deduplication-compatible volume that has never become a copy source of a copy executed in accordance with the XCOPY command.
  • the microprocessor 10 obtains a positive result in step S8 by completing the deduplication on the data stored in all the deduplication-compatible volumes that have not become the copy source of the copy executed in accordance with the XCOPY command. And this deduplication process is complete
  • FIG. 11 shows specific processing contents of the deduplication execution process executed by the microprocessor 10 in step S4 and step S7 of the deduplication process.
  • the microprocessor 10 When the microprocessor 10 proceeds to step S4 or step S7 of the deduplication process, the microprocessor 10 starts the deduplication execution process shown in FIG. 11, and first selects the target of the process after step S11 from the virtual pages in the target volume.
  • One virtual page hereinafter referred to as a target page
  • the method for determining the target page may be either a method for determining at random from the virtual pages of the target volume or a method for determining in a predetermined order from the virtual pages of the target volume.
  • the microprocessor 10 determines whether or not deduplication is necessary for the data of the target page determined in step S10 (hereinafter referred to as target data) (S11).
  • target data data of the target page determined in step S10
  • the microprocessor 10 refers to the page update frequency information 13 (FIG. 1) and deduplicates the target data based on whether or not the target page is updated within a predetermined time. It is determined whether or not to perform.
  • step S21 If the microprocessor 10 obtains a negative result in this determination, it proceeds to step S21. On the other hand, when the microprocessor 10 obtains a positive result in the determination at step S11, the microprocessor 10 calculates a hash value of the target data using a predetermined hash function as the FPK of the target data (S12).
  • the microprocessor 10 sequentially compares the hash value (FPK) of the target data calculated in step S12 with each FPK stored in the FPT 31 (FIG. 9) (S13), and the hash value (FPK) of the target data Is matched with any FPK already registered in the FPT 31 (S14).
  • Obtaining a negative result in this determination means that the hash value (FPK) of the target data has not been registered in the FPT 41 yet.
  • the microprocessor 10 newly registers the hash value calculated in step S12 in the FPT 31 as the FPK of the target page, and at the top FPK row 31C of the column 31A (FIG. 9) corresponding to the FPK in the FPT 31.
  • the LA of the target page is stored in (S18).
  • the microprocessor 10 then compresses the target data, writes the compressed data thus obtained to the write-once space, and the LA of the target page and the address (PA) where the compressed data in the write-once space is stored. After the correspondence relationship is registered in the address conversion table 14 (S19), the process proceeds to step S20.
  • obtaining a positive result in the determination in step S14 means that the hash value (FPK) of the target data is already registered in the FPT 31.
  • the target data and the data for which the FPK registered in the FPT 31 is the same as the hash value of the target data are not always completely the same. Therefore, at this time, the microprocessor 10 compares the target data with data in which the FPK having the same value as the hash value of the target data is registered in the FPT 31 (S15).
  • the microprocessor 10 determines that the FPK having the same value as the hash value of the target data is the LA of the chunk in which the original data of the data registered in the FPT 31 is stored (the top FPK in the column 31A of the FPK in the FPT 31). (LA registered in the row 31C) is acquired from the FPT 31, the address conversion table 14 (FIG. 6) is referred to, an address on the additional write space corresponding to the LA is acquired, and the address in the additional write space is acquired. Read the compressed data of the original data from the position. The microprocessor decompresses the read compressed data to restore the original data before compression, and compares the restored original data with the data of the target page.
  • the microprocessor 10 determines whether or not the target data and the original data of the data registered in the FPT 31 match the target data and the FPK having the same value as the hash value of the target data based on the comparison result of step S15. Judgment is made (S16).
  • the microprocessor 10 If the microprocessor 10 obtains a negative result in this determination, it compresses the target data, writes the compressed data thus obtained to the write-once space, and stores the LA of the target page and the compressed data in the write-once space. After the correspondence relationship with the address (PA) is registered in the address conversion table 14 (S19), the process proceeds to step S20.
  • the microprocessor 10 when the microprocessor 10 obtains a positive result in the determination at step S16, it additionally registers the LA of the target page in the last FPK row 31C of the FPK column 31A having the same value as the hash value of the target data in the FPT 31. Thereafter, the page in the overwrite space where the target data is stored is discarded (the data on the page is deleted) (S20).
  • the microprocessor 10 determines whether or not the processing of step S11 to step S20 has been executed for all virtual pages in the target volume (S21). If the microprocessor 10 obtains a negative result in this determination, it returns to step S10. Thereafter, the microprocessor 10 sequentially switches the target page determined in step S10 to another unprocessed virtual page in the target volume, step S10 to step S21. Repeat the process.
  • step S21 when the microprocessor 10 eventually obtains a positive result in step S21 by completing the processing of steps S11P to S20 for all virtual pages in the target volume, it ends this deduplication execution processing and deduplication processing.
  • the copy source made in accordance with the XCOPY command is the copy source before the deduplication-compatible volume that has never been the copy source made in accordance with the XCOPY command.
  • the LA of the virtual page of the deduplication-compatible volume that has been the copy source of the copy made according to the XCOPY command in the FPT 31 is executed according to the XCOPY command. It is registered at a position higher than the LA of the virtual page of the deduplication corresponding volume that has never been the copy source of the broken copy.
  • the virtual page data in which LA is stored in the uppermost FA row 31C of the FPT 31 is left as the original data as described above.
  • the LA of the virtual page of the deduplication-compatible volume that has become the copy source of the copy performed in accordance with the XCOPY command is stored in the uppermost FA row 31C of the FPT 31,
  • the data stored in the deduplication-compatible volume that has become the copy source of the copy made in accordance with the XCOPY command is left as the original data and stored in the deduplication target volume that is the copy destination of the copy
  • the deduplicated data will be deduplicated.
  • FIG. 12 shows that original data is transferred from the current virtual volume VVOL to another virtual volume VVOL of another migration destination candidate according to deletion of the virtual volume VVOL or overwriting of the original data.
  • the processing procedure of the original data movement process executed by any of the microprocessors 10 of the CPU 11 when a situation to be moved occurs will be shown.
  • the microprocessor 10 moves the original data to another virtual volume VVOL as a migration destination candidate according to the processing procedure shown in FIG.
  • the microprocessor 10 starts the original data movement process shown in FIG. 12, and first, from the local copy pair information 24 (FIG. 8), the virtual data storing the original data to be moved is stored. Information about a local copy pair in which a volume VVOL (hereinafter referred to as an original data storage volume) is used as a data copy source or copy destination is acquired (S30).
  • VVOL volume VVOL
  • the microprocessor 10 stores the information of the record (row) corresponding to the original data storage volume in the local copy pair information 24 and the volume number of the original data storage volume in the partner volume number column 24E (FIG. 8). Get information on all records (rows).
  • the microprocessor 10 determines whether the original data storage volume is set as a secondary volume (copy destination virtual volume VVOL) of a copy pair with another virtual volume VVOL. Is determined (S31). This determination is made by determining whether or not the pair attribute stored in the pair attribute column 24B (FIG. 8) of the record of the original data storage volume acquired in step S30 is “secondary volume”.
  • the negative result obtained in this determination is that the original data storage volume is not set to a copy pair for local copy with any virtual volume VVOL, or the original data storage volume is used as a primary volume and other virtual volumes VVOL and local volumes. It means that the copy pair is set.
  • the microprocessor 10 determines that the use case of the original data is the third use case described above, and proceeds to step S35.
  • the volume set as the primary volume in the copy pair in which the original data storage volume is set as the secondary volume (hereinafter referred to as a specific volume). ) are acquired from the local copy pair information 24 (S32). Specifically, the row information of the record (row) corresponding to the specific volume is acquired from the local copy pair information 24.
  • the microprocessor is set to a copy pair that is a virtual volume VVOL other than the original data storage volume and has the specific volume as the primary volume, among the migration destination candidates of the original data. It is determined whether or not there is a virtual volume VVOL (secondary volume) (S33). This determination is made based on whether the volume number of the virtual volume VVOL other than the volume number of the original data storage volume is stored in the counterpart volume number column 24E (FIG. 8) of the information acquired in step S32. .
  • Obtaining a positive result in this determination means that there are a plurality of copy pairs whose copy source is a specific volume, and that the original data storage volume exists as a secondary volume of one of the copy pairs. means.
  • the microprocessor 10 determines that the use case of the original data is the second use case, and refers to the local copy pair information 24 (FIG. 8), so that a plurality of copy volumes having a specific volume as the copy source are referred to.
  • the secondary volume other than the original data storage volume and having the latest updated time is determined as the migration destination of the original data (S34).
  • step S34 the microprocessor 10 sets a record pair of each virtual volume VVOL to which a volume number other than the volume number of the original data storage volume detected in step S33 in the local copy pair information 24 is assigned.
  • the virtual volume VVOL with the latest time stored in the operation time column 24C (FIG. 8) is determined as the migration destination of the original data. Then, the microprocessor proceeds to step S37.
  • obtaining a negative result in the determination in step S33 means that the specific volume is not set to a copy pair of a virtual volume VVOL other than the original data storage volume and the local copy.
  • the microprocessor 10 determines that the use case of the original data is the third use case, and the virtual page that can be the destination of the original data in each virtual volume VVOL that can be the destination of the original data.
  • the update frequencies are acquired from the page update frequency information 13 (FIG. 1), and these are compared (S35).
  • the “virtual volume VVOL that can be the migration destination of the original data” here is a deduplication-compatible volume in which data having the same content as the original data is deduplicated, and “a virtual page that can be the migration destination of the original data”. Is a virtual page in which data having the same content as the original data in the deduplication target volume is deduplicated.
  • the microprocessor 10 selects the virtual page with the lowest update frequency from the virtual pages that can be the migration destination of the original data in each virtual volume VVOL that can be the migration destination of the original data. Is determined as the destination (S36).
  • the microprocessor 10 copies the original data to the destination of the original data determined in step S34 or S35 (S37). Note that this copying is performed by copying the compressed data of the original data stored in the write-once space so that it is newly written in the write-once space.
  • the microprocessor 10 uses the address (PA) on the write-once space associated with the source LA of the original data in the address conversion table 14 (FIG. 6) as the copy destination of the original data in the write-once space. Rewrite to address (PA).
  • microprocessor 10 copies the address (PA) in the write-once space of the deduplicated data having the same FPK as the original data in the address conversion table 14 to the address of the copy destination of the original data (in step S37) (PA) (S38).
  • the microprocessor 10 determines the LA of the virtual page in which the original data stored in the first LA row 31C of the column 31A of the FPK corresponding to the original data among the columns 31A of the FPT 31 has been stored. Delete (S39).
  • the microprocessor 10 stores the LA of the virtual page to which the original data is moved in the first LA row 31C of the column 31A in the FPT 31, and the LA other than the LA of the virtual page stored in the column 31A. If necessary, the LA stored in the column 31A is moved so as to be pre-packed (S40).
  • the microprocessor 10 thereafter ends this original data movement process.
  • the storage apparatus 1 of the present embodiment can determine that the data to be deduplicated is copied by copying (first use case). Deduplication processing is performed so that the original data remains in the copy source virtual volume.
  • the storage apparatus 1 is a case (second use case) in which the original data of the deduplicated data exists in any one of the plurality of backup destination virtual volumes VVOL, and when the original data is updated, When deleting the virtual volume VVOL, the original data is moved to the last updated virtual volume VVOL among the other backup destination virtual volumes VVOL. Further, when the use case of the deduplicated data is other than the second use case, the storage apparatus 1 moves the original data to the virtual page with the lowest update frequency.
  • the present storage device 1 it is difficult for the original data to move due to the update of the original data or the deletion of the virtual volume VVOL. As a result, the consumption of resources caused by the movement of the original data of the deduplicated data Can be reduced, and the movement processing cost of the original data can be reduced.
  • the deduplication processing is executed so that the original data is left in the copy source virtual volume.
  • the probability that the original data remains in the deduplication-compatible volume is low. Therefore, the probability that the original data needs to be moved when deleting the deduplication-compatible volume can be reduced as much as possible, and thus the processing time (average) required for deleting the deduplication-compatible volume can be shortened. be able to.
  • the copy attribute information indicating whether or not the copy has been a copy source has been managed as a logical volume unit, and the copy source based on the copy attribute information.
  • the deduplication processing execution unit that performs deduplication processing to leave the original data in a certain logical volume and the situation that the original data should be moved to another logical volume occurs, the next original data moves
  • the microprocessor 10 of the CPU 11 executes the microprogram on the logical volume that is estimated to have the least risk of being determined as the migration destination of the original data and the original data migration unit that migrates the original data to the determined logical volume.
  • the present invention is not limited to this, and these pipes are embodied. Parts, some or all of the deduplication processing execution unit and the original data moved may be configured by its dedicated hardware.
  • the present invention can be widely applied to storage apparatuses equipped with a deduplication function.
  • SYMBOLS 1 ... Storage device, 2 ... Storage controller, 8 ... Host device, 10 ... Microprocessor, 11 ... CPU, 12 ... Processor memory, 13 ... Page update frequency information, 14 ... Address conversion table, 22 ... Shared memory area, 23 ... Virtual volume information, 24 ... Local copy pair information, 30 ... Storage device, 31 ... FPT, VVOL ... Virtual volume.

Abstract

[Problem] To provide a storage device and control method therefor that can reduce movement processing costs for source data of de-duplicated data. [Solution] A storage device that executes de-duplication processing is configured so as to manage copy attribute information, as to whether a logical volume has served as a copy source for a copy, on a logical volume unit basis, and to execute de-duplication processing so as to leave the source data on the logical volume that has served as a copy source, on the basis of the copy attribute information; and as a result of the foregoing, the likelihood of source data movement in conjunction with source data updates or logical volume deletions can be reduced. Consequently, it is possible to realize a storage device and control method which can reduce movement processing costs for source data of de-duplicated data.

Description

ストレージ装置及びその制御方法Storage apparatus and control method thereof
 本発明は、ストレージ装置及びその制御方法に関し、例えば、重複排除機能が搭載されたストレージ装置に適用して好適なものである。 The present invention relates to a storage apparatus and its control method, and is suitable for application to a storage apparatus equipped with a deduplication function, for example.
 従来、ストレージ装置には、低コストで大量のデータを保存することが求められている。こうした要求を満たすためのストレージ装置の機能として、重複排除機能が広く用いられている(例えば、特許文献1及び特許文献2参照)。重複排除機能は、ストレージ装置内に同一内容のデータが複数存在していることをストレージ装置が検出したときに、そのうちの1つだけをストレージ装置内の記憶デバイスに残し、残りのデータをすべて削除する機能である。 Conventionally, storage devices are required to store a large amount of data at low cost. A deduplication function is widely used as a function of a storage apparatus for satisfying such a request (see, for example, Patent Document 1 and Patent Document 2). When the storage device detects that multiple data with the same content exists in the storage device, the deduplication function leaves only one of them in the storage device and deletes all the remaining data. It is a function to do.
 なお、以下においては、かかる重複排除機能に基づいてストレージ装置により実行される処理(同一内容のデータの1つだけをストレージ装置内の記憶デバイスに残し、残りのデータをすべて削除する処理)を重複排除処理と呼ぶ。また、この重複排除処理によりストレージ装置内の記憶デバイスに残されるデータを元データと呼ぶ。 In the following, the processing executed by the storage device based on the deduplication function (processing that leaves only one piece of data with the same content in the storage device in the storage device and deletes all remaining data) is duplicated. This is called exclusion processing. Further, data left in the storage device in the storage apparatus by this deduplication processing is called original data.
 近年、このような重複排除機能に関連して、種々の技術が提案されている。例えば、特許文献1には、データの重複排除を実施する際、高負荷のボリュームにさらに負荷が集中することを回避すべく、複数のボリュームに格納されているファイルのうち、複数のボリュームに重複して格納されているファイルを集約対象ファイルとして決定し、集約対象ファイルを格納する複数のボリュームを特定し、特定された複数のボリュームの負荷に基づいて、特定された複数のボリュームの中から一つ以上のボリュームを集約ボリュームとして選択し、選択されなかったボリュームに格納されている集約対象ファイルを削除することが提案されている。 In recent years, various techniques have been proposed in connection with such a deduplication function. For example, in Patent Document 1, when performing deduplication of data, among files stored in a plurality of volumes, duplication is performed on a plurality of volumes in order to avoid further concentration of the load on a high-load volume. The files stored in this way are determined as aggregation target files, a plurality of volumes storing the aggregation target files are identified, and one of the plurality of identified volumes is selected based on the load of the plurality of identified volumes. It has been proposed to select one or more volumes as an aggregation volume and delete the aggregation target files stored in the unselected volumes.
 また特許文献2には、重複排除機能が搭載されたストレージ装置の性能低下を抑制すべく、記憶対象データが記憶装置に既に記憶されている重複状態であるか否かを判定する重複判定部と、重複していない記憶対象データである非重複データの格納先を決定する格納先決定部と、非重複データは、決定した格納先となる記憶装置に記憶するデータ格納制御部とを設け、格納先決定部が、非重複データと予め設定された基準により関連すると判断される重複データの格納先を判定し、当該判定結果に基づいて、非重複データの格納先を決定することが開示されている。 Patent Document 2 discloses a duplication determination unit that determines whether or not the storage target data is already stored in the storage device in order to suppress the performance degradation of the storage device equipped with the deduplication function. A storage destination determination unit that determines a storage destination of non-duplicate data that is non-duplicated storage target data, and a data storage control unit that stores non-duplicate data in a storage device that is the determined storage destination. It is disclosed that a destination determination unit determines a storage location of duplicate data that is determined to be related to non-duplicate data according to a predetermined criterion, and determines a storage location of non-duplication data based on the determination result Yes.
特開2009-80671号公報JP 2009-80671 A 特開2015-170345号公報JP2015-170345A
 ところで、重複排除機能が搭載されたストレージ装置において、重複排除されたデータの元データが格納されているボリュームを削除する場合や、その元データを更新する場合には、重複排除されたデータを含む他のファイル等のためにかかる元データを予め他のボリュームに移動させる必要がある。 By the way, in a storage device equipped with a deduplication function, when deleting the volume that stores the original data of the deduplicated data, or when updating the original data, the deduplicated data is included. It is necessary to move the original data for another file or the like to another volume in advance.
 しかしながら、このような元データの移動には多くの時間とリソースを必要とする問題があった。このため重複排除機能の実運用における元データの移動処理コストの低減が求められている。 However, such movement of the original data has a problem of requiring a lot of time and resources. For this reason, it is required to reduce the cost of the original data transfer process in the actual operation of the deduplication function.
 本発明は以上の点を考慮してなされたもので、重複排除されたデータの元データの移動処理コストを低減させ得るストレージ装置及びその制御方法を提案しようとするものである。 The present invention has been made in consideration of the above points, and intends to propose a storage apparatus and a control method thereof that can reduce the migration processing cost of the original data of the deduplicated data.
 かかる課題を解決するため本発明の一実施形態においては、論理ボリュームを記憶領域として上位装置に提供すると共に、前記論理ボリュームに格納されたデータに対して、同一内容のデータのうちの1つを元データとして残し、他のデータを削除する重複排除処理を実行するストレージ装置において、コピーのコピー元となったことがあるか否かのコピー属性情報を前記論理ボリューム単位で管理する管理部と、前記コピー属性情報に基づいて、前記コピー元となったことがある前記論理ボリュームに前記元データを残すように前記重複排除処理を実行する重複排除処理実行部とを設けるようにした。 In order to solve this problem, in one embodiment of the present invention, a logical volume is provided as a storage area to a higher-level device, and one of the data having the same content is provided to the data stored in the logical volume. In the storage device that executes the deduplication process that leaves the original data and deletes other data, a management unit that manages copy attribute information on whether or not the copy source has been copied in units of logical volumes; Based on the copy attribute information, a deduplication processing execution unit is provided for executing the deduplication processing so as to leave the original data in the logical volume that has become the copy source.
 また本発明の他の実施形態においては、論理ボリュームを記憶領域として上位装置に提供すると共に、前記論理ボリュームに格納されたデータに対して、同一内容のデータのうちの1つを元データとして残し、他のデータを削除する重複排除処理を実行するストレージ装置の制御方法において、前記ストレージ装置が、コピーのコピー元となったことがあるか否かのコピー属性情報を前記論理ボリューム単位で管理する第1のステップと、前記ストレージ装置が、前記コピー属性情報に基づいて、前記コピー元となったことがある前記論理ボリュームに前記元データを残すように前記重複排除処理を実行する第2のステップとを設けるようにした。 In another embodiment of the present invention, a logical volume is provided to a host device as a storage area, and one of the data having the same content is left as original data for the data stored in the logical volume. In the storage apparatus control method for executing deduplication processing for deleting other data, the storage apparatus manages copy attribute information indicating whether or not the copy apparatus has been a copy source in units of logical volumes. A first step and a second step in which the storage apparatus executes the deduplication processing so as to leave the original data in the logical volume that has become the copy source, based on the copy attribute information. And so on.
 本発明によれば、重複排除されたデータの元データの移動により発生するリソースの消費を低減し、元データの移動処理コストを削減させることができる。 According to the present invention, it is possible to reduce the resource consumption caused by the movement of the original data of the deduplicated data, and to reduce the movement processing cost of the original data.
本実施の形態によるストレージ装置の全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the storage apparatus by this Embodiment. ストレージ装置の論理構成を示すブロック図である。It is a block diagram which shows the logical structure of a storage apparatus. 第1のユースケースの説明に供する概念図である。It is a conceptual diagram with which it uses for description of a 1st use case. 第2のユースケースの説明に供する概念図である。It is a conceptual diagram with which it uses for description of a 2nd use case. 第3のユースケースの説明に供する概念図である。It is a conceptual diagram with which it uses for description of a 3rd use case. アドレス変換テーブルの概略構成を示す概念図である。It is a conceptual diagram which shows schematic structure of an address conversion table. 仮想ボリューム情報の概略構成を示す概念図である。It is a conceptual diagram which shows schematic structure of virtual volume information. ローカルコピーペア情報の概略構成を示す概念図である。It is a conceptual diagram which shows schematic structure of local copy pair information. FPTの概略構成を示す概念図である。It is a conceptual diagram which shows schematic structure of FPT. 本実施の形態による重複排除処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the deduplication process by this Embodiment. 重複排除実行処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a deduplication execution process. 元データ移動処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of an original data movement process.
 以下図面について、本発明の一実施の形態を詳述する。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.
(1)本実施の形態によるストレージ装置の構成
 図1において、1は全体として本実施の形態によるストレージ装置を示す。このストレージ装置1は、ストレージコントローラ2を形成するチャネルアダプタパッケージ3、マイクロプロセッサボード4及びキャッシュメモリパッケージ5と、ストレージコントローラ2に記憶領域を提供するハードディスクユニット6とを備えて構成されている。
(1) Configuration of Storage Device According to this Embodiment In FIG. 1, reference numeral 1 denotes a storage device according to this embodiment as a whole. The storage device 1 includes a channel adapter package 3, a microprocessor board 4 and a cache memory package 5 that form a storage controller 2, and a hard disk unit 6 that provides a storage area to the storage controller 2.
 チャネルアダプタパッケージ3は、1又は複数の図示しないチャネルアダプタを備える。各チャネルアダプタは、ネットワーク7を介したホスト装置8との通信時おけるプロトコル制御を行うインタフェースであり、それぞれポートを備える。ポートには、かかるネットワーク7上でそのポートを識別するための固有のWWW(World Wide Name)が付与される。 The channel adapter package 3 includes one or a plurality of channel adapters (not shown). Each channel adapter is an interface that performs protocol control during communication with the host device 8 via the network 7 and includes a port. A unique WWW (World Wide Name) for identifying the port on the network 7 is assigned to the port.
 マイクロプロセッサボード4は、それぞれCPU(Central Processing Unit)コアから構成される1又は複数のマイクロプロセッサ10を有するCPU11と、半導体メモリからなるプロセッサメモリ12とが実装されたボードである。CPU11の各マイクロプロセッサ10は、それぞれローカルメモリ10Aを有する。ローカルメモリ10Aには、マイクロプロセッサ10が各種処理を実行するためのプログラムでなるマイクロプログラムや、後述する仮想ボリューム情報23がキャッシュメモリパッケージ5の後述する共有メモリ領域22からロードされて格納される。またプロセッサメモリ12は、各マイクロプロセッサ10が共有して利用するメモリである。後述するページ更新頻度情報13及びアドレス変換テーブル14は、このプロセッサメモリ12に格納されて保持される。 The microprocessor board 4 is a board on which a CPU 11 having one or a plurality of microprocessors 10 each composed of a CPU (Central Processing Unit) core and a processor memory 12 composed of a semiconductor memory are mounted. Each microprocessor 10 of the CPU 11 has a local memory 10A. In the local memory 10A, a microprogram that is a program for the microprocessor 10 to execute various processes and virtual volume information 23 described later are loaded and stored from a shared memory area 22 described later of the cache memory package 5. The processor memory 12 is a memory that is shared and used by the microprocessors 10. Page update frequency information 13 and an address conversion table 14 to be described later are stored and held in the processor memory 12.
 キャッシュメモリパッケージ5は、複数のDIMM(Dual In-line Memory module)20を備えて構成される。DIMM20は、それぞれ例えばDRAM(Dynamic Random Access Memory)等の複数の半導体メモリがプリント基板上に搭載されたメモリモジュールである。これらDIMM20をそれぞれ構成する各半導体メモリが提供する記憶領域の一部は、ハードディスクユニット6を構成する後述の記憶デバイス30にリード/ライトするデータを一時的に保持するキャッシュメモリ領域21として利用され、残りの領域は、CPU11の各マイクロプロセッサ10が共有する制御情報等を格納するための共有メモリ領域22として利用される。後述の仮想ボリューム情報23及びローカルコピーペア情報24はこの共有メモリ領域22に格納されて保持される。 The cache memory package 5 includes a plurality of DIMMs (Dual In-line Memory Module) 20. The DIMM 20 is a memory module in which a plurality of semiconductor memories such as DRAM (Dynamic Random Access Memory) are mounted on a printed circuit board. A part of the storage area provided by each semiconductor memory that constitutes each of these DIMMs 20 is used as a cache memory area 21 that temporarily holds data to be read / written to a storage device 30 that will be described later that constitutes the hard disk unit 6, The remaining area is used as a shared memory area 22 for storing control information and the like shared by the microprocessors 10 of the CPU 11. Virtual volume information 23 and local copy pair information 24 described later are stored and held in this shared memory area 22.
 ハードディスクユニット6は、複数の記憶デバイス30を備えて構成される。記憶デバイス30は、FC(Fibre Chanel)ディスクやSAS(Serial Attached SCSI)ディスクなどの高価・高性能なディスク装置や、SATA(Serial AT Attachment)ディスクなどの安価・低性能なディスク装置、又は、SSD(Solid State Drive)などから構成される。 The hard disk unit 6 includes a plurality of storage devices 30. The storage device 30 is an expensive and high-performance disk device such as an FC (Fibre-Chanel) disk or a SAS (Serial-Attached SCSI) disk, an inexpensive and low-performance disk device such as a SATA (Serial-AT-Attachment) disk, or an SSD. (Solid State Drive) etc.
 図2は、ストレージ装置1の論理構成を示す。この図2に示すように、ストレージ装置1では、ハードディスクユニット6を構成する各記憶デバイス30が1又は複数まとめてRAID(Redundant Arrays of Inexpensive Disks)グループRGとして管理され、1又は複数の各RAIDグループRGをそれぞれ構成する各記憶デバイス30が提供する記憶領域がプールPLとして管理される。プールPL内の記憶領域は、所定大きさ(例えば42MB)の部分領域を単位として管理される。以下においては、この部分領域を「ページ」又は「物理ページ」と呼ぶ。 FIG. 2 shows a logical configuration of the storage apparatus 1. As shown in FIG. 2, in the storage apparatus 1, one or more storage devices 30 constituting the hard disk unit 6 are managed as a RAID (Redundant Arrays of Inexpensive Disks) group RG, and one or more RAID groups are managed. A storage area provided by each storage device 30 constituting each RG is managed as a pool PL. The storage area in the pool PL is managed in units of a partial area having a predetermined size (for example, 42 MB). Hereinafter, this partial area is referred to as “page” or “physical page”.
 また各プールPLには、それぞれThin Provisioning技術を用いて形成される1又は複数の仮想的な論理ボリューム(以下、これを「仮想ボリューム」と呼ぶ)VVOLが対応付けられ、これら仮想ボリュームVVOLがそれぞれデータを読み書きするための記憶領域としてホスト装置8に提供される。以下においては、この仮想ボリュームVVOL(ホスト装置8に提供される記憶空間)のことを「上書き空間」と呼ぶこともある。 Each pool PL is associated with one or a plurality of virtual logical volumes (hereinafter referred to as “virtual volumes”) VVOLs formed using Thin Provisioning technology. It is provided to the host device 8 as a storage area for reading and writing data. Hereinafter, this virtual volume VVOL (storage space provided to the host apparatus 8) may be referred to as “overwrite space”.
 各仮想ボリュームVVOLには、それぞれ固有の識別子(以下、これを「LUN(Logical Unit Number)」と呼ぶ)が付与される。また仮想ボリュームVVOLの記憶領域は、論理ブロックと呼ばれる所定大きさ(例えば512バイト)の部分領域を単位として管理される。各論理ブロックには、それぞれその論理ブロックに固有の識別子(以下、これを「LBA(Logical Block Address)」と呼ぶ)が付与される。さらに仮想ボリュームVVOLの記憶領域は、複数個の論理ブロックにより構成される、物理ページと同じ大きさの部分領域に区分されて管理される。以下においては、この部分領域を「仮想ページ」と呼ぶ。 A unique identifier (hereinafter referred to as “LUN (Logical Unit Number)”) is assigned to each virtual volume VVOL. The storage area of the virtual volume VVOL is managed in units of a partial area called a logical block having a predetermined size (for example, 512 bytes). Each logical block is given a unique identifier (hereinafter referred to as “LBA (LogicalBABlock」 Address) ”). Furthermore, the storage area of the virtual volume VVOL is managed by being divided into partial areas having the same size as the physical page, which are configured by a plurality of logical blocks. Hereinafter, this partial area is referred to as a “virtual page”.
 ホスト装置8から仮想ボリュームVVOLに対するデータのリード/ライトは、その仮想ボリュームVVOLのLUNと、その仮想ボリュームVVOLにおけるデータのリード/ライトを行う領域の最初の論理ブロックのLUNと、そのデータのデータ長とを指定したリード要求やライト要求をストレージ装置1に発行することにより行われる。 Data read / write from the host device 8 to the virtual volume VVOL includes the LUN of the virtual volume VVOL, the LUN of the first logical block in the area where data is read / written in the virtual volume VVOL, and the data length of the data Is issued by issuing to the storage apparatus 1 a read request or a write request designating.
 ストレージ装置1では、かかるリード要求やライト要求を受信すると、ストレージコントローラ2のCPU11におけるそのとき負荷が最も低いマイクロプロセッサ10がそのリード要求やライト要求に対する処理の担当として割り当てられる。 When the storage device 1 receives such a read request or write request, the microprocessor 10 having the lowest load at that time in the CPU 11 of the storage controller 2 is assigned as a person in charge of processing the read request or write request.
 そして、割り当てられたマイクロプロセッサ10は、そのときホスト装置8から与えられた要求がライト要求であって、そのライト要求において指定されたデータの書込み先の仮想ページに物理ページが割り当てられていない場合には、その仮想ボリュームVVOLと対応付けられたプールPLから未使用の物理ページをその仮想ページに割り当てる。そして、マイクロプロセッサ10は、ホスト装置8からのデータをその仮想ページに割り当てた物理ページに書き込む。 Then, the assigned microprocessor 10 has a case where the request given from the host device 8 at that time is a write request, and no physical page is assigned to the virtual page to which the data specified in the write request is written. In this case, an unused physical page is allocated to the virtual page from the pool PL associated with the virtual volume VVOL. Then, the microprocessor 10 writes the data from the host device 8 to the physical page assigned to the virtual page.
 またマイクロプロセッサ10は、そのときホスト装置8から与えられた要求がリード要求又はライト要求の場合であって、そのリード要求又はライト要求において指定されたデータのリード/ライト先の領域に物理ページが割り当てられているときには、その物理ページからデータを読み出してリード要求の送信元のホスト装置8にそのデータを転送し(リード要求の場合)、又は、その物理ページにホスト装置8から与えられたデータを書き込む(ライト要求の場合)。 Further, the microprocessor 10 is in the case where the request given from the host device 8 at that time is a read request or a write request, and the physical page is in the read / write destination area of the data designated in the read request or the write request. When it is allocated, the data is read from the physical page and transferred to the host device 8 that is the source of the read request (in the case of a read request), or the data given from the host device 8 to the physical page Is written (in the case of a write request).
 なお本実施の形態の場合、ユーザは、仮想ボリュームVVOLに対してデータの重複排除を行うべき旨の適用する設定を行うことができる。以下においては、適宜、このような設定がなされた仮想ボリュームVVOLを重複排除対応ボリュームと呼ぶものとする。 In the case of this embodiment, the user can make a setting to apply that data deduplication should be performed on the virtual volume VVOL. In the following, a virtual volume VVOL that has been set as appropriate will be referred to as a deduplication-compatible volume.
 重複排除対応ボリュームでは、仮想ページ内の領域が、仮想ページの先頭から順に論理ブロックの整数倍の所定大きさ(例えば8KB)の「チャンク」と呼ばれる部分領域に区分されて管理される。各チャンクには、それぞれそのチャンクに固有のアドレス(以下、これをLA(Logical Address)と呼ぶ)がそれぞれ付与される。 In the deduplication-compatible volume, the area in the virtual page is managed by being divided into partial areas called “chunks” having a predetermined size (for example, 8 KB) that is an integral multiple of the logical block in order from the top of the virtual page. Each chunk is given a unique address (hereinafter referred to as LA (Logical Address)).
 そしてストレージコントローラ2(図1)では、CPU11のそのとき最も負荷が低いマイクロプロセッサ10が、各重複排除対応ボリュームについて、ホスト装置8からのリード要求やライト要求に対するI/O処理とは非同期に、所定周期(例えば50msec周期)で、チャンク単位で同一内容か否かの重複判定を行い、同一内容のチャンクについてはそのうちの1つのチャンクのデータのみを残し、他のチャンクのデータを削除する重複排除処理を実行する。 In the storage controller 2 (FIG. 1), the microprocessor 10 having the lowest load at that time of the CPU 11 is asynchronous with the I / O processing for the read request and write request from the host device 8 for each deduplication-compatible volume. Deduplication is performed by determining whether or not the same contents are in units of chunks at a predetermined period (for example, 50 msec period), and for chunks with the same contents, only one chunk data is left and the other chunk data is deleted Execute the process.
 ここで、重複判定の際、2つのデータをビット単位あるいはバイト単位で比較すると、判定処理に長時間を要することになる。そこでマイクロプロセッサ10は、比較対象のデータについて、例えばハッシュ関数を利用して算出するハッシュ値などの、そのデータに基づいて算出される小サイズ(たとえば8バイト程度)の特徴量でなるチェックコードを算出し、算出したチェックコードを用いてチャンク間の重複判定を行う。以下の実施の形態では、1つのチャンクのデータから生成されるチェックコードのことを、「FPK(FingerPrint Key)」と呼ぶ。 Here, at the time of duplication determination, if two data are compared in bit units or byte units, the determination process takes a long time. Therefore, the microprocessor 10 provides a check code including a feature amount of a small size (for example, about 8 bytes) calculated based on the data to be compared, such as a hash value calculated using a hash function. Calculate and perform duplication determination between chunks using the calculated check code. In the following embodiment, a check code generated from data of one chunk is referred to as “FPK (FingerPrint Key)”.
 マイクロプロセッサ10は、重複判定の結果、あるデータの重複を初めて検出した場合には、同一内容のチャンクのうちの1つのチャンクのデータのみを元データとして残し、他の同一内容のチャンクのデータを削除する。この際、マイクロプロセッサ10は、元データとして残すデータについては、LZWアルゴリズム等の可逆圧縮アルゴリズムを用いた圧縮処理を施し、削除するデータについては、そのデータが格納されたチャンクのLAを当該データのFPKと対応付けて、専用の仮想ボリュームVVOLに格納されたテーブルに登録して管理する。以下においては、このテーブルを「FPT(FingerPrint key Table)」31(図2)と呼ぶ。 When the microprocessor 10 detects duplication of certain data for the first time as a result of the duplication determination, only the data of one chunk among the chunks having the same content is left as the original data, and the data of the other chunks having the same content is stored. delete. At this time, the microprocessor 10 performs compression processing using a reversible compression algorithm such as the LZW algorithm on the data to be left as the original data, and for the data to be deleted, the LA of the chunk in which the data is stored In association with FPK, it is registered and managed in a table stored in a dedicated virtual volume VVOL. Hereinafter, this table is referred to as “FPT (FingerPrintingerkey Table)” 31 (FIG. 2).
 なお、圧縮処理により生成された元データの圧縮データは、非圧縮データの格納された物理ページとは異なる場所(以下、これを「追書き空間」と呼ぶ)に格納される。追書き空間は、ホスト装置8がアクセスできる記憶空間ではなく、ストレージコントローラ2だけが使用可能な記憶空間(仮想ボリュームVVOL)である。追書き空間は、ストレージコントローラ2が圧縮データを記憶デバイス30に格納するために用いられる。なお追書き空間に対する圧縮データの格納は、追記により行われる。またFPT31に格納された元データのLAと、その元データの圧縮データが格納されている追書き空間上のアドレスとの対応関係はテーブル(以下、これをアドレス変換テーブルと呼ぶ)14(図1)を用いて管理される。 Note that the compressed data of the original data generated by the compression processing is stored in a location different from the physical page in which the uncompressed data is stored (hereinafter referred to as “additional writing space”). The write-once space is not a storage space accessible by the host device 8, but a storage space (virtual volume VVOL) that can be used only by the storage controller 2. The write-once space is used for the storage controller 2 to store the compressed data in the storage device 30. The compressed data is stored in the additional writing space by additional writing. The correspondence between the LA of the original data stored in the FPT 31 and the address in the write-once space where the compressed data of the original data is stored is a table (hereinafter referred to as an address conversion table) 14 (FIG. 1). ).
 ある仮想ページの各チャンクにそれぞれ格納されたデータがすべて圧縮されて追書き空間に書き込まれると、その仮想ページに割り当てられていた物理ページは開放される。これにより記憶容量の有効利用を図ることができる。 When all the data stored in each chunk of a certain virtual page is compressed and written to the write-once space, the physical page assigned to that virtual page is released. As a result, the storage capacity can be effectively used.
 なお、ホスト装置8から書き込まれたデータが追書き空間に移動された仮想ページに対して、ホスト装置8から更新データの書き込み要求(つまり更新要求)があった場合、更新データは圧縮されて追書き空間に追記される。ただし別の実施の形態として、ストレージ装置1が再び上書き空間の仮想ページに物理ページを割り当て、追書き空間に移動されたデータを伸長して、仮想ページに割り当てられた物理ページに伸長されたデータを書き戻し、そしてこの物理ページ上のデータを更新(上書き)するようにしてもよい。 Note that if there is an update data write request (that is, an update request) from the host device 8 to the virtual page in which the data written from the host device 8 has been moved to the write space, the update data is compressed and added. Added to the writing space. However, as another embodiment, the storage apparatus 1 again assigns a physical page to the virtual page in the overwrite space, decompresses the data moved to the write-once space, and decompresses the data to the physical page assigned to the virtual page. The data on the physical page may be updated (overwritten).
(2)本実施の形態による重複排除機能
 ところで、本実施の形態のストレージ装置1には、上述のような重複排除処理を実行する重複排除機能の一環として、初めてのデータを重複排除する際の元データの配置位置や、その後に元データのボリューム移動が必要となったときの移動先の仮想ボリュームVVOLを、元データの移動が発生するリスクが最も少ないと推定される仮想ボリュームVVOLに決定する機能が搭載されている。
(2) Deduplication function according to the present embodiment By the way, the storage apparatus 1 of the present embodiment includes a deduplication function for executing deduplication processing as described above, as a part of the deduplication function. The placement position of the original data and the virtual volume VVOL of the migration destination when the migration of the original data is necessary after that is determined as the virtual volume VVOL estimated to have the least risk of the migration of the original data. The function is installed.
 具体的に、ストレージ装置1は、あるデータに対して初めて重複排除処理を実行する場合、そのデータが、例えば図3に示すように、マスタデータをコピーして複数の同一内容のデータを生成するユースケース(以下、これを第1のユースケースと呼ぶ)のデータである場合には、そのマスタのイメージデータ(以下、これをマスタデータと呼ぶ)を元データとして残し、当該マスタデータをコピーすることにより生成された残りの各データ(以下、これをコピーデータと呼ぶ)を削除するように重複排除処理を実行する。 Specifically, when the storage apparatus 1 executes deduplication processing for a certain amount of data for the first time, the data, for example, as shown in FIG. 3, copies master data to generate a plurality of pieces of data having the same contents. If the data is a use case (hereinafter referred to as the first use case), the master image data (hereinafter referred to as master data) is left as the original data, and the master data is copied. The deduplication processing is executed so as to delete the remaining data (hereinafter referred to as copy data) generated.
 このような第1のユースケースの一例としてVDI(Virtual Desktop Infrastructure)が挙げられる。VDI運用では、各ユーザ間においてユーザデータ以外の大部分のデータは重複しており、重複するデータの更新頻度も少ない。またマスタデータが格納された仮想ボリューム(以下、これをマスタボリュームと呼ぶ)がコピー先の仮想ボリュームに先駆けて削除されることは通常考え難い。従って、初回の重複排除処理の際に、マスタボリュームに元データを集中させ、コピーデータを重複排除することにより、元データの移動が発生し難く、コピーデータが格納されている仮想ボリュームを削除する際の処理時間も短縮化できるものと考えられる。 An example of such a first use case is VDI (Virtual Desktop Infrastructure). In VDI operation, most data other than user data is duplicated between users, and the frequency of updating duplicate data is low. Also, it is usually difficult to imagine that a virtual volume storing master data (hereinafter referred to as a master volume) is deleted prior to the copy destination virtual volume. Therefore, at the time of the first deduplication process, the original data is concentrated on the master volume and the copy data is deduplicated, so that the movement of the original data is difficult to occur and the virtual volume in which the copy data is stored is deleted. It is considered that the processing time can be shortened.
 なお、重複排除処理対象のデータが上述した第1のユースケースのものであるか否かは、データ複製コマンドであるXCOPYコマンドの発行情報に基づいて判定することができる。すなわち第1のユースケースの場合、XCOPYコマンドにおいてコピー元として指定された仮想ボリュームVVOLが第1のユースケースにおけるマスタボリュームである判断することができる。なお、XCOPYコマンドが発行された際に、コピー元となっている仮想ボリュームVVOLに対して「XCOPYのコピー元である」という情報を付与しておくことで重複排除処理時にマスタボリュームを容易に特定することができる。 Note that whether or not the data to be deduplicated is in the first use case described above can be determined based on the issuance information of the XCOPY command that is a data replication command. That is, in the case of the first use case, it can be determined that the virtual volume VVOL designated as the copy source in the XCOPY command is the master volume in the first use case. When the XCOPY command is issued, the master volume can be easily identified during deduplication processing by giving the information “XCOPY copy source” to the virtual volume VVOL that is the copy source. can do.
 一方、ストレージ装置1は、重複排除処理されたデータが定期的なバックアップにより得られたバックアップデータであると判断できる場合(以下、この場合のユースケースを第2のユースケースと呼ぶ)において、そのデータの元データの更新や、元データが格納されている仮想ボリュームの削除などの契機で当該元データを他のバックアップボリュームに移動させることが必要となった場合には、元データの移動先の候補となる他のバックアップボリュームのうち、最後に更新が行われたバックアップボリュームに元データを移動させる。 On the other hand, when the storage apparatus 1 can determine that the deduplicated data is backup data obtained by regular backup (hereinafter, this use case is referred to as a second use case) If it becomes necessary to move the original data to another backup volume due to an update of the original data or deletion of a virtual volume that stores the original data, the destination of the original data Of the other candidate backup volumes, the original data is moved to the last updated backup volume.
 例えば、図4に示すように、曜日ごとのバックアップボリューム(仮想ボリュームVVOL)がそれぞれ用意され、あるバックアップ対象のデータ(以下、これをバックアップ対象データと呼ぶ)をこれらのバックアップボリュームに日々バックアップする運用を考える。当然ながら、この場合のバックアップ対象データのバックアップ先は、その日の曜日に対応するバックアップボリュームである。 For example, as shown in FIG. 4, a backup volume (virtual volume VVOL) for each day of the week is prepared, and a certain backup target data (hereinafter referred to as backup target data) is backed up daily to these backup volumes. think of. Of course, the backup destination of the backup target data in this case is the backup volume corresponding to the day of the week.
 図4は、1週間分の合計7つのバックアップボリュームにそれぞれ格納されたデータについて重複排除処理が行われ、水曜日用のバックアップボリュームに格納されているデータ「A」が元データとして残され、他の曜日のバックアップボリューム内の同一内容のデータ(データ「A」)は重複排除された例を示している。 In FIG. 4, deduplication processing is performed on data stored in a total of seven backup volumes for one week, and data “A” stored in the backup volume for Wednesday is left as the original data. An example is shown in which data having the same content (data “A”) in the backup volume of the day of the week is deduplicated.
 ここで、水曜日にバックアップ対象データの更新を行う場合を考える。この場合、バックアップ対象データの更新によりデータ「A」がデータ「B」に更新されるものとすると、かかるデータ「B」がバックアップされて水曜日用のバックアップボリュームに格納されているデータ「A」に上書きされる前に、データ「A」の元データを他のバックアップボリュームに移動させる必要がある。これは、データ「B」がデータ「A」に上書きされた場合、それまでの1週間分のバックアップデータに含まれるデータ「A」が復元できなくなるからである。 Suppose here that the data to be backed up is updated on Wednesday. In this case, if the data “A” is updated to the data “B” by updating the backup target data, the data “B” is backed up to the data “A” stored in the Wednesday backup volume. Before being overwritten, it is necessary to move the original data of the data “A” to another backup volume. This is because when the data “B” is overwritten with the data “A”, the data “A” included in the backup data for one week until then cannot be restored.
 この際、データ「A」の元データの移動先としては、翌日に更新されて直ぐに元データの移動が発生する木曜日用のバックアップボリュームでなく、今後の更新機会が最も先になる火曜日用のバックアップボリュームが適当である。つまり第2のユースケースでは、元データの移動先は、最後に更新が行われたバックアップボリュームとすることが好ましいと言える。 At this time, as the destination of the original data of data “A”, the backup volume for Tuesday, which is the earliest in the future, is not the backup volume for Thursday where the original data is moved immediately after the update on the next day. The volume is appropriate. In other words, in the second use case, it can be said that it is preferable that the migration destination of the original data is a backup volume updated last.
 この場合、バックアップ運用は、ローカルコピー機能で実現されることが多い。このため、第2のユースケースにおける元データの移動先の判定は、ローカルコピー機能のペア情報(最新の操作時刻)を用いて行うことができる。 In this case, the backup operation is often realized by the local copy function. Therefore, the determination of the migration destination of the original data in the second use case can be performed using the pair information (latest operation time) of the local copy function.
 他方、ストレージ装置1は、元データの上書きや、元データが格納されている仮想ボリュームの削除などの契機で元データの移動が必要となった場合において、その元データのユースケースが定期的なバックアップ運用であるとは判断できないユースケース(以下、これを第3のユースケースと呼ぶ)については、図5に示すように、元データの移動先候補のうち、最も更新頻度が低い仮想ページに元データを移動させる。 On the other hand, when the storage device 1 needs to move the original data due to overwriting of the original data or deletion of the virtual volume storing the original data, the use case of the original data is periodically For use cases that cannot be determined to be backup operations (hereinafter referred to as the third use case), as shown in FIG. Move the original data.
 このように元データの移動先をその移動先候補のうちの最も更新頻度が低い仮想ページに移動させることによりデータ更新による元データの移動リスクを低減させることができる。なお、この手法は、第2のユースケース以外のユースケースにおいて元データを移動先を決定する場合だけでなく、初回の重複排除処理において元データの配置位置を決定する際(第1のユースケースを除く)にも適用することができる。 In this way, the migration risk of the original data due to the data update can be reduced by moving the migration destination of the original data to the virtual page with the lowest update frequency among the migration destination candidates. This method is used not only when determining the destination of the original data in use cases other than the second use case, but also when determining the location of the original data in the first deduplication process (first use case). Can also be applied.
 以上のような機能を実現するための手段として、本実施の形態のストレージ装置1においては、図1に示すように、マイクロプロセッサボード4のプロセッサメモリ12にページ更新頻度情報13及びアドレス変換テーブル14が格納されると共に、キャッシュメモリパッケージ5の共有メモリ領域22に仮想ボリューム情報23及びローカルコピーペア情報24が格納されている。またストレージ装置1内には、図2について上述したように、ストレージコントローラ2だけが使用可能な仮想ボリュームVVOL(以下、これをFPTボリュームと呼ぶ)が定義されており、このFPTボリューム内にFPT31が格納されている。 As means for realizing the above functions, in the storage apparatus 1 of the present embodiment, as shown in FIG. 1, page update frequency information 13 and an address conversion table 14 are stored in the processor memory 12 of the microprocessor board 4. Are stored, and the virtual volume information 23 and the local copy pair information 24 are stored in the shared memory area 22 of the cache memory package 5. Further, as described above with reference to FIG. 2, a virtual volume VVOL (hereinafter referred to as an FPT volume) that can be used only by the storage controller 2 is defined in the storage apparatus 1, and an FPT 31 is included in this FPT volume. Stored.
 ページ更新頻度情報13は、各仮想ボリュームVVOLのチャンクごとの所定時間(例えば、数秒~数時間)内における更新回数(更新頻度)がそれぞれ格納されたテーブル構造を有する。このページ更新頻度情報13は、仮想ボリュームVVOLに書き込まれたデータが更新される度に、その処理を担当したマイクロプロセッサ10(図1)により更新頻度の値を1増加させるように更新される。 The page update frequency information 13 has a table structure in which the number of updates (update frequency) within a predetermined time (for example, several seconds to several hours) for each chunk of each virtual volume VVOL is stored. The page update frequency information 13 is updated so that the value of the update frequency is incremented by 1 every time the data written in the virtual volume VVOL is updated by the microprocessor 10 (FIG. 1) in charge of the processing.
 またアドレス変換テーブル14は、上書き空間上のチャンクのデータが追書き空間に移動されたときの各チャンクの移動先を管理するために利用されるテーブルである。このアドレス変換テーブル14には、上書き空間に格納されたデータを圧縮して追書き空間に格納したときに、その処理を担当したマイクロプロセッサ10によりそのデータが格納されていた上書き空間のアドレスと、その元データの圧縮データが格納された追書き空間のアドレスとが対応付けて格納される。 The address conversion table 14 is a table used for managing the movement destination of each chunk when the chunk data on the overwrite space is moved to the write-once space. In the address conversion table 14, when the data stored in the overwrite space is compressed and stored in the write-once space, the address of the overwrite space in which the data was stored by the microprocessor 10 in charge of the processing, The address of the additional writing space where the compressed data of the original data is stored is stored in association with each other.
 実際上、アドレス変換テーブル14は、図6に示すように、上書き空間アドレス欄14A及び追書き空間アドレス欄14Bを備えて構成される。そして上書き空間アドレス欄14Aには、データが圧縮されて追書き空間に移動されたチャンクの上書き空間上のアドレス(LA)が格納され、追書き空間アドレス欄14Bには、対応するチャンクの追書き空間上の移動先のアドレス(PA)が格納される。 Actually, as shown in FIG. 6, the address conversion table 14 includes an overwrite space address column 14A and a write space address column 14B. The overwrite space address column 14A stores the address (LA) on the overwrite space of the chunk that has been compressed and moved to the write space, and the write space address column 14B stores the corresponding chunk. The destination address (PA) in space is stored.
 仮想ボリューム情報23は、ストレージ装置1内に定義された各仮想ボリュームVVOLを管理するために利用される情報であり、図7に示すように、ボリューム番号欄23A、容量欄23B、重複排除設定欄23C及びXCOPY属性欄23Dなどを備えて構成される。 The virtual volume information 23 is information used to manage each virtual volume VVOL defined in the storage apparatus 1, and as shown in FIG. 7, a volume number column 23A, a capacity column 23B, a deduplication setting column 23C and XCOPY attribute column 23D.
 そしてボリューム番号欄23Aには、ストレージ装置1内に定義された各仮想ボリュームVVOLに対してそれぞれ付与された識別番号(ボリューム番号)がすべて格納され、容量欄23Bには対応する仮想ボリュームVVOLについて設定された容量が格納される。 The volume number column 23A stores all identification numbers (volume numbers) assigned to the respective virtual volumes VVOL defined in the storage apparatus 1, and the capacity column 23B sets the corresponding virtual volume VVOL. Stored capacity is stored.
 また重複排除設定欄23Cには、対応する仮想ボリュームVVOLが重複排除対応ボリュームに設定されているか否かを表す情報(図7では、設定されている場合に「有」、設定されていない場合に「無」)が格納される。 In the deduplication setting column 23C, information indicating whether or not the corresponding virtual volume VVOL is set as a deduplication-compatible volume (in FIG. 7, “present” when set, “when not set”). “Nothing”) is stored.
 さらにXCOPY属性欄23Dには、対応する仮想ボリュームVVOLがXCOPYコマンドに基づくコピーのコピー元となったか否かの属性(以下、これをXCOPY属性と呼ぶ)が格納される。図7では、対応する仮想ボリュームVVOLがXCOPYコマンドに基づくコピーのコピー元となった場合に「XCOPYコピー元」という文字列が格納される例が示されている。なお、この情報は、仮想ボリュームがXCOPYのコピー元となった場合にそのコピー処理の制御を担当したマイクロプロセッサ10(図1)により登録される。 Further, the XCOPY attribute column 23D stores an attribute indicating whether or not the corresponding virtual volume VVOL is a copy source of a copy based on the XCOPY command (hereinafter referred to as an XCOPY attribute). FIG. 7 shows an example in which the character string “XCOPY copy source” is stored when the corresponding virtual volume VVOL is the copy source based on the XCOPY command. This information is registered by the microprocessor 10 (FIG. 1) in charge of controlling the copy processing when the virtual volume becomes the XCOPY copy source.
 ローカルコピーペア情報24は、ストレージ装置1内に定義された各ローカルコピーペアを管理するための情報であり、図8に示すように、ボリューム番号欄24A、ペア属性欄24B、ペア操作時刻欄24C、ペア数欄24D及び相手ボリューム番号欄24Eを備えたテーブル構造を有する。 The local copy pair information 24 is information for managing each local copy pair defined in the storage apparatus 1, and as shown in FIG. 8, a volume number column 24A, a pair attribute column 24B, and a pair operation time column 24C. The table structure includes a pair number column 24D and a partner volume number column 24E.
 そしてボリューム番号欄24Aには、ストレージ装置1内に定義されたすべての仮想ボリュームVVOLのボリューム番号がそれぞれ格納される。またペア属性欄24Bには、対応する仮想ボリュームVVOLが他の仮想ボリュームVVOLとローカルコピーのコピーペアに設定されている場合に、そのコピーペアのコピー元である正ボリューム(正VOL)及びコピー先である副ボリューム(副VOL)のいずれであるかを表す情報が格納される。対応する仮想ボリュームVVOLがいずれの仮想ボリュームVVOLとコピーペアに設定されていない場合には、ペア属性欄24Bには何も格納されない。 In the volume number column 24A, volume numbers of all virtual volumes VVOL defined in the storage apparatus 1 are stored. In the pair attribute column 24B, when the corresponding virtual volume VVOL is set as a copy pair of another virtual volume VVOL and a local copy, the primary volume (primary VOL) that is the copy source of the copy pair and the copy destination Information indicating which secondary volume (secondary VOL) is stored is stored. If the corresponding virtual volume VVOL is not set as a copy pair with any virtual volume VVOL, nothing is stored in the pair attribute column 24B.
 またペア操作時刻欄24Cには、対応する仮想ボリュームVVOLが他の仮想ボリュームVVOLとコピーペアに設定されている場合に、そのコピーペアの形成やリシンク(再同期)などの所定操作が最後に行われた時刻が格納される。 In the pair operation time column 24C, when the corresponding virtual volume VVOL is set as a copy pair with another virtual volume VVOL, a predetermined operation such as formation of the copy pair or resync (resynchronization) is performed last. Stored time is stored.
 さらにペア数欄24Dには、対応する仮想ボリュームVVOLがコピーペアに設定されている相手側の仮想ボリュームVVOLの数が格納され、相手ボリューム番号欄24Eには、これら相手側の仮想ボリュームVVOLのボリューム番号がすべて格納される。 The pair number column 24D stores the number of counterpart virtual volumes VVOL for which the corresponding virtual volume VVOL is set as a copy pair. The counterpart volume number column 24E stores the volumes of these counterpart virtual volumes VVOL. All numbers are stored.
 一方、FPT31は、重複排除処理の際にそれぞれ算出した各重複排除対応ボリューム内の各チャンクのFPKを管理するためのテーブルであり、図9に示すように、重複排除処理において算出された異なるFPKごとの列31Aを備えて構成される。 On the other hand, the FPT 31 is a table for managing the FPK of each chunk in each deduplication corresponding volume calculated at the time of deduplication processing. As shown in FIG. 9, different FPKs calculated in the deduplication processing are used. Each column 31A is configured.
 そして各列31Aの最上段の行(以下、これをFPK行と呼ぶ)31Bには、それぞれ対応するFPKの値がそれぞれ格納され、各列31A内のFPK行31Bよりも下の各行(以下、これらをLA行と呼ぶ)31Cには、それぞれ格納されたデータのPFKの値がFPK行31Bに格納されたFPKの値と一致するすべてのチャンクのLAがそれぞれ格納される。 The uppermost row (hereinafter referred to as FPK row) 31B of each column 31A stores the corresponding FPK value, and each row below the FPK row 31B in each column 31A (hereinafter referred to as “FPK row 31B”). These are called LA rows.) 31C stores LAs of all chunks in which the PFK values of the stored data match the FPK values stored in the FPK row 31B.
 従って、FPT31では、列31Aごとに、それまでの重複排除処理により検出した、その列31AのFPK行31Bに格納されたFPKと同じ値のFPKのデータが格納されたすべてのチャンクのLAが格納されることとなり、これらチャンクのうちの1つのチャンクのデータのみが元データとして残され(圧縮されて追書き空間に格納され)、同じ列31Aの他のチャンクのデータが重複排除処理により削除されていることになる。 Therefore, in the FPT 31, for each column 31A, LA of all the chunks in which the FPK data having the same value as the FPK stored in the FPK row 31B of the column 31A detected by the deduplication process is stored. As a result, only the data of one of these chunks is left as original data (compressed and stored in the write-once space), and the data of other chunks in the same column 31A is deleted by the deduplication processing. Will be.
 なお本実施の形態の場合、FPT31の各列31Aにおいて、格納されたデータが元データとして残されたチャンクのLAが最も上のLA行31Cに格納される。従って、図9の例の場合、FPKが「FPK1」であるデータが重複排除されており、その元データは「LA1」というLAが付与された仮想ページに格納されていることが示されている。 In the case of the present embodiment, in each column 31A of the FPT 31, the LA of the chunk in which the stored data is left as the original data is stored in the top LA row 31C. Therefore, in the example of FIG. 9, it is shown that the data whose FPK is “FPK1” is deduplicated and the original data is stored in the virtual page to which LA “LA1” is assigned. .
 また本実施の形態の場合、元データが更新される場合や元データが格納された仮想ボリュームVVOLが削除される場合には、元データは、同じ列31Aの次のLA行31Cに格納されたLAに移動される。従って図9の例の場合、「LA1」というLAの仮想ページに格納されたデータ(元データ)が更新される場合や、「LA1」という仮想ページを有する仮想ボリュームVVOLが削除される場合には、「LA1」というLAの仮想ページに格納されている元データが「LA1031」というLAの仮想ページに移動されることになる。 In the case of this embodiment, when the original data is updated or when the virtual volume VVOL storing the original data is deleted, the original data is stored in the next LA row 31C of the same column 31A. Moved to LA. Therefore, in the example of FIG. 9, when the data (original data) stored in the virtual page of LA “LA1” is updated, or when the virtual volume VVOL having the virtual page “LA1” is deleted. , The original data stored in the LA virtual page “LA1” is moved to the LA virtual page “LA1031”.
(3)重複排除機能に関する各種処理
 次に、上述した本実施の形態の重複排除機能に関する各種処理の具体的な処理内容について説明する。なお、以下の各種処理は、CPU11のそのとき最も負荷が小さいマイクロプロセッサ10がローカルメモリ10Aに格納されたマイクロプログラムに基づいて実行することは言うまでもない。
(3) Various Processes Related to Deduplication Function Next, specific processing contents of various processes related to the above-described deduplication function of this embodiment will be described. Needless to say, the following various processes are executed by the microprocessor 10 having the smallest load of the CPU 11 based on the microprogram stored in the local memory 10A.
(3-1)重複排除処理
 図10は、図示しないスケジューラから周期的(例えば50msec)に与えられる起動コマンドに基づいて、その周期でCPU11(図1)のいずれかのマイクロプロセッサ10(図1)により実行される重複排除処理の処理手順を示す。
(3-1) Deduplication Processing FIG. 10 shows one of the microprocessors 10 (FIG. 1) of the CPU 11 (FIG. 1) based on an activation command given periodically (for example, 50 msec) from a scheduler (not shown). The process procedure of the deduplication process performed by this is shown.
 かかる起動コマンドをCPU11が受信すると、CPU11内のいずれかのマイクロプロセッサ10は、この図10に示す重複排除処理を開始し、まず、ストレージ装置1内に定義された重複排除対応ボリュームの中からステップS2以降の処理の対象とする重複排除対応ボリューム(以下、これを対象ボリュームと呼ぶ)を1つ決定する(S1)。 When the CPU 11 receives the activation command, any one of the microprocessors 10 in the CPU 11 starts the deduplication process shown in FIG. 10, and first performs a step from the deduplication corresponding volume defined in the storage apparatus 1. One deduplication-compatible volume (hereinafter referred to as a target volume) to be processed after S2 is determined (S1).
 なお、対象ボリュームの決定方法は、重複排除対応ボリュームの中からランダムに決定する方法であっても、重複排除対応ボリュームの中から所定の順番で決定する方法のいずれであってもよい。前者の方法としては、先行して行った重複排除処理で最後に対象ボリュームとなった重複排除対応ボリュームのボリューム番号、又は、今回の重複排除処理で直前の対象ボリュームのボリューム番号に所定の素数を加算した値を求め、その値のボリューム番号が付与された重複排除対応ボリュームを対象ボリュームとする方法が考えられる。また後者の方法としては、ボリューム番号の昇順又は降順で重複排除対応ボリュームを対象ボリュームに決定していく方法が考えられる。 Note that the method of determining the target volume may be either a method of determining at random from the deduplication-compatible volume or a method of determining in a predetermined order from the deduplication-compatible volume. As the former method, a predetermined prime number is added to the volume number of the deduplication-compatible volume that was the last target volume in the deduplication process performed earlier or the volume number of the previous target volume in this deduplication process. A method is conceivable in which the added value is obtained and the deduplication-compatible volume to which the volume number of that value is assigned is used as the target volume. As the latter method, a method of determining a deduplication-compatible volume as a target volume in ascending or descending order of volume numbers is conceivable.
 続いて、マイクロプロセッサ10は、図7について上述した仮想ボリューム情報23における、対象ボリュームに対応するXCOPY属性欄23D(図7)に格納されているXCOPY属性を取得し(S2)、取得したXCOPY属性に基づいて、対象ボリュームがそのときまでにXCOPYコマンドに従って実行されたコピーのコピー元となったことがあるか否かを判断する(S3)。そしてマイクロプロセッサ10は、この判断で否定結果を得るとステップS5に進む。 Subsequently, the microprocessor 10 acquires the XCOPY attribute stored in the XCOPY attribute column 23D (FIG. 7) corresponding to the target volume in the virtual volume information 23 described above with reference to FIG. 7 (S2), and acquires the acquired XCOPY attribute. Based on the above, it is determined whether or not the target volume has been the copy source of the copy executed according to the XCOPY command by that time (S3). If the microprocessor 10 obtains a negative result in this determination, it proceeds to step S5.
 これに対して、マイクロプロセッサ10は、ステップS3の判断で肯定結果を得ると、対象ボリュームに格納されたデータに対する重複排除を実行し(S4)、この後、ストレージ装置1内のすべての重複排除対応ボリュームに対してステップS1~ステップS4の処理を実行し終えたか否かを判断する(S4)。 On the other hand, when the microprocessor 10 obtains a positive result in the determination at step S3, it performs deduplication on the data stored in the target volume (S4), and thereafter all the deduplications in the storage device 1 are performed. It is determined whether or not the processing of steps S1 to S4 has been executed for the corresponding volume (S4).
 そしてマイクロプロセッサ10は、この判断で否定結果を得ると、ステップS1に戻り、この後、ステップS1で対象ボリュームに決定する重複排除対応ボリュームを未処理の他の対象となる重複排除対応ボリュームに順次切り替えながら、ステップS1~ステップS5の処理を繰り返す。 If the microprocessor 10 obtains a negative result in this determination, it returns to step S1, and thereafter, the deduplication corresponding volume determined as the target volume in step S1 is sequentially changed to another unprocessed deduplication corresponding volume. While switching, the processing from step S1 to step S5 is repeated.
 このようなステップS1~ステップS5の繰返し処理により、XCOPYコマンドに従って実行されたコピーのコピー元となったことがある各重複排除対応ボリュームにそれぞれ格納されたデータに対する重複排除が実行されることになる。 By repeating such steps S1 to S5, deduplication is performed on the data stored in each deduplication-compatible volume that has become the copy source of the copy executed in accordance with the XCOPY command. .
 そしてマイクロプロセッサ10は、やがてXCOPYコマンドに従って実行されたコピーのコピー元となったことがあるすべての重複排除対応ボリュームに格納されたデータに対する重複排除が完了することによりステップS5で肯定結果を得ると、重複排除実行処理が未処理の重複排除対応ボリューム(そのときまでにXCOPYコマンドに従って実行されたコピーのコピー元となったことがない重複排除対応ボリューム)の中からステップS6以降の処理の対象とする重複排除ボリューム(対象ボリューム)を1つ決定する。 Then, the microprocessor 10 obtains a positive result in step S5 by completing the deduplication on the data stored in all the deduplication-compatible volumes that have become the copy source of the copy executed in accordance with the XCOPY command. Deduplication-compatible volumes for which deduplication execution processing has not been performed (deduplication-compatible volumes that have not been the copy source of copies executed according to the XCOPY command until that time) are subject to processing in step S6 and subsequent steps. One deduplication volume (target volume) to be determined is determined.
 なお、このステップS6における対象ボリュームの決定方法も、重複排除実行処理が未処理の重複排除対応ボリュームの中からランダムに決定する方法であっても、重複排除実行処理が未処理の重複排除対応ボリュームの中から所定の順番で決定する方法のいずれであってもよい。 Even if the method for determining the target volume in this step S6 is also a method in which the deduplication execution process is randomly determined from among the unprocessed deduplication corresponding volumes, the deduplication execution process is not yet processed. Any of the methods of determining in a predetermined order from the above may be used.
 次いで、マイクロプロセッサ10は、ステップS6で決定した対象ボリュームに格納されたデータに対する重複処理を実行し(S7)、この後、ストレージ装置1内に定義されたすべての重複排除対応ボリュームに格納されたデータに対する重複排除を実行し終えたか否かを判断する(S8)。 Next, the microprocessor 10 performs duplication processing on the data stored in the target volume determined in step S6 (S7), and thereafter stored in all the deduplication-compatible volumes defined in the storage apparatus 1. It is determined whether or not deduplication has been performed on data (S8).
 そしてマイクロプロセッサ10は、この判断で否定結果を得るとステップS6に戻り、この後、ステップS6で対象ボリュームに決定する重複排除対応ボリュームを未処理の他の対象となる重複排除対応ボリュームに順次切り替えながら、ステップS6~ステップS8の処理を繰り返す。 If the microprocessor 10 obtains a negative result in this determination, the microprocessor 10 returns to step S6, and then sequentially switches the deduplication corresponding volume determined as the target volume in step S6 to another unprocessed deduplication corresponding volume. However, the processing from step S6 to step S8 is repeated.
 このようなステップS6~ステップS8の繰返し処理により、XCOPYコマンドに従って実行されたコピーのコピー元となったことがない各重複排除対応ボリュームに格納されたデータに対する重複排除が実行されることになる。 By such repeated processing of step S6 to step S8, deduplication is executed on data stored in each deduplication-compatible volume that has never become a copy source of a copy executed in accordance with the XCOPY command.
 そしてマイクロプロセッサ10は、やがてXCOPYコマンドに従って実行されたコピーのコピー元となったことがないすべての重複排除対応ボリュームに格納されたデータに対する重複排除を実行し終えることによりステップS8で肯定結果を得ると、この重複排除処理を終了する。 The microprocessor 10 obtains a positive result in step S8 by completing the deduplication on the data stored in all the deduplication-compatible volumes that have not become the copy source of the copy executed in accordance with the XCOPY command. And this deduplication process is complete | finished.
 なお、かかる重複排除処理のステップS4及びステップS7においてマイクロプロセッサ10により実行される重複排除実行処理の具体的な処理内容を図11に示す。 In addition, FIG. 11 shows specific processing contents of the deduplication execution process executed by the microprocessor 10 in step S4 and step S7 of the deduplication process.
 マイクロプロセッサ10は、重複排除処理のステップS4又はステップS7に進むと、この図11に示す重複排除実行処理を開始し、まず、対象ボリューム内の仮想ページの中からステップS11以降の処理の対象とする仮想ページ(以下、これを対象ページと呼ぶ)を1つ決定する(S10)。なお、対象ページの決定方法は、対象ボリュームの仮想ページの中からランダムに決定する方法であっても、対象ボリュームの仮想ページの中から所定の順番で決定する方法のいずれであってもよい。 When the microprocessor 10 proceeds to step S4 or step S7 of the deduplication process, the microprocessor 10 starts the deduplication execution process shown in FIG. 11, and first selects the target of the process after step S11 from the virtual pages in the target volume. One virtual page (hereinafter referred to as a target page) is determined (S10). The method for determining the target page may be either a method for determining at random from the virtual pages of the target volume or a method for determining in a predetermined order from the virtual pages of the target volume.
 続いて、マイクロプロセッサ10は、ステップS10で決定した対象ページのデータ(以下、これを対象データと呼ぶ)に対する重複排除の要否を判定する(S11)。本実施の形態の場合、圧縮して追書き空間に格納した元データの更新が頻繁に発生するのを避けるため、対象データが所定時間(例えば1時間)内に更新されている場合には重複排除を行わないこととしている。そこで、マイクロプロセッサ10は、このステップS11において、ページ更新頻度情報13(図1)を参照して、対象ページが所定時間内に更新されているか否かに基づいてその対象データに対して重複排除を行うか否かを判定する。 Subsequently, the microprocessor 10 determines whether or not deduplication is necessary for the data of the target page determined in step S10 (hereinafter referred to as target data) (S11). In the case of the present embodiment, in order to avoid frequent updating of the original data compressed and stored in the write-once space, duplication occurs when the target data is updated within a predetermined time (for example, 1 hour). We do not do exclusion. Therefore, in step S11, the microprocessor 10 refers to the page update frequency information 13 (FIG. 1) and deduplicates the target data based on whether or not the target page is updated within a predetermined time. It is determined whether or not to perform.
 そしてマイクロプロセッサ10は、この判断で否定結果を得るとステップS21に進む。これに対してマイクロプロセッサ10は、ステップS11の判断で肯定結果を得ると、対象データのFPKとして、所定のハッシュ関数を用いてその対象データのハッシュ値を算出する(S12)。 If the microprocessor 10 obtains a negative result in this determination, it proceeds to step S21. On the other hand, when the microprocessor 10 obtains a positive result in the determination at step S11, the microprocessor 10 calculates a hash value of the target data using a predetermined hash function as the FPK of the target data (S12).
 次いで、マイクロプロセッサ10は、ステップS12で算出した対象データのハッシュ値(FPK)を、FPT31(図9)に格納されている各FPKと順次比較し(S13)、対象データのハッシュ値(FPK)がFPT31に既に登録されているいずれかのFPKと一致するか否かを判断する(S14)。 Next, the microprocessor 10 sequentially compares the hash value (FPK) of the target data calculated in step S12 with each FPK stored in the FPT 31 (FIG. 9) (S13), and the hash value (FPK) of the target data Is matched with any FPK already registered in the FPT 31 (S14).
 この判断で否定結果を得ることは、対象データのハッシュ値(FPK)が未だFPT41に登録されていないことを意味する。かくして、このときマイクロプロセッサ10は、ステップS12で算出したハッシュ値を対象ページのFPKとしてFPT31に新規に登録すると共に、FPT31におけるそのFPKに対応する列31A(図9)の最も上段のFPK行31Cに対象ページのLAを格納する(S18)。 Obtaining a negative result in this determination means that the hash value (FPK) of the target data has not been registered in the FPT 41 yet. Thus, at this time, the microprocessor 10 newly registers the hash value calculated in step S12 in the FPT 31 as the FPK of the target page, and at the top FPK row 31C of the column 31A (FIG. 9) corresponding to the FPK in the FPT 31. The LA of the target page is stored in (S18).
 またマイクロプロセッサ10は、この後、対象データを圧縮し、かくして得られた圧縮データを追書き空間に書き込み、対象ページのLAと、追書き空間におけるその圧縮データを格納したアドレス(PA)との対応関係をアドレス変換テーブル14に登録した後(S19)、ステップS20に進む。 The microprocessor 10 then compresses the target data, writes the compressed data thus obtained to the write-once space, and the LA of the target page and the address (PA) where the compressed data in the write-once space is stored. After the correspondence relationship is registered in the address conversion table 14 (S19), the process proceeds to step S20.
 これに対して、ステップS14の判断で肯定結果を得ることは、対象データのハッシュ値(FPK)が既にFPT31に登録されていることを意味する。ただし、この場合においても、対象データと、FPT31に登録されたFPKが当該対象データのハッシュ値と同じデータとが完全に一致するとは限らない。そこで、このときマイクロプロセッサ10は、対象データと、当該対象データのハッシュ値と同じ値のFPKがFPT31に登録されたデータとを比較する(S15)。 On the other hand, obtaining a positive result in the determination in step S14 means that the hash value (FPK) of the target data is already registered in the FPT 31. However, even in this case, the target data and the data for which the FPK registered in the FPT 31 is the same as the hash value of the target data are not always completely the same. Therefore, at this time, the microprocessor 10 compares the target data with data in which the FPK having the same value as the hash value of the target data is registered in the FPT 31 (S15).
 具体的に、マイクロプロセッサ10は、対象データのハッシュ値と同じ値のFPKがFPT31に登録されたデータの元データが格納されているチャンクのLA(FPT31におけるそのFPKの列31Aの最上段のFPK行31Cに登録されているLA)をFPT31から取得し、アドレス変換テーブル14(図6)を参照して、そのLAに対応する追書き空間上のアドレスを取得して、追書き空間におけるそのアドレス位置から元データの圧縮データを読み出す。またマイクロプロセッサは、読み出した圧縮データを伸長して圧縮前の元データを復元し、復元した元データと、対象ページのデータとを比較する。 Specifically, the microprocessor 10 determines that the FPK having the same value as the hash value of the target data is the LA of the chunk in which the original data of the data registered in the FPT 31 is stored (the top FPK in the column 31A of the FPK in the FPT 31). (LA registered in the row 31C) is acquired from the FPT 31, the address conversion table 14 (FIG. 6) is referred to, an address on the additional write space corresponding to the LA is acquired, and the address in the additional write space is acquired. Read the compressed data of the original data from the position. The microprocessor decompresses the read compressed data to restore the original data before compression, and compares the restored original data with the data of the target page.
 この後、マイクロプロセッサ10は、ステップS15の比較結果に基づいて、対象データと、当該対象データのハッシュ値と同じ値のFPKがFPT31に登録されたデータの元データとが一致したか否かを判断する(S16)。 Thereafter, the microprocessor 10 determines whether or not the target data and the original data of the data registered in the FPT 31 match the target data and the FPK having the same value as the hash value of the target data based on the comparison result of step S15. Judgment is made (S16).
 そしてマイクロプロセッサ10は、この判断で否定結果を得ると、対象データを圧縮し、かくして得られた圧縮データを追書き空間に書き込み、さらに対象ページのLAと、追書き空間におけるその圧縮データを格納したアドレス(PA)との対応関係をアドレス変換テーブル14に登録した後(S19)、ステップS20に進む。 If the microprocessor 10 obtains a negative result in this determination, it compresses the target data, writes the compressed data thus obtained to the write-once space, and stores the LA of the target page and the compressed data in the write-once space. After the correspondence relationship with the address (PA) is registered in the address conversion table 14 (S19), the process proceeds to step S20.
 これに対して、マイクロプロセッサ10は、ステップS16の判断で肯定結果を得ると、FPT31における対象データのハッシュ値と同じ値のFPKの列31Aの最後のFPK行31Cに対象ページのLAを追加登録し(S17)、この後、対象データが格納された上書き空間のページを破棄(当該ページのデータを削除)する(S20)。 On the other hand, when the microprocessor 10 obtains a positive result in the determination at step S16, it additionally registers the LA of the target page in the last FPK row 31C of the FPK column 31A having the same value as the hash value of the target data in the FPT 31. Thereafter, the page in the overwrite space where the target data is stored is discarded (the data on the page is deleted) (S20).
 続いて、マイクロプロセッサ10は、対象ボリューム内のすべての仮想ページについてステップS11~ステップS20の処理を実行し終えたか否かを判断する(S21)。そしてマイクロプロセッサ10は、この判断で否定結果を得るとステップS10に戻り、この後、ステップS10で決定する対象ページを対象ボリューム内の未処理の他の仮想ページに順次切り替えながらステップS10~ステップS21の処理を繰り返す。 Subsequently, the microprocessor 10 determines whether or not the processing of step S11 to step S20 has been executed for all virtual pages in the target volume (S21). If the microprocessor 10 obtains a negative result in this determination, it returns to step S10. Thereafter, the microprocessor 10 sequentially switches the target page determined in step S10 to another unprocessed virtual page in the target volume, step S10 to step S21. Repeat the process.
 そしてマイクロプロセッサ10は、やがて対象ボリューム内のすべての仮想ページについてステップS11P~ステップS20の処理を実行し終えることによりステップS21で肯定結果を得ると、この重複排除実行処理を終了して重複排除処理に戻る。 Then, when the microprocessor 10 eventually obtains a positive result in step S21 by completing the processing of steps S11P to S20 for all virtual pages in the target volume, it ends this deduplication execution processing and deduplication processing. Return to.
 このように本実施の形態のストレージ装置1では、XCOPYコマンドに従って行われたコピーのコピー元となったことがない重複排除対応ボリュームに先だって、XCOPYコマンドに従って行われたコピーのコピー元となったことがある重複排除対応ボリュームに対する重複排除実行処理を実行するため、FPT31において、XCOPYコマンドに従って行われたコピーのコピー元となったことがある重複排除対応ボリュームの仮想ページのLAが、XCOPYコマンドに従って行われたコピーのコピー元となったことがない重複排除対応ボリュームの仮想ページのLAよりも上位の位置に登録されることになる。 As described above, in the storage apparatus 1 according to the present embodiment, the copy source made in accordance with the XCOPY command is the copy source before the deduplication-compatible volume that has never been the copy source made in accordance with the XCOPY command. In order to execute the deduplication execution processing for a certain deduplication-compatible volume, the LA of the virtual page of the deduplication-compatible volume that has been the copy source of the copy made according to the XCOPY command in the FPT 31 is executed according to the XCOPY command. It is registered at a position higher than the LA of the virtual page of the deduplication corresponding volume that has never been the copy source of the broken copy.
 この場合において、本実施の形態のストレージ装置1では、上述のようにFPT31の最も上のFA行31CにLAが格納された仮想ページのデータが元データとして残されるため、上述した図10及び図11のような一連の処理により、XCOPYコマンドに従って行われたコピーのコピー元となったことがある重複排除対応ボリュームの仮想ページのLAがFPT31の最上段のFA行31Cに格納されることとなり、この結果として、XCOPYコマンドに従って行われたコピーのコピー元となったことがある重複排除対応ボリュームに格納されたデータが元データとして残され、そのコピーのコピー先となった重複排除対象ボリュームに格納されたデータが重複排除されることになる。 In this case, in the storage device 1 of the present embodiment, the virtual page data in which LA is stored in the uppermost FA row 31C of the FPT 31 is left as the original data as described above. 11, the LA of the virtual page of the deduplication-compatible volume that has become the copy source of the copy performed in accordance with the XCOPY command is stored in the uppermost FA row 31C of the FPT 31, As a result, the data stored in the deduplication-compatible volume that has become the copy source of the copy made in accordance with the XCOPY command is left as the original data and stored in the deduplication target volume that is the copy destination of the copy The deduplicated data will be deduplicated.
(3-2)元データ移動処理
 一方、図12は、仮想ボリュームVVOLの削除又は元データの上書きなどに応じて元データを現在の仮想ボリュームVVOLから他の移動先候補の他の仮想ボリュームVVOLに移動させるべき事態が発生したときにCPU11のいずれかのマイクロプロセッサ10により実行される元データ移動処理の処理手順を示す。マイクロプロセッサ10は、かかる事態が発生した場合に、この図12に示す処理手順に従って、元データを移動先候補の他の仮想ボリュームVVOLに移動させる。
(3-2) Original Data Migration Processing On the other hand, FIG. 12 shows that original data is transferred from the current virtual volume VVOL to another virtual volume VVOL of another migration destination candidate according to deletion of the virtual volume VVOL or overwriting of the original data. The processing procedure of the original data movement process executed by any of the microprocessors 10 of the CPU 11 when a situation to be moved occurs will be shown. When such a situation occurs, the microprocessor 10 moves the original data to another virtual volume VVOL as a migration destination candidate according to the processing procedure shown in FIG.
 実際上、マイクロプロセッサ10は、かかる事態が発生すると、この図12に示す元データ移動処理を開始し、まず、ローカルコピーペア情報24(図8)から、移動対象の元データが格納された仮想ボリュームVVOL(以下、これを元データ格納ボリュームと呼ぶ)をデータのコピー元又はコピー先とするローカルコピーペアに関する情報を取得する(S30)。 In practice, when such a situation occurs, the microprocessor 10 starts the original data movement process shown in FIG. 12, and first, from the local copy pair information 24 (FIG. 8), the virtual data storing the original data to be moved is stored. Information about a local copy pair in which a volume VVOL (hereinafter referred to as an original data storage volume) is used as a data copy source or copy destination is acquired (S30).
 具体的に、マイクロプロセッサ10は、ローカルコピーペア情報24における元データ格納ボリュームに対応するレコード(行)の情報と、相手ボリューム番号欄24E(図8)に元データ格納ボリュームのボリューム番号が格納されたすべてのレコード(行)の情報とを取得する。 Specifically, the microprocessor 10 stores the information of the record (row) corresponding to the original data storage volume in the local copy pair information 24 and the volume number of the original data storage volume in the partner volume number column 24E (FIG. 8). Get information on all records (rows).
 続いて、マイクロプロセッサ10は、ステップS30で取得した情報に基づいて、元データ格納ボリュームが他の仮想ボリュームVVOLとのコピーペアの副ボリューム(コピー先の仮想ボリュームVVOL)に設定されているか否かを判断する(S31)。この判断は、ステップS30で取得した元データ格納ボリュームのレコードのペア属性欄24B(図8)に格納されたペア属性が「副ボリューム」となっているか否かを判断することにより行われる。 Subsequently, based on the information acquired in step S30, the microprocessor 10 determines whether the original data storage volume is set as a secondary volume (copy destination virtual volume VVOL) of a copy pair with another virtual volume VVOL. Is determined (S31). This determination is made by determining whether or not the pair attribute stored in the pair attribute column 24B (FIG. 8) of the record of the original data storage volume acquired in step S30 is “secondary volume”.
 この判断で否定結果を得ることは、元データ格納ボリュームがいずれの仮想ボリュームVVOLともローカルコピーのコピーペアに設定されてはおらず、又は、元データ格納ボリュームが正ボリュームとして他の仮想ボリュームVVOLとローカルコピーのコピーペアに設定されていることを意味する。かくして、このときマイクロプロセッサ10は、元データのユースケースが上述した第3のユースケースであると判断してステップS35に進む。 The negative result obtained in this determination is that the original data storage volume is not set to a copy pair for local copy with any virtual volume VVOL, or the original data storage volume is used as a primary volume and other virtual volumes VVOL and local volumes. It means that the copy pair is set. Thus, at this time, the microprocessor 10 determines that the use case of the original data is the third use case described above, and proceeds to step S35.
 これに対してマイクロプロセッサ10は、ステップS31の判断で肯定結果を得ると、元データ格納ボリュームが副ボリュームに設定されたコピーペアにおいて正ボリュームに設定されたボリューム(以下、これを特定ボリュームと呼ぶ)に関するすべてのコピーペアの情報をローカルコピーペア情報24から取得する(S32)。具体的に、ローカルコピーペア情報24から特定ボリュームに対応するレコード(行)の行の情報を取得する。 On the other hand, if the microprocessor 10 obtains a positive result in the determination at step S31, the volume set as the primary volume in the copy pair in which the original data storage volume is set as the secondary volume (hereinafter referred to as a specific volume). ) Are acquired from the local copy pair information 24 (S32). Specifically, the row information of the record (row) corresponding to the specific volume is acquired from the local copy pair information 24.
 そしてマイクロプロセッサは、ステップS32で取得した情報に基づいて、元データの移動先候補の中に、元データ格納ボリューム以外の仮想ボリュームVVOLであって、特定ボリュームを正ボリュームとするコピーペアに設定された仮想ボリュームVVOL(副ボリューム)が存在するか否かを判断する(S33)。この判断は、ステップS32で取得した情報のうち、相手ボリューム番号欄24E(図8)に元データ格納ボリュームのボリューム番号以外の仮想ボリュームVVOLのボリューム番号が格納されているか否かに基づいて行われる。 Then, based on the information acquired in step S32, the microprocessor is set to a copy pair that is a virtual volume VVOL other than the original data storage volume and has the specific volume as the primary volume, among the migration destination candidates of the original data. It is determined whether or not there is a virtual volume VVOL (secondary volume) (S33). This determination is made based on whether the volume number of the virtual volume VVOL other than the volume number of the original data storage volume is stored in the counterpart volume number column 24E (FIG. 8) of the information acquired in step S32. .
 この判断で肯定結果を得ることは、特定ボリュームをコピー元とする複数のコピーペアが存在し、これらコピーペアのうちの1つのコピーペアの副ボリュームとして元データ格納ボリュームが存在していることを意味する。 Obtaining a positive result in this determination means that there are a plurality of copy pairs whose copy source is a specific volume, and that the original data storage volume exists as a secondary volume of one of the copy pairs. means.
 かくして、このときマイクロプロセッサ10は、元データのユースケースが第2のユースケースであると判断して、ローカルコピーペア情報24(図8)を参照して、特定ボリュームをコピー元とする複数のコピーペアの副ボリュームのうち、元データ格納ボリューム以外の副ボリュームであって、更新された時刻が最も新しい副ボリュームを元データの移動先に決定する(S34)。 Thus, at this time, the microprocessor 10 determines that the use case of the original data is the second use case, and refers to the local copy pair information 24 (FIG. 8), so that a plurality of copy volumes having a specific volume as the copy source are referred to. Of the secondary volumes in the copy pair, the secondary volume other than the original data storage volume and having the latest updated time is determined as the migration destination of the original data (S34).
 具体的に、マイクロプロセッサ10は、このステップS34において、ローカルコピーペア情報24における、ステップS33で検出した元データ格納ボリュームのボリューム番号以外のボリューム番号がそれぞれ付与された各仮想ボリュームVVOLのレコードのペア操作時刻欄24C(図8)にそれぞれ格納された時刻が最も新しい仮想ボリュームVVOLを元データの移動先に決定する。そしてマイクロプロセッサは、この後、ステップS37に進む。 Specifically, in this step S34, the microprocessor 10 sets a record pair of each virtual volume VVOL to which a volume number other than the volume number of the original data storage volume detected in step S33 in the local copy pair information 24 is assigned. The virtual volume VVOL with the latest time stored in the operation time column 24C (FIG. 8) is determined as the migration destination of the original data. Then, the microprocessor proceeds to step S37.
 これに対して、ステップS33の判断で否定結果を得ることは、特定ボリュームが元データ格納ボリューム以外の仮想ボリュームVVOLとローカルコピーのコピーペアに設定されていないことを意味する。かくして、このときマイクロプロセッサ10は、元データのユースケースが第3のユースケースであると判断して、元データの移動先となり得る各仮想ボリュームVVOLにおける当該元データの移動先となり得る仮想ページの更新頻度をページ更新頻度情報13(図1)からそれぞれ取得し、これらを比較する(S35)。 On the other hand, obtaining a negative result in the determination in step S33 means that the specific volume is not set to a copy pair of a virtual volume VVOL other than the original data storage volume and the local copy. Thus, at this time, the microprocessor 10 determines that the use case of the original data is the third use case, and the virtual page that can be the destination of the original data in each virtual volume VVOL that can be the destination of the original data. The update frequencies are acquired from the page update frequency information 13 (FIG. 1), and these are compared (S35).
 なお、ここでの「元データの移動先となり得る仮想ボリュームVVOL」とは、元データと同一内容のデータが重複排除された重複排除対応ボリュームであり、「元データの移動先となり得る仮想ページ」とは、その重複排除対象ボリュームにおける元データと同一内容のデータが重複排除された仮想ページである。 The “virtual volume VVOL that can be the migration destination of the original data” here is a deduplication-compatible volume in which data having the same content as the original data is deduplicated, and “a virtual page that can be the migration destination of the original data”. Is a virtual page in which data having the same content as the original data in the deduplication target volume is deduplicated.
 そしてマイクロプロセッサ10は、ステップS35の比較結果に基づいて、元データの移動先となり得る各仮想ボリュームVVOLにおける当該元データの移動先となり得る仮想ページのうち、更新頻度が最も低い仮想ページを元データの移動先に決定する(S36)。 Then, based on the comparison result of step S35, the microprocessor 10 selects the virtual page with the lowest update frequency from the virtual pages that can be the migration destination of the original data in each virtual volume VVOL that can be the migration destination of the original data. Is determined as the destination (S36).
 続いて、マイクロプロセッサ10は、元データをステップS34又はステップS35で決定した元データの移動先にコピーする(S37)。なお、このコピーは、追書き空間に格納されている元データの圧縮データを新たに追書き空間に追記するようにコピーすることにより行われる。また、このときマイクロプロセッサ10は、アドレス変換テーブル14(図6)における元データの移動元のLAと対応付けられた追書き空間上のアドレス(PA)を追書き空間における元データのコピー先のアドレス(PA)に書き換える。 Subsequently, the microprocessor 10 copies the original data to the destination of the original data determined in step S34 or S35 (S37). Note that this copying is performed by copying the compressed data of the original data stored in the write-once space so that it is newly written in the write-once space. At this time, the microprocessor 10 uses the address (PA) on the write-once space associated with the source LA of the original data in the address conversion table 14 (FIG. 6) as the copy destination of the original data in the write-once space. Rewrite to address (PA).
 さらに、マイクロプロセッサ10は、アドレス変換テーブル14における、元データと同一のFPKを有する重複排除されたデータの追書き空間上のアドレス(PA)をステップS37において追記した元データのコピー先のアドレス(PA)に書き換える(S38)。 Further, the microprocessor 10 copies the address (PA) in the write-once space of the deduplicated data having the same FPK as the original data in the address conversion table 14 to the address of the copy destination of the original data (in step S37) (PA) (S38).
 次いで、マイクロプロセッサ10は、FPT31の各列31Aのうち、かかる元データに対応するFPKの列31Aの先頭のLA行31Cに格納されている元データがそれまで格納されていた仮想ページのLAを削除する(S39)。 Next, the microprocessor 10 determines the LA of the virtual page in which the original data stored in the first LA row 31C of the column 31A of the FPK corresponding to the original data among the columns 31A of the FPT 31 has been stored. Delete (S39).
 またマイクロプロセッサ10は、元データの移動先の仮想ページのLAをFPT31におけるその列31Aの先頭のLA行31Cに格納すると共に、その列31Aに格納されているその仮想ページのLA以外のLAを前詰するよう必要に応じてその列31Aに格納されているLAを移動させる(S40)。 Further, the microprocessor 10 stores the LA of the virtual page to which the original data is moved in the first LA row 31C of the column 31A in the FPT 31, and the LA other than the LA of the virtual page stored in the column 31A. If necessary, the LA stored in the column 31A is moved so as to be pre-packed (S40).
 そしてマイクロプロセッサ10は、この後、この元データ移動処理を終了する。 Then, the microprocessor 10 thereafter ends this original data movement process.
(4)本実施の形態の効果
 以上のように本実施の形態のストレージ装置1は、重複排除対象のデータがコピーにより複製されたものであると判断できるケース(第1のユースケース)では、コピー元の仮想ボリュームに元データを残すように重複排除処理を行う。
(4) Effects of the present embodiment As described above, the storage apparatus 1 of the present embodiment can determine that the data to be deduplicated is copied by copying (first use case). Deduplication processing is performed so that the original data remains in the copy source virtual volume.
 またストレージ装置1は、重複排除されたデータの元データが複数のバックアップ先のいずれかの仮想ボリュームVVOLに存在するケース(第2のユースケース)であって、その元データを更新するときや、その仮想ボリュームVVOLを削除するときには、他のバックアップ先の仮想ボリュームVVOLのうち、最後に更新された仮想ボリュームVVOLに元データを移動させる。さらにストレージ装置1は、重複排除されたデータのユースケースが第2のユースケース以外の場合には、最も更新頻度が低い仮想ページに元データを移動させる。 The storage apparatus 1 is a case (second use case) in which the original data of the deduplicated data exists in any one of the plurality of backup destination virtual volumes VVOL, and when the original data is updated, When deleting the virtual volume VVOL, the original data is moved to the last updated virtual volume VVOL among the other backup destination virtual volumes VVOL. Further, when the use case of the deduplicated data is other than the second use case, the storage apparatus 1 moves the original data to the virtual page with the lowest update frequency.
 従って、本ストレージ装置1によれば、元データの更新や仮想ボリュームVVOLの削除に伴う元データの移動が発生し難く、この結果、重複排除されたデータの元データの移動により発生するリソースの消費を低減し、元データの移動処理コストを削減させることができる。 Therefore, according to the present storage device 1, it is difficult for the original data to move due to the update of the original data or the deletion of the virtual volume VVOL. As a result, the consumption of resources caused by the movement of the original data of the deduplicated data Can be reduced, and the movement processing cost of the original data can be reduced.
 また本ストレージ装置1によれば、上述の第1のユースケースと推定される場合、コピー元の仮想ボリュームに元データを残すように重複排除処理を実行するため、重複排除対応ボリュームを削除する際にその重複排除対応ボリュームに元データが残されている確率が低い。従って、重複排除対応ボリュームを削除する際に元データを移動させる必要が発生する確率を可能な限り低減させることができ、かくして重複排除対応ボリュームの削除に要する処理時間(の平均)を短縮化させることができる。 Further, according to the present storage device 1, when it is estimated that the first use case described above, the deduplication processing is executed so that the original data is left in the copy source virtual volume. In addition, the probability that the original data remains in the deduplication-compatible volume is low. Therefore, the probability that the original data needs to be moved when deleting the deduplication-compatible volume can be reduced as much as possible, and thus the processing time (average) required for deleting the deduplication-compatible volume can be shortened. be able to.
(5)他の実施の形態
 なお上述の実施の形態においては、本発明を図1のように構成されたストレージ装置1に適用するようにした場合について述べたが、本発明はこれに限らず、この他重複排除機能が搭載された種々の構成のストレージ装置に広く適用することができる。
(5) Other Embodiments In the above-described embodiments, the case where the present invention is applied to the storage apparatus 1 configured as shown in FIG. 1 has been described. However, the present invention is not limited to this. In addition, the present invention can be widely applied to storage apparatuses having various configurations equipped with a deduplication function.
 また上述の実施の形態においては、コピーのコピー元となったことがあるか否かのコピー属性情報を論理ボリューム単位で管理する管理部と、コピー属性情報に基づいて、コピー元となったことがある論理ボリュームに元データを残すように重複排除処理を実行する重複排除処理実行部と、元データを他の論理ボリュームに移動させるべき事態が発生した場合に、次の元データの移動が発生するリスクが最も少ないと推定される論理ボリュームを元データの移動先に決定し、決定した当該論理ボリュームに元データを移動させる元データ移動部とを、CPU11のマイクロプロセッサ10がマイクロプログラムを実行することにより具現化されるソフトウェア構成とするようにした場合について述べたが、本発明はこれに限らず、これら管理部、重複排除処理実行部及び元データ移動の一部又は全部をそれ専用のハードウェアにより構成するようにしても良い。 Further, in the above-described embodiment, the copy attribute information indicating whether or not the copy has been a copy source has been managed as a logical volume unit, and the copy source based on the copy attribute information. When the deduplication processing execution unit that performs deduplication processing to leave the original data in a certain logical volume and the situation that the original data should be moved to another logical volume occurs, the next original data moves The microprocessor 10 of the CPU 11 executes the microprogram on the logical volume that is estimated to have the least risk of being determined as the migration destination of the original data and the original data migration unit that migrates the original data to the determined logical volume. However, the present invention is not limited to this, and these pipes are embodied. Parts, some or all of the deduplication processing execution unit and the original data moved may be configured by its dedicated hardware.
 本発明は重複排除機能が搭載されたストレージ装置に広く適用することができる。 The present invention can be widely applied to storage apparatuses equipped with a deduplication function.
 1……ストレージ装置、2……ストレージコントローラ、8……ホスト装置、10……マイクロプロセッサ、11……CPU、12……プロセッサメモリ、13……ページ更新頻度情報、14……アドレス変換テーブル、22……共有メモリ領域、23……仮想ボリューム情報、24……ローカルコピーペア情報、30……記憶デバイス、31……FPT、VVOL……仮想ボリューム。 DESCRIPTION OF SYMBOLS 1 ... Storage device, 2 ... Storage controller, 8 ... Host device, 10 ... Microprocessor, 11 ... CPU, 12 ... Processor memory, 13 ... Page update frequency information, 14 ... Address conversion table, 22 ... Shared memory area, 23 ... Virtual volume information, 24 ... Local copy pair information, 30 ... Storage device, 31 ... FPT, VVOL ... Virtual volume.

Claims (8)

  1.  論理ボリュームを記憶領域として上位装置に提供すると共に、前記論理ボリュームに格納されたデータに対して、同一内容のデータのうちの1つを元データとして残し、他のデータを削除する重複排除処理を実行するストレージ装置において、
     コピーのコピー元となったことがあるか否かのコピー属性情報を前記論理ボリューム単位で管理する管理部と、
     前記コピー属性情報に基づいて、前記コピー元となったことがある前記論理ボリュームに前記元データを残すように前記重複排除処理を実行する重複排除処理実行部と
     を備えることを特徴とするストレージ装置。
    A deduplication process that provides a logical volume as a storage area to a host device and leaves one of the data with the same content as the original data and deletes other data from the data stored in the logical volume. In the storage device to be executed,
    A management unit that manages copy attribute information on whether or not a copy source has been created in units of logical volumes;
    A deduplication processing execution unit configured to execute the deduplication processing so as to leave the original data in the logical volume that has been the copy source based on the copy attribute information. .
  2.  前記元データを他の前記論理ボリュームに移動させるべき事態が発生した場合に、次の前記元データの移動が発生するリスクが最も少ないと推定される前記論理ボリュームを前記元データの移動先に決定し、決定した当該論理ボリュームに前記元データを移動させる元データ移動部を更に備える
     ことを特徴とする請求項1に記載のストレージ装置。
    When a situation where the original data should be moved to another logical volume occurs, the logical volume that is estimated to have the least risk of the next original data being moved is determined as the movement destination of the original data. The storage apparatus according to claim 1, further comprising an original data moving unit that moves the original data to the determined logical volume.
  3.  前記元データ移動部は、
     次の前記元データの移動が発生するリスクが最も少ないと推定される論理ボリュームとして、前記元データの移動元の論理ボリュームが第1のコピーペアにおけるデータのコピー先の論理ボリュームであり、かつ、当該第1のコピーペアにおけるデータのコピー元の前記論理ボリュームと第2のコピーペアに設定され、当該第2のコピーペアにおけるデータのコピー先の論理ボリュームが存在する場合に、当該論理ボリュームのうちの最も新たに更新された論理ボリュームに決定する
     ことを特徴とする請求項2に記載のストレージ装置。
    The original data moving unit
    As the logical volume that is estimated to have the least risk of the next movement of the original data, the logical volume that is the movement source of the original data is the logical volume that is the data copy destination in the first copy pair, and When the logical volume that is the data copy source in the first copy pair and the second copy pair are set, and there is a logical volume that is the data copy destination in the second copy pair, The storage apparatus according to claim 2, wherein the most recently updated logical volume is determined.
  4.  前記元データ移動部は、
     前記元データの移動元の論理ボリュームが第1のコピーペアにおけるデータのコピー先の論理ボリュームであり、かつ、当該第1のコピーペアにおけるデータのコピー元の前記論理ボリュームと第2のコピーペアに設定され、当該第2のコピーペアにおけるデータのコピー先の論理ボリュームが存在する場合以外の場合には、前記元データの移動先候補となり得るすべての前記論理ボリュームの部分領域のうち、最も更新頻度が低い部分領域を前記元データの移動先として決定する
     ことを特徴とする請求項3に記載のストレージ装置。
    The original data moving unit
    The migration source logical volume of the original data is the logical copy destination logical volume in the first copy pair, and the data copy source logical volume and the second copy pair in the first copy pair are the same. The update frequency is the highest among the partial areas of all the logical volumes that can be set as candidates for the migration destination of the original data, except when there is a logical volume that is a data copy destination in the second copy pair. The storage apparatus according to claim 3, wherein a partial area having a low is determined as a movement destination of the original data.
  5.  論理ボリュームを記憶領域として上位装置に提供すると共に、前記論理ボリュームに格納されたデータに対して、同一内容のデータのうちの1つを元データとして残し、他のデータを削除する重複排除処理を実行するストレージ装置の制御方法において、
     前記ストレージ装置が、コピーのコピー元となったことがあるか否かのコピー属性情報を前記論理ボリューム単位で管理する第1のステップと、
     前記ストレージ装置が、前記コピー属性情報に基づいて、前記コピー元となったことがある前記論理ボリュームに前記元データを残すように前記重複排除処理を実行する第2のステップと
     を備えることを特徴とするストレージ装置の制御方法。
    A deduplication process that provides a logical volume as a storage area to a host device and leaves one of the data with the same content as the original data and deletes other data from the data stored in the logical volume. In the storage device control method to be executed,
    A first step of managing, on a logical volume basis, copy attribute information indicating whether or not the storage device has become a copy source;
    The storage apparatus comprises a second step of executing the deduplication processing so as to leave the original data in the logical volume that has become the copy source based on the copy attribute information. A storage device control method.
  6.  前記ストレージ装置が、前記元データを他の前記論理ボリュームに移動させるべき事態が発生した場合に、次の前記元データの移動が発生するリスクが最も少ないと推定される前記論理ボリュームを前記元データの移動先に決定し、決定した当該論理ボリュームに前記元データを移動させる第3のステップを更に備える
     ことを特徴とする請求項5に記載のストレージ装置の制御方法。
    In the event that the storage apparatus should move the original data to another logical volume, the logical volume estimated to have the least risk of the next movement of the original data is transferred to the original data The storage apparatus control method according to claim 5, further comprising a third step of deciding the migration destination and moving the original data to the decided logical volume.
  7.  前記第3のステップにおいて、前記ストレージ装置は、
     次の前記元データの移動が発生するリスクが最も少ないと推定される論理ボリュームとして、前記元データの移動元の論理ボリュームが第1のコピーペアにおけるデータのコピー先の論理ボリュームであり、かつ、当該第1のコピーペアにおけるデータのコピー元の前記論理ボリュームと第2のコピーペアに設定され、当該第2のコピーペアにおけるデータのコピー先の論理ボリュームが存在する場合に、当該論理ボリュームのうちの最も新たに更新された論理ボリュームに決定する
     ことを特徴とする請求項6に記載のストレージ装置の制御方法。
    In the third step, the storage device
    As the logical volume that is estimated to have the least risk of the next movement of the original data, the logical volume that is the movement source of the original data is the logical volume that is the data copy destination in the first copy pair, and When the logical volume that is the data copy source in the first copy pair and the second copy pair are set, and there is a logical volume that is the data copy destination in the second copy pair, The storage apparatus control method according to claim 6, wherein the most recently updated logical volume is determined.
  8.  前記第3のステップにおいて、前記ストレージ装置は、
     前記元データの移動元の論理ボリュームが第1のコピーペアにおけるデータのコピー先の論理ボリュームであり、かつ、当該第1のコピーペアにおけるデータのコピー元の前記論理ボリュームと第2のコピーペアに設定され、当該第2のコピーペアにおけるデータのコピー先の論理ボリュームが存在する場合以外の場合には、前記元データの移動先候補となり得るすべての前記論理ボリュームの部分領域のうち、最も更新頻度が低い部分領域を前記元データの移動先として決定する
     ことを特徴とする請求項7に記載のストレージ装置の制御方法。
    In the third step, the storage device
    The migration source logical volume of the original data is the logical copy destination logical volume in the first copy pair, and the data copy source logical volume and the second copy pair in the first copy pair are the same. The update frequency is the highest among the partial areas of all the logical volumes that can be set as candidates for the migration destination of the original data, except when there is a logical volume that is a data copy destination in the second copy pair. The storage apparatus control method according to claim 7, further comprising: determining a partial area having a low value as a movement destination of the original data.
PCT/JP2016/084371 2016-11-18 2016-11-18 Storage device and control method therefor WO2018092288A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/084371 WO2018092288A1 (en) 2016-11-18 2016-11-18 Storage device and control method therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/084371 WO2018092288A1 (en) 2016-11-18 2016-11-18 Storage device and control method therefor

Publications (1)

Publication Number Publication Date
WO2018092288A1 true WO2018092288A1 (en) 2018-05-24

Family

ID=62146303

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/084371 WO2018092288A1 (en) 2016-11-18 2016-11-18 Storage device and control method therefor

Country Status (1)

Country Link
WO (1) WO2018092288A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012063902A (en) * 2010-09-15 2012-03-29 Nec Corp File management device, program and method
JP2013541055A (en) * 2011-09-16 2013-11-07 日本電気株式会社 Storage device
JP2015503780A (en) * 2012-02-13 2015-02-02 株式会社日立製作所 Hierarchical storage system management apparatus and management method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012063902A (en) * 2010-09-15 2012-03-29 Nec Corp File management device, program and method
JP2013541055A (en) * 2011-09-16 2013-11-07 日本電気株式会社 Storage device
JP2015503780A (en) * 2012-02-13 2015-02-02 株式会社日立製作所 Hierarchical storage system management apparatus and management method

Similar Documents

Publication Publication Date Title
US20210157523A1 (en) Storage system
US20230013281A1 (en) Storage space optimization in a system with varying data redundancy schemes
JP6304406B2 (en) Storage apparatus, program, and information processing method
US10031703B1 (en) Extent-based tiering for virtual storage using full LUNs
US7054960B1 (en) System and method for identifying block-level write operations to be transferred to a secondary site during replication
JP6124902B2 (en) Variable length coding in storage systems
JP6240071B2 (en) Computer system and method for effectively managing mapping table in storage system
US9235535B1 (en) Method and apparatus for reducing overheads of primary storage by transferring modified data in an out-of-order manner
US8521685B1 (en) Background movement of data between nodes in a storage cluster
US8656123B2 (en) Snapshot preserved data cloning
US7975115B2 (en) Method and apparatus for separating snapshot preserved and write data
WO2017119091A1 (en) Distrubuted storage system, data storage method, and software program
JP5685676B2 (en) Computer system and data management method
US8204858B2 (en) Snapshot reset method and apparatus
US9665306B1 (en) Method and system for enhancing data transfer at a storage system
US10176183B1 (en) Method and apparatus for reducing overheads of primary storage while transferring modified data
US9075755B1 (en) Optimizing data less writes for restore operations
JP6094267B2 (en) Storage system
US10739999B2 (en) Computer system having data amount reduction function and storage control method
US9063892B1 (en) Managing restore operations using data less writes
US10331362B1 (en) Adaptive replication for segmentation anchoring type
US10095700B2 (en) Persistent file handle object container memory expiry
US10089125B2 (en) Virtual machines accessing file data, object data, and block data
US9690809B1 (en) Dynamic parallel save streams
US20210103400A1 (en) Storage system and data migration method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16921962

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16921962

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP