WO2015173925A1 - Storage device - Google Patents

Storage device Download PDF

Info

Publication number
WO2015173925A1
WO2015173925A1 PCT/JP2014/062959 JP2014062959W WO2015173925A1 WO 2015173925 A1 WO2015173925 A1 WO 2015173925A1 JP 2014062959 W JP2014062959 W JP 2014062959W WO 2015173925 A1 WO2015173925 A1 WO 2015173925A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
storage
stored
storage medium
compressed
Prior art date
Application number
PCT/JP2014/062959
Other languages
French (fr)
Japanese (ja)
Inventor
彬史 鈴木
義裕 吉井
和衛 弘中
山本 彰
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2014/062959 priority Critical patent/WO2015173925A1/en
Publication of WO2015173925A1 publication Critical patent/WO2015173925A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures

Definitions

  • the present invention relates to a storage device using a semiconductor recording device as a primary data storage device, and a device control method.
  • Storage devices generally include a component called a cache for the purpose of improving request processing performance (hereinafter simply referred to as performance).
  • the cache plays two major roles in the storage device. One of the roles is to store an area with a relatively high read / write access frequency in the cache and improve the average performance of the storage apparatus.
  • the second function is to temporarily store write data when a write request is received from the server to the storage apparatus.
  • NVM nonvolatile semiconductor memory
  • FM NAND-type flash memory
  • storage devices are required to store data at low cost while maintaining the reliability of data retention.
  • storage apparatuses that record data by lossless compression (hereinafter simply referred to as compression) are known.
  • compression lossless compression
  • the data retention cost bit cost of the storage medium, power consumption cost of the storage device, etc.
  • a storage having such a compression function generally conceals the fact that recorded data is compressed and stored from the server that the storage device provides a storage area, and the data is recorded as if it were in an uncompressed state. So as to provide a storage area. With this function, the user can enjoy the merit of lowering the holding cost due to the compression without changing the software such as the existing application and operating system.
  • a storage apparatus having a data compression function and a function for concealing data changes caused by compression includes a virtual uncompressed storage area provided to the server (hereinafter referred to as a virtual uncompressed volume) and a physical area that is a recording destination of compressed data. Need to be associated.
  • the compression rate changes depending on the data content.
  • the data size after compression depends on the data content of the data to be compressed, and can only be obtained by a heuristic method of “actual compression”. Accordingly, the association between the virtual uncompressed volume and the physical recording destination of the compressed data changes dynamically every time the recorded data changes.
  • the storage device manages the correspondence by dividing the virtual uncompressed volume into fixed areas, and updates the information managing the dynamically changing correspondence every time data update and data compression accompanying it are completed. To do.
  • information for managing this correspondence relationship is referred to as compression management information.
  • Compressed management information is generally larger in size than other management information managed by the storage.
  • an 800 TB virtual uncompressed volume is provided using a 100 TB physical area.
  • a lot of management information of a storage device is generally stored in a DRAM that can be accessed at high speed by a processor that controls the storage device.
  • a method of storing compression management information in GB units in a DRAM having a high bit cost and high power consumption causes an increase in data retention cost.
  • the compression management information is management information for managing the virtual non-compressed volume and the physical recording destination of the compressed data, and is indispensable for the response to the data read request from the server. Therefore, from the viewpoint of the reliability of the storage apparatus, the loss of compression management information is equivalent to the loss of retained data. For this reason, in order to retain the compression management information, it is necessary to maintain at least the same level of reliability as data.
  • the storage apparatus of the present invention has a function of providing a virtual uncompressed volume to a host apparatus such as a server in order to conceal data changes caused by compression.
  • the storage apparatus also divides the area of the virtual uncompressed volume into stripe units, and manages each stripe in association with one of a plurality of final storage media constituting the RAID group.
  • the parity is generated from the data of each stripe, the generated parity and the data of each stripe are compressed, and the compressed parity and the data of each stripe are stored in the stripe. Are stored in the final storage medium associated with each other.
  • the storage device divides compression management information for managing the correspondence of the compressed data with the recording destination on the final storage medium for each recording medium, and compresses management for managing only the correspondence relation related to one recording medium. Information is recorded in a specific area of a recording medium to be managed.
  • the compression management information can be held with the same reliability as the data in the storage device. Further, it is possible to reduce the processing load associated with updating the compression management information and improve the performance of the storage apparatus.
  • FIG. 1 is a diagram showing a schematic configuration of a computer system centered on a storage apparatus according to an embodiment of the present invention.
  • FIG. 2-A is a conceptual diagram showing a logical space configuration of the storage apparatus according to the embodiment of the present invention.
  • FIG. 2-B is another conceptual diagram showing the logical space configuration of the storage apparatus according to the embodiment of the present invention.
  • FIG. 3-A is a diagram showing a data flow when the storage apparatus receives a write command from the host apparatus.
  • FIG. 3-B is a diagram showing a data flow when the storage apparatus receives a write command from the host apparatus.
  • FIG. 4 is a diagram showing an internal configuration of the NVM module.
  • FIG. 5 is a diagram showing the internal configuration of the FM.
  • FIG. 1 is a diagram showing a schematic configuration of a computer system centered on a storage apparatus according to an embodiment of the present invention.
  • FIG. 2-A is a conceptual diagram showing a logical space configuration of the storage apparatus according to the embodiment of the present invention
  • FIG. 6 is a diagram showing the internal configuration of the physical block.
  • FIG. 7 is a diagram showing the concept of associating the LBA0 and LBA1 spaces, which are logical spaces provided by the NVM module to the storage controller, and the PBA space, which is a physical area designating address space.
  • FIG. 8 is a diagram showing the contents of the LBA0-PBA conversion table 810 and the LBA1-PBA conversion table 820 managed by the NVM module.
  • FIG. 9 is a diagram showing block management information used by the NVM module.
  • FIG. 10 is a diagram showing a write command and response information to the write command received by the NVM module.
  • FIG. 11 is a diagram showing a compressed data size acquisition command and response information to the compressed data size acquisition command received by the NVM module.
  • FIG. 12 is a diagram showing an LBA1 mapping command received by the NVM module and response information for the LBA1 mapping command.
  • FIG. 13 is a diagram illustrating a full stripe parity generation command and response information to the full stripe parity generation command received by the NVM module.
  • FIG. 14 is a diagram illustrating an update parity generation command and response information to the update parity generation command received by the NVM module.
  • FIG. 15 is a diagram showing a compressed information acquisition command and response information to the compressed information acquisition command received by the NVM module.
  • FIG. 16 is a diagram showing a read command and response information to the read command received by the NVM module.
  • FIG. 17 is a diagram illustrating a mapping cancellation command and response information to the mapping cancellation command received by the NVM module.
  • FIG. 18 is a diagram illustrating a compressed information transfer command and response information to the compressed information transfer command received by the NVM module.
  • FIG. 19 is a diagram illustrating LBA0 mapping command and response information to the LBA0 mapping command received by the NVM module.
  • FIG. 20A is a diagram showing an example of cache management information.
  • FIG. 20B is a diagram showing an example of a free list.
  • FIG. 21 is a conceptual diagram showing a correspondence relationship between a virtual volume, a RAID group, and a PDEV in the storage apparatus according to the embodiment of the present invention.
  • FIG. 22 is a diagram showing an example of management information for managing the correspondence between virtual volumes, RAID groups, and PDEVs in the storage apparatus according to an embodiment of the present invention.
  • FIG. 20A is a diagram showing an example of cache management information.
  • FIG. 20B is a diagram showing an example of a free list.
  • FIG. 21 is a conceptual diagram showing a correspondence relationship between a virtual volume, a RAID
  • FIG. 23 is a diagram showing a configuration of compression management information used by the storage apparatus according to the embodiment of the present invention.
  • FIG. 24 is a flowchart of the decompression read process.
  • FIG. 25 is a flowchart of the write data cache storage process.
  • FIG. 26 is a flowchart of parity generation processing.
  • FIG. 27 is a flowchart of the destage process.
  • FIG. 28 is a flowchart of the partial recovery operation of the compression management information 2300.
  • FIG. 29 is a flowchart of the rebuild process.
  • FM NAND flash memory
  • the present invention is not limited to FM, and covers all nonvolatile memories.
  • a mode in which data compression is performed by a dedicated hardware circuit will be described.
  • data is compressed by a data compression arithmetic process by a general-purpose processor. It may be compressed.
  • a mode in which the parity is implemented by a dedicated hardware circuit will be described.
  • the present invention is not limited to this embodiment, and the RAID parity is set by a parity generation calculation process by a general-purpose processor. It may be generated.
  • FIG. 1 is a diagram showing a schematic configuration of a computer system centered on a storage device according to an embodiment of the present invention.
  • the NVM module 126 shown in FIG. 1 is a semiconductor recording device using FM as a recording medium.
  • the storage apparatus 101 includes a plurality of storage controllers 110.
  • Each storage controller 110 includes a host interface (host I / F) 124 that connects to a host device and a disk interface (disk I / F) 123 that connects to a recording device.
  • the host interface 124 include devices that support protocols such as FC (Fibre Channel), iSCSI (Internet Small Computer System Interface), and FCoE (Fibre Channel over Ether), and the disk interface 107 includes, for example, FC and SAS (Serial). Examples include devices compatible with various protocols such as Attached SCSI), SATA (Serial Advanced Technology Attachment), and PCI (Peripheral Component Interconnect) -Express.
  • the storage controller 110 includes hardware resources such as a processor 121 and a memory (DRAM) 125, and a final storage medium device such as the SSD 111 and the HDD 112 in response to a read / write request from the host device 124 under the control of the processor. To read / write requests. Further, it has an NVM module 126 used as a cache device, and can be controlled from the processor 121 via the internal SW 122.
  • hardware resources such as a processor 121 and a memory (DRAM) 125
  • DRAM memory
  • NVM module 126 used as a cache device, and can be controlled from the processor 121 via the internal SW 122.
  • the storage controller 110 has a RAID (Redundant Arrays of Inexpensive Disks) parity generation function and a data restoration function using RAID parity, and manages a plurality of SSDs 111 and a plurality of HDDs 112 as a RAID group in arbitrary units.
  • the storage controller 110 also has a function of monitoring and managing the failure, usage status, operating status, etc. of the recording device.
  • the storage apparatus 101 is connected to the management apparatus 104 via a network.
  • An example of this network is a LAN (Local Area Network). Although this network is omitted for simplification in FIG. 1, it is connected to each storage controller 110 in the storage apparatus 101. This network may be connected by the same network as the SAN 102.
  • the management device 104 is a computer having hardware resources such as a processor, a memory, a network interface, and a local input / output device, and software resources such as a management program.
  • the management device 104 acquires information from the storage device by a program and displays a management screen.
  • the system administrator uses the management screen displayed on the management apparatus to monitor the storage apparatus 101 and control the operation.
  • the SSD 111 stores data transferred in response to a write request from the storage controller, retrieves stored data in response to a read request, and transfers the data to the storage controller.
  • the disk interface 123 designates a logical storage location for a read / write request by a logical address (hereinafter, LBA: Logical Block Address).
  • LBA Logical Block Address
  • the plurality of SSDs 111 are managed as a plurality of RAID groups, and are configured to be able to restore lost data when data is lost.
  • a plurality of HDDs (Hard Disk Drives) 112 are provided in the storage apparatus 101, and are connected to a plurality of storage controllers 110 in the same storage apparatus via the disk interface 123, similarly to the SSD 111.
  • the HDD 112 stores data transferred in response to a write request from the storage controller 110, retrieves stored data in response to a read request, and transfers it to the storage controller 110.
  • the disk interface 123 designates a logical storage location for a read / write request by a logical address (hereinafter, LBA: Logical Block Address).
  • LBA Logical Block Address
  • the plurality of HDDs 111 are managed as a plurality of RAID groups, and are configured to be able to restore lost data when data is lost.
  • the storage controller 110 is connected to the SAN 102 connected to the host device 103 via the host interface 124. Although omitted in FIG. 1 for simplification, a connection path for mutually communicating data and control information between the storage controllers 110 is also provided.
  • the host device 103 corresponds to, for example, a computer or a file server that forms the core of a business system.
  • the host device 103 includes hardware resources such as a processor, a memory, a network interface, and a local input / output device, and includes software resources such as a device driver, an operating system (OS), and an application program.
  • OS operating system
  • the host apparatus 103 executes various programs under processor control to perform communication with the storage apparatus 101 and data read / write requests.
  • management information such as usage status and operation status of the storage apparatus 101 is acquired by executing various programs under processor control.
  • the management unit of the recording apparatus, the recording apparatus control method, the data compression setting, and the like can be designated and changed.
  • FIG. 2-A shows the transition of the write data management state when a write request is issued from the host device 103 in the storage apparatus of this embodiment.
  • the host apparatus 103 recognizes a virtual volume (indicated as “virtual Vol” in the figure) 200 as a storage area, and performs data access by designating an address in the virtual volume 200.
  • a virtual volume indicated as “virtual Vol” in the figure
  • the virtual volume 200 is a virtual space that the storage apparatus 101 provides to the host apparatus 103.
  • the write data is compressed in the storage device 101 and stored in the final storage medium (HDD 111 or SSD 112). It cannot be recognized that data is compressed and stored in 200 (changes in data due to data compression are concealed).
  • FIG. 2A an example in which the storage apparatus 101 has one virtual volume 200 is described, but the present invention is not limited to this example.
  • the storage apparatus 101 may manage a plurality of virtual volumes.
  • a plurality of volumes to be managed may include a volume that is not compressed.
  • the virtual volume 200 that conceals data compression in the higher-level device 103 will be mainly described.
  • the storage apparatus 101 of the present invention logically manages the storage areas of the physical HDD 111 or SSD 112 as PDEV 205 (Physical Device), and associates each PDEV 205 with one virtual PDEV 204 whose capacity is virtually expanded. Manage.
  • the storage apparatus 101 configures and manages an RG 203 (RAID group) with a plurality of virtual PDEVs 204, and manages the RG 203 and the virtual volume 200 in association with each other.
  • FIG. 2A shows an example in which one RG is associated with one virtual volume 200 (virtual volume 200 and RAID group 0), but the present invention is not limited to this example.
  • a configuration in which one RG and a plurality of virtual volumes are associated may be employed, or a configuration in which one virtual volume is associated with a plurality of RGs may be employed.
  • the storage apparatus 101 manages the area specified as the write destination in the virtual volume 200 as being cached in the LBA0 space provided by the NVM module 126.
  • the LBA0 space is a virtual logical space provided to the storage apparatus 101 by the NVM module 126, and it is assumed that the data compressed and stored by the NVM module 126 is stored uncompressed. This space is accessible from the processor 121) of the storage controller 110.
  • the write data is transferred to the NVM module 126 after the storage apparatus 101 receives it from the host apparatus 103.
  • the NVM module 126 of this embodiment compresses the data and records it in the NVM module 126.
  • the storage apparatus 101 determines that the write data has been stored in the cache area (the LBA0 space provided by the NVM module 126), and notifies the upper apparatus 103 that the write has been completed. To do.
  • the storage apparatus 101 transfers the compressed data of the write data recorded in the LBA0 space to the HDD 111 or the SSD 112 as the final storage medium at an arbitrary timing. At this time, the storage apparatus 101 needs to acquire compressed data from the NVM module 126. As shown in FIG. 2A, the storage apparatus 101 of this embodiment acquires compressed data using the LBA1 space 202 provided by the NVM module 126. For this purpose, the storage apparatus 101 issues a command for associating the compressed data stored in the non-compressed area area in the LBA0 space with the LBA1 space 202 to the NVM module 126.
  • the NVM module 126 that has received the association command for the LBA1 space 202 associates the compressed data associated with the designated LBA0 space with the LBA1 space.
  • the storage apparatus 101 acquires the compressed data from the NVM module 126 by designating the address of the LBA1 space.
  • the storage apparatus 101 identifies the addresses in the virtual PDEV 204 and the virtual PDEV 204 for storing data from the addresses in the virtual volume associated with the compressed data to be transferred to the final storage medium. Then, the address of the PDEV 205 associated with the address in the virtual PDEV 204 is determined, and the data is transferred to the physical device.
  • the LBA1 space 202 for acquiring the compressed data is not necessary.
  • the storage apparatus 101 may issue a read command including an LBA0 address and an instruction to transfer the compressed data without decompressing, and read the compressed data from the NVM module 126 using the LBA0 space.
  • the storage apparatus 101 compresses and stores the data acquired from the host apparatus 103 in the NVM module 126 that is a cache.
  • this operation is referred to as a host write operation.
  • data transfer that occurs in the host write operation will be described.
  • the first data transfer in the host ride operation is performed when the write data is acquired from the host device.
  • This transfer is a transfer from the host interface 124 to the DRAM 125 of the storage controller (311).
  • the storage apparatus 101 performs this transfer by issuing a command to the host interface 124.
  • the storage apparatus 101 issues a command to the NVM module 126 and transfers the write data stored in the DRAM 125 to the NVM module 126 (312).
  • the NVM module 126 compresses the write data by internal compression hardware (compression circuit) and stores it in the DRAM (data buffer) 416 in the NVM module 126 (313).
  • the NVM module 126 notifies the storage device that the storage of the write data is completed.
  • the compressed data stored in the DRAM 416 may be transferred from the DRAM 416 to the NVM (FM) 420 in the NVM module 126 and recorded, or may be kept in the DRAM 416. Whether the transfer from the DRAM 416 to the NVM (FM) 420 is necessary depends on the control method in the NVM module 126.
  • the storage apparatus 101 that has received the write data storage completion from the NVM module 126 notifies the host apparatus 103 of the completion of the write command.
  • the above is the data transfer that occurs in the host write operation in this embodiment.
  • the storage apparatus 101 After the host write operation, the storage apparatus 101 generates a RAID parity for the write data at an arbitrary timing. Hereinafter, this operation is referred to as a parity generation operation. Next, data transfer that occurs in the parity generation operation will be described.
  • the storage apparatus 101 generates a RAID parity for the write data during the parity generation operation.
  • parity is not generated for compressed write data, but parity is generated for uncompressed write data.
  • this parity generation method a method is conceivable in which write data is stored uncompressed when recorded in the NVM module 126 and the data is compressed after parity generation.
  • many nonvolatile memories such as NAND flash memory and RRAM have a limited number of writes, and by reducing the amount of data written to the NVM (FM) 420 by compression, the device life of the NVM module 126 can be extended. .
  • FM NVM
  • the capacity of the NVM module 126 can be expanded, and the cache area of the storage apparatus can be expanded at a lower apparatus cost.
  • the NVM module 126 compresses and stores the data in the DRAM 416 or NVM in the NVM module 126, and decompresses the data when generating the parity.
  • This operation is not indispensable, and it may be recorded in the NVM (FM) 420 or the DRAM 416 without being compressed, and the data may be compressed together with the generated parity after the parity is generated.
  • the NVM module 126 In the NVM module 126 according to the embodiment of the present invention, after the compressed data recorded in the DRAM 416 or the NVM (FM) 420 is expanded, the expanded data is provided to the parity generation circuit (317). With this function, the NVM module 126 generates parity for the uncompressed data while aiming to increase the life of the apparatus and reduce the cost.
  • the parity generated by the parity generation circuit is transferred to the compression circuit (318), and is transferred to the DRAM 416 as compressed data (319).
  • the compressed parity stored in the DRAM 416 may be recorded in the NVM (FM) 420 or kept in the DRAM 416 depending on the determination of the NVM module 126. Further, it is not always necessary to compress the parity generated by the parity generation circuit. In general, since the data effect due to compression cannot be expected compared with data, parity control may be performed without compression.
  • the storage apparatus 101 transfers the compressed data of parity and write data to the final storage medium at an arbitrary timing.
  • this operation is referred to as a destage operation. Next, data transfer that occurs in the destage operation will be described.
  • the storage apparatus 101 reads the compressed parity and write data from the NVM module 126 during the destage operation. At this time, the NVM module 126 transfers the designated write data and compressed parity data to the DRAM 125 of the storage apparatus 101 (322). Thereafter, the storage apparatus 101 transfers the compressed write data and parity to the SSD or HDD (323).
  • the above is the outline of the write data transfer process performed by the storage apparatus 101 according to the embodiment of the present invention.
  • the decompressed data is recorded in the DRAM 416 as shown in FIG. May be transferred to the parity generation circuit.
  • the parity generated by the parity generation circuit need not be directly transferred to the compression circuit, but the generated parity may be recorded in the DRAM 416 and the data may be transferred to the compression circuit.
  • the NVM module 126 includes an FM controller (FM CTL) 410 and a plurality of (for example, 32) FM 420s.
  • FM CTL FM controller
  • the FM controller 410 includes a processor 415, a RAM (DRAM) 413, a data compression / decompression unit 418, a parity generation unit 419, a data buffer 416, an I / O interface (I / F) 411, an FM interface (I / F). ) 417, and a switch 414 for mutually transferring data.
  • the switch 414 connects the processor 415 in the FM controller 410, the RAM 413, the data compression / decompression unit 418, the parity generation unit 419, the data buffer 416, the I / O interface 411, and the FM interface 417, and addresses the data between the parts. Or route and forward by ID.
  • the I / O interface 411 is connected to the internal switch 122 included in the storage controller 110 in the storage apparatus 101, and is connected to each part of the FM controller 410 via the switch 414.
  • the I / O interface 411 receives a read / write request and a logical storage location (LBA: Logical Block Address) to be requested from the processor 121 included in the storage controller 110 in the storage apparatus 101, and processes the request. I do. Further, when a write request is made, the write data is received and the write data is recorded in the FM 420. Further, the I / O interface 411 receives an instruction from the processor 121 included in the storage controller 110 and issues an interrupt to the processor 415 in the FM controller internal 410.
  • LBA Logical Block Address
  • the I / O interface 411 also receives a control command for the NVM module 126 from the processor 121 included in the storage controller 110, and displays the operation status, usage status, current setting value, etc. of the NVM module 126 according to the command.
  • the storage controller 110 can be notified.
  • the processor 415 is connected to each part of the FM controller 410 via the switch 414 and controls the entire FM controller 410 based on the program and management information recorded in the RAM 413. In addition, the processor 415 monitors the entire FM controller 410 by a periodic information acquisition and interrupt reception function.
  • the data buffer 416 is configured by using a DRAM as an example, and stores temporary data in the middle of data transfer processing in the FM controller 410.
  • the FM interface 417 is connected to the FM 420 by a plurality of buses (for example, 16).
  • a plurality (for example, 2) of FM 420 is connected to each bus, and a plurality of FMs 420 connected to the same bus are controlled independently using a CE (Chip Enable) signal that is also connected to the FM 420.
  • CE Chip Enable
  • the FM interface 417 operates in response to a read / write request instructed by the processor 415. At this time, the FM interface 417 is instructed by the processor 415 as the chip, block, and page numbers as request targets. If it is a read request, the stored data is read from the FM 420 and transferred to the data buffer 416. If it is a write request, the data to be stored is called from the data buffer 416 and transferred to the FM 420.
  • the FM interface 417 includes an ECC generation circuit, an ECC data loss detection circuit, and an ECC correction circuit.
  • ECC generation circuit When writing data to the FM 420, the data is written with the ECC added. Further, when data is called, the call data from the FM 420 is inspected by the data loss detection circuit using ECC, and when the data loss is detected, the data is corrected by the ECC correction circuit.
  • the data compression / decompression unit 418 has a data compression function using a reversible compression algorithm. In addition, there are a plurality of types of data compression algorithms, and a compression level changing function is also provided.
  • the data compression / decompression unit 418 reads data from the data buffer 416 according to an instruction from the processor 415, performs a data compression operation that is a data compression operation or an inverse conversion of the data compression by a lossless compression algorithm, and outputs the result again. Write to the data buffer.
  • the data compression / decompression unit 418 may be implemented as a logic circuit, or a similar function may be realized by executing a compression / decompression program with a processor.
  • the parity generation unit 419 has a function of generating parity that is redundant data required in the RAID technology. Specifically, the parity generation unit 419 includes an XOR operation used in RAID 5 and 6, a Reed Solomon code or EVENODD used in RAID 6. It has a function of generating diagonal parity calculated by the method. The parity generation unit 419 reads data that is a parity generation target from the data buffer 416 in accordance with an instruction from the processor 415, and generates RAID5 or RAID6 parity by the above-described parity generation function.
  • the switch 414, I / O interface 411, processor 415, data buffer 416, FM interface 417, data compression / decompression unit 418, and parity generation unit 419 described above are ASIC (Application Specific Integrated Circuit) and FPGA (Field Programmable Gate).
  • Array may be configured in one semiconductor element, or may be configured by connecting a plurality of individual dedicated ICs (Integrated Circuits) to each other.
  • a volatile memory such as a DRAM is used for the RAM 413.
  • the RAM 413 stores management information of the FM 420 used in the NVM module 126, a transfer list including transfer control information used by each DMA, and the like.
  • a part or all of the role of the data buffer 416 for storing data may be included in the RAM 413 and the RAM 413 may be used for data storage.
  • the configuration of the NVM module 126 to which the present invention is applied has been described with reference to FIG.
  • the NVM module 126 having the flash memory Flash Memory
  • the non-volatile memory to be mounted on the NVM module 126 is not limited to the flash memory.
  • a non-volatile memory such as Phase Change RAM or Resistance RAM may be used.
  • a configuration may be adopted in which part or all of the FM 420 is a volatile RAM (DRAM or the like).
  • the nonvolatile memory area in the FM 420 is composed of a plurality (for example, 4096) of blocks (physical blocks) 502, and stored data is erased in units of physical blocks.
  • the FM 420 has an I / O register 501 inside.
  • the I / O register 501 is a register having a recording capacity equal to or larger than a physical page size (for example, 8 KB).
  • FM 420 operates in accordance with a read / write request instruction from FM interface 417.
  • the flow of the write operation is as follows. First, the FM 420 receives a write command, a requested physical block, and a physical page from the FM interface 417. Next, the write data transferred from the FM interface 417 is stored in the I / O register 501. Thereafter, the data stored in the I / O register 501 is written to the designated physical page.
  • the flow of read operation is as follows. First, the FM 420 receives a read command, a requested physical block, and a page from the FM interface 417. Next, the data stored in the physical page of the designated physical block is read and stored in the I / O register 501. Thereafter, the data stored in the I / O register 501 is transferred to the FM interface 417.
  • the physical block 502 is divided into a plurality of (for example, 128) pages 601, and reading of stored data and writing of data are processed in units of pages.
  • the order of writing to the physical page 601 in the block 502 is fixed, and writing is performed in order from the first page. That is, data must be written in the order of Page1, Page2, Page3,.
  • overwriting on a written page 601 is prohibited in principle, and when data is overwritten on a written page 601, it is necessary to delete the data in the block 502 to which the page 601 belongs only after the data is erased. Data cannot be written to page 601.
  • the NVM module 126 includes a plurality of FMs (chips) 420, manages a storage area composed of a plurality of blocks and a plurality of pages, and a storage controller 110 (processor 121) to which the NVM module 126 is connected.
  • a storage controller 110 processor 121 to which the NVM module 126 is connected.
  • “providing storage space” means that each storage area to be accessed by the storage controller 110 is assigned an address, and the processor 121 of the storage controller 110 to which the NVM module 126 is connected determines the address.
  • the physical storage area configured by the FM 420 is managed in a manner uniquely associated with an address space used only within the NVM module 126.
  • this physical area designating address space (physical address space) used only within the NVM module 126 will be referred to as a PBA (Physical Block Address) space, and each physical storage area (sector in the PBA space.
  • PBA Physical Block Address
  • the position (address) of 1 sector is 512 bytes) is described as PBA (Physical Block Address).
  • the NVM module 126 of this embodiment manages the association between this PBA and LBA (Logical Block Address) that is the address of each area of the logical storage space provided to the storage apparatus.
  • LBA Logical Block Address
  • a conventional storage device such as an SSD provides one storage space for a host device (such as a host computer) to which the storage device is connected.
  • the NVM module 126 of this embodiment has two logical storage spaces, and provides the two logical storage spaces to the storage controller 110 to which the NVM module 126 is connected. The relationship between the two logical storage spaces LBA and PBA will be described with reference to FIG.
  • FIG. 7 is a diagram illustrating a concept of association between the LBA0 space 701 and the LBA1 space 702, which are logical storage spaces provided by the NVM module 126 of the present embodiment to the storage controller 110, and the PBA space 703.
  • the NVM module 126 provides two logical storage spaces, an LBA0 space 701 and an LBA1 space 702.
  • the addresses assigned to the storage areas on the LBA 0 space 701 are referred to as “LBA 0” or “LBA 0 address”
  • the addresses assigned to the storage areas on the LBA 1 space 702 are referred to as “LBA 1”.
  • LBA1 address the addresses assigned to the storage areas on the LBA 1 space 702 are referred to as “LBA 1”.
  • the size of the LBA0 space 701 and the size of the LBA1 space 702 are both equal to or smaller than the size of the PBA space. However, even when the size of the LBA0 space 701 is larger than the size of the PBA space, The invention is effective.
  • the LBA0 space 701 is a logical storage space for allowing the processor 121 of the storage controller 110 to access the compressed data recorded in the physical storage area configured by the FM 420 as uncompressed data.
  • the processor 121 designates an address (LBA0) on the LBA0 space 701 and issues a write request to the NVM module 126
  • the NVM module 126 acquires write data from the storage controller 110 and compresses it by the data compression / decompression unit 418. After that, the NVM module 126 records data in the physical storage area on the FM 420 designated by the dynamically selected PBA, and associates LBA0 and PBA.
  • the NVM module 126 acquires data (compressed data) from the physical storage area of the FM 420 indicated by the PBA associated with the LBA0. After decompression by the compression / decompression unit 418, the decompressed data is transferred to the storage controller 110 as read data.
  • the association between LBA0 and PBA is managed by an LBA0-PBA conversion table described later.
  • the LBA1 space 702 is a logical storage space for allowing the storage controller 110 to access the compressed data recorded in the physical storage area configured by the FM 420 as it is (not expanded).
  • the processor 121 of the storage controller 110 designates LBA1 and issues a write request to the NVM module 126
  • the NVM module 126 acquires data (compressed write data) from the storage controller 110, and the NVM module 126 dynamically
  • the data is recorded in the storage area of the FM designated by the selected PBA, and the LBA 1 and the PBA are associated with each other.
  • the NVM module 126 acquires data (compressed data) from the physical storage area of the FM 420 indicated by the PBA associated with LBA 1 and reads it to the storage controller 110. Transfer compressed data as data.
  • the association between LBA1 and PBA is managed by an LBA1-PBA conversion table described later.
  • the area on the PBA space which is the physical storage area in which the compressed data 713 is recorded, may be associated with both the LBA0 space area and the LBA1 space area at the same time.
  • the decompressed data of the compressed data 713 is associated with the LBA0 space as the decompressed data 711, and the compressed data 713 is directly associated with the LBA1 space as the compressed data 712.
  • the processor 121 specifies LBA0 (assuming that LBA0 is set to 0x000000011000) and writes data to the NVM module 126, the data is compressed by the data compression / decompression unit 418 in the NVM module 126.
  • the NVM module 126 is arranged on the dynamically selected PBA space (specifically, any unwritten page among a plurality of pages of the FM 420).
  • the data is managed in a state associated with the address 0x000000011000 of the LBA0 space. Thereafter, when the processor 121 issues a request for associating the data associated with 0x000000011000 to the address of the LBA1 space (assuming 0x80000000010) to the NVM module 126, this data is also associated with the LBA1 space, and the processor 121 is associated with the LBA1 space.
  • a request (command) for reading the data at the address 0x80000000010 is issued to the NVM module 126, the processor 121 can read out the data written to the LBA0 address 0x000000011000 in a compressed state.
  • the storage apparatus 101 in the present embodiment associates the data written to the NVM module 126 with LBA0 specified, associates it with an area on the LBA1 space, specifies LBA1 and issues a RAID parity generation instruction corresponding to the data. RAID parity generation for compressed data is enabled.
  • the size of the compressed data generated by the NVM module 126 in the embodiment of the present invention is limited to a multiple of 512 bytes (1 sector), and does not exceed the size of the uncompressed data. Yes. That is, when 4 KB data is compressed, the minimum size is 512 bytes and the maximum size is 4 KB.
  • NVM Module Management Information 1 LBA-PBA Conversion Table Next, management information used for control by the NVM module 126 in this embodiment will be described.
  • the LBA0-PBA conversion table 810 and the LBA1-PBA conversion table 820 will be described with reference to FIG.
  • the LBA0-PBA conversion table 810 is stored in the DRAM 413 in the NVM module 126, and includes information on the NVM module LBA0 (811), the NVM module PBA (812), and the PBA length (813).
  • the processor 2415 of the NVM module 126 receives the LBA 0 specified at the time of the read request from the host device, and then uses the LBA 0 to obtain the PBA indicating the location where the actual data is stored.
  • the NVM module 126 records the update data (write data) in a physical storage area different from the PBA in which the pre-update data is recorded, and converts the PBA and PBA length in which the update data is recorded into an LBA0-PBA conversion. Record in the corresponding part of the table and update the LBA0-PBA conversion table. By operating in this manner, the NVM module 126 enables (pseudo) overwriting of data in the area on the LBA0 space.
  • the NVM module LBA0 (811) is a logical area of the LBA0 space provided by the NVM module 126 arranged in units of 4 KB in order (each address (LBA0) in the LBA0 space is attached to each sector (512 bytes). Have been).
  • LBA0 each address
  • PBA NVM module PBA
  • the association between the NVM module LBA0 (811) and the NVM module PBA (812) is managed in units of 4 KB (8 sectors).
  • the association between the NVM module LBA0 (811) and the NVM module PBA (812) may be managed in an arbitrary unit other than the 4 KB unit.
  • the NVM module PBA (812) is a field for storing the head address of the PBA associated with the NVM module LBA0 (811).
  • the physical storage area of the PBA space is divided and managed for every 512 bytes (one sector).
  • a value (PBA) of “XXX” is associated as a PBA (Physical Block Address) associated with the NVM module LBA0 (811) “0x000_0000_0000”. This value is an address that uniquely indicates a storage area among a plurality of FMs 420 mounted on the NVM module 126.
  • the actual storage size of 4 KB data designated in the NVM module LBA0 (811) is recorded.
  • the storage size is recorded by the number of sectors.
  • the NVM module 126 in this embodiment compresses uncompressed data instructed by the processor 121 of the storage controller 110 in units of 4 KB.
  • the processor 121 receives a write request for 8 KB data (uncompressed data) starting from an address in the LBA 0 space (0x000_0000_0000)
  • 4 KB data in the address range 0x000_0000_0000 to 0x000_0000_0007 (in the LBA 0 space) is used as a unit.
  • Compressed data is generated by compression, and then compressed data is generated by compressing 4 KB data in the address range 0x000_0000_0008 to 0x000_0000_000F as a unit, and each compressed data is written in the physical storage area of the FM 420.
  • the present invention is not limited to a mode in which data is compressed in units of 4 KB, and the present invention is effective even in a configuration in which data is compressed in other units.
  • the LBA1-PBA conversion table 820 is stored in the DRAM 413 in the NVM module 126, and includes two pieces of information of the NVM module LBA1 (821) and the NVM module PBA (822).
  • the processor 245 of the NVM module 126 receives the LBA1 specified at the time of the read request from the upper apparatus, and then uses the LBA1-PBA conversion table 820 to indicate the location where the actual data of the LBA1 is stored. Convert to PBA.
  • the NVM module LBA1 (821) is a logical area of the LBA1 space provided by the NVM module 126 arranged in order for each sector (a numerical value 1 in the NVM module LBA1 (821) means one sector (512 bytes). To do). This is because the NVM module 126 in this embodiment is described on the premise that the association between the NVM module LBA1 (821) and the NVM module PBA (822) is managed in units of 512B, but this NVM module LBA1 (821). ) And the NVM module PBA (822) are not limited to the mode managed in 512B units, and may be managed in any unit. However, LBA1 is a space that directly maps the physical storage area PBA that is the storage location of the compressed data, and is preferably equal to the PBA division management size. In this embodiment, LBA1 is divided in units of 512B. to manage.
  • the NVM module PBA (822) is a field for storing the head address of the PBA associated with LBA1.
  • the PBA is divided and managed for each 512B.
  • the PBA value “ZZZ” is associated with the NVM module LBA1 “0x800_0000_0002”.
  • This PBA value is an address that uniquely indicates a storage area on a certain FM 420 mounted on the NVM module 126. Accordingly, when “0x800_0000_0002” is received as the read request destination start address (LBA1), “ZZZ” is acquired as the physical read destination start address in the NVM module 126.
  • LBA1 the read request destination start address
  • a value indicating “unallocated” is stored in the NVM module PBA (822).
  • the above is the contents of the LBA0-PBA logical-physical conversion table 810 and the LBA1-PBA logical-physical change table 820 used by the NVM module 126.
  • NVM Module Management Information 3 Block Management Information Next, block management information used by the NVM module to which the present invention is applied will be described with reference to FIG.
  • the block management information 900 is stored in the DRAM 413 in the NVM module 126 and includes items of an NVM module PBA 901, an NVM chip number 902, a block number 903, and an invalid PBA amount 904.
  • the NVM module PBA 901 is a field for storing a PBA value that uniquely identifies each area in all the FMs 420 managed by the NVM module 126.
  • the NVM module PBA 901 is divided and managed in units of blocks.
  • FIG. 9 shows an example in which the head address is stored as the NVM module PBA value.
  • the field “0x000_0000_0000” indicates that the NVM module PBA range from “0x000_0000_0000” to “0x000_0000_0FFF” is applicable.
  • the NVM chip number 902 is a field for storing a number for uniquely specifying the FM Chip 420 mounted on the NVM module 126.
  • the block number 903 is a field for storing the block number in the FM Chip 420 specified by the stored value of the NVM Chip number 902.
  • the invalid PBA amount 904 is a field for storing the invalid PBA amount of the block specified by the stored value of the block number 903 in the FM Chip specified by the stored value of the NVM Chip number 902.
  • the invalid PBA amount is associated with the LBA0 space and / or LBA1 space specified by the NVM module LBA0 (811) and the NVM module LBA1 (821) in the LBA0-PBA conversion table 810 and the LBA1-PBA conversion table 820. This is the amount of the area (on the PBA space) that was later released from the association.
  • the PBA associated with the NVM module LBA0 or LBA1 by the LBA0-PBA conversion table 810 or the LBA1-PBA conversion table 820 is referred to as an effective PBA in this specification.
  • the invalid PBA area is inevitably generated when a pseudo-overwrite is attempted in a non-volatile memory where data cannot be overwritten.
  • the NVM module 126 records the update data in an unwritten PBA (different from the PBA in which the pre-update data is written) at the time of data update, and the NVM module PBA 812 of the LBA0-PBA conversion table 810. And the PBA length 813 field are rewritten to the start address and PBA length of the PBA area in which the update data is recorded. At this time, the association by the LBA0-PBA conversion table 810 is released for the PBA area in which the pre-update data is recorded.
  • the NVM module 126 also checks the LBA1-PBA conversion table 820 and sets an area that is not associated in the LBA1-PBA conversion table as an invalid PBA area.
  • the NVM module 126 counts the amount of invalid PBA for each block, which is the minimum erase unit of FM, and preferentially selects a block with a large amount of invalid PBA as a garbage collection target area.
  • the block number 0 of the NVM chip number 0 managed by the NVM module 126 has an invalid PBA area of 160 KB as an example.
  • garbage collection when the total amount of invalid PBA areas managed by the NVM module 126 exceeds a predetermined garbage collection start threshold (depletion of unwritten pages), blocks including invalid PBA areas are erased and unwritten. Create a PBA area. This operation is called garbage collection.
  • garbage collection When an effective PBA area is included in an erasure target block at the time of garbage collection, it is necessary to copy the effective PBA area to another block before erasing the block. Since this data copy involves a write operation to the FM, the destruction of the FM progresses, and resources such as the processor of the NVM module 126 and the bus bandwidth are consumed as the copy operation, which causes a decrease in performance. For this reason, it is desirable that the number of valid PBA areas be as small as possible.
  • the NVM module 126 refers to the block management information 900 at the time of garbage collection, and deletes the effective PBA by sequentially deleting the blocks having a larger storage value of the invalid PBA amount 904 (including many invalid PBA areas). Operates to reduce the amount of space copy.
  • the amount of the area released from the association with the NVM modules LBA0 (811) and LBA1 (821) is managed by the PBA amount (KB). It is not limited to this management unit. For example, instead of the invalid PBA amount, there may be a mode of managing the number of pages that are the minimum writing unit.
  • the above is the content of the block management information 900 used by the NVM module to which the present invention is applied.
  • NVM Module Control Command 1 Write Command Next, commands used by the NVM module 126 to which the present invention is applied will be described.
  • the NVM module 126 When the NVM module 126 in this embodiment receives one command from the processor 121 of the storage controller 110, the NVM module 126 analyzes the content of the received command, performs predetermined processing, and sends one response (response information) after the processing is completed. Reply to the storage controller.
  • This process is realized by the processor 415 in the NVM module 126 executing a command processing program stored in the RAM 413.
  • the command includes a set of information necessary for the NVM module 126 to perform predetermined processing. For example, if the command is a write command that instructs the NVM module 126 to write data, The command includes a write command and information necessary for writing (write data write position, data length, etc.).
  • the NVM module 126 supports a plurality of types of commands. First, information common to each command will be described.
  • Each command includes information such as an operation code (Opcode) and a command ID at the head as common information. Then, after the command ID, information (parameter) unique to each command is added to form one command.
  • FIG. 10 is a diagram showing the format of the write command of the NVM module 126 and the format of the response information for the write command in this embodiment.
  • the element (field) 1011 in FIG. 10 is Opcode, and the element 1012 is a command. ID.
  • Elements 1013 to 1016 are parameters specific to the write command.
  • the command ID and status (Status) are information included in all response information, and information unique to each response information is added after the status. Sometimes.
  • the operation code is information for notifying the NVM module 126 of the command type, and the NVM module 126 that has acquired the command recognizes the notified command type by referring to this information. For example, if Opcode is 0x01, it is recognized as a write command, and if Opcode is 0x02, it is recognized as a read command.
  • the command ID is a field for storing a unique ID of the command.
  • the ID specified in this field is used so that the storage controller 110 can recognize which command is the response information. Is granted.
  • the storage controller 110 generates an ID capable of uniquely identifying the command when creating the command, creates a command storing this ID in the command ID field, and transmits the command to the NVM module 126. Then, when the process corresponding to the received command is completed, the NVM module 126 includes the command ID of the command in response information and returns it to the storage controller 110.
  • the storage controller 110 recognizes the completion of the command by acquiring the ID included in the response information.
  • the status (element 1022 in FIG. 10) included in the response information is a field in which information indicating whether or not the command processing has been normally completed is stored. If the command process is not completed normally (error), the status stores a number that can identify the cause of the error, for example.
  • FIG. 10 is a diagram showing the LBA0 write command of the NVM module 126 and the response information to the write command in this embodiment.
  • the LBA0 write command 1010 of the NVM module 126 in the present embodiment is constituted by an operation code 1011, a command ID 1012, an LBA0 / 1 start address 1013, an LBA0 / 1 length 1014, a compression necessity flag 1015, and a write data address 1016 as command information. Is done.
  • an example of a command based on the above information will be described, but there may be additional information above.
  • the present invention is effective even if information related to DIF (Data Integrity Field) or the like is given to the command.
  • DIF Data Integrity Field
  • the operation code 1011 is a field for notifying the type of command to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a write command.
  • the command ID 1012 is a field for storing a unique ID of the command.
  • the command response information is assigned with the ID specified in this field so that the storage apparatus can recognize which command is the response information. Is done.
  • the storage apparatus 101 assigns an ID that can uniquely identify the command when the command is created.
  • the LBA 0/1 start address 1013 is a field for designating the start address of the write destination logical space.
  • the LBA0 space is a space in the range of addresses 0x000_0000_0000 to 0x07F_FFFF_FFFF
  • the LBA1 space is defined as a space in the range after the address 0x800_0000_0000. Therefore, the NVM module 126 uses the LBA0 / 1 of the write command.
  • the LBA 0/1 length 1014 is a field for designating the range (length) of the recording destination LBA 0 or LBA 1 starting from the LBA 0/1 start address 1013, and stores the length represented by the number of sectors.
  • the NVM module 126 performs processing for associating the PBA area storing the write data with the LBA0 or LBA1 area in the range indicated by the LBA0 or LBA1 start address 1013 and the LBA0 / 1 length 1014 described above.
  • the compression necessity flag 1015 is a field for designating whether to compress the write target data indicated by this command.
  • the storage controller 110 creates a write command, if the size reduction effect due to data compression cannot be expected for the write target data (for example, when it is already recognized as data compressed by image compression), this flag is controlled. Then, the NVM module 126 is notified that compression is not necessary.
  • the write target data has already been compressed and is used to explicitly notify that compression is not necessary. If the fixed setting indicates that transfer data compression is not required when writing to LBA1, this compression necessity flag 1015 may be omitted.
  • the write data address 1016 is a field for storing the start address of the current storage destination of the write target data indicated by this command. For example, when data temporarily stored in the DRAM 125 of the storage apparatus 101 is written to the NVM module 126, the processor of the storage apparatus 101 issues a write command in which the address on the DRAM 125 in which the data is stored is stored in the write data address 1016. Create it.
  • the NVM module 126 acquires write data by acquiring, from the storage apparatus 101, data of an area having a length designated by the LBA 0/1 length 1014 from the address indicated in this field.
  • the write response information 1020 includes a command ID 1021, a status 1022, and a compressed data length 1023.
  • a command ID 1021 a command ID 1021
  • a status 1022 a status 1023
  • a compressed data length 1023 a compressed data length
  • the command ID 1021 is a field for storing a number that can uniquely identify a completed command.
  • the status 1022 is a field for notifying the storage device of the completion or error of the command.
  • a number for identifying the cause of the error is stored.
  • the compressed data length 1023 is a field for recording the data length when the written data is reduced by data compression.
  • the storage apparatus 101 can grasp the data size after compression of the written data by acquiring this field.
  • the storage apparatus 101 cannot accurately grasp the actual compressed data size associated with the specific LBA0 area as update writing is performed. For this reason, the storage apparatus 101 issues a compressed data size acquisition command, which will be described later, for mapping to LBA1 when the total of the compressed data lengths 1023 acquired by the write command reaches a constant value.
  • this field is invalid because compressed data is recorded.
  • FIG. 11 is a diagram showing a compressed data size acquisition command of the NVM module 126 and response information to the compressed data size acquisition command in this embodiment. It is.
  • the compressed data size acquisition command 1110 of the NVM module 126 in the present embodiment is constituted by an operation code 1111, a command ID 1012, an LBA 0 start address 1113, and an LBA 0 length 1114 as command information.
  • an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous LBA0 write command, description thereof is omitted.
  • the operation code 1111 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a compressed data size acquisition command.
  • the LBA 0 start address 1113 is a field for designating the start address of the LBA 0 area that is the target of acquiring the data size after compression.
  • the LBA 0 length 1114 is a field for designating a range of LBA 0 starting from the LBA 0 start address 1113.
  • the NVM module 126 calculates the size of the compressed data associated with the LBA1 area in the range indicated by the LBA0 start address 1113 and the LBA0 length 1114, and notifies the storage apparatus.
  • the address that can be specified as the LBA 0 start address 1113 is limited to a multiple of 8 sectors (4 KB).
  • the length that can be designated as the LBA 0 length 1114 is also limited to a multiple of 8 sectors (4 KB). If an address that does not match the 8-sector boundary (for example, 0x000 — 0000 — 0001) or length is specified as the LBA 0 start address 1113 or the LBA 0 length 1114, an error is returned.
  • the compressed data size acquisition response 1120 includes a command ID 1021, a status 1022, and a compressed data length 1123.
  • a command ID 1021 a command ID 1021
  • a status 1022 a status of a command ID 1021
  • a compressed data length 1123 a compressed data length 1123.
  • response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
  • the compressed data length 1123 is a field for storing the size of the compressed data associated with the LBA0 area specified by the compressed data size acquisition command.
  • the storage controller 110 acquires the value of this compressed data length, and recognizes the area size required for the LBA 1 that is the mapping destination by an LBA 1 mapping command described later.
  • FIG. 12 is a diagram schematically showing an LBA1 mapping command and response information to the LBA1 mapping command supported by the NVM module 126 in the present embodiment.
  • the LBA1 mapping command 1210 of the NVM module 126 in the present embodiment is configured by an operation code 1211, a command ID 1012, an LBA0 start address 1213, an LBA0 length 1214, and an LBA1 start address 1215 as command information.
  • an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous write command, description thereof is omitted.
  • the operation code 1211 is a field for notifying the type of command to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is an LBA1 mapping command.
  • the LBA 0 start address 1213 is a field for designating a head address for designating the LBA 0 area of the target data for mapping the compressed data to LBA 1.
  • the LBA0 length 1214 is a field for designating a range of LBA0 starting from the LBA0 start address 1213 to be mapped to LBA1. As with the compressed data size acquisition command, the LBA 0 start address 1213 and the LBA 0 length 1214 are limited to multiples of 8 sectors (4 KB).
  • the LBA1 start address 1215 is a field for designating the start address of LBA1 to be mapped.
  • the storage controller 110 knows the data size to be mapped in advance using the compressed data size acquisition command, reserves an LBA1 area to which this data size can be mapped, and stores this head address in the LBA1 start address 1215 field.
  • the command is issued to the NVM module 126.
  • the NVM module 126 transfers the compressed data associated with the LBA0 space in the range indicated by the LBA0 start address 1213 and the LBA0 length 1214 from the LBA1 start address 1215 to an area corresponding to the compressed data size. Perform mapping. More specifically, the PBA (NVM module PBA812) associated with the LBA0 space in the range indicated by the LBA0 start address 1213 and the LBA0 length 1214 is acquired by referring to the LBA0-PBA conversion table.
  • the PBA acquired in the PBA 822 in the LBA1 range (entry specified by the NVM module LBA1 (821)) having the same size as the total size of the acquired PBA Enter the address.
  • the LBA1 mapping response 1220 includes a command ID 1021 and a status 1022.
  • a command ID 1021 and a status 1022 In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous write response, description thereof is omitted.
  • NVM Module Control Command 4 Full Stripe Parity Generation Command
  • RAID technology There are two main parity generation methods in RAID technology. One is a method of generating parity by calculating parity data such as XOR by using all data necessary for generating parity, and this method is referred to as a “full stripe parity generation method” in this specification. . The other corresponds to the data before update and the data before update stored in the storage medium in addition to the update data when update data is written to the RAID-configured storage medium group. This is a method of generating parity (updated parity) corresponding to update data by performing an XOR operation with the parity before update, and this method is called an “update parity generation method” in this specification.
  • the full stripe parity generation command can be used when all the data constituting the RAID parity is stored in the NVM module 126 and mapped in the LBA1 space. Therefore, in the case of a RAID configuration that generates parity for six data, six data must be stored in the NVM module 126.
  • the write data from the higher level apparatus 103 is stored in a compressed state in the NVM module 126.
  • the parity generation the uncompressed data is stored. Generate parity from. Therefore, the parity generation target data needs to be mapped to the LBA0 space.
  • FIG. 13 is a diagram showing the response information to the full stripe parity generation command and the full stripe parity generation command of the NVM module 126 in the present embodiment.
  • the full stripe parity generation command 1310 includes, as command information, an operation code (Opcode) 1311, a command ID 1012, an LBA0 length 1313, a stripe number 1314, an LBA0 start address 0 to X (1315 to 1317), and an LBA0 start address (for XOR parity) 1318, an LBA0 start address (for RAID 6 parity) 1319.
  • an operation code (Opcode) 1311 includes, as command information, an operation code (Opcode) 1311, a command ID 1012, an LBA0 length 1313, a stripe number 1314, an LBA0 start address 0 to X (1315 to 1317), and an LBA0 start address (for XOR parity) 1318, an LBA0 start address (for RAID 6 parity) 1319.
  • the operation code 1311 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a full stripe parity generation command.
  • the LBA 0 length 1313 is a field for designating the length of the parity to be generated (for RAID parity, the parity and the parity generation source data have the same length).
  • the number of stripes 1314 designates the number of data used for generating parity. For example, when parity is generated for 6 data, 6 is stored in the stripe number 1314.
  • LBA 0 start addresses 0 to X are fields for designating the start address of LBA 0 to which the parity generation source data is associated.
  • the number of fields matches the number specified by the stripe number 1314 (when a command that does not match is issued, the NVM module 126 returns an error). For example, in a configuration in which two parities are created for six data (RAID6 6D + 2P), six LBA1 start addresses are designated.
  • LBA 0 start address (for XOR parity) 1318 is a field for designating the storage destination of the generated RAID parity (XOR parity).
  • the parity (RAID 5 parity, RAID 6 P parity, or horizontal parity) generated in the area specified by the LBA 0 length 1313 is mapped from this start address.
  • the LBA 0 start address (for RAID 6) 1319 is a field for designating the storage destination of the parity for RAID 6 to be generated.
  • the parity for RAID 6 is Q parity of Reed-Solomon code or diagonal parity in the EVENODD system.
  • the generated parity is stored in an area in a range specified by the LBA 0 start address (for RAID 6) 1319 and the LBA 0 length 1313.
  • the NVM module 126 of this embodiment acquires a plurality of compressed data from the FM 420 indicated by the PBA associated with the area specified by the LBA 0 start addresses 0 to X (1315 to 1317) described above. Subsequently, the acquired data is decompressed using the data compression / decompression unit 418, and one or two parities are generated from the decompressed data using the parity generation unit 419 inside the NVM module 126. Thereafter, the generated parity is compressed using the data compression / decompression unit 418 and then recorded in the FM 420.
  • the corresponding row of the LBA 0-PBA management information 810 is the area of the recording destination FM in the NVM module PBA (812) and the PBA length (813) of the LBA0 start address (for XOR parity) 1318 and the LBA0 start address (for RAID6) 1319)
  • the PBA and the compressed data length are recorded.
  • the full stripe parity generation response 1320 includes a command ID 1021 and a status 1022.
  • a command ID 1021 and a status 1022 In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
  • NVM module control command 5 Update parity generation command Update parity generation is updated with update data and update data when update data is recorded in the area of the final storage medium for which parity has already been created. This is performed when three pieces of information of old data in the area and old parity protecting the old data are mapped on LBA0.
  • the storage controller 110 of this embodiment reads the old data and the compressed data of the old parity from the final storage medium configured in RAID, and writes it to the area on the LBA1 space of the NVM module 126.
  • Update parity generation is performed by issuing a parity generation command.
  • FIG. 14 is a diagram showing an update parity generation command of the NVM module 126 and response information to the update parity generation command in the present embodiment.
  • the update parity command 1410 of the NVM module 126 in this embodiment includes, as command information, an operation code 1011, a command ID 1012, an LBA0 length 1413, an LBA0 start address 0 (1414), an LBA0 start address 1 (1415), and an LBA0 start address 2 ( 1416), LBA0 start address 3 (1417), LBA0 start address 4 (1418), and LBA0 start address 5 (1419).
  • an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous LBA0 write command, description thereof is omitted.
  • the operation code 1411 is a field for notifying the type of command to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is an update parity generation command.
  • the LBA0 length 1413 is a field for designating the length of parity to be generated (note that the length of RAID parity and parity generation source data have the same relationship).
  • LBA 0 start address 0 (1414) is a field indicating the start address of the LBA 0 area to which new data for parity update is mapped.
  • the storage apparatus 101 uses this field to notify the NVM module 126 that the data in the area from the LBA 0 start address 0 (1414) to the LBA 0 length 1413 is new data.
  • LBA0 start address 1 (1415) is a field indicating the start address of the LBA0 area to which the old data for parity update is mapped.
  • the storage apparatus 101 uses this field to notify the NVM module 126 that the data in the area from the LBA 0 start address 1 (1415) to the LBA 0 length 1413 is old data.
  • LBA0 start address 2 (1416) is a field indicating the start address of the LBA0 area to which the XOR parity before update for parity update is mapped.
  • the storage apparatus 101 uses this field to notify the NVM module 126 that the data in the area from the LBA 0 start address 2 (1416) to the LBA 0 length 1413 is the pre-update XOR parity.
  • LBA 0 start address 3 (1417) is a field indicating the start address of the LBA 0 area to which the parity for RAID 6 before update for parity update is mapped.
  • the storage apparatus 101 uses this field to notify the NVM module 126 that the data in the area from the LBA 0 start address 3 (1417) to the LBA 0 length 1413 is RAID 6 parity.
  • LBA0 start address 4 (1418) is a field indicating the start address of the LBA0 area to which the XOR parity newly created by updating is associated.
  • the storage apparatus 101 uses this field to instruct the NVM module 126 to map a new XOR parity from the LBA 0 start address 4 (1418) to the LBA 0 length 1413 area.
  • LBA 0 start address 5 (1419) is a field indicating the start address of the LBA 0 area to which a parity for RAID 6 newly created by update is associated.
  • the storage controller 110 uses this field to instruct the NVM module 126 to map a new parity for RAID 6 from the LBA 0 start address 5 (1419) to the LBA 0 length 1413 area.
  • the processing when the NVM module 126 receives the update parity generation command is the same as the processing performed when the full stripe parity generation command is received.
  • a parity generation unit in the NVM module 126 that acquires and decompresses a plurality of compressed data from the storage area on the FM 420 indicated by the PBA associated with the area specified by the LBA 0 start addresses 0 to 3 (1414 to 1417). After generating one or two parities using 419, the parities are compressed. Thereafter, the generated parity is recorded in the FM 420 and mapped to the LBA 0 specified by the LBA 0 start address 4 (1418) and the LBA 0 start address 5 (1419).
  • the update parity generation response 1420 includes a command ID 1021 and a status 1022.
  • a command ID 1021 and a status 1022 In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
  • NVM module control command 6 compression information acquisition command
  • the NVM module 126 which is a cache apparatus, generates parity corresponding to data, and each data including parity .
  • the storage apparatus 101 acquires the compressed data from the NVM module 126 and records the compressed data in the final storage medium. At this time, information necessary for decompressing the compressed data (hereinafter referred to as compressed information) is also recorded in the final storage medium.
  • compressed information information necessary for decompressing the compressed data
  • the present invention does not depend on this method, and the NVM module 126 may permanently hold information necessary for decompression.
  • the storage apparatus 101 When recording the compressed information in the final storage medium as in the present embodiment, the storage apparatus 101 needs to acquire the compressed information from the NVM module 126 that is a cache apparatus.
  • the compression information acquisition command is used when the storage controller 110 acquires compression information from the NVM module 126.
  • FIG. 15 is a diagram illustrating a compression information acquisition command and response information to the compression information acquisition command of the NVM module 126 in the present embodiment.
  • the compression information acquisition command 1510 of the NVM module 126 in this embodiment is constituted by an operation code 1511, a command ID 1012, an LBA1 start address 1513, an LBA1 length 1514, and a compression information address 1515 as command information.
  • an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous write command, description thereof is omitted.
  • the operation code 1511 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a compression information acquisition command.
  • the LBA1 start address 1513 is a field for designating the start address of the area on the LBA1 from which compression information is to be acquired.
  • the LBA1 length 1514 is a field for designating a range of LBA1 starting from the LBA1 start address 1513.
  • the compression information address 1315 is a field for designating the storage destination of the compression information acquired by the storage controller 110 from the NVM module 126.
  • the NVM module 126 creates compression information necessary for decompressing the data recorded in the LBA1 area in the range indicated by the LBA1 start address 1513 and the LBA1 length 1514, and sets the compression information address 1315 specified by the storage controller 110. Forward.
  • the compression information specifically indicates the structure of the compressed data mapped to LBA1. For example, when four pieces of independently decompressable compressed data are mapped to the designated LBA1 area, the information stores the start position of the four pieces of compressed data and the length of the compressed data.
  • the storage apparatus 101 acquires the compression information from the NVM module 126 by the compression information acquisition command, and then records the compression information together with the compressed data on the final storage medium.
  • the compressed information is acquired from the final storage medium together with the compressed data, the compressed data is written to the NVM module 126, and then the compressed information is transferred to a compressed information transfer command described later.
  • the NVM module 126 can be extended.
  • the compression information acquisition response 1520 includes a command ID 1021 and a status 1022.
  • a command ID 1021 and a status 1022 In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
  • FIG. 16 is a diagram showing a read command of the NVM module 126 and response information to the read command in this embodiment.
  • the read command 1610 of the NVM module 126 in this embodiment is constituted by an operation code 1611, a command ID 1012, an LBA 0/1 start address 1613, an LBA 0/1 length 1614, an expansion necessity flag 1615, and a read data address 1616 as command information.
  • the in this embodiment an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous LBA0 write command, description thereof is omitted.
  • the operation code 1111 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a read command.
  • the LBA 0/1 start address 1613 is a field for designating the start address of the logical space of the read destination.
  • the LBA 0/1 length 1614 is a field for designating the range of the recording destination LBA 0 or LBA 1 starting from the LBA 0/1 start address 1613.
  • the NVM module 126 obtains data from the PBA associated with the LBA0 or LBA1 area in the range indicated by the LBA0 or LBA1 start address 1613 and the LBA0 / 1 length 1614 described above, and transfers the data to the storage device to perform read processing. I do.
  • the decompression necessity flag 1615 is a field for designating the necessity of decompression of the read target data indicated by this command. When the storage controller 110 creates a read command, this flag is controlled to notify the NVM module 126 that decompression is unnecessary. This field may not be included in the read command. In this embodiment, when reading from the LBA1 space, it is necessary to acquire the read target data without intentionally expanding the data. For this reason, it is used to explicitly tell that decompression is unnecessary. It should be noted that the decompression necessity flag 1615 may not be provided if the read to the LBA 1 is set to a fixed value that does not require decompression of the acquired data.
  • the head address (for example, an address in the DRAM 125) of the output destination area of the read target data is designated.
  • data having a length designated by the LBA 0/1 length 1614 is continuously stored from the area of the address designated by the read data address 1616.
  • the read response 1620 includes a command ID 1021 and a status 1022.
  • a command ID 1021 and a status 1022 In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
  • NVM Module Control Command 8 Mapping Cancel Command
  • the storage controller 110 acquires the write data and parity compressed and recorded in the NVM module 126 in a compressed state. Therefore, the data is mapped to LBA1. Further, in order to decompress and acquire the compressed information, LBA1 is designated and data recorded in the NVM module 126 is mapped to LBA0. The mapped area needs to be unmapped when the processing is completed and becomes unnecessary.
  • the storage apparatus 101 of this embodiment releases the association of LBA0 or LBA1 associated with the PBA using a mapping release command.
  • FIG. 17 is a diagram showing a mapping cancellation command of the NVM module 126 and response information to the mapping cancellation command in this embodiment.
  • the mapping cancellation command 1710 of the NVM module 126 in this embodiment is constituted by an operation code 1711, a command ID 1012, an LBA0 / 1 start address 1713, and an LBA0 / 1 length 1714 as command information.
  • an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous LBA0 write command, description thereof is omitted.
  • the operation code 1111 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a mapping release command.
  • the LBA 0/1 start address 1713 is a field for designating the start address of the logical space to be unmapped, and addresses in both the LBA 0 space and the LBA 1 space can be designated. However, if an address in the LBA0 space is specified, the address must be an address on a 4 KB (8 sector) boundary. If an address that is not on a 4 KB (8 sector) boundary is specified, the NVM module 126 will generate an error. return it.
  • the LBA 0/1 length 1714 is a field for designating the range of the LBA 0 space or the LBA 1 space starting from the LBA 0/1 start address 1713.
  • the processing when the NVM module 126 receives a mapping release command from the storage controller 110 is as follows.
  • the NVM module 126 associates the PBA associated with the LBA0 or LBA1 space (hereinafter referred to as “target LBA0 / 1 area”) in the range indicated by the LBA0 LBA / 1 start address 1713 and the LBA0 / 1 length 1714 described above. Is deleted.
  • each entry in which the value of the NVM module LBA0 (811) or the NVM module LBA1 (821) belongs to the range of the target LBA0 / 1 area Is updated by changing the field of the NVM module PBA812 or the NVM module PBA822 to unallocated.
  • the NVM module 126 in the embodiment of the present invention selects a block having a relatively large invalid PBA amount 904 among a plurality of blocks (that is, selects a block having the largest invalid PBA amount 904 in order), Garbage collection is carried out. Garbage collection is a well-known process and will not be described here.
  • NVM Module Control Command 9 Compressed Information Transfer Command
  • the storage apparatus 101 After the storage apparatus 101 stores the data compressed by the NVM module 126 in the final storage medium, the storage apparatus 101 In response to a read request from the user, it is necessary to decompress the compressed data and transfer it to the host device. At this time, the storage apparatus 101 acquires the compressed data from the final storage medium, transfers the compressed data to the NVM module 126, and then transfers the compressed information necessary for decompressing the compressed data.
  • FIG. 18 is a diagram showing the compression information transfer command of the NVM module 126 and the response information to the compression information transfer command in the present embodiment.
  • the compression information transfer command 1810 of the NVM module 126 in the present embodiment is constituted by an operation code 1811, a command ID 1012, an LBA1 start address 1813, an LBA1 length 1814, and a compression information address 1815 as command information.
  • an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous write command, description thereof is omitted.
  • the operation code 1811 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a compressed information transfer command.
  • the LBA1 start address 1813 is a field that specifies the start address of the area on the LBA1 that is the target of the compressed information to be transferred.
  • the LBA1 length 1814 is a field for designating a range of LBA1 starting from the LBA1 start address 1513.
  • the compression information address 1815 is a field for designating the storage destination of the compression information transferred from the storage controller 110 to the NVM module 126.
  • the NVM module 126 acquires the compression information from the address specified by the compression information address 1815, and enables decompression of a plurality of compressed data in the area specified by the LBA1 start address 1813 and the LBA1 length 1814. Specifically, after the compressed data associated with LBA1 is mapped to LBA0 with the LBA0 mapping command described later, the compression transferred with the compression information transfer command when a read request for LBA0 is received from the storage device The compressed data is decompressed using the information and transferred to the storage.
  • the compressed information transfer response 1820 includes a command ID 1021 and a status 1022.
  • a command ID 1021 and a status 1022 In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous write response, description thereof is omitted.
  • NVM Module Control Command 10 LBA0 Mapping Command
  • the NVM module 126 records the compressed data written by designating the LBA1 area in the FM.
  • the compressed data is acquired by the storage controller 110 being decompressed, so that the compressed data is mapped to LBA 0 different from LBA 1 that is the write destination of the compressed data.
  • FIG. 19 is a diagram showing the LBA0 mapping command of the NVM module 126 and the response information to the LBA0 mapping command in the present embodiment.
  • the LBA0 mapping command 1210 of the NVM module 126 in the present embodiment is configured by an operation code 1911, a command ID 1012, an LBA1 start address 1913, an LBA1 length 1914, and an LBA0 start address 1915 as command information.
  • an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous write command, description thereof is omitted.
  • the operation code 1911 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is an LBA0 mapping command.
  • the LBA1 start address 1913 is a field for designating a head address for designating an LBA1 area of target data for mapping compressed data to LBA1.
  • the LBA1 length 1914 is a field for designating a range of LBA1 starting from the LBA1 start address 1913 to be mapped to LBA1.
  • the LBA 0 start address 1915 is a field for designating the start address of LBA 0 to be mapped.
  • the storage apparatus 101 knows the data size after the decompression of the compressed data recorded in the LBA 1 from the compression information acquired from the PDEV by the storage controller 110, and secures an area of LBA 0 to which this data size can be mapped. Enter the address in the LBA 0 start address 1915 field.
  • the address that can be specified as the LBA 0 start address 1915 is limited to a multiple of 8 sectors (4 KB).
  • the NVM module 126 of this embodiment transfers the compressed data associated with the LBA1 area in the range indicated by the LBA1 start address 1913 and the LBA0 length 1914 from the LBA0 start address 1915 to the area corresponding to the data size after decompression. Mapping. More specifically, referring to the LBA1-PBA conversion table, the PBA associated with the LBA in the range indicated by the LBA1 start address 1913 and the LBA1 length 1914 is acquired. Then, referring to the LBA0-PBA conversion table, acquired from the LBA0 start address 1915 to the PBA822 in the LBA0 range that has the same size after decompression based on the compression information acquired from the storage device by the NVM module 126 using the compression information transfer command Enter the PBA address.
  • the LBA0 mapping response 1220 includes a command ID 1021 and a status 1022.
  • a command ID 1021 and a status 1022 In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous write response, description thereof is omitted.
  • the storage apparatus 101 manages the virtual volume in association with one or more RAID group areas.
  • the storage apparatus 101 manages one or more virtual PDEVs in association with the RAID group.
  • the storage apparatus 101 manages one PDEV (SSD 111 or HDD 112) in association with each virtual PDEV.
  • the storage apparatus 101 provides a virtual volume (denoted as “virtual Vol” in the drawing) 200 to the host apparatus 103.
  • the example of FIG. 21 shows the association in the storage apparatus of the data area “Data14” recognized by the higher-level apparatus 103.
  • RAID type of the RAID group associated with the virtual volume 200 is RAID 5 will be described as an example.
  • data areas Data0 to Data14 are fixed-size areas divided by RAID parity calculation units, and these units are hereinafter referred to as RAID stripes.
  • the data areas Data0 to Data14 are storage areas recognized by the host apparatus 103, and are therefore areas in which uncompressed data is stored in the storage apparatus 101.
  • the RAID stripe Data14 generates a parity for RAID5 by performing an XOR operation with Data13 and Data12.
  • a set of RAID stripes necessary for generating a RAID parity, such as Data12, Data13, and Data14 is referred to as a RAID stripe column.
  • the length of the RAID stripe is 64 KB or 32 KB
  • each stripe in one virtual volume is assigned a serial number starting from 0. This number is referred to herein as the “striped number”.
  • the stripe number of the stripe located at the head of the virtual volume is 0, and the stripe numbers 1, 2,.
  • the numbers given after “Data” such as Data 0 to 14 are stripe numbers. This is referred to as a “stripe number”.
  • each RAID stripe column is also given a number starting from 0 (called a stripe column number) in order from the stripe column located at the head of the RAID group.
  • the stripe column number of the stripe column located at the head of the RAID group is 0, and the stripe column numbers 1, 2,...
  • the virtual PDEV is a concept defined in the storage apparatus 101 in order to convert the address of the virtual volume into the PDEV address.
  • the storage apparatus 101 converts the data written from the host apparatus 103 into the virtual PDEV. Treat as a storage device that stores the data as is (uncompressed).
  • the virtual volume 200 is associated with the RAID group 0, and the RAID group 0 is configured with virtual PDEVs 0 to 3.
  • RAID group 0 is configured to protect data with RAID 5 by virtual PDEVs 0 to 3, and each stripe of virtual volume 200 corresponds to virtual volume 200 with a statically defined computable correspondence. Is associated with one of the stripes in the plurality of virtual PDEVs belonging to the RAID group 0. This is the same as the association performed in the storage apparatus adopting the conventional RAID configuration. Further, since the PDEV has a relationship of 1: 1 with the virtual PDEV, it can be said that each stripe of the virtual volume 200 is associated with one of the PDEVs belonging to the RAID group 0.
  • the correspondence relationship between each virtual PDEV configuring the RAID group 0 and the PDEV installed in the storage apparatus 101 is managed by virtual PDEV information 2230 described later.
  • the correspondence relationship between the storage destination area of the RAID stripe “Data14” in the virtual PDEV2 and the area in the PDEV2 in which the data of the RAID stripe “Data14” is compressed and stored (“compressed D14” in FIG. 21) is as follows. It is managed by the compression management information (“management information 2” in FIG. 21) stored in the PDEV 2.
  • the compression management information (management information 2) is recorded at a predetermined location in the PDEV 2. Details regarding the contents of the compression management information and the recording position in the PDEV 2 will be described later. Although only the compression management information in the PDEV 2 is described here, the compression management information (management information 0, 1, 3 in FIG. 21) is also stored in the other PDEVs.
  • the storage device 101 stores “virtual volume management information” stored in the DRAM 125 of the storage device 101 so that the processor 121 can access data at high speed.
  • “RAID group management information” and “virtual PDEV information” will be described with reference to FIG. Note that the management information stored in the DRAM 125 of the storage apparatus 101 is not limited to the above, and other management information may be stored.
  • the virtual volume management information 2210 is management information generated each time the storage apparatus 101 creates one virtual volume, and management information for one virtual volume is stored in one virtual volume management information.
  • the storage apparatus 101 manages the association between the virtual volume and the RAID group using the virtual volume management information 2210, and identifies the RAID group to be referred to for the address requested from the host apparatus.
  • the virtual volume management information 2210 includes items of a virtual volume start address 2211, a virtual volume size 2212, a RAID group number 2213, and a RAID group start address 2214. The present invention is not limited to these four types of items.
  • the virtual volume management information 2210 may include management information other than that shown in FIG.
  • the head address 2211 in the virtual volume is an item for storing the head of the address in the virtual volume to which the RAID group is associated.
  • the virtual volume size 2212 is an item for storing the area size in the virtual volume associated with the RAID group.
  • the storage apparatus 101 associates the area of the size specified by the virtual volume size 2212 from the start address specified by the virtual volume start address 2211 with the RAID group specified by the RAID group number 2213 described later.
  • the RAID group number 2213 is an item for storing the number of the RAID group associated with the virtual volume.
  • the RAID group start address 2214 is an item for designating an address in the RAID group designated by the RAID group number 2213. In this specification, the RAID group address does not include an area in which parity is stored.
  • the RAID group address will be described below with reference to FIG.
  • the stripe size is 64 KB
  • the first stripe “Data 0” of virtual PDEV 0 that is the first virtual PDEV that constitutes RAID group 0 is 0, and the first stripe “Data 1” of the next virtual PDEV 1 is stored.
  • the position to be recorded is address 64 KB.
  • the position where the first stripe “Data2” of the next virtual PDEV2 is stored is the address 128 KB.
  • the position where “Data4” of the virtual PDEV0 is stored is the address 192KB.
  • the processor 121 refers to the virtual volume management information 2210, A RAID group associated with an access target area can be specified.
  • the virtual volume management information for associating the virtual volume with the RAID group does not require as much information as the compression management information described later for associating the areas. The reason for this will be described with reference to FIG.
  • the example described here is an example when the RAID type of the RAID group is RAID 5, and in the case of a RAID group adopting another RAID type, the address on the virtual volume is calculated by a different calculation method.
  • the address in the virtual PDEV associated with (LBA, stripe number, etc.) can be specified. Further, the virtual PDEV and the address in the virtual PDEV associated with the stripe storing the parity corresponding to a certain RAID stripe are similarly obtained by simple calculation.
  • the correspondence between the address of the virtual volume and the address in the RAID group can be uniquely specified by calculation considering the RAID configuration. For this reason, a lot of information is not required for the association, and the information can be recorded in the DRAM accessible by the processor 121 at a high speed.
  • the above is the virtual volume management information in the embodiment of the present invention.
  • the storage apparatus 101 manages the association between virtual volumes and RAID groups using this management information.
  • the RAID group management information 2220 is information for managing a virtual PDEV that constitutes a RAID group, and the storage apparatus 101 generates one RAID group management information 2220 when one RAID group is defined.
  • One RAID group management information manages one RAID group.
  • the RAID group management information 2220 is composed of a RAID configuration number 2221, a registered virtual PDEV number 2222, and a RAID type 2223.
  • the RAID group management information 2220 of the present invention is not limited to the items shown in FIG.
  • the RAID group management information 2220 may include items other than those shown in FIG.
  • the RAID configuration virtual PDEV number 2221 is an item for storing the number of virtual PDEVs constituting the RAID group. In the example of FIG. 22, it is shown that four units are configured.
  • the registered virtual PDEV number 2222 is an item for storing a number for identifying the virtual PDEV that constitutes the RAID group.
  • the RAID group is configured with four units of virtual PDEV3, virtual PDEV8, virtual PDEV9, and virtual PDEV15.
  • the RAID type 2223 is an item for storing a RAID type (RAID level).
  • the example of FIG. 22 indicates that the RAID group to which this management information corresponds is configured with RAID5.
  • the above is the RAID group management information in the embodiment of the present invention.
  • the storage apparatus 101 uses this management information to manage a RAID group configured with virtual PDEVs.
  • the virtual PDEV information 2230 is information for managing the association between the virtual PDEV and the PDEV.
  • the virtual PDEV number 2231 is an identification number assigned to each virtual PDEV managed in the storage apparatus 101
  • the PDEV Addr 2232 is an identification number of each PDEV managed in the storage apparatus 101.
  • the PDEV is a storage medium that conforms to the SAS standard
  • the PDEV Addr 2232 stores the SAS address assigned to each PDEV.
  • the PDEV can be specified from the virtual PDEV number.
  • the compression management information is information for managing the association between the virtual PDEV area and the PDEV area, and is recorded in each PDEV.
  • the recording position of the compression management information in the PDEV is common to all PDEVs.
  • the head address in the PDEV in which the compression management information is stored is referred to as “head address in the PDEV of the compression management information”.
  • the information on the head address in the PDEV of the compression management information may be stored in the DRAM 125 of the storage controller 110, or the address information is embedded in a program for accessing the compression management information. Also good.
  • the compression rate when data is compressed depends on the data content, the association between the virtual PDEV area and the PDEV area dynamically changes. For this reason, the compression management information is changed as the recording data is changed.
  • FIG. 23 shows compression management information 2300 used by the storage apparatus 101 according to the embodiment of the present invention.
  • the compression management information 2300 is composed of three fields: a head address 2301 in the PDEV in which compressed data is stored, a length 2302 of the compressed data, and a compression flag 2303.
  • the compression management information 2300 in the present invention is not limited to this configuration, and the compression management information 2300 may have three or more fields.
  • the virtual PDEV capacity is set to 8 TB, which is eight times the 1 TB PDEV capacity, but the present invention is not limited to this value.
  • the virtual PDEV capacity may be set to an arbitrary value according to the assumed compression rate. For example, when the compression rate of stored data can only be expected to be about 50%, it is preferable to set the virtual PDEV capacity to 2 TB with respect to the 1 TB PDEV capacity because the compression management information 2300 can be reduced. Further, the virtual PDEV capacity may be dynamically changed according to the compression rate of the stored data. For example, at first, the virtual PDEV capacity is set to 2 TB with respect to the 1 TB PDEV capacity.
  • the stored data is reduced to less than 1/8 based on the compressed data length information obtained from the NVM module 126, for example. If compression is expected, the capacity of the virtual PDEV may be dynamically set to 8 TB. In that case, the compression management information 2300 is also increased dynamically. On the other hand, if it is found that the stored data cannot be compressed much after the operation is started, the capacity of the virtual PDEV is dynamically reduced.
  • an example of the compression management information 2300 is shown in which the virtual PDEV area is divided and associated with each 4 KB, but the present invention is not limited to this division unit.
  • the head address 2301 in the PDEV in which the compressed data is stored is a field for storing the head address of the area in which the compressed data in the corresponding virtual PDEV area is stored.
  • this field stores the start address of an area where uncompressed data is stored.
  • NULL is stored at the head address 2301 in the PDEV in which the compressed data is stored (described as “unassigned” in FIG. 23)
  • the PDEV area is not allocated to the corresponding virtual PDEV area ( Unassigned).
  • the compressed data and the uncompressed data are managed with the sector length 512B as the minimum unit. The present invention is not limited to this sector length.
  • the compressed data length 2302 is a field for storing the length of the compressed data stored in the PDEV.
  • the unit of the value stored in this field is a sector.
  • the value of the start address 2301 in the PDEV in which the compressed data is stored is set as the start address, and the compressed data is stored in the area of the number of sectors + 1 recorded in the compressed data length 2302 field.
  • the storage area of the virtual volume is divided into 4 KB units, and compression is performed for each divided 4 KB area.
  • the minimum data value by compression is 512B, and when there is no compression effect, the compressed data length is 512B to 4096B (4KB) because it is stored uncompressed (in terms of the number of sectors, 1-8). However, when the value of the compressed data length 2302 is 0, this field is used with the rule that the compressed data length is 1 sector (512 B). As a result, the compressed data length 2302 field manages the data length of 512 B to 4 KB in 3 bits.
  • the compression flag 2303 is a field indicating that the corresponding virtual PDEV area is compressed and stored. When the value of the compression flag 2303 is 1, it indicates that the data is compressed and stored. On the other hand, when the value of the compression flag 2303 is 0, it indicates that the data is stored uncompressed.
  • the compression management information 2300 has each value of the leading address 2301 in the PDEV, the length 2302 in the PDEV of the compressed data, and the compression flag 2303 for each 4 KB area of the virtual PDEV.
  • a set of the head address 2301 in the PDEV, the length 2302 in the PDEV of the compressed data, and the compression flag 2303 is referred to as a compression information entry.
  • the recording position in the PDEV of the compression information entry that manages the association of the virtual PDEV area is uniquely determined by the virtual PDEV area to be associated.
  • the example of FIG. 23 indicates that the compressed information entry of the 4 KB area with the virtual PDEV area “0x0000_0000_1000” as the head address is fixedly recorded at the recording position “0x00_0000_0008”.
  • the units of the start address of the virtual PDEV area and the address of the recording position are both bytes. Therefore, the address “0x0000 — 0000 — 1000” of the virtual PDEV area represents a position of 4 KB from the top of the virtual PDEV area.
  • the recording position is represented by a relative address when the start address (address 0) is the start address in the PDEV of the compression management information.
  • address 0 the start address in the PDEV of the compression management information.
  • the storage apparatus 101 can arbitrarily store the write data (compressed data) from the host apparatus 103 as long as it is an area (unused area) other than the area where the data is already stored. Compressed data can be stored in this area.
  • update data for certain data pre-update data
  • the compressed data of the update data is written in a location different from the pre-update data (compressed data). Then, the PDEV area where the pre-update data is stored is treated as an unused area.
  • the update data may be stored at a different position from the pre-update data, but even in this case, the value in the compression information entry is changed, and the virtual data is always stored in the recording position “0x00_0000_0008”.
  • a compressed information entry of the PDEV area “0x0000_0000_1000” is recorded.
  • the PDEV area in which the compression information entry of the area is recorded is recorded in the area incremented by 8B.
  • the location of the compression information entry that manages the virtual PDEV address can be specified by the calculation formula of the head address in the PDEV of virtual PDEV address ⁇ 4 KB ⁇ 8B + compression management information.
  • the present invention is not limited to this calculation formula. It is only necessary to uniquely calculate the recording position of the compressed information entry recorded in the PDEV from the virtual PDEV address.
  • Such a fixed arrangement of compressed information entries makes it possible to calculate a virtual PDEV area that has become unreadable when the compressed information entry is lost from the recording address of the lost entry. For this reason, after the lost virtual PDEV area is restored by rebuilding by RAID, the contents of the compressed information entry can be regenerated when the restored data is compressed and recorded. Therefore, in the present invention, the reliability of the storage apparatus can be maintained even if the compression management information 2300 is not made redundant.
  • the storage apparatus 101 responds to a read request from the host apparatus 103 with respect to the data recorded in the final storage medium by the write data compression operation of the storage apparatus 101 described with reference to FIG.
  • the data is decompressed and returned to the host apparatus 103.
  • each process is executed by the processor 121 of the storage apparatus 101.
  • the storage controller 110 uses the storage area provided by the NVM module 126 as a cache area for temporarily storing write data from the higher-level device 103 and read data from the SSD 111 or the HDD 122.
  • the NVM module 126 provides the LBA0 space or LBA1 space to the storage controller 110 (the processor 121 thereof), and among the provided LBA0 space or LBA1 space, the region used for storing data and so on.
  • the processor 121 manages a non-region (referred to as a free space). Information used for managing this area is called cache management information.
  • FIG. 20A shows an example of the cache management information 3000 managed by the storage controller 110.
  • the cache management information 3000 is stored on the DRAM 125.
  • the storage apparatus 101 uses the LBA0 space provided by the NVM module 126 as a cache area for storing write data from the host apparatus 103.
  • the LBA1 space is used. This is because the data read from the final storage medium is compressed data.
  • the cache area allocation unit is the stripe size. In the following description, the stripe size is 64 KB as an example.
  • Each row (entry) of the cache management information 3000 is data of an area corresponding to one stripe of the virtual volume specified by VOL # 3010 which is an identification number (virtual volume number) given to the virtual volume and an address 3020 in the virtual volume. This indicates that the area for caching the LBA0 space address for the stripe size starting from the cache LBA0 (3030) and the LBA1 space address for the stripe size starting from the cache LBA1 (3040).
  • an invalid value NULL is stored in the cache LBA0 (3030) or the cache LBA1 (3040).
  • NULL invalid value
  • the area for caching data in the area (stripe) where VOL # 3010 is 0 and address 3020 is 0 is for 64 KB (for the stripe size) where the cache LBA0 (3030) starts from 0. It is an area. Further, since the cache LBA1 (3040) is NULL, it indicates that the area of the LBA1 space is not allocated.
  • the address 3020 stores a stripe number.
  • Bit map 3050 is 16-bit information indicating in which area data is stored in one stripe area specified by cache LBA0 (3030). Each bit represents information about a 4 KB area in one stripe. When each bit is 1, data is stored in an area corresponding to the bit, and when 0, data is stored. It means not.
  • the bitmap 3040 corresponding to the row (first row) where the cache LBA 0 (3030) is 0 is “0x8000”, that is, the first bit of 16 bits is 1. Therefore, the cache LBA0 (3030) indicates that data is stored in the first 4 KB in the area corresponding to one stripe starting from 0.
  • “Dirty” or “Clean” information is stored as information indicating the state of data cached in the area specified by the cache LBA0 (3030).
  • “Dirty” is stored in the attribute 3060, it means that the data in the area specified by the cache LBA 0 (3030) is not reflected in the final storage medium (SSD 111 or HDD 112), and “Clean” is stored. If it is, it indicates that the cached data has already been reflected in the final storage medium.
  • the last access time 3070 represents the time when the cached data was last accessed.
  • the last access time 3070 is used as reference information when selecting data to be destaged to the last storage medium from among a plurality of data stored in the cache area from the higher-level device 103 (for example, the data with the oldest last access time is Select). Therefore, the cache management information 3000 may store other information used for selecting the destage target data at the last access time 3070.
  • the storage controller 110 since the storage controller 110 needs to manage an unused (no cache data) area in the storage space of the LBA0 space and the LBA1 space of the NVM module 126, the storage controller 110 stores information on the unused area list. Have. This is called a free list 3500, and an example is shown in FIG. 20-B.
  • the free list 3500 includes a free LBA 0 list 3510 and a free LBA 1 list 3520.
  • the addresses of unused LBA 0/1 areas are stored in the respective lists (3510, 3520).
  • the LBA0 address is acquired from the free LBA0 list 3510, and the address 3020 of the cache management information 3000 is N rows.
  • the cache area can be secured by storing the acquired LBA0 address in the cache LBA0 (3030).
  • the LBA1 address is obtained from the free LBA1 list 3520, and the cache LBA1 of the cache management information 3000 is acquired.
  • the acquired LBA1 address is stored in (3040).
  • the storage apparatus 101 receives a read request and a read target address from the host apparatus 103.
  • step S2402 following S2401, the processor 121 checks whether the read target data exists in the NVM module 126 (cache) (cache hit) using the read address acquired in S2401.
  • the processor 121 checks whether or not a value is stored in the cache LBA0 (3030) of the row corresponding to the read address acquired in S2401 in the cache management information 3000. If the value is stored, it is determined as a cache hit. To do. On the other hand, if no value is stored, it is determined that there is a cache miss.
  • the processor 121 acquires the RAID group management information 2220 of the RAID group indicated by the value of the RAID group number 2213, and acquires the virtual PDEV number 2222 registered in the RAID group. Then, the virtual PDEV number storing the target data and the virtual PDEV internal address are calculated from the read address acquired in S2401. At this time, depending on the read address and the request size, the read request area from the host device may straddle a plurality of virtual PDEVs. In this case, in order to respond to the read request, a plurality of virtual PDEVs and addresses in the virtual PDEV are used. calculate.
  • step S2405 following S2404 an entry of compression management information 2300 for managing the virtual PDEV number and the virtual PDEV address acquired in S2404 is acquired from the PDEV.
  • the processor 121 refers to the virtual PDEV information 2230 and specifies the PDEV associated with the virtual PDEV number acquired in S2404.
  • the PDEV is uniquely associated with the virtual PDEV, and the compression management information 2300 is recorded in a specific area in the PDEV. For this reason, as described above, the address at which the entry of the compression management information 2300 recorded in the PDEV is stored is calculated from the address in the virtual PDEV, and the compression information entry is acquired from the PDEV.
  • step S2406 following S2405 the processor 121 refers to the compression management information entry acquired in S2405, and records it in the PDEV from the start address 2301 in the PDEV in which the compressed data is stored and the length 2302 of the compressed data. Specify the storage area for compressed data.
  • step S2407 following S2406, the processor 121 reads the compressed data from the compressed data storage area acquired in S2406.
  • the read data is temporarily stored in the DRAM 125.
  • step S2408 the processor 121 writes the compressed data by designating LBA1 of the NVM module 126 which is a cache device.
  • the processor 121 uses the write command 1010 to specify LBA1 and writes the compressed data.
  • step S2409 following S2408, the storage apparatus 101 creates compression information necessary for decompressing the compressed data based on the entry of the compression management information acquired in S2405, and transfers it to the NVM module 126.
  • the processor 121 transfers the compressed information to the NVM module 126 using the compressed information transfer command 1810 shown in FIG.
  • Step S2410 following S2409 is a step of mapping the compressed data to LBA0 in order to decompress and acquire the compressed data written by the storage apparatus 101 in S2408.
  • the processor 121 instructs the NVM module 126 to map the compressed data to LBA0 using the LBA0 mapping command shown in FIG.
  • the NVM module 126 that has acquired the command refers to the compression information related to the compressed data associated with LBA1, and associates the compressed data with the area of LBA0 corresponding to the data size after expansion of the compressed data.
  • step S2411 subsequent to S2413 or 2410 the processor 121 reads data that has been read into the cache area by the processing of S2407 to S2409 and mapped to the LBA0 space (data that has already been associated with LBA0 in the case of a cache hit). Is obtained by decompressing and reading by designating LBA0.
  • the NVM module 126 that has acquired the read command designating LBA 0 acquires the compressed data associated with LBA 0 from the FM 420, decompresses it with the compression / decompression unit 418, and returns it to the storage controller 110 (DRAM 125).
  • step S2412 the processor 121 returns the decompressed data acquired in S2411 to the server as response data to the read request. Since the cache LBA1 (3040) area is an unused area, the cache LBA1 (3040) value is returned to the free list, the cache LBA1 (3040) value of the cache management information 3000 is set to NULL, and the process ends. To do.
  • step S2413 when it is determined in step S2403 that there is no cache miss (cache hit), the processor 121 refers to the cache management information 3000, and the read target area is already stored in LBA0 (cache LBA0 (3030)). ) Information.
  • the write data cache storage operation of this embodiment is the processes 311 to 314 of the write data compression operation of this embodiment shown in FIG.
  • the write data cache storage operation will be described with reference to the flowchart of FIG.
  • the first step S2501 of the storage device write data cache storage operation is a step in which the storage device 101 receives write data and a write destination address from the host device. At this time, the write data is once recorded in the DRAM 125 of the storage apparatus 101 as in a data flow 311 shown in FIG. If there is a function of directly transferring data from the host interface 124 to the NVM module 126, the data may not be recorded in the DRAM 125 of the storage apparatus 101.
  • Step S2502 following S2501 is a step of performing a cache hit determination using the write address acquired by the processor 121 in S2001. Here, the same process as step S2402 of the extension read operation is performed.
  • Step S2503 following S2502 is a step that branches depending on the determination result of S2502. If the result of S2502 is a cache hit, the process proceeds to S2504. If the result of S2501 is a cache miss, the process proceeds to S2509.
  • Step S2509 following S2503 is a step in which the processor 121 newly secures LBA0 of the NVM module 126 for recording the write data.
  • the securing of LBA0 is the same as the processing performed in S2404 of the decompression read processing, but it is not necessary to secure the area of the LBA1 space here.
  • step S2505 the processor 121 specifies LBA0 acquired in S2504 or S2509, and writes data to the NVM module 126 using the write command 1010 shown in FIG.
  • the write data is transferred from the DRAM 125 of the storage apparatus 101 to the data compression / decompression unit 418 of the NVM module 126 and compressed, as shown in the data flow 312 shown in FIG. Recorded in the buffer 416.
  • the compressed data recorded in the data buffer 416 is recorded in the FM 420 at an arbitrary timing as in the data flow 314.
  • step S2506 following S2505 the processor 121 obtains the write response shown in FIG. 10 from the NVM module 126, and the compressed data of the data written in S2505 from the compressed data length 1023 field of the write response information 1020. Get the size.
  • step S2508 the storage apparatus 101 determines whether the total amount of data in which RAID parity is not generated among the compressed data held in the cache configured by the NVM module 126 is equal to or greater than the threshold value. It is. When the total amount of data for which RAID parity is not generated among the compressed data held in the cache exceeds a threshold value, the storage apparatus 101 needs to generate parity for the compressed data held in the cache. It judges that there is, and shifts to the parity generation operation. On the other hand, if the total amount of data for which no RAID parity is generated among the compressed data held in the cache is less than or equal to the threshold value, the storage apparatus 101 determines that parity generation is unnecessary, and performs cache storage operation for write data. finish. The above is the write data cache storage operation in this embodiment.
  • the RAID parity generation operation of the storage device in this embodiment will be described.
  • the total amount of data in which RAID parity is not generated among the compressed data held in the cache in step S2008 in the write data cache storage operation shown in FIG. It is not limited to the aspect performed only when it becomes above.
  • the RAID parity operation may be performed by the storage apparatus 101 at an arbitrary timing. For example, it is performed when there are few or no requests from the host device 103.
  • the RAID parity generation operation of this embodiment is the processes 315 to 320 of the write data compression operation of this embodiment shown in FIG.
  • the RAID parity generation operation will be described with reference to the flow of FIG.
  • the first step S2601 of the RAID parity generation processing of the storage apparatus is a step in which the processor 121 selects data to be a parity generation target from the data recorded in the cache area configured by the LBA0 of the NVM module 126. is there. At this time, the processor 121 refers to the last access time 3070 of the cache management information 3000 and selects data having a long elapsed time since the last access. Further, data that is a parity generation target may be selected according to some other rule. For example, data with a relatively low update frequency may be selected.
  • Step S2602 following S2601 is a step of securing a recording destination area of the parity to be generated on the LBA0, which is a logical space provided by the NVM module 126, by the processor 121.
  • the processor 121 refers to the free list 3500 and secures an unused LBA0.
  • the selected LBA 0 is managed by the parity cache area management information (not shown) similar to the cache management information 3000.
  • Step S2603 following S2602 is a step of determining whether to perform full stripe parity generation. If all data belonging to the same stripe column as the data selected in S2601 exists in the cache, the processor 121 moves to S2604 in order to generate full stripe parity. On the other hand, if there is only a part of the data belonging to the same stripe column as the data selected in S2601, the process proceeds to S2607 to generate updated parity.
  • the method of searching the cache for data belonging to the same stripe column as the data selected in S2601 refers to VOL # 3010 and address 3020 for each row stored in cache management information 3000, and VOL # 3010 and address 3020. Is within the same stripe column range as the data selected in S2601. For example, FIG. 21 will be described as an example. If the data selected in S2601 is Data14, VOL # 3010 and address 3020 of each row stored in the cache management information 3000 are referred to, Vol # 3010 is equal to the virtual volume number to which Data14 belongs, and address 3020 is set. Data belonging to the same stripe is a value obtained by dividing the stripe number (14) of Data 14 by 3 (3), which is equal to the result (3). Furthermore, if the values of the respective bitmaps 3050 are the same, it can be determined that data belonging to the same stripe column as the data selected in S2601 is stored in the cache.
  • Step S2604 following S2603 is a step for instructing the NVM module 126 to map the RAID parity to the LBA0 area secured by the storage apparatus 101 in S2602.
  • the processor 121 uses the full stripe parity generation command 1310 shown in FIG. 13 to specify the compressed data for generating the parity by the LBA 0 start addresses 0 to X (1315 to 1317), and also for the generated parity mapping location.
  • the LBA 0 start address (for XOR parity) 1318 and the LBA 0 start address (for RAID 6 parity) 1319 are specified.
  • the NVM module 126 that has received the full stripe parity generation command reads the compressed data recorded in the FM 420 to the data buffer 416 in the NVM module 126 if the area associated with LBA 0 is FM 420 (corresponding to LBA 0). (Not required if the attached area is the data buffer 416 in the NVM module 126).
  • the parity generation unit 419 in the NVM module 126 is instructed to generate parity for the compressed data in the data buffer 416. Upon receipt of the instruction, the parity generation unit 419 acquires data from the data buffer 416 by the data compression / decompression unit 418 and expands it, and generates parity from the expanded data.
  • parity generation unit 419 transfers the generated parity to the data compression / decompression unit 418, compresses it, and records it in the data buffer 416 or FM 420 in the NVM module 126.
  • the PBA associated with the generated parity recording area is designated by LBA0 (LBA0 start address (for XOR parity) 1318 and LBA0 start address (for RAID6 parity) 1319. LBA0).
  • Step S2607 following S2603 is a step in which the storage apparatus 101 acquires the compressed data of the old data and the compressed data of the old parity from the final storage medium configured in RAID and designates and writes LBA1 in order to generate updated parity. It is.
  • the processor 121 acquires, from the free list, LBA1 for storing the compressed data of the old data and the compressed data of the old parity.
  • the processor 121 temporarily stores the acquired LBA1 information.
  • the old data necessary for parity generation is the same as the virtual PDEV in which the new data (the data selected in S2601) is stored.
  • the old parity virtual PDEV can be obtained by simple calculation from the address (stripe number) of the new data. Since the addresses in the virtual PDEV for the old data and the old parity are the same as the addresses in the virtual PDEV to be stored for the new data (the data selected in S2601), the processor 121 stores the data selected in S2601. What is necessary is just to specify the address in virtual PDEV which should be performed.
  • the storage position in the PDEV of the compressed data is specified from the old data required for parity generation and the address in the virtual PDEV of the old parity, and the data is read from the PDEV, by the same processing as S2404 to S2407 of the above-described read operation.
  • the processor 121 writes the old compressed data and the old parity to the secured LBA 1 using the write command 1010 shown in FIG.
  • Step S2608 following S2607 is a step of mapping the old data and the old parity compressed state data recorded in the LBA1 area in S2607 to the LBA0 space area.
  • the processor 121 acquires from the free list 3500 an LBA0 area to which the data size after decompression of each compressed data can be mapped. Then, a plurality of LBA0 mapping commands shown in FIG. 19 specifying each LBA0 and each LBA1 are transferred to the NVM module 126, and the decompressed image of the compressed data recorded in the LBA1 area written in S2607 is displayed in the LBA0 area. To map.
  • Step S2609 following S2608 is a step of performing update parity generation using the data (update data) selected in S2601, the old compressed data mapped to LBA0 in S2608, and the old parity.
  • the processor 121 uses the updated parity generation command 1410 shown in FIG. 14 to specify the compressed data, the old compressed data, and the old parity area by LBA0, and also specifies the storage location of the updated parity by LBA0.
  • the flow of processing performed by the NVM module 126 that has received the update parity generation command is substantially the same as the flow of processing that is performed when the full stripe parity generation command is received as described above.
  • Step S2605 subsequent to S2604 or S2609 is a step of obtaining the correct data size after compression of the parity generated in S2609 or S2604.
  • the processor 121 creates a compressed data size acquisition command 1110 in which the LBA 0 in which the generated parity is stored is specified at the LBA 0 start address 1113 of the command parameter, and issues it to the NVM module 126. Then, the processor 121 acquires the data size after compression of the parity by the compressed data size acquisition response 1120.
  • step S2606 it is determined whether destage is necessary.
  • the processor 121 determines whether the compressed data on the cache that has generated the parity should be recorded in the final storage medium. This determination is made, for example, based on the cache free area. If the free area in the cache is equal to or smaller than the threshold value, the storage apparatus 101 starts destage processing to create a free area. On the other hand, if it is determined that there is a sufficient free area in the cache, the parity generation process ends.
  • the destage operation is not limited to a mode that is executed only when it is determined that destage is necessary in step 2606 in the RAID parity generation operation shown in FIG.
  • the destage operation in this embodiment may be performed by the storage apparatus 101 at an arbitrary timing. For example, it may be performed at an arbitrary timing when there are few or no requests from the host device.
  • the destage operation of the present embodiment is the processes 321 to 323 of the write data compression operation of the present embodiment shown in FIG. Destage operation will be described with reference to the flowchart of FIG.
  • the first step S2701 of the destaging operation of the storage device is a step of selecting data to be destaged from the NVM module 126 which is a cache device.
  • the processor 121 selects an area to be destaged from the LBA0.
  • the destage refers to the last access time 3070 of the cache management information 3000 and may target data that has not been accessed recently from the upper level apparatus 103 or other statistical information managed by the storage apparatus 101. On the basis of this, a method of targeting data determined to be sequential write data may be adopted. Note that here, the parity generated by the processing of FIG.
  • step S2702 following step S2701, the storage apparatus 101 acquires the data size after compression of the data in the LBA0 space area selected in S2701 from the NVM module 126.
  • the processor 121 transfers the compressed data size acquisition command 1120 shown in FIG. 11 to the NVM module 126, acquires the compressed data length 1123 in the compressed data size acquisition response 1120, and sets the compressed data size to be acquired in the destage operation. To grasp.
  • step S2703 the storage apparatus 101 maps the compressed data in the LBA0 area determined in S2701 to the LBA1 area.
  • the processor 121 transfers the LBA1 mapping command 1210 describing the LBA1 area to which the compressed data length can be mapped acquired in step S2702 to the NVM module 126.
  • step S2704 the storage apparatus 101 acquires compressed data from the LBA1 area mapped in step S2703.
  • the processor 121 describes the LBA1 mapped in S2703 in the read command 1610 shown in FIG. 16, and acquires the compressed data by transferring it to the NVM module 126.
  • Step S2704 'following step S2704 is a step of specifying the storage destination address of the write target data. Since the address of the storage destination virtual volume of each write target data in the cache area (the LBA0 space of the NVM module 126) is stored at the address (3020) in the cache management information 3000, the processor 121 uses this. The virtual PDEV associated with this address and the address of the virtual PDEV are calculated. The calculation method is as described above.
  • Step S2705 subsequent to step S2704 ' is a step of recording the compressed data acquired in step S2704 on the PDEV.
  • the write data storage destination virtual PDEV is specified.
  • the processor 121 refers to the virtual PDEV information 2230 and selects the PDEV associated with the write data storage destination virtual PDEV. Identify.
  • the free area of the specified PDEV is selected, the selected free area is determined as a storage location for compressed data, and the compressed data acquired in S2704 is stored in the PDEV area for the determined storage location. Record.
  • the storage apparatus 101 manages information about free areas (areas not associated with virtual PDEVs) of each PDEV (SSDorHDD) in the storage controller 110 (for example, DRAM 125). This information is used when the processor 121 selects a free area in the PDEV. Alternatively, since the area other than the area registered in the compression management information 2300 (the area specified by the start address 2301 in the PDEV in which the compressed data is stored) is an empty area, the processor 121 stores the compressed data. A method may be employed in which the compression management information 2300 is read from the destination PDEV, and an empty area is specified based on the compression management information 2300.
  • Step S2706 following step S2705 is a step of releasing the LBA1 area mapped for obtaining compressed data in S2703.
  • the processor 121 releases LBA1 using a mapping release command 1710 shown in FIG. Further, the processor 121 stores the released LBA1 information in the free LBA1 list 3520 and deletes it from the cache management information 3000.
  • step S2707 following step S2706, the compression management information 2300 is updated and recorded in the PDEV.
  • the processor 121 reads the compression management information 2300, and stores the compressed data at S2705 in the first address 2301 column in the PDEV storing the compressed data of the compressed information entry corresponding to the virtual PDEV area targeted for destaging. The address of the PDEV area is recorded and updated. Then, the updated compression management information 2300 is recorded in the PDEV.
  • the compression management information 2300 is updated, it is not necessary to read and update all the information stored in the compression management information 2300, and only a necessary area may be read and updated.
  • the operation of regenerating the entry of the compression management information 2300 in the embodiment is performed when entry disappearance of the compression management information 2300 is detected.
  • the entry loss detection opportunity is at the time of regularly monitoring the entry in the storage device, or when the entry of the compression management information 2300 is acquired by the read or destage operation.
  • the storage apparatus 101 is characterized in that the compression management information 2300 can be regenerated by the partial recovery operation of the compression management information 2300 and a rebuild process described later. With this function, the storage apparatus 101 can maintain the reliability of the storage apparatus without redundantly holding the compression management information 2300.
  • the partial recovery operation of the compression management information 2300 will be described with reference to FIG. For simplification of description, a case where only one entry of the compression management information 2300 is lost will be described below. However, this process is applicable even when a plurality of entries of the compression management information 2300 are lost.
  • S2801 which is the first step of the partial recovery operation of the compression management information 2300, is a step of calculating a virtual PDEV area (address within the virtual PDEV) managed by the lost entry of the compression management information 2300.
  • the storage apparatus 101 fixedly assigns the recording position of the entry of the compression management information 2300 in the PDEV in accordance with the virtual PDEV area managed by the entry. Therefore, the processor 121 can identify the virtual PDEV area managed by the entry from the address in the PDEV of the lost entry. For example, if the compression management information 2300 having the contents shown in FIG.
  • the compression management information entry whose recording position (relative address in the PDEV) is stored at 0x00_0000_0008 cannot be read It can be seen that this is a compression management information entry for a 4 KB area starting from the address 0x0000_0000_1000 of the PDEV area.
  • the data stored in the 4 KB area starting from the address 0x0000_0000_1000 of the virtual PDEV area is regenerated using the data read from the other PDEVs constituting the RAID group. Then, the process of rewriting the PDEV and recreating the entry of the compression management information based on it is performed after S2802.
  • step S2802 following step S2801, the RAID group to which the virtual PDEV in which the lost compression management information 2300 was stored belongs is specified.
  • the processor 121 searches the RAID group management information 2220 and identifies the corresponding RAID group.
  • step S2803 in order to restore the virtual PDEV area managed by the lost entry, the processor 121 stores data necessary for restoring the data in the virtual PDEV area specified in S2801, Obtain from each PDEV.
  • This process will be described by taking as an example the case where the compression management information entry for the 4 KB area starting from the virtual PDEV area address 0x0000_0000_1000 described above has disappeared.
  • virtual PDEVs other than the virtual PDEV in which the lost compression management information 2300 was stored hereinafter, these virtual PDEVs are referred to as “other virtual PDEVs”).
  • the compressed data corresponding to the virtual PDEV area identified in S2801 (that is, the 4 KB area starting from the virtual PDEV area address 0x0000_0000_1000) is read. Therefore, the processor 121 reads the compression management information 2300 from each of the other virtual PDEVs, and the PDEV address associated with the 4 KB area starting from the virtual PDEV area address 0x0000_0000_1000 of the other virtual PDEV, and the length of the compressed data The compressed data is read from the PDEV associated with the other virtual PDEV using the acquired PDEV address and the length of the compressed data.
  • step S2804 following step S2803, the processor 121 records a plurality of compressed data (data necessary for restoring the data in the virtual PDEV area specified in S2801) acquired in S2803 in the NVM module 126. Map the decompressed image to LBA0.
  • steps S2408 to S2410 in FIG. 24 may be performed. Note that it is necessary to secure the LBA0 and LBA1 space areas before data is recorded in the NVM module 126, but this is the same as the processing performed in S2404 of FIG.
  • step S2805 the processor 121 restores the data in the virtual PDEV area identified in step S2801 using the RAID function from the plurality of data in the RAID stripe column mapped to LBA0 of the NVM module 126 in step S2804.
  • a full stripe parity generation command 1310 may be used.
  • the address of the data mapped in the LBA0 space in step S2804 is stored in the LBA0 start address (1314 to 1316).
  • an LBA0 space area for storing data to be restored is secured, and a command in which the start address of the secured LBA0 space area is stored in the LBA0 start address (for XOR parity) 1317 is created and issued to the NVM module 126 To do.
  • step S2806 following step S2805 the processor 121 maps the compressed data of the decompressed data generated in S2805 to LBA1, and acquires the compressed data.
  • processing similar to that in steps S2702 to S2704 in FIG. 27 is performed.
  • Step S2807 following step S2806 is a step of recording the compressed data acquired in S2806 on the PDEV.
  • the processor 121 identifies the PDEV associated with the restored data storage destination virtual PDEV, selects a free area in the identified PDEV, and selects the selected free area. May be determined as the storage location of the compressed data, and the compressed data acquired in S2806 may be recorded in the PDEV area for the determined storage location.
  • Step S2808 following step S2807 is a step of updating the compression management information 2300 and recording it in the PDEV.
  • the processor 121 records the address of the PDEV area to which the compressed data is written in S2807 in the first address 2301 column in the PDEV in which the compressed data of the compression management information 2300 entry of the virtual PDEV area to be destaged is stored. Update. Then, the lost entry is restored by recording the updated compression management information entry in the PDEV.
  • the storage device 101 can regenerate each entry of the compression management information 2300 from the data even if the compression management information 2300 is lost by the entry regeneration operation of the compression management information 2300.
  • the storage apparatus 101 performs the rebuild process shown in FIG. 29 when one of the PDEVs constituting the RAID group fails and cannot be accessed.
  • the processor 121 identifies a virtual PDEV associated with the failed PDEV (hereinafter referred to as a failed virtual PDEV).
  • step S2902 following step S2901, the RAID group to which the failed virtual PDEV belongs is specified.
  • the processor 121 searches the RAID group management information 2220 and identifies the corresponding RAID group.
  • step S2903 subsequent to step S2902, the processor 121 retrieves data necessary for recovering data in each area of the failed virtual PDEV identified in S2901 in order to restore the failed virtual PDEV area in S2902. Obtained from a plurality of PDEVs that are associated with a plurality of virtual PDEVs other than the failed virtual PDEV in the identified RAID group.
  • the processing described in step S2803 in FIG. 28 is performed for all areas of the failed virtual PDEV (all addresses from the virtual PDEV address 0 to the maximum address).
  • the compression management information 2300 is read from (a plurality of) virtual PDEVs (hereinafter referred to as “other virtual PDEVs”) other than the failed virtual PDEV, and the data of the other virtual PDEV addresses 0x0000_0000_0000, 0x0000_0000_1000,. To go. However, for the area in the compression management information 2300 in which the start address 2301 in the PDEV in which the compressed data is stored is not assigned (NULL), the area is not associated with the PDEV area, so it is not necessary to read it out. Absent.
  • steps S2904 to S2908 described below perform the same processing as S2804 to S2808 in FIG.
  • the difference from the process of FIG. 28 is that the process of FIG. 28 is performed only for a part of the area in the virtual PDEV (the specific area in the virtual PDEV managed by the lost entry). The process is to perform the process on all areas of the faulty virtual PDEV (excluding areas where the PDEV areas are not associated).
  • step S2904 subsequent to step S2903, the processor 121 records the compressed data acquired in S2903 in the NVM module 126, and maps the decompressed image to LBA0.
  • step S2804 in FIG. 28 the same processing as step S2804 in FIG. 28 is performed.
  • step S2905 the processor 121 obtains data of the failed virtual PDEV area identified in S2901 using the RAID function from a plurality of data in the RAID stripe column mapped to LBA0 of the NVM module 126 in step S2904. Restore. The restored data is compressed and recorded in the LBA0 space of the NVM module 126.
  • step S2805 in FIG. 28 the same processing as step S2805 in FIG. 28 is performed.
  • step S2906 following step S2905 the processor 121 maps the compressed data of the decompressed data generated in S2905 to LBA1, and acquires the compressed data.
  • step S2806 in FIG. 28 the same processing as step S2806 in FIG. 28 is performed.
  • Step S2907 following step S2906 is a step of recording the compressed data acquired in S2906 in a new PDEV.
  • the storage apparatus 101 holds one or more PDEVs that are spares for the failed PDEV, and is used as a substitute for the failed PDEV.
  • this PDEV is referred to as an alternative PDEV.
  • the processor 121 selects a virtual PDEV corresponding to the alternative PDEV, registers the virtual PDEV number of the virtual PDEV corresponding to the alternative PDEV in the virtual PDEV number column 2222 of the RAID group management information 2220, and corresponds to the failed PDEV. Delete the PDEV number.
  • the selected virtual PDEV is added as a substitute for the failed virtual PDEV to the RAID group associated with the failed virtual PDEV.
  • the processor 121 records the compressed data acquired in step S2906 in the alternative PDEV area.
  • the compressed data can be stored in the alternative PDEV by performing the same process as in step S2705 in FIG.
  • step S2908 following step S2907, the processor 121 generates compression management information for managing the association between the substitute virtual PDEV area and the substitute PDEV area stored in step S2907, and the generated compression management information 2300 is generated. Is recorded on the PDEV. Thereby, the data restoration at the time of the failure of the PDEV is completed.
  • the data is recovered using the decompressed data of the other virtual PDEV area of the RAID group that the failed virtual PDEV configures. . Then, the recovery data is compressed and recorded in an alternative PDEV. At this time, by regenerating the compression management information that associates the alternate virtual PDEV with the area of the substitute PDEV, the compression management information lost due to the PDEV failure is regenerated.
  • steps S2903 to S2908 are performed for all areas of the failed virtual PDEV, but it is not necessary to process the entire area at once in each step, and for each partial area (stripe Or every 4 KB which is a data compression unit of the storage apparatus 101 according to the embodiment of the present invention.
  • the change in the data due to compression is concealed from a host device such as a server, as if uncompressed It provides a storage area (virtual uncompressed volume) where data appears to be recorded in the state.
  • a host device such as a server
  • information compression management information
  • the compression management information is management information for managing the virtual non-compressed volume and the physical recording destination of the compressed data, and is indispensable for the response to the data read request from the server. Therefore, from the viewpoint of the reliability of the storage apparatus, the loss of compression management information is equivalent to the loss of retained data. For this reason, in order to retain the compression management information, it is necessary to maintain at least the same level of reliability as data.
  • the compression management information for managing the association of the compressed data with the recording destination on the final storage medium is divided for each final recording medium, and only the correspondence relation related to one final recording medium is managed.
  • the compression management information is recorded in a specific area of the final recording medium to be managed.
  • the compression management information is lost for some reason, and even if a failure that makes it inaccessible occurs, the data managed by the compression management information that has become inaccessible is regenerated by RAID technology and regenerated.
  • Data (recovery data) can be compressed and written to the final storage medium.
  • compression management information corresponding to the recovery data written to the final storage medium can be created and written to the final storage medium to restore the compression management information. It is. Therefore, in the storage apparatus of the present invention, it is not necessary to store the compression management information in a redundant manner, and the consumption of the storage area by the compression management information can be reduced.
  • the storage apparatus forms a virtual volume in which storage areas of a RAID group configured using storage areas of a plurality of virtual PDEVs are statically assigned, and this virtual volume is used as a host apparatus.
  • a volume formed by using a so-called Thin Provisioning technology also referred to as a Dynamic Provisioning technology
  • a Dynamic Provisioning technology that dynamically allocates a physical storage area
  • Dynamic provisioning is a function that can define a volume that is larger than the storage capacity of the final storage medium (SSD 111 or HDD 112) installed in the storage device (hereinafter, this volume is referred to as “DP volume”). With this function, the user does not necessarily have to install a final storage medium with the same capacity as the defined volume (DP volume) in the storage device in the initial state. Add storage media.
  • the DP volume is one of the volumes virtually created by the storage device, and is created with an arbitrary capacity designated by the user or the host device. In the initial state, no storage area is allocated to the DP volume. When data is written from the host device 103, a storage area is allocated as much as necessary.
  • the virtual volume 200 in the embodiment described above is managed for each fixed-size storage area (this storage area is called a Dynamic Provisioning page (or DP page)). It may be assigned to the DP volume.
  • the storage area of the virtual volume 200 is a storage area that is treated as if data before compression is stored. For this reason, the DP volume using the storage area allocated from the virtual volume 200 also conceals that the data is compressed and stored in the higher level device 103.
  • the storage controller 110 determines whether or not the virtual volume based on the compression information such as the compressed data length or the compression rate (ratio between the data amount before compression and the data amount after compression) acquired from the NVM module 126.
  • the size of the virtual PDEV that constitutes may be increased or decreased. As the size of the virtual PDEV increases or decreases, the number of DP pages that can be extracted from the virtual volume 200 also increases or decreases.
  • the storage device manages the increase / decrease in the number of DP pages depending on the compression ratio, and when the remaining DP page amount is below a certain level, the storage device 101 or the management terminal of the storage device 101 needs to add the final storage medium. Notice. The user may add the final storage medium to the storage apparatus 101 at the time of receiving the notification.
  • a DP volume which is a fixed size volume determined in advance, is provided to the host device 103. Even when the increase / decrease occurs, the host device 103 or the user using it does not need to be aware of the increase / decrease of the storage area, and when the storage area (DP page) that can be used increases due to the improved compression ratio, There is an advantage that the increased storage area can be effectively used.
  • Storage device 102 SAN 103: Host device 104: Management device 110: Storage controller 111: SSD 112: HDD 121: Processor 122: Internal SW 123: Disk interface 124: Host interface 125: DRAM 126: NVM module 410: FM controller 411: I / O interface 413: RAM 414: Switch 416: Data buffer 417: FM interface 418: Data compression / decompression unit 419: Parity generation unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

 This storage device has a plurality of final storage media and a cache device provided with a data compression function and a parity generation function. The storage device stores write data from a host device in the final storage medium after compressing the data, and, for the host device, provides a virtual non-compressed volume for concealing that the data is stored after being compressed. The storage device divides the area of the virtual non-compressed volume into stripe units, and manages each stripe in correlation with one of the plurality of final storage media constituting a RAID group. When storing the data of each stripe in the final storage medium, the storage device generates a parity from the data of each stripe and compresses the generated parity and the data of each stripe by using the cache device, and stores the parity and the data of each stripe that have been compressed by the cache device in each of the final storage media constituting the RAID group.

Description

ストレージ装置Storage device
 本発明は、半導体記録装置を一次データの記憶装置として用いるストレージ装置、及び装置制御方法に関するものである。 The present invention relates to a storage device using a semiconductor recording device as a primary data storage device, and a device control method.
 ストレージ装置は一般に、リクエスト処理性能(以降、単に性能と記す)向上の目的で、キャッシュというコンポーネントを備える。キャッシュは、ストレージ装置において大きく2つの役割を担う。役割の一つは、リード・ライトアクセス頻度が相対的に高い領域をキャッシュに格納し、ストレージ装置の平均的な性能を向上させる役割である。2つ目は、サーバからストレージ装置へのライト要求受領時に、ライトデータを一時的に格納する役割である。近年、こうしたキャッシュの記憶素子として、NAND型Flash Memory(以下、FMと記す)等の不揮発性半導体メモリ(以下NVM)用いた装置が知られている。(例えば特許文献1) Storage devices generally include a component called a cache for the purpose of improving request processing performance (hereinafter simply referred to as performance). The cache plays two major roles in the storage device. One of the roles is to store an area with a relatively high read / write access frequency in the cache and improve the average performance of the storage apparatus. The second function is to temporarily store write data when a write request is received from the server to the storage apparatus. In recent years, a device using a nonvolatile semiconductor memory (hereinafter referred to as NVM) such as a NAND-type flash memory (hereinafter referred to as FM) is known as a storage element of such a cache. (For example, Patent Document 1)
 一方、ストレージ装置は、データ保持の信頼性を保ちながら、低コストでデータを保存することが求められる。こうした要求を満たすために、データを可逆圧縮(以降、単に圧縮と記す)して記録するストレージ装置が知られている。圧縮によりデータサイズを縮小し、最終記憶媒体及び/またはキャッシュ装置に記録すると、データの保持コスト(記憶媒体のビットコスト、ストレージ装置の消費電力コスト等)を削減できる。 On the other hand, storage devices are required to store data at low cost while maintaining the reliability of data retention. In order to satisfy these requirements, storage apparatuses that record data by lossless compression (hereinafter simply referred to as compression) are known. When the data size is reduced by compression and recorded in the final storage medium and / or the cache device, the data retention cost (bit cost of the storage medium, power consumption cost of the storage device, etc.) can be reduced.
 こうした圧縮機能を有するストレージは、一般的にストレージ装置が記憶領域を提供するサーバに対して、記録データが圧縮して格納されていることを隠蔽し、あたかも非圧縮状態でデータが記録されているように記憶領域を提供する。この機能により、ユーザは、既存のアプリケーションやオペレーティングシステム等のソフトウェアに変更を加えることなく、圧縮による保持コスト低下のメリットを享受できる。 A storage having such a compression function generally conceals the fact that recorded data is compressed and stored from the server that the storage device provides a storage area, and the data is recorded as if it were in an uncompressed state. So as to provide a storage area. With this function, the user can enjoy the merit of lowering the holding cost due to the compression without changing the software such as the existing application and operating system.
米国特許出願公開第2009/0216945号明細書US Patent Application Publication No. 2009/0216945
 背景技術にて述べたとおり、キャッシュ装置及び最終記憶媒体にはデータを圧縮して格納し、圧縮によるデータの変化をサーバに対して隠蔽することが望ましい。データ圧縮機能と圧縮によるデータ変化を隠蔽する機能を持つストレージ装置は、サーバに提供する仮想的な非圧縮の記憶領域(以降、仮想非圧縮ボリュームと記す)と圧縮データの記録先となる物理領域の対応付け管理が必要となる。 As described in the background art, it is desirable to compress and store data in the cache device and the final storage medium, and to conceal changes in the data due to the compression from the server. A storage apparatus having a data compression function and a function for concealing data changes caused by compression includes a virtual uncompressed storage area provided to the server (hereinafter referred to as a virtual uncompressed volume) and a physical area that is a recording destination of compressed data. Need to be associated.
 一方、可逆圧縮アルゴリズムは、データ内容に依存して圧縮率が変化する。たとえば、圧縮対象データの情報が複雑な(多い)場合と、圧縮対象データの情報が単調な(少ない)場合では、単調な場合のほうが圧縮後のデータサイズは小さくなる。このように、圧縮後のデータサイズは、圧縮対象データのデータ内容に依存しており、「実際に圧縮する」という発見的な方法でしか取得できない。従って、仮想非圧縮ボリュームと圧縮データの物理的記録先との対応付けは、記録されるデータが変化する度に動的に変化する。 On the other hand, in the lossless compression algorithm, the compression rate changes depending on the data content. For example, when the information on the compression target data is complex (large) and the information on the compression target data is monotonous (small), the data size after compression is smaller in the monotonous case. Thus, the data size after compression depends on the data content of the data to be compressed, and can only be obtained by a heuristic method of “actual compression”. Accordingly, the association between the virtual uncompressed volume and the physical recording destination of the compressed data changes dynamically every time the recorded data changes.
 このためストレージ装置は、仮想非圧縮ボリュームを一定領域毎に区分して対応関係を管理し、データ更新とそれに伴うデータ圧縮が完了する度に、動的に変化する対応関係を管理する情報を更新する。以降、この対応関係を管理する情報を圧縮管理情報と記す。 For this reason, the storage device manages the correspondence by dividing the virtual uncompressed volume into fixed areas, and updates the information managing the dynamically changing correspondence every time data update and data compression accompanying it are completed. To do. Hereinafter, information for managing this correspondence relationship is referred to as compression management information.
 圧縮管理情報は、ストレージが管理する他の管理情報と比べて一般的に大きなサイズとなる。一例として、100TBの物理領域を用いて800TBの仮想非圧縮ボリュームを提供する場合について述べる。仮想非圧縮ボリュームの領域800TBを4KBの領域に区分し、領域毎に8B(例:6Bの開始位置情報+2Bの長さ情報)の情報で対応付けを管理する構成では、800TB÷4KB×8B=1600GBの圧縮管理情報が必要となる。 Compressed management information is generally larger in size than other management information managed by the storage. As an example, a case will be described in which an 800 TB virtual uncompressed volume is provided using a 100 TB physical area. In the configuration in which the area 800TB of the virtual uncompressed volume is divided into 4KB areas, and the association is managed with information of 8B (for example, 6B start position information + 2B length information) for each area, 800TB ÷ 4KB × 8B = 1600 GB of compression management information is required.
 ストレージ装置の多くの管理情報は、一般的にストレージ装置を制御するプロセッサが高速にアクセス可能なDRAMに格納されている。しかし、GB単位の圧縮管理情報を、ビットコストが高価で消費電力の高いDRAMに格納する方式は、データの保持コストの増大を招く。 A lot of management information of a storage device is generally stored in a DRAM that can be accessed at high speed by a processor that controls the storage device. However, a method of storing compression management information in GB units in a DRAM having a high bit cost and high power consumption causes an increase in data retention cost.
 また圧縮管理情報は、仮想非圧縮ボリュームと圧縮データの物理的な記録先を管理する管理情報であり、サーバからのデータリード要求への応答に不可欠な情報である。従って、ストレージ装置の信頼性の観点において、圧縮管理情報の消失は保持データの消失と等価である。このため、圧縮管理情報の保持には、最低でもデータと同程度の信頼性を維持する必要がある。 Further, the compression management information is management information for managing the virtual non-compressed volume and the physical recording destination of the compressed data, and is indispensable for the response to the data read request from the server. Therefore, from the viewpoint of the reliability of the storage apparatus, the loss of compression management information is equivalent to the loss of retained data. For this reason, in order to retain the compression management information, it is necessary to maintain at least the same level of reliability as data.
 前記目的を達成するため、本発明のストレージ装置は、圧縮によるデータ変化を隠蔽するために、サーバなどの上位装置に対して仮想非圧縮ボリュームを提供する機能を持つ。ストレージ装置はまた、仮想非圧縮ボリュームの領域をストライプ単位に分割し、各ストライプを、RAIDグループを構成する複数の最終記憶媒体のいずれかに対応付けて管理している。各ストライプのデータを最終記憶媒体に格納する際には、各ストライプのデータからパリティを生成し、生成されたパリティ及び各ストライプのデータを圧縮し、圧縮されたパリティ及び各ストライプのデータを、ストライプの対応付けられた最終記憶媒体へと格納する。 In order to achieve the above object, the storage apparatus of the present invention has a function of providing a virtual uncompressed volume to a host apparatus such as a server in order to conceal data changes caused by compression. The storage apparatus also divides the area of the virtual uncompressed volume into stripe units, and manages each stripe in association with one of a plurality of final storage media constituting the RAID group. When storing the data of each stripe in the final storage medium, the parity is generated from the data of each stripe, the generated parity and the data of each stripe are compressed, and the compressed parity and the data of each stripe are stored in the stripe. Are stored in the final storage medium associated with each other.
 また、上記ストレージ装置は、圧縮データの最終記憶媒体上の記録先との対応付けを管理する圧縮管理情報を記録媒体毎に分割し、一つの記録媒体に関連する対応関係のみを管理する圧縮管理情報を、管理する記録媒体の特定領域に記録すること特徴とする。 Further, the storage device divides compression management information for managing the correspondence of the compressed data with the recording destination on the final storage medium for each recording medium, and compresses management for managing only the correspondence relation related to one recording medium. Information is recorded in a specific area of a recording medium to be managed.
 本発明によれば、ストレージ装置において、圧縮管理情報をデータと同等の信頼性にて保持できる。また、圧縮管理情報の更新に伴う処理負荷を軽減し、ストレージ装置の性能を向上できる。 According to the present invention, the compression management information can be held with the same reliability as the data in the storage device. Further, it is possible to reduce the processing load associated with updating the compression management information and improve the performance of the storage apparatus.
図1は、本発明の実施例に係るストレージ装置を中心とした、コンピュータシステムの概略構成を示した図である。FIG. 1 is a diagram showing a schematic configuration of a computer system centered on a storage apparatus according to an embodiment of the present invention. 図2-Aは、本発明の実施例に係るストレージ装置の論理空間構成を示す概念図である。FIG. 2-A is a conceptual diagram showing a logical space configuration of the storage apparatus according to the embodiment of the present invention. 図2-Bは、本発明の実施例に係るストレージ装置の論理空間構成を示す別の概念図である。FIG. 2-B is another conceptual diagram showing the logical space configuration of the storage apparatus according to the embodiment of the present invention. 図3-Aは、ストレージ装置が上位装置からライトコマンドを受信した場合のデータの流れを示した図である。FIG. 3-A is a diagram showing a data flow when the storage apparatus receives a write command from the host apparatus. 図3-Bは、ストレージ装置が上位装置からライトコマンドを受信した場合のデータの流れを示した図である。FIG. 3-B is a diagram showing a data flow when the storage apparatus receives a write command from the host apparatus. 図4は、NVMモジュールの内部構成を示した図である。FIG. 4 is a diagram showing an internal configuration of the NVM module. 図5は、FMの内部構成を示した図である。FIG. 5 is a diagram showing the internal configuration of the FM. 図6は、物理ブロックの内部構成を示した図である。FIG. 6 is a diagram showing the internal configuration of the physical block. 図7は、NVMモジュールがストレージコントローラに提供する論理空間であるLBA0空間及びLBA1空間と、物理領域指定用アドレス空間であるPBA空間との対応付けの概念を示した図である。FIG. 7 is a diagram showing the concept of associating the LBA0 and LBA1 spaces, which are logical spaces provided by the NVM module to the storage controller, and the PBA space, which is a physical area designating address space. 図8は、NVMモジュールが管理するLBA0-PBA変換テーブル810とLBA1-PBA変換テーブル820の内容を示した図である。FIG. 8 is a diagram showing the contents of the LBA0-PBA conversion table 810 and the LBA1-PBA conversion table 820 managed by the NVM module. 図9は、NVMモジュールが用いるブロック管理情報を示した図である。FIG. 9 is a diagram showing block management information used by the NVM module. 図10は、NVMモジュールが受け付ける、ライトコマンドとそのライトコマンドに対する応答情報を示した図である。FIG. 10 is a diagram showing a write command and response information to the write command received by the NVM module. 図11は、NVMモジュールが受け付ける、圧縮データサイズ取得コマンドとその圧縮データサイズ取得コマンドへの応答情報を示した図である。FIG. 11 is a diagram showing a compressed data size acquisition command and response information to the compressed data size acquisition command received by the NVM module. 図12は、NVMモジュールが受け付ける、LBA1マッピングコマンドと、そのLBA1マッピングコマンドに対する応答情報を示した図である。FIG. 12 is a diagram showing an LBA1 mapping command received by the NVM module and response information for the LBA1 mapping command. 図13は、NVMモジュールが受け付ける、フルストライプパリティ生成コマンドとフルストライプパリティ生成コマンドへの応答情報を示した図である。FIG. 13 is a diagram illustrating a full stripe parity generation command and response information to the full stripe parity generation command received by the NVM module. 図14は、NVMモジュールが受け付ける、更新パリティ生成コマンドと更新パリティ生成コマンドへの応答情報を示した図である。FIG. 14 is a diagram illustrating an update parity generation command and response information to the update parity generation command received by the NVM module. 図15は、NVMモジュールが受け付ける、圧縮情報取得コマンドと圧縮情報取得コマンドへの応答情報を示した図である。FIG. 15 is a diagram showing a compressed information acquisition command and response information to the compressed information acquisition command received by the NVM module. 図16は、NVMモジュールが受け付ける、リードコマンドとそのリードコマンドへの応答情報を示した図である。FIG. 16 is a diagram showing a read command and response information to the read command received by the NVM module. 図17は、NVMモジュールが受け付ける、マッピング解除コマンドとマッピング解除コマンドへの応答情報を示した図である。FIG. 17 is a diagram illustrating a mapping cancellation command and response information to the mapping cancellation command received by the NVM module. 図18は、NVMモジュールが受け付ける、圧縮情報転送コマンドと圧縮情報転送コマンドへの応答情報を示した図である。FIG. 18 is a diagram illustrating a compressed information transfer command and response information to the compressed information transfer command received by the NVM module. 図19は、NVMモジュールが受け付ける、LBA0マッピングコマンドとLBA0マッピングコマンドへの応答情報を示した図であるFIG. 19 is a diagram illustrating LBA0 mapping command and response information to the LBA0 mapping command received by the NVM module. 図20-Aは、キャッシュ管理情報の一例を示した図である。FIG. 20A is a diagram showing an example of cache management information. 図20-Bは、フリーリストの例を示した図である。FIG. 20B is a diagram showing an example of a free list. 図21は、本発明の実施例に係るストレージ装置における、仮想ボリュームとRAIDグループとPDEVの対応関係を表した概念図である。FIG. 21 is a conceptual diagram showing a correspondence relationship between a virtual volume, a RAID group, and a PDEV in the storage apparatus according to the embodiment of the present invention. 図22は、本発明の実施例に係るストレージ装置における、仮想ボリュームとRAIDグループとPDEVの対応関係を管理するための管理情報の例を示した図である。FIG. 22 is a diagram showing an example of management information for managing the correspondence between virtual volumes, RAID groups, and PDEVs in the storage apparatus according to an embodiment of the present invention. 図23は、本発明の実施例に係るストレージ装置が用いる圧縮管理情報の構成を示した図である。FIG. 23 is a diagram showing a configuration of compression management information used by the storage apparatus according to the embodiment of the present invention. 図24は、伸長リード処理のフローチャートである。FIG. 24 is a flowchart of the decompression read process. 図25は、ライトデータキャッシュ格納処理のフローチャートである。FIG. 25 is a flowchart of the write data cache storage process. 図26は、パリティ生成処理のフローチャートである。FIG. 26 is a flowchart of parity generation processing. 図27は、デステージ処理のフローチャートである。FIG. 27 is a flowchart of the destage process. 図28は、圧縮管理情報2300の部分回復動作のフローチャートである。FIG. 28 is a flowchart of the partial recovery operation of the compression management information 2300. 図29は、リビルド処理のフローチャートである。FIG. 29 is a flowchart of the rebuild process.
 次に、本発明の実施形態を図面に基づいて説明する。尚、本発明は、以下に説明する実施形態に限定されるものではない。尚、半導体記録素子としてNAND型フラッシュメモリ(以下、FM)を例に説明するが、本発明はFMに限定されるものではなく、不揮発性メモリの全てを対象とする。また、本実施例では、データの圧縮を専用のハードウェア回路で実施する形態について記述するが、本発明はこの実施例に限定されるものではなく、汎用プロセッサによるデータ圧縮演算処理にてデータを圧縮するとしてもよい。また、本実施例では、パリティを専用のハードウェア回路で実施する形態について記述するが、本発明は、この実施例に限定されるものではなく、汎用プロセッサによるパリティ生成演算処理にてRAIDパリティを生成するとしてよい。 Next, an embodiment of the present invention will be described based on the drawings. The present invention is not limited to the embodiments described below. Note that a NAND flash memory (hereinafter referred to as FM) will be described as an example of the semiconductor recording element, but the present invention is not limited to FM, and covers all nonvolatile memories. In this embodiment, a mode in which data compression is performed by a dedicated hardware circuit will be described. However, the present invention is not limited to this embodiment, and data is compressed by a data compression arithmetic process by a general-purpose processor. It may be compressed. In this embodiment, a mode in which the parity is implemented by a dedicated hardware circuit will be described. However, the present invention is not limited to this embodiment, and the RAID parity is set by a parity generation calculation process by a general-purpose processor. It may be generated.
 (1-1)ストレージ装置の構成
 図1は、本発明の実施例に係るストレージ装置を中心としたコンピュータシステムの概略構成を示す図である。図1に示すNVMモジュール126は、FMを記録媒体とした半導体記録装置である。
(1-1) Configuration of Storage Device FIG. 1 is a diagram showing a schematic configuration of a computer system centered on a storage device according to an embodiment of the present invention. The NVM module 126 shown in FIG. 1 is a semiconductor recording device using FM as a recording medium.
 ストレージ装置101は複数のストレージコントローラ110を備えている。各ストレージコントローラ110は、上位装置との接続を行うホストインターフェース(ホストI/F)124と記録装置との接続を行うディスクインターフェース(ディスクI/F)123を備えている。ホストインターフェース124は例えば、FC(Fibre Channel)、iSCSI(internet Small Computer System Interface)、FCoE(Fibre Channel over Ether)等のプロトコルに対応したデバイスが挙げられ、ディスクインターフェース107は例えば、FC、SAS(Serial Attached SCSI)、SATA(Serial Advanced Technology Attachment)、PCI(Peripheral Component Interconnect)-Express等の各種プロトコルに対応したデバイスが挙げられる。更に、ストレージコントローラ110はプロセッサ121やメモリ(DRAM)125などのハードウェア資源を備え、プロセッサの制御の下、上位装置124からのリード/ライト要求に応じて、SSD111やHDD112等の最終記憶媒体装置へのリード/ライト要求を行う。また、キャッシュ装置として用いられるNVMモジュール126を内部にもち、内部SW122を介してプロセッサ121から制御可能となっている。 The storage apparatus 101 includes a plurality of storage controllers 110. Each storage controller 110 includes a host interface (host I / F) 124 that connects to a host device and a disk interface (disk I / F) 123 that connects to a recording device. Examples of the host interface 124 include devices that support protocols such as FC (Fibre Channel), iSCSI (Internet Small Computer System Interface), and FCoE (Fibre Channel over Ether), and the disk interface 107 includes, for example, FC and SAS (Serial). Examples include devices compatible with various protocols such as Attached SCSI), SATA (Serial Advanced Technology Attachment), and PCI (Peripheral Component Interconnect) -Express. Furthermore, the storage controller 110 includes hardware resources such as a processor 121 and a memory (DRAM) 125, and a final storage medium device such as the SSD 111 and the HDD 112 in response to a read / write request from the host device 124 under the control of the processor. To read / write requests. Further, it has an NVM module 126 used as a cache device, and can be controlled from the processor 121 via the internal SW 122.
 また、ストレージコントローラ110はRAID(Redundant Arrays of Inexpensive Disks)パリティ生成機能及び、RAIDパリティによるデータ復元機能を備え、複数のSSD111や複数のHDD112を任意の単位でRAIDグループとして管理する。また、ストレージコントローラ110は記録装置の障害、使用状況、動作状況等を監視及び管理する機能を持っている。 In addition, the storage controller 110 has a RAID (Redundant Arrays of Inexpensive Disks) parity generation function and a data restoration function using RAID parity, and manages a plurality of SSDs 111 and a plurality of HDDs 112 as a RAID group in arbitrary units. The storage controller 110 also has a function of monitoring and managing the failure, usage status, operating status, etc. of the recording device.
 ストレージ装置101は管理装置104とネットワークを介して接続している。このネットワークは例えばLAN(Local Area Network)などが挙げられる。このネットワークは図1では簡略化の為に省略したが、ストレージ装置101内部の各ストレージコントローラ110に接続している。尚、このネットワークはSAN102と同じネットワークによって接続してもよい。 The storage apparatus 101 is connected to the management apparatus 104 via a network. An example of this network is a LAN (Local Area Network). Although this network is omitted for simplification in FIG. 1, it is connected to each storage controller 110 in the storage apparatus 101. This network may be connected by the same network as the SAN 102.
 管理装置104は、プロセッサやメモリ、ネットワークインターフェース、ローカル入出力デバイス等のハードウェア資源と、管理プログラム等のソフトウェア資源を備えたコンピュータである。管理装置104は、プログラムによってストレージ装置から情報を取得し、管理画面を表示する。 The management device 104 is a computer having hardware resources such as a processor, a memory, a network interface, and a local input / output device, and software resources such as a management program. The management device 104 acquires information from the storage device by a program and displays a management screen.
 システム管理者は、管理装置に表示された管理画面を用いて、ストレージ装置101の監視、及び運用における制御を行う。 The system administrator uses the management screen displayed on the management apparatus to monitor the storage apparatus 101 and control the operation.
 SSD111は、ストレージ装置101内に複数(例えば16個)あり、同じくストレージ装置内に複数あるストレージコントローラ110とディスクインターフェース123を介して接続されている。SSD111は、ストレージコントローラからのライト要求に応じて転送されるデータを格納し、リード要求に応じて格納済みのデータを取り出しストレージコントローラに転送する。尚、このときディスクインターフェース123は、リード/ライト要求する論理的な格納位置を論理アドレス(以下LBA:Logical Block Address)によって指定する。また、複数のSSD111は、複数のRAIDグループとして管理されており、データ損失時に損失データの復元が可能な構成としている。 There are a plurality (for example, 16) of SSDs 111 in the storage apparatus 101, and they are connected to a plurality of storage controllers 110 in the storage apparatus via the disk interface 123. The SSD 111 stores data transferred in response to a write request from the storage controller, retrieves stored data in response to a read request, and transfers the data to the storage controller. At this time, the disk interface 123 designates a logical storage location for a read / write request by a logical address (hereinafter, LBA: Logical Block Address). The plurality of SSDs 111 are managed as a plurality of RAID groups, and are configured to be able to restore lost data when data is lost.
 HDD(Hard Disk Drive)112は、ストレージ装置101内に複数(例えば120個)あり、SSD111と同様に、同じストレージ装置内に複数あるストレージコントローラ110とディスクインターフェース123を介して接続されている。HDD112は、ストレージコントローラ110からのライト要求に応じて転送されるデータを格納し、リード要求に応じて格納済みのデータを取り出しストレージコントローラ110に転送する。尚、このときディスクインターフェース123は、リード/ライト要求する論理的な格納位置を論理アドレス(以下LBA:Logical Block Address)によって指定する。また、複数のHDD111は複数のRAIDグループとして管理されており、データ損失時に損失データの復元が可能な構成としている。 A plurality of HDDs (Hard Disk Drives) 112 (for example, 120) are provided in the storage apparatus 101, and are connected to a plurality of storage controllers 110 in the same storage apparatus via the disk interface 123, similarly to the SSD 111. The HDD 112 stores data transferred in response to a write request from the storage controller 110, retrieves stored data in response to a read request, and transfers it to the storage controller 110. At this time, the disk interface 123 designates a logical storage location for a read / write request by a logical address (hereinafter, LBA: Logical Block Address). Further, the plurality of HDDs 111 are managed as a plurality of RAID groups, and are configured to be able to restore lost data when data is lost.
 ストレージコントローラ110は、ホストインターフェース124を介して、上位装置103と接続するSAN102と接続する。尚、図1では簡略化の為に省略したが、ストレージコントローラ110間でデータや制御情報を相互に通信する接続パスも備えている。 The storage controller 110 is connected to the SAN 102 connected to the host device 103 via the host interface 124. Although omitted in FIG. 1 for simplification, a connection path for mutually communicating data and control information between the storage controllers 110 is also provided.
 上位装置103は、例えば業務システムの中核をなすコンピュータ、ファイルサーバー等が相当する。上位装置103は、プロセッサやメモリ、ネットワークインターフェース、ローカル入出力デバイス等のハードウェア資源を備え、デバイスドライバやオペレーティングシステム(OS)、アプリケーションプログラムなどのソフトウェア資源を備えている。これにより上位装置103は、プロセッサ制御の下、各種プログラムを実行することで、ストレージ装置101との通信及び、データのリード/ライト要求を行う。また、プロセッサ制御の下、各種プログラムを実行することで、ストレージ装置101の使用状況、動作状況等の管理情報を取得する。また、記録装置の管理単位や記録装置制御方法、データ圧縮設定等を指定し、変更を行うことができる。 The host device 103 corresponds to, for example, a computer or a file server that forms the core of a business system. The host device 103 includes hardware resources such as a processor, a memory, a network interface, and a local input / output device, and includes software resources such as a device driver, an operating system (OS), and an application program. As a result, the host apparatus 103 executes various programs under processor control to perform communication with the storage apparatus 101 and data read / write requests. Also, management information such as usage status and operation status of the storage apparatus 101 is acquired by executing various programs under processor control. In addition, the management unit of the recording apparatus, the recording apparatus control method, the data compression setting, and the like can be designated and changed.
 ここまで、本発明が適用されるNVMモジュール126を含むコンピュータシステム構成について説明した。 So far, the computer system configuration including the NVM module 126 to which the present invention is applied has been described.
 (1-3)ストレージ装置の論理構成
 次に図2-Aを用いて本実施例のストレージ装置の論理空間構成について説明する。図2-Aは、本実施例のストレージ装置において、上位装置103からのライト要求時のライトデータの管理状態の遷移を示している。
(1-3) Logical Configuration of Storage Device Next, the logical space configuration of the storage device of this embodiment will be described with reference to FIG. FIG. 2-A shows the transition of the write data management state when a write request is issued from the host device 103 in the storage apparatus of this embodiment.
 上位装置103は、記憶領域として仮想ボリューム(図中では「仮想Vol」と表記)200を認識しており、仮想ボリューム200内のアドレスを指定してデータアクセスを行う。 The host apparatus 103 recognizes a virtual volume (indicated as “virtual Vol” in the figure) 200 as a storage area, and performs data access by designating an address in the virtual volume 200.
 仮想ボリューム200は、ストレージ装置101が上位装置103に提供する仮想的な空間である。上位装置103が仮想ボリューム200に対してデータのライトを行うと、ライトデータはストレージ装置101内で圧縮されて最終記憶媒体(HDD111またはSSD112)に格納されるが、上位装置103からは、仮想ボリューム200にデータが圧縮されて格納されていると認識することはできない(データ圧縮によるデータの変化が隠蔽されている)。尚、図2-Aでは、ストレージ装置101が内部に一つの仮想ボリューム200を持つ例について記すが、本発明はこの例に限定されるものではない。ストレージ装置101は、複数の仮想ボリュームを管理していても良い。また、複数管理するボリュームの中には、圧縮を行わないボリュームがあっても良い。ただし、本発明の実施例では、データ圧縮を上位装置103に隠蔽する仮想ボリューム200について中心に説明する。 The virtual volume 200 is a virtual space that the storage apparatus 101 provides to the host apparatus 103. When the upper device 103 writes data to the virtual volume 200, the write data is compressed in the storage device 101 and stored in the final storage medium (HDD 111 or SSD 112). It cannot be recognized that data is compressed and stored in 200 (changes in data due to data compression are concealed). In FIG. 2A, an example in which the storage apparatus 101 has one virtual volume 200 is described, but the present invention is not limited to this example. The storage apparatus 101 may manage a plurality of virtual volumes. In addition, a plurality of volumes to be managed may include a volume that is not compressed. However, in the embodiment of the present invention, the virtual volume 200 that conceals data compression in the higher-level device 103 will be mainly described.
 本発明のストレージ装置101は、物理的なHDD111またはSSD112の記憶領域をそれぞれ、論理的にPDEV205(Physical Device)として管理し、各PDEV205を、容量を仮想的に拡張した1つの仮想PDEV204と対応付けて管理する。ストレージ装置101は、複数の仮想PDEV204によりRG203(RAIDグループ)を構成して管理し、このRG203と仮想ボリューム200を対応付けて管理する。尚、図2-Aには、1つのRGと1つの仮想ボリューム200を対応付けた例(仮想ボリューム200とRAIDグループ0)について記すが、本発明はこの例に限定されるものでない。一つのRGと複数の仮想ボリュームが対応付けられる構成でも良いし、一つの仮想ボリュームが複数のRGと対応付けられた構成であっても良い。 The storage apparatus 101 of the present invention logically manages the storage areas of the physical HDD 111 or SSD 112 as PDEV 205 (Physical Device), and associates each PDEV 205 with one virtual PDEV 204 whose capacity is virtually expanded. Manage. The storage apparatus 101 configures and manages an RG 203 (RAID group) with a plurality of virtual PDEVs 204, and manages the RG 203 and the virtual volume 200 in association with each other. FIG. 2A shows an example in which one RG is associated with one virtual volume 200 (virtual volume 200 and RAID group 0), but the present invention is not limited to this example. A configuration in which one RG and a plurality of virtual volumes are associated may be employed, or a configuration in which one virtual volume is associated with a plurality of RGs may be employed.
 ストレージ装置101は、仮想ボリューム200内のライト先として指定された領域をNVMモジュール126が提供するLBA0空間にキャッシュするものとして管理する。なお、LBA0空間は、NVMモジュール126がストレージ装置101に提供する仮想的な論理空間であり、NVMモジュール126が圧縮して格納したデータが、非圧縮で格納されているものとして、ストレージ装置101(ストレージコントローラ110のプロセッサ121)よりアクセス可能にする空間である。 The storage apparatus 101 manages the area specified as the write destination in the virtual volume 200 as being cached in the LBA0 space provided by the NVM module 126. Note that the LBA0 space is a virtual logical space provided to the storage apparatus 101 by the NVM module 126, and it is assumed that the data compressed and stored by the NVM module 126 is stored uncompressed. This space is accessible from the processor 121) of the storage controller 110.
 一方、ライトデータは、ストレージ装置101が上位装置103より受領した後、NVMモジュール126に転送する。本実施例のNVMモジュール126はこのとき、データを圧縮してNVMモジュール126内部に記録する。 On the other hand, the write data is transferred to the NVM module 126 after the storage apparatus 101 receives it from the host apparatus 103. At this time, the NVM module 126 of this embodiment compresses the data and records it in the NVM module 126.
 NVMモジュール126への記録が完了したことで、ストレージ装置101は、キャッシュ領域(NVMモジュール126が提供するLBA0空間)にライトデータを格納したと判断し、上位装置103にライトが完了したことを通知する。 When the recording to the NVM module 126 is completed, the storage apparatus 101 determines that the write data has been stored in the cache area (the LBA0 space provided by the NVM module 126), and notifies the upper apparatus 103 that the write has been completed. To do.
 また、ストレージ装置101は任意のタイミングで、LBA0空間に記録したライトデータの圧縮データを最終記憶媒体であるHDD111やSSD112に転送する。このときストレージ装置101は、NVMモジュール126より圧縮データを取得する必要がある。本実施例のストレージ装置101は、図2-Aに示すように、NVMモジュール126が提供するLBA1空間202を用いて圧縮データを取得する。このために、ストレージ装置101はLBA0空間上の非圧縮領域領域に格納されたデータの圧縮データをLBA1空間202に対応付けるコマンドをNVMモジュール126に発行する。 In addition, the storage apparatus 101 transfers the compressed data of the write data recorded in the LBA0 space to the HDD 111 or the SSD 112 as the final storage medium at an arbitrary timing. At this time, the storage apparatus 101 needs to acquire compressed data from the NVM module 126. As shown in FIG. 2A, the storage apparatus 101 of this embodiment acquires compressed data using the LBA1 space 202 provided by the NVM module 126. For this purpose, the storage apparatus 101 issues a command for associating the compressed data stored in the non-compressed area area in the LBA0 space with the LBA1 space 202 to the NVM module 126.
 LBA1空間202への対応付けコマンドを受領したNVMモジュール126は、指定されたLBA0空間に対応付けられた圧縮データをLBA1空間に対応付ける。ストレージ装置101は、圧縮データをLBA1空間のアドレスを指定することでNVMモジュール126より取得する。 The NVM module 126 that has received the association command for the LBA1 space 202 associates the compressed data associated with the designated LBA0 space with the LBA1 space. The storage apparatus 101 acquires the compressed data from the NVM module 126 by designating the address of the LBA1 space.
 続いて、ストレージ装置101は最終記憶媒体に転送する圧縮データが対応付けられている仮想ボリューム内のアドレスより、データを格納する仮想PDEV204と仮想PDEV204内のアドレスを特定する。そして、仮想PDEV204内のアドレスに対応付けるPDEV205のアドレスを決定し、物理デバイスに対してデータを転送する。 Subsequently, the storage apparatus 101 identifies the addresses in the virtual PDEV 204 and the virtual PDEV 204 for storing data from the addresses in the virtual volume associated with the compressed data to be transferred to the final storage medium. Then, the address of the PDEV 205 associated with the address in the virtual PDEV 204 is determined, and the data is transferred to the physical device.
 以上が本実施例のストレージ装置の論理構成の概要である。尚、図2-Bに示すように本発明において、圧縮データを取得するためのLBA1空間202は、なくとも良い。例えば、ストレージ装置101は、LBA0のアドレスと圧縮データを伸張せずに転送する指示を含むリードコマンドを発行し、LBA0空間を用いて圧縮データをNVMモジュール126よりリードするとしても良い。 The above is the outline of the logical configuration of the storage apparatus of this embodiment. As shown in FIG. 2-B, in the present invention, the LBA1 space 202 for acquiring the compressed data is not necessary. For example, the storage apparatus 101 may issue a read command including an LBA0 address and an instruction to transfer the compressed data without decompressing, and read the compressed data from the NVM module 126 using the LBA0 space.
 (1-4)ライトデータの転送
 本発明の実施例に係るストレージ装置におけるライトデータの転送について、図3-Aを用いて説明する。ストレージ装置101は、上位装置103より取得したデータをキャッシュであるNVMモジュール126に圧縮して格納する。以降、この動作をホストライト動作と記す。次に、ホストライト動作において、生じるデータ転送について説明する。
(1-4) Write Data Transfer Write data transfer in the storage apparatus according to the embodiment of the present invention will be described with reference to FIG. The storage apparatus 101 compresses and stores the data acquired from the host apparatus 103 in the NVM module 126 that is a cache. Hereinafter, this operation is referred to as a host write operation. Next, data transfer that occurs in the host write operation will be described.
 まずホストライド動作における最初のデータ転送は、上位装置からのライトデータ取得時に行う。この転送は、ホストインターフェース124からストレージコントローラのDRAM125への転送である(311)。ストレージ装置101は、ホストインターフェース124にコマンドを発行することでこの転送を実施する。 First, the first data transfer in the host ride operation is performed when the write data is acquired from the host device. This transfer is a transfer from the host interface 124 to the DRAM 125 of the storage controller (311). The storage apparatus 101 performs this transfer by issuing a command to the host interface 124.
 続いて、ストレージ装置101は、NVMモジュール126に対してコマンドを発行し、DRAM125に格納されたライトデータをNVMモジュール126に転送する(312)。NVMモジュール126は、内部の圧縮ハードウェア(圧縮回路)により、ライトデータを圧縮し、NVMモジュール126内のDRAM(データバッファ)416に格納する(313)。このDRAMへの圧縮データの格納が完了した段階でNVMモジュール126は、ストレージ装置にライトデータの格納が完了したことを通知する。尚、DRAM416に格納された圧縮データは、DRAM416からNVMモジュール126内のNVM(FM)420に転送され、記録される場合もあれば、DRAM416に保持され続ける場合もある。DRAM416からNVM(FM)420への転送要否判断は、NVMモジュール126内の制御方法に依存する。 Subsequently, the storage apparatus 101 issues a command to the NVM module 126 and transfers the write data stored in the DRAM 125 to the NVM module 126 (312). The NVM module 126 compresses the write data by internal compression hardware (compression circuit) and stores it in the DRAM (data buffer) 416 in the NVM module 126 (313). When the storage of the compressed data in the DRAM is completed, the NVM module 126 notifies the storage device that the storage of the write data is completed. The compressed data stored in the DRAM 416 may be transferred from the DRAM 416 to the NVM (FM) 420 in the NVM module 126 and recorded, or may be kept in the DRAM 416. Whether the transfer from the DRAM 416 to the NVM (FM) 420 is necessary depends on the control method in the NVM module 126.
 NVMモジュール126よりライトデータの格納完了を受領したストレージ装置101は、上位装置103にライトコマンドの完了を通知する。以上が本実施例におけるホストライト動作にて生じるデータ転送である。 The storage apparatus 101 that has received the write data storage completion from the NVM module 126 notifies the host apparatus 103 of the completion of the write command. The above is the data transfer that occurs in the host write operation in this embodiment.
 ホストライト動作以降、ストレージ装置101は任意のタイミングで、ライトデータに対してRAIDパリティを生成する。以降この動作をパリティ生成動作と記す。次にパリティ生成動作にて生じるデータ転送について説明する。 After the host write operation, the storage apparatus 101 generates a RAID parity for the write data at an arbitrary timing. Hereinafter, this operation is referred to as a parity generation operation. Next, data transfer that occurs in the parity generation operation will be described.
 ストレージ装置101は、パリティ生成動作時にライトデータに対して、RAIDパリティを生成する。本発明では、圧縮したライトデータに対してパリティを生成するのではなく、非圧縮のライトデータに対してパリティを生成する。このパリティ生成方式では、ライトデータをNVMモジュール126に記録する際に非圧縮で格納し、パリティ生成後にデータを圧縮する方法が考えられる。しかし、NAND型フラッシュメモリやRRAMの多くの不揮発性メモリは書き込み回数が制限されており、圧縮によるNVM(FM)420への書き込みデータ量を削減することで、NVMモジュール126の装置寿命を延長できる。また、データを圧縮して格納することで、NVMモジュール126の容量を拡張でき、より少ない装置コストでストレージ装置のキャッシュ領域を拡張できる効果もある。 The storage apparatus 101 generates a RAID parity for the write data during the parity generation operation. In the present invention, parity is not generated for compressed write data, but parity is generated for uncompressed write data. In this parity generation method, a method is conceivable in which write data is stored uncompressed when recorded in the NVM module 126 and the data is compressed after parity generation. However, many nonvolatile memories such as NAND flash memory and RRAM have a limited number of writes, and by reducing the amount of data written to the NVM (FM) 420 by compression, the device life of the NVM module 126 can be extended. . In addition, by compressing and storing data, the capacity of the NVM module 126 can be expanded, and the cache area of the storage apparatus can be expanded at a lower apparatus cost.
 以上の効果を得るため、NVMモジュール126は、NVMモジュール126内のDRAM416またはNVMにデータを圧縮して格納し、パリティ生成時に伸張する。尚、この動作は必須ではなく、NVM(FM)420またはDRAM416に非圧縮のまま記録し、パリティ生成後に、生成されたパリティとあわせてデータを圧縮するとしても良い。 In order to obtain the above effects, the NVM module 126 compresses and stores the data in the DRAM 416 or NVM in the NVM module 126, and decompresses the data when generating the parity. This operation is not indispensable, and it may be recorded in the NVM (FM) 420 or the DRAM 416 without being compressed, and the data may be compressed together with the generated parity after the parity is generated.
 本発明の実施例に係るNVMモジュール126では、DRAM416またはNVM(FM)420に記録されている圧縮データを伸張した後、パリティ生成回路に伸張したデータを提供する(317)。この機能によりNVMモジュール126は、装置の長寿命化や低コスト化の効果を狙いつつ、非圧縮データに対してパリティを生成する。 In the NVM module 126 according to the embodiment of the present invention, after the compressed data recorded in the DRAM 416 or the NVM (FM) 420 is expanded, the expanded data is provided to the parity generation circuit (317). With this function, the NVM module 126 generates parity for the uncompressed data while aiming to increase the life of the apparatus and reduce the cost.
 パリティ生成回路にて生成されたパリティは圧縮回路に転送され(318)、圧縮データとなってDRAM416に転送される(319)。尚、DRAM416に格納された圧縮パリティは、NVMモジュール126の判断によって、NVM(FM)420に記録される場合もあれば、DRAM416に保持し続けられる場合もある。また、パリティ生成回路にて生成されたパリティを必ずしも圧縮する必要はない。一般的にパリティはデータと比べて圧縮によるデータ効果が期待できないため、圧縮しない制御を行っても良い。 The parity generated by the parity generation circuit is transferred to the compression circuit (318), and is transferred to the DRAM 416 as compressed data (319). Note that the compressed parity stored in the DRAM 416 may be recorded in the NVM (FM) 420 or kept in the DRAM 416 depending on the determination of the NVM module 126. Further, it is not always necessary to compress the parity generated by the parity generation circuit. In general, since the data effect due to compression cannot be expected compared with data, parity control may be performed without compression.
 以上が、パリティ生成動作にて生じるデータ転送である。 The above is the data transfer that occurs in the parity generation operation.
 パリティ生成動作以降、ストレージ装置101は任意のタイミングで、パリティとライトデータの圧縮データを最終記憶媒体に転送する。以降この動作をデステージ動作と記す。次にデステージ動作にて生じるデータ転送について説明する。 After the parity generation operation, the storage apparatus 101 transfers the compressed data of parity and write data to the final storage medium at an arbitrary timing. Hereinafter, this operation is referred to as a destage operation. Next, data transfer that occurs in the destage operation will be described.
 ストレージ装置101は、デステージ動作時に圧縮したパリティとライトデータをNVMモジュール126よりリードする。このときNVMモジュール126は、指定されたライトデータとパリティの圧縮データをストレージ装置101のDRAM125に転送する(322)。その後ストレージ装置101は、圧縮されたライトデータとパリティを、SSDまたはHDDに転送する(323)。 The storage apparatus 101 reads the compressed parity and write data from the NVM module 126 during the destage operation. At this time, the NVM module 126 transfers the designated write data and compressed parity data to the DRAM 125 of the storage apparatus 101 (322). Thereafter, the storage apparatus 101 transfers the compressed write data and parity to the SSD or HDD (323).
 以上が、本発明の実施例に係るストレージ装置101が実施する、ライトデータの転送処理の概要である。尚、本発明において、圧縮データを伸張回路にて伸張した後、直接パリティ生成回路に転送する制御は必須ではなく、図3-Bに示すように、伸張したデータをDRAM416に記録し、そのデータをパリティ生成回路に転送するとしてもよい。同様に、パリティ生成回路にて、生成されたパリティは、直接圧縮回路に転送する必要はなく、生成したパリティをDRAM416に記録し、そのデータを圧縮回路に転送するとしても良い。 The above is the outline of the write data transfer process performed by the storage apparatus 101 according to the embodiment of the present invention. In the present invention, after the compressed data is decompressed by the decompression circuit, it is not essential to directly transfer the compressed data to the parity generation circuit, and the decompressed data is recorded in the DRAM 416 as shown in FIG. May be transferred to the parity generation circuit. Similarly, the parity generated by the parity generation circuit need not be directly transferred to the compression circuit, but the generated parity may be recorded in the DRAM 416 and the data may be transferred to the compression circuit.
 (1-5)NVMモジュールの構成
 次に図4を用いて、NVMモジュール126の内部構成について説明する。
(1-5) Configuration of NVM Module Next, the internal configuration of the NVM module 126 will be described with reference to FIG.
 NVMモジュール126は内部に、FMコントローラ(FM CTL)410と複数(例えば32個)のFM420を備える。 The NVM module 126 includes an FM controller (FM CTL) 410 and a plurality of (for example, 32) FM 420s.
 FMコントローラ410は、その内部にプロセッサ415、RAM(DRAM)413、データ圧縮/伸長ユニット418、パリティ生成ユニット419、データバッファ416、I/Oインターフェース(I/F)411、FMインターフェース(I/F)417、及びデータ転送を相互に行うスイッチ414を備えている。 The FM controller 410 includes a processor 415, a RAM (DRAM) 413, a data compression / decompression unit 418, a parity generation unit 419, a data buffer 416, an I / O interface (I / F) 411, an FM interface (I / F). ) 417, and a switch 414 for mutually transferring data.
 スイッチ414は、FMコントローラ410内のプロセッサ415、RAM413、データ圧縮/伸長ユニット418、パリティ生成ユニット419、データバッファ416、I/Oインターフェース411、FMインターフェース417を接続し、各部位間のデータをアドレスまたはIDによってルーティングし転送する。 The switch 414 connects the processor 415 in the FM controller 410, the RAM 413, the data compression / decompression unit 418, the parity generation unit 419, the data buffer 416, the I / O interface 411, and the FM interface 417, and addresses the data between the parts. Or route and forward by ID.
 I/Oインターフェース411は、ストレージ装置101内のストレージコントローラ110が備える内部スイッチ122と接続し、スイッチ414を介してFMコントローラ410の各部位と接続する。I/Oインターフェース411は、ストレージ装置101内のストレージコントローラ110が備えるプロセッサ121から、リード/ライト要求と要求対象とする論理的な格納位置(LBA:Logical Block Address)を受領し、当該要求の処理を行う。さらにライト要求時にはライトデータを受領し、ライトデータをFM420に記録する。また、I/Oインターフェース411は、ストレージコントローラ110が備えるプロセッサ121からの指示を受領し、FMコントローラ内部410のプロセッサ415に割り込みを発行する。さらに、I/Oインターフェース411は、ストレージコントローラ110が備えるプロセッサ121よりNVMモジュール126の制御用コマンド等も受領し、そのコマンドに応じてNVMモジュール126の動作状況、利用状況、現在の設定値等を、ストレージコントローラ110に通知可能である。 The I / O interface 411 is connected to the internal switch 122 included in the storage controller 110 in the storage apparatus 101, and is connected to each part of the FM controller 410 via the switch 414. The I / O interface 411 receives a read / write request and a logical storage location (LBA: Logical Block Address) to be requested from the processor 121 included in the storage controller 110 in the storage apparatus 101, and processes the request. I do. Further, when a write request is made, the write data is received and the write data is recorded in the FM 420. Further, the I / O interface 411 receives an instruction from the processor 121 included in the storage controller 110 and issues an interrupt to the processor 415 in the FM controller internal 410. Further, the I / O interface 411 also receives a control command for the NVM module 126 from the processor 121 included in the storage controller 110, and displays the operation status, usage status, current setting value, etc. of the NVM module 126 according to the command. The storage controller 110 can be notified.
 プロセッサ415は、スイッチ414を介してFMコントローラ410の各部位と接続し、RAM413に記録されたプログラム及び管理情報を基にFMコントローラ410全体を制御する。また、プロセッサ415は、定期的な情報取得、及び割り込み受信機能によって、FMコントローラ410全体を監視する。 The processor 415 is connected to each part of the FM controller 410 via the switch 414 and controls the entire FM controller 410 based on the program and management information recorded in the RAM 413. In addition, the processor 415 monitors the entire FM controller 410 by a periodic information acquisition and interrupt reception function.
 データバッファ416は、一例としてDRAMを用いて構成され、FMコントローラ410でのデータ転送処理途中の一時的なデータを格納する。 The data buffer 416 is configured by using a DRAM as an example, and stores temporary data in the middle of data transfer processing in the FM controller 410.
 FMインターフェース417は、複数バス(例えば16)によってFM420と接続する。各バスには複数(例えば2)のFM420を接続し、同じくFM420に接続されるCE(Chip Enable)信号を用い、同一バスに接続された複数のFM420を独立して制御する。 The FM interface 417 is connected to the FM 420 by a plurality of buses (for example, 16). A plurality (for example, 2) of FM 420 is connected to each bus, and a plurality of FMs 420 connected to the same bus are controlled independently using a CE (Chip Enable) signal that is also connected to the FM 420.
 FMインターフェース417は、プロセッサ415より指示されるリード/ライト要求に応じて動作する。このときFMインターフェース417はプロセッサ415より要求対象をチップ、ブロック、ページ、の各番号として指示される。リード要求であればFM420から格納データをリードしデータバッファ416に転送し、ライト要求であれば格納すべきデータをデータバッファ416から呼び出し、FM420に転送する。 The FM interface 417 operates in response to a read / write request instructed by the processor 415. At this time, the FM interface 417 is instructed by the processor 415 as the chip, block, and page numbers as request targets. If it is a read request, the stored data is read from the FM 420 and transferred to the data buffer 416. If it is a write request, the data to be stored is called from the data buffer 416 and transferred to the FM 420.
 また、FMインターフェース417はECC生成回路、ECCによるデータ損失検出回路、ECC訂正回路を有し、FM420へのデータ書き込み時にはデータに対してECCを付加して書き込む。またデータ呼び出し時にECCによるデータ損失検出回路によって、FM420からの呼び出しデータを検査し、データ損失が検出された際には、ECC訂正回路によってデータ訂正を行う。 Also, the FM interface 417 includes an ECC generation circuit, an ECC data loss detection circuit, and an ECC correction circuit. When writing data to the FM 420, the data is written with the ECC added. Further, when data is called, the call data from the FM 420 is inspected by the data loss detection circuit using ECC, and when the data loss is detected, the data is corrected by the ECC correction circuit.
 データ圧縮/伸長ユニット418は、可逆圧縮のアルゴリズムを用いたデータ圧縮機能を有する。またデータ圧縮アルゴリズムとして複数種のアルゴリズムを有し、さらに圧縮レベルの変更機能も備える。データ圧縮/伸長ユニット418は、プロセッサ415からの指示に従って、データバッファ416からデータをリードし、可逆圧縮のアルゴリズムによりデータ圧縮演算もしくはデータ圧縮の逆変換であるデータ伸長演算を行い、その結果を再度データバッファにライトする。尚、データ圧縮/伸長ユニット418は、論理回路として実装してもよいし、圧縮/伸長のプログラムをプロセッサで実行することで、同様の機能を実現してもよい。 The data compression / decompression unit 418 has a data compression function using a reversible compression algorithm. In addition, there are a plurality of types of data compression algorithms, and a compression level changing function is also provided. The data compression / decompression unit 418 reads data from the data buffer 416 according to an instruction from the processor 415, performs a data compression operation that is a data compression operation or an inverse conversion of the data compression by a lossless compression algorithm, and outputs the result again. Write to the data buffer. The data compression / decompression unit 418 may be implemented as a logic circuit, or a similar function may be realized by executing a compression / decompression program with a processor.
 パリティ生成ユニット419は、RAID技術で必要とされる冗長データであるパリティの生成機能を有しており、具体的には、RAID5、6で用いられるXOR演算、RAID6で用いられるリードソロモン符号またはEVENODD法により算出される対角パリティ(Diagonal Parity)の生成機能を有している。パリティ生成ユニット419は、プロセッサ415からの指示に従って、データバッファ416からパリティ生成対象となるデータをリードし、前述のパリティ生成機能により、RAID5またはRAID6のパリティを生成する。 The parity generation unit 419 has a function of generating parity that is redundant data required in the RAID technology. Specifically, the parity generation unit 419 includes an XOR operation used in RAID 5 and 6, a Reed Solomon code or EVENODD used in RAID 6. It has a function of generating diagonal parity calculated by the method. The parity generation unit 419 reads data that is a parity generation target from the data buffer 416 in accordance with an instruction from the processor 415, and generates RAID5 or RAID6 parity by the above-described parity generation function.
 以上説明した、スイッチ414、I/Oインターフェース411、プロセッサ415、データバッファ416、FMインターフェース417、データ圧縮/伸長ユニット418、パリティ生成ユニット419は、ASIC(Application Specific Integrated Circuit)やFPGA(Field Programmable Gate Array)として、一つの半導体素子内で構成してもよいし、複数の個別専用IC(Integrated Circuit)を相互に接続した構成であってもよい。 The switch 414, I / O interface 411, processor 415, data buffer 416, FM interface 417, data compression / decompression unit 418, and parity generation unit 419 described above are ASIC (Application Specific Integrated Circuit) and FPGA (Field Programmable Gate). Array) may be configured in one semiconductor element, or may be configured by connecting a plurality of individual dedicated ICs (Integrated Circuits) to each other.
 RAM413には具体的にはDRAMなどの揮発性メモリが用いられる。RAM413は、NVMモジュール126内で用いられるFM420の管理情報、各DMAが用いる転送制御情報を含んだ転送リスト等を格納する。尚、データを格納するデータバッファ416の役割の一部または全てをRAM413に含ませて、RAM413をデータ格納に用いる構成としてもよい。 Specifically, a volatile memory such as a DRAM is used for the RAM 413. The RAM 413 stores management information of the FM 420 used in the NVM module 126, a transfer list including transfer control information used by each DMA, and the like. A part or all of the role of the data buffer 416 for storing data may be included in the RAM 413 and the RAM 413 may be used for data storage.
 ここまで、図4を用いて本発明が適用されるNVMモジュール126の構成について説明した。尚、本実施例では図4に示すようにフラッシュメモリ(Flash Memory)を搭載したNVMモジュール126について記述しているが、NVMモジュール126に搭載する不揮発性メモリはフラッシュメモリに限定されるものではない。たとえばPhase Change RAMやResistance RAM等の不揮発メモリであってもよい。また、FM420の一部または全部を揮発性のRAM(DRAM等)とする構成であってもよい。 Up to this point, the configuration of the NVM module 126 to which the present invention is applied has been described with reference to FIG. In the present embodiment, the NVM module 126 having the flash memory (Flash Memory) is described as shown in FIG. 4, but the non-volatile memory to be mounted on the NVM module 126 is not limited to the flash memory. . For example, a non-volatile memory such as Phase Change RAM or Resistance RAM may be used. Further, a configuration may be adopted in which part or all of the FM 420 is a volatile RAM (DRAM or the like).
 次に、図5を用いてFM420について説明する。FM420内の不揮発性メモリ領域は、複数(例えば、4096個)のブロック(物理ブロック)502で構成され、格納されたデータは物理ブロック単位で消去される。またFM420は、I/Oレジスタ501を内部に持つ。I/Oレジスタ501は、物理ページサイズ(例えば8KB)以上の記録容量を持つレジスタである。 Next, the FM 420 will be described with reference to FIG. The nonvolatile memory area in the FM 420 is composed of a plurality (for example, 4096) of blocks (physical blocks) 502, and stored data is erased in units of physical blocks. The FM 420 has an I / O register 501 inside. The I / O register 501 is a register having a recording capacity equal to or larger than a physical page size (for example, 8 KB).
 FM420は、FMインターフェース417からのリード/ライト要求の指示に従って動作する。ライト動作の流れは以下の通りである。FM420はまず、FMインターフェース417より、ライトコマンドと要求対象の物理ブロック、物理ページを受信する。次に、FMインターフェース417より転送されるライトデータをI/Oレジスタ501に格納する。その後、I/Oレジスタ501に格納されたデータを、指定された物理ページにライトする。 FM 420 operates in accordance with a read / write request instruction from FM interface 417. The flow of the write operation is as follows. First, the FM 420 receives a write command, a requested physical block, and a physical page from the FM interface 417. Next, the write data transferred from the FM interface 417 is stored in the I / O register 501. Thereafter, the data stored in the I / O register 501 is written to the designated physical page.
 リード動作の流れは以下の通りである。FM420はまず、FMインターフェース417からリードコマンドと要求対象の物理ブロック、ページを受信する。次に、指定された物理ブロックの物理ページに格納されたデータをリードしI/Oレジスタ501に格納する。その後、I/Oレジスタ501に格納されたデータをFMインターフェース417に対して転送する。 The flow of read operation is as follows. First, the FM 420 receives a read command, a requested physical block, and a page from the FM interface 417. Next, the data stored in the physical page of the designated physical block is read and stored in the I / O register 501. Thereafter, the data stored in the I / O register 501 is transferred to the FM interface 417.
 次に、図6を用いて物理ブロック502について説明する。物理ブロック502は、複数(例えば128)のページ601に分かれており、格納データの読み出しやデータの書き込みは、ページ単位で処理される。また、ブロック502内の物理ページ601に対して書き込みを行う順序は固定されており、先頭のページから順に書き込みが行われる。つまりPage1、Page2、Page3…の順にデータが書き込まれなければならない。また、書き込み済みのページ601への上書きは原則として禁止されており、書き込み済みのページ601にデータを上書きする場合、そのページ601が属するブロック502内のデータをすべて消去した後でなければ、そのページ601に対してデータを書き込むことができない。 Next, the physical block 502 will be described with reference to FIG. The physical block 502 is divided into a plurality of (for example, 128) pages 601, and reading of stored data and writing of data are processed in units of pages. The order of writing to the physical page 601 in the block 502 is fixed, and writing is performed in order from the first page. That is, data must be written in the order of Page1, Page2, Page3,. In addition, overwriting on a written page 601 is prohibited in principle, and when data is overwritten on a written page 601, it is necessary to delete the data in the block 502 to which the page 601 belongs only after the data is erased. Data cannot be written to page 601.
 ここまで、本発明が適用されるNVMモジュール126の構成、及びNVMモジュール126が利用されるコンピュータシステムについて説明を行った。続いて、本実施例においてNVMモジュール126がストレージ装置に提供する記憶領域について説明する。 So far, the configuration of the NVM module 126 to which the present invention is applied and the computer system in which the NVM module 126 is used have been described. Next, a storage area provided to the storage apparatus by the NVM module 126 in this embodiment will be described.
 (1-6)NVMモジュールのLBAとPBA対応付けの概要
 続いて、本実施例においてNVMモジュール126がストレージ装置101に提供する記憶空間について説明する。本実施例におけるNVMモジュール126は、複数のFM(チップ)420を搭載し、複数のブロック、複数のページにより構成される記憶領域を管理し、自身が接続されるストレージコントローラ110(のプロセッサ121)に対して、論理的な記憶空間を提供する。ここで、「記憶空間を提供する」とは、ストレージコントローラ110に対してアクセスさせる各記憶領域にアドレスが付けられており、NVMモジュール126が接続されるストレージコントローラ110のプロセッサ121が、当該アドレスを指定したアクセス要求(コマンド)を発行することにより、当該アドレスで特定された領域に格納されているデータの参照・更新が可能な状態にされていることを意味する。FM420により構成される物理記憶領域は、NVMモジュール126内部のみで用いるアドレス空間に一意に対応づけて管理される。以降、このNVMモジュール126内部のみで用いる物理領域指定用アドレス空間(物理アドレス空間)を、PBA(Physical Block Address)空間と呼び、PBA空間内の各物理記憶領域(セクタ。本発明の実施例では、1セクタは512バイトとする)の位置(アドレス)をPBA(Physical Block Address)と記す。本実施例のNVMモジュール126は、このPBAと、ストレージ装置に提供する論理記憶空間の各領域のアドレスであるLBA(Logical Block Address)との対応付けを管理する。
(1-6) Overview of Correspondence between LBA and PBA of NVM Module Next, the storage space provided by the NVM module 126 to the storage apparatus 101 in this embodiment will be described. The NVM module 126 according to this embodiment includes a plurality of FMs (chips) 420, manages a storage area composed of a plurality of blocks and a plurality of pages, and a storage controller 110 (processor 121) to which the NVM module 126 is connected. Provides a logical storage space. Here, “providing storage space” means that each storage area to be accessed by the storage controller 110 is assigned an address, and the processor 121 of the storage controller 110 to which the NVM module 126 is connected determines the address. By issuing a designated access request (command), it means that the data stored in the area specified by the address can be referred to and updated. The physical storage area configured by the FM 420 is managed in a manner uniquely associated with an address space used only within the NVM module 126. Hereinafter, this physical area designating address space (physical address space) used only within the NVM module 126 will be referred to as a PBA (Physical Block Address) space, and each physical storage area (sector in the PBA space. In the embodiment of the present invention). The position (address) of 1 sector is 512 bytes) is described as PBA (Physical Block Address). The NVM module 126 of this embodiment manages the association between this PBA and LBA (Logical Block Address) that is the address of each area of the logical storage space provided to the storage apparatus.
 従来のSSD等の記憶装置は、記憶装置が接続される上位装置(ホスト計算機等)に対して一つの記憶空間を提供している。一方で本実施例のNVMモジュール126は、2つの論理記憶空間を有し、NVMモジュール126が接続されるストレージコントローラ110にこの2つの論理記憶空間を提供することを特徴とする。この2つの論理記憶空間LBAとPBAの関係について図7を用いて説明する。 A conventional storage device such as an SSD provides one storage space for a host device (such as a host computer) to which the storage device is connected. On the other hand, the NVM module 126 of this embodiment has two logical storage spaces, and provides the two logical storage spaces to the storage controller 110 to which the NVM module 126 is connected. The relationship between the two logical storage spaces LBA and PBA will be described with reference to FIG.
 図7は、本実施例のNVMモジュール126がストレージコントローラ110に提供する論理記憶空間であるLBA0空間701及びLBA1空間702と、PBA空間703との対応付けの概念を示した図である。 FIG. 7 is a diagram illustrating a concept of association between the LBA0 space 701 and the LBA1 space 702, which are logical storage spaces provided by the NVM module 126 of the present embodiment to the storage controller 110, and the PBA space 703.
 NVMモジュール126は、LBA0空間701とLBA1空間702という2つの論理記憶空間を提供する。なお、これ以降、LBA0空間701上の各記憶領域に付されたアドレスのことを「LBA0」または「LBA0アドレス」と呼び、LBA1空間702上の各記憶領域に付されたアドレスのことを「LBA1」または「LBA1アドレス」と呼ぶ。また、本発明の実施例では、LBA0空間701のサイズ及びLBA1空間702のサイズはいずれも、PBA空間のサイズ以下とするが、LBA0空間701のサイズがPBA空間のサイズよりも大きい場合でも、本発明は有効である。 The NVM module 126 provides two logical storage spaces, an LBA0 space 701 and an LBA1 space 702. Hereinafter, the addresses assigned to the storage areas on the LBA 0 space 701 are referred to as “LBA 0” or “LBA 0 address”, and the addresses assigned to the storage areas on the LBA 1 space 702 are referred to as “LBA 1”. Or “LBA1 address”. In the embodiment of the present invention, the size of the LBA0 space 701 and the size of the LBA1 space 702 are both equal to or smaller than the size of the PBA space. However, even when the size of the LBA0 space 701 is larger than the size of the PBA space, The invention is effective.
 LBA0空間701は、FM420により構成される物理記憶領域に記録された圧縮データを非圧縮データとして、ストレージコントローラ110のプロセッサ121にアクセスさせるための論理記憶空間である。プロセッサ121がLBA0空間701上のアドレス(LBA0)を指定してNVMモジュール126にライト要求を発行すると、NVMモジュール126は、ストレージコントローラ110からライトデータを取得し、データ圧縮/伸長ユニット418にて圧縮した後、NVMモジュール126が動的に選択したPBAにより指定されるFM420上の物理記憶領域にデータを記録し、LBA0とPBAの対応付けを行う。また、プロセッサ121がLBA0を指定してNVMモジュール126にリード要求を発行すると、NVMモジュール126は、LBA0に対応付けられたPBAが示すFM420の物理記憶領域からデータ(圧縮データ)を取得し、データ圧縮/伸長ユニット418にて伸長した後、この伸長したデータをリードデータとしてストレージコントローラ110に転送する。尚、このLBA0とPBAとの対応づけは、後述するLBA0-PBA変換テーブルにて管理する。 The LBA0 space 701 is a logical storage space for allowing the processor 121 of the storage controller 110 to access the compressed data recorded in the physical storage area configured by the FM 420 as uncompressed data. When the processor 121 designates an address (LBA0) on the LBA0 space 701 and issues a write request to the NVM module 126, the NVM module 126 acquires write data from the storage controller 110 and compresses it by the data compression / decompression unit 418. After that, the NVM module 126 records data in the physical storage area on the FM 420 designated by the dynamically selected PBA, and associates LBA0 and PBA. When the processor 121 designates LBA0 and issues a read request to the NVM module 126, the NVM module 126 acquires data (compressed data) from the physical storage area of the FM 420 indicated by the PBA associated with the LBA0. After decompression by the compression / decompression unit 418, the decompressed data is transferred to the storage controller 110 as read data. The association between LBA0 and PBA is managed by an LBA0-PBA conversion table described later.
 LBA1空間702は、FM420により構成される物理記憶領域に記録された圧縮データを圧縮データのまま(伸長せず)、ストレージコントローラ110にアクセスさせるための論理記憶空間である。ストレージコントローラ110のプロセッサ121が、LBA1を指定してNVMモジュール126にライト要求を発行すると、NVMモジュール126は、ストレージコントローラ110よりデータ(圧縮済みのライトデータ)を取得し、NVMモジュール126が動的に選択したPBAにより指定されるFMの記憶領域にデータを記録し、LBA1とPBAの対応付けを行う。また、プロセッサ121がLBA1を指定してリード要求を発行すると、NVMモジュール126は、LBA1に対応付けられたPBAが示すFM420の物理記憶領域よりデータ(圧縮データ)を取得し、ストレージコントローラ110にリードデータとして圧縮済みデータを転送する。尚、このLBA1とPBAとの対応づけは、後述するLBA1-PBA変換テーブルにて管理する。 The LBA1 space 702 is a logical storage space for allowing the storage controller 110 to access the compressed data recorded in the physical storage area configured by the FM 420 as it is (not expanded). When the processor 121 of the storage controller 110 designates LBA1 and issues a write request to the NVM module 126, the NVM module 126 acquires data (compressed write data) from the storage controller 110, and the NVM module 126 dynamically The data is recorded in the storage area of the FM designated by the selected PBA, and the LBA 1 and the PBA are associated with each other. When the processor 121 designates LBA 1 and issues a read request, the NVM module 126 acquires data (compressed data) from the physical storage area of the FM 420 indicated by the PBA associated with LBA 1 and reads it to the storage controller 110. Transfer compressed data as data. The association between LBA1 and PBA is managed by an LBA1-PBA conversion table described later.
 尚、図7に示すとおり、圧縮データ713が記録された物理記憶領域であるPBA空間上の領域は、同時にLBA0空間の領域とLBA1空間の領域との両方に対応づけられることもある。例えば、圧縮データ713の伸長されたデータがLBA0空間上に伸長データ711として対応づけられ、圧縮データ713がそのままLBA1空間上に圧縮データ712として対応づけられる。たとえばプロセッサ121が、LBA0(仮にLBA0が0x00000001000とする)を指定してNVMモジュール126にデータをライトすると、当該データはNVMモジュール126内のデータ圧縮/伸長ユニット418により圧縮され、圧縮されたデータはNVMモジュール126が動的に選択したPBA空間上(具体的には、FM420の複数のページ中の、いずれかの未書き込みページ)に配置される。またそのデータはLBA0空間のアドレス0x00000001000に対応付けられた状態で管理される。その後プロセッサ121が、0x00000001000に対応づけられたデータを、LBA1空間のアドレス(仮に0x80000000010とする)に対応付ける要求をNVMモジュール126に発行すると、このデータはLBA1空間にも対応づけられ、プロセッサ121がLBA1アドレス0x80000000010のデータをリードする要求(コマンド)をNVMモジュール126に対して発行すると、プロセッサ121は、自身がLBA0アドレス0x00000001000に対して書き込んだデータを、圧縮した状態で読み出すことが出来る。 Note that, as shown in FIG. 7, the area on the PBA space, which is the physical storage area in which the compressed data 713 is recorded, may be associated with both the LBA0 space area and the LBA1 space area at the same time. For example, the decompressed data of the compressed data 713 is associated with the LBA0 space as the decompressed data 711, and the compressed data 713 is directly associated with the LBA1 space as the compressed data 712. For example, when the processor 121 specifies LBA0 (assuming that LBA0 is set to 0x000000011000) and writes data to the NVM module 126, the data is compressed by the data compression / decompression unit 418 in the NVM module 126. The NVM module 126 is arranged on the dynamically selected PBA space (specifically, any unwritten page among a plurality of pages of the FM 420). The data is managed in a state associated with the address 0x000000011000 of the LBA0 space. Thereafter, when the processor 121 issues a request for associating the data associated with 0x000000011000 to the address of the LBA1 space (assuming 0x80000000010) to the NVM module 126, this data is also associated with the LBA1 space, and the processor 121 is associated with the LBA1 space. When a request (command) for reading the data at the address 0x80000000010 is issued to the NVM module 126, the processor 121 can read out the data written to the LBA0 address 0x000000011000 in a compressed state.
 本実施例におけるストレージ装置101は、LBA0を指定してNVMモジュール126にライトしたデータをLBA1空間上領域に対応付け、LBA1を指定して、当該データに対応するRAIDパリティ生成の指示を行うことで、圧縮データに対するRAIDパリティ生成を可能とする。 The storage apparatus 101 in the present embodiment associates the data written to the NVM module 126 with LBA0 specified, associates it with an area on the LBA1 space, specifies LBA1 and issues a RAID parity generation instruction corresponding to the data. RAID parity generation for compressed data is enabled.
 なお、本発明の実施例におけるNVMモジュール126で生成される圧縮データのサイズは、512バイト(1セクタ)の倍数のサイズに限定され、また非圧縮データのサイズを超えないサイズになるようにしている。つまり4KBのデータを圧縮した場合、最小サイズが512バイトで、最大サイズが4KBになる。 Note that the size of the compressed data generated by the NVM module 126 in the embodiment of the present invention is limited to a multiple of 512 bytes (1 sector), and does not exceed the size of the uncompressed data. Yes. That is, when 4 KB data is compressed, the minimum size is 512 bytes and the maximum size is 4 KB.
 (1-7)NVMモジュールの管理情報1:LBA-PBA変換テーブル
 続いて、本実施例におけるNVMモジュール126が制御に用いる管理情報について説明する。
(1-7) NVM Module Management Information 1: LBA-PBA Conversion Table Next, management information used for control by the NVM module 126 in this embodiment will be described.
 NVMモジュール126が用いる管理情報として、まずLBA0-PBA変換テーブル810とLBA1-PBA変換テーブル820について図8を用いて説明する。 As management information used by the NVM module 126, the LBA0-PBA conversion table 810 and the LBA1-PBA conversion table 820 will be described with reference to FIG.
 LBA0-PBA変換テーブル810は、NVMモジュール126内のDRAM413内に格納されており、NVMモジュールLBA0(811)、NVMモジュールPBA(812)、PBA長(813)の情報から構成される。NVMモジュール126のプロセッサ2415は、上位装置からリード要求時に指定されるLBA0を受信した後、そのLBA0を用いて、実際のデータが格納されている場所を示すPBAを取得する。 The LBA0-PBA conversion table 810 is stored in the DRAM 413 in the NVM module 126, and includes information on the NVM module LBA0 (811), the NVM module PBA (812), and the PBA length (813). The processor 2415 of the NVM module 126 receives the LBA 0 specified at the time of the read request from the host device, and then uses the LBA 0 to obtain the PBA indicating the location where the actual data is stored.
 また、更新ライト時には、NVMモジュール126は更新データ(ライトデータ)を更新前データが記録されたPBAとは異なる物理記憶領域に記録し、更新データを記録したPBAとPBA長を、LBA0-PBA変換テーブルの該当する箇所に記録し、LBA0-PBA変換テーブルを更新する。NVMモジュール126はこのように動作することによって、LBA0空間上領域のデータの上書きを(疑似的に)可能にしている。 At the time of update writing, the NVM module 126 records the update data (write data) in a physical storage area different from the PBA in which the pre-update data is recorded, and converts the PBA and PBA length in which the update data is recorded into an LBA0-PBA conversion. Record in the corresponding part of the table and update the LBA0-PBA conversion table. By operating in this manner, the NVM module 126 enables (pseudo) overwriting of data in the area on the LBA0 space.
 NVMモジュールLBA0(811)は、NVMモジュール126が提供するLBA0空間の論理領域を4KB単位ごとに順に並べたものである(LBA0空間の各アドレス(LBA0)は、1セクタ(512バイト)ごとに付されている)。本実施例におけるLBA0-PBA変換テーブル810では、NVMモジュールLBA0(811)とNVMモジュールPBA(812)との対応付けが4KB(8セクタ)単位で管理されていることを意図している。但し、このNVMモジュールLBA0(811)とNVMモジュールPBA(812)との対応付けを4KB単位以外の任意の単位で管理してもよい。 The NVM module LBA0 (811) is a logical area of the LBA0 space provided by the NVM module 126 arranged in units of 4 KB in order (each address (LBA0) in the LBA0 space is attached to each sector (512 bytes). Have been). In the LBA0-PBA conversion table 810 in this embodiment, it is intended that the association between the NVM module LBA0 (811) and the NVM module PBA (812) is managed in units of 4 KB (8 sectors). However, the association between the NVM module LBA0 (811) and the NVM module PBA (812) may be managed in an arbitrary unit other than the 4 KB unit.
 NVMモジュールPBA(812)は、NVMモジュールLBA0(811)に対応付けられたPBAの先頭アドレスを格納するフィールドである。本実施例では、PBA空間の物理記憶領域を512バイト(1セクタ)毎に分割して管理する。図8の例では、NVMモジュールLBA0(811)「0x000_0000_0000」に対応付けられたPBA(Physical Block Address)として、「XXX」という値(PBA)が対応付けられている。この値は、NVMモジュール126が搭載する複数のFM420のうちの、ある記憶領域を一意に示すアドレスである。これにより、リードリクエスト先の先頭アドレス(LBA0)として「0x000_0000_0000」を受領した場合、NVMモジュール126内の物理記憶領域(リード先)の先頭アドレス(PBA)として「XXX」が取得される。また、NVMモジュールLBA0(811)で特定されるLBA0に対応付けられたPBAが無い場合、NVMモジュールPBA(812)には「未割当」であることを示す値(NULLあるいは0xFFFFFFFFなど)が格納される。 The NVM module PBA (812) is a field for storing the head address of the PBA associated with the NVM module LBA0 (811). In this embodiment, the physical storage area of the PBA space is divided and managed for every 512 bytes (one sector). In the example of FIG. 8, a value (PBA) of “XXX” is associated as a PBA (Physical Block Address) associated with the NVM module LBA0 (811) “0x000_0000_0000”. This value is an address that uniquely indicates a storage area among a plurality of FMs 420 mounted on the NVM module 126. As a result, when “0x000 — 0000 — 0000” is received as the head address (LBA 0) of the read request destination, “XXX” is acquired as the head address (PBA) of the physical storage area (read destination) in the NVM module 126. When there is no PBA associated with LBA0 specified by the NVM module LBA0 (811), a value (such as NULL or 0xFFFFFFFF) indicating “unallocated” is stored in the NVM module PBA (812). The
 PBA長813には、NVMモジュールLBA0(811)に指定された4KBのデータの、実際の格納サイズが記録される。なお、格納サイズはセクタ数にて記録されている。図8に示す例では、LBA0「0x000_0000_0000」を開始アドレスとする4KB(LBA0空間としては8sector)のデータは、PBA長として「2」すなわち512B×2=1KBの長さで、記録されていることを示している。従って、NVMモジュールPBA(812)の情報と組み合わせるとLBA0[0x000_0000_0000]を開始アドレスとする4KBのデータは、PBA「XXX」から「XXX+1」の1KBの領域に圧縮して格納されていることを表している。なお、本実施例におけるNVMモジュール126では、ストレージコントローラ110のプロセッサ121からライト指示された非圧縮データを、4KB単位で圧縮する。たとえばプロセッサ121から、LBA0空間のアドレス(0x000_0000_0000)を開始アドレスとする8KBのデータ(非圧縮データ)のライト要求があった場合、(LBA0空間の)アドレス範囲0x000_0000_0000~0x000_0000_0007の4KBのデータを単位として圧縮して圧縮データを生成し、続いてアドレス範囲0x000_0000_0008~0x000_0000_000Fの4KBのデータを単位として圧縮して圧縮データを生成し、それぞれの圧縮データをFM420の物理記憶領域に書き込む。ただし、本発明はデータを4KB単位で圧縮する態様に限定されるものではなく、その他の単位でデータが圧縮される構成であっても本発明は有効である。 In the PBA length 813, the actual storage size of 4 KB data designated in the NVM module LBA0 (811) is recorded. The storage size is recorded by the number of sectors. In the example shown in FIG. 8, data of 4 KB (LBA 0 space is 8 sec) with LBA 0 “0x000 — 0000 — 0000” as the start address is recorded with a PBA length of “2”, that is, 512 B × 2 = 1 KB. Is shown. Therefore, when combined with the information of the NVM module PBA (812), 4 KB data having the start address of LBA 0 [0x000_0000_0000] is compressed and stored in the 1 KB area from PBA “XXX” to “XXX + 1”. ing. Note that the NVM module 126 in this embodiment compresses uncompressed data instructed by the processor 121 of the storage controller 110 in units of 4 KB. For example, when the processor 121 receives a write request for 8 KB data (uncompressed data) starting from an address in the LBA 0 space (0x000_0000_0000), 4 KB data in the address range 0x000_0000_0000 to 0x000_0000_0007 (in the LBA 0 space) is used as a unit. Compressed data is generated by compression, and then compressed data is generated by compressing 4 KB data in the address range 0x000_0000_0008 to 0x000_0000_000F as a unit, and each compressed data is written in the physical storage area of the FM 420. However, the present invention is not limited to a mode in which data is compressed in units of 4 KB, and the present invention is effective even in a configuration in which data is compressed in other units.
 続いて、LBA1-PBA変換テーブル820について説明する。LBA1-PBA変換テーブル820は、NVMモジュール126内のDRAM413内に格納されており、NVMモジュールLBA1(821)、NVMモジュールPBA(822)の2つの情報から構成される。NVMモジュール126のプロセッサ245は、上位装置からリード要求時に指定されるLBA1を受信した後、受信したLBA1を、LBA1-PBA変換テーブル820を用いて、LBA1実際のデータが格納されている場所を示すPBAに変換する。 Subsequently, the LBA1-PBA conversion table 820 will be described. The LBA1-PBA conversion table 820 is stored in the DRAM 413 in the NVM module 126, and includes two pieces of information of the NVM module LBA1 (821) and the NVM module PBA (822). The processor 245 of the NVM module 126 receives the LBA1 specified at the time of the read request from the upper apparatus, and then uses the LBA1-PBA conversion table 820 to indicate the location where the actual data of the LBA1 is stored. Convert to PBA.
 NVMモジュールLBA1(821)は、NVMモジュール126が提供するLBA1空間の論理領域をセクタごとに順に並べたものである(NVMモジュールLBA1(821)内の数値1は、1セクタ(512バイト)を意味する)。これは、本実施例におけるNVMモジュール126が、NVMモジュールLBA1(821)とNVMモジュールPBA(822)との対応付けを512B単位で管理する前提で記載されているためだが、このNVMモジュールLBA1(821)とNVMモジュールPBA(822)との対応付けは、512B単位にて管理される態様に限定されるものではなく、如何なる単位で管理してもよい。但し、LBA1は、圧縮データの格納先である物理記憶領域PBAを直接マッピングする空間であり、PBAの分割管理サイズと同等であることが望ましいことから、本実施例では、512B単位で分割して管理する。 The NVM module LBA1 (821) is a logical area of the LBA1 space provided by the NVM module 126 arranged in order for each sector (a numerical value 1 in the NVM module LBA1 (821) means one sector (512 bytes). To do). This is because the NVM module 126 in this embodiment is described on the premise that the association between the NVM module LBA1 (821) and the NVM module PBA (822) is managed in units of 512B, but this NVM module LBA1 (821). ) And the NVM module PBA (822) are not limited to the mode managed in 512B units, and may be managed in any unit. However, LBA1 is a space that directly maps the physical storage area PBA that is the storage location of the compressed data, and is preferably equal to the PBA division management size. In this embodiment, LBA1 is divided in units of 512B. to manage.
 NVMモジュールPBA(822)は、LBA1に対応付けられたPBAの先頭アドレスを格納するフィールドである。本実施例では、PBAを512B毎に分割して管理する。図8の例では、NVMモジュールLBA1「0x800_0000_0002」に「ZZZ」というPBA値が対応付けられている。このPBA値は、NVMモジュール126が搭載する、あるFM420上の記憶領域を一意に示すアドレスである。これにより、リードリクエスト先の先頭アドレス(LBA1)として、「0x800_0000_0002」を受領した場合、NVMモジュール126内の物理的なリード先の先頭アドレスとして「ZZZ」が取得される。また、NVMモジュールLBA1(821)で特定されるLBA1に対応付けられたPBAが無い場合、NVMモジュールPBA(822)には「未割当」であることを示す値が格納される。 The NVM module PBA (822) is a field for storing the head address of the PBA associated with LBA1. In this embodiment, the PBA is divided and managed for each 512B. In the example of FIG. 8, the PBA value “ZZZ” is associated with the NVM module LBA1 “0x800_0000_0002”. This PBA value is an address that uniquely indicates a storage area on a certain FM 420 mounted on the NVM module 126. Accordingly, when “0x800_0000_0002” is received as the read request destination start address (LBA1), “ZZZ” is acquired as the physical read destination start address in the NVM module 126. When there is no PBA associated with LBA1 specified by the NVM module LBA1 (821), a value indicating “unallocated” is stored in the NVM module PBA (822).
 以上が、NVMモジュール126が用いるLBA0-PBA論物変換テーブル810とLBA1-PBA論物変化テーブル820の内容である。 The above is the contents of the LBA0-PBA logical-physical conversion table 810 and the LBA1-PBA logical-physical change table 820 used by the NVM module 126.
 (1-9)NVMモジュールの管理情報3:ブロック管理情報
 続いて、本発明が適用されるNVMモジュールが用いるブロック管理情報について図9を用いて説明する。
(1-9) NVM Module Management Information 3: Block Management Information Next, block management information used by the NVM module to which the present invention is applied will be described with reference to FIG.
 ブロック管理情報900は、NVMモジュール126内のDRAM413内に格納されており、NVMモジュールPBA901、NVM chip番号902、ブロック番号903、無効PBA量904の各項目にて構成される。 The block management information 900 is stored in the DRAM 413 in the NVM module 126 and includes items of an NVM module PBA 901, an NVM chip number 902, a block number 903, and an invalid PBA amount 904.
 NVMモジュールPBA901は、NVMモジュール126が管理する全FM420内の各領域を一意に特定するPBA値を格納するフィールドである。尚、本実施例では、NVMモジュールPBA901をブロック単位で区分して管理する。図9では、NVMモジュールPBA値として、先頭アドレスを格納した例について示している。例えば、「0x000_0000_0000」のフィールドは、「0x000_0000_0000」~「0x000_0000_0FFF」のNVMモジュール PBA範囲が該当することを示している。 The NVM module PBA 901 is a field for storing a PBA value that uniquely identifies each area in all the FMs 420 managed by the NVM module 126. In this embodiment, the NVM module PBA 901 is divided and managed in units of blocks. FIG. 9 shows an example in which the head address is stored as the NVM module PBA value. For example, the field “0x000_0000_0000” indicates that the NVM module PBA range from “0x000_0000_0000” to “0x000_0000_0FFF” is applicable.
 NVM chip番号902は、NVMモジュール126が搭載するFM Chip420を一意に指定する番号を格納するフィールドである。ブロック番号903は、NVM Chip番号902の格納値により指定されるFM Chip420内のブロック番号を格納するフィールドである。 The NVM chip number 902 is a field for storing a number for uniquely specifying the FM Chip 420 mounted on the NVM module 126. The block number 903 is a field for storing the block number in the FM Chip 420 specified by the stored value of the NVM Chip number 902.
 無効PBA量904は、NVM Chip番号902の格納値により指定されるFM Chip内の、ブロック番号903の格納値により指定されるブロックの無効PBA量を格納するフィールドである。無効PBA量とは、LBA0-PBA変換テーブル810及びLBA1-PBA変換テーブル820にて、NVMモジュールLBA0(811)及びNVMモジュールLBA1(821)で特定されるLBA0空間及び/またはLBA1空間に対応付けられていたが、のちに対応づけを解除された領域(PBA空間上)の量である。逆に、LBA0-PBA変換テーブル810またはLBA1-PBA変換テーブル820によってNVMモジュールLBA0またはLBA1に対応付けがなされているPBAのことを本明細書では有効PBAと呼ぶ。 The invalid PBA amount 904 is a field for storing the invalid PBA amount of the block specified by the stored value of the block number 903 in the FM Chip specified by the stored value of the NVM Chip number 902. The invalid PBA amount is associated with the LBA0 space and / or LBA1 space specified by the NVM module LBA0 (811) and the NVM module LBA1 (821) in the LBA0-PBA conversion table 810 and the LBA1-PBA conversion table 820. This is the amount of the area (on the PBA space) that was later released from the association. Conversely, the PBA associated with the NVM module LBA0 or LBA1 by the LBA0-PBA conversion table 810 or the LBA1-PBA conversion table 820 is referred to as an effective PBA in this specification.
 無効PBA領域は、データの上書きが不可能な不揮発性メモリにおいて、疑似的に上書きを実現しようとする際に必然的に生じるものである。具体的には、NVMモジュール126はデータ更新の際に更新データを(更新前データの書き込まれているPBAとは異なる)未書き込みPBAに対して記録し、LBA0-PBA変換テーブル810のNVMモジュールPBA812とPBA長813のフィールドを更新データが記録されたPBA領域の先頭アドレスとPBA長に書き換える。この時、更新前データが記録されたPBA領域は、LBA0-PBA変換テーブル810による対応付けが解除される。NVMモジュール126はこのとき、LBA1-PBA変換テーブル820も調査し、LBA1-PBA変換テーブルにおいても対応付けがなされていない領域を無効PBA領域とする。NVMモジュール126は、FMの最小消去単位であるブロック毎にこの無効PBAの量をカウントし、無効PBA量が多いブロックを優先的にガーベッジコレクション対象領域として選択する。図9の例では、NVMモジュール126が管理するNVM chip番号0のブロック番号0には、無効PBA領域が160KBあることを一例として示している。 The invalid PBA area is inevitably generated when a pseudo-overwrite is attempted in a non-volatile memory where data cannot be overwritten. Specifically, the NVM module 126 records the update data in an unwritten PBA (different from the PBA in which the pre-update data is written) at the time of data update, and the NVM module PBA 812 of the LBA0-PBA conversion table 810. And the PBA length 813 field are rewritten to the start address and PBA length of the PBA area in which the update data is recorded. At this time, the association by the LBA0-PBA conversion table 810 is released for the PBA area in which the pre-update data is recorded. At this time, the NVM module 126 also checks the LBA1-PBA conversion table 820 and sets an area that is not associated in the LBA1-PBA conversion table as an invalid PBA area. The NVM module 126 counts the amount of invalid PBA for each block, which is the minimum erase unit of FM, and preferentially selects a block with a large amount of invalid PBA as a garbage collection target area. In the example of FIG. 9, the block number 0 of the NVM chip number 0 managed by the NVM module 126 has an invalid PBA area of 160 KB as an example.
 本実施例では、NVMモジュール126が管理する無効PBA領域の総量が、所定のガーベッジコレクション開始閾値以上となった(未書き込みページの枯渇)際に、無効PBA領域を含むブロックを消去し、未書き込みPBA領域を作成する。この動作をガーベッジコレクションと呼ぶ。このガーベッジコレクション時に、消去対象ブロック中に有効PBA領域が含まれる場合、ブロック消去の前に有効PBA領域を別のブロックへとコピーする必要が生じる。このデータコピーは、FMへのライト動作を伴うため、FMの破壊を進展させるとともに、コピー動作としてNVMモジュール126のプロセッサやバス帯域などのリソースを消費するため、性能低下の要因ともなる。このため、有効PBA領域のコピーは可能な限り少ないことが望ましい。本実施例のNVMモジュール126は、ガーベッジコレクション時にブロック管理情報900を参照し、無効PBA量の904の格納値が大きい(無効PBA領域を多く含む)ブロックから順に削除を実施することで、有効PBA領域のコピー量を削減するように動作する。 In this embodiment, when the total amount of invalid PBA areas managed by the NVM module 126 exceeds a predetermined garbage collection start threshold (depletion of unwritten pages), blocks including invalid PBA areas are erased and unwritten. Create a PBA area. This operation is called garbage collection. When an effective PBA area is included in an erasure target block at the time of garbage collection, it is necessary to copy the effective PBA area to another block before erasing the block. Since this data copy involves a write operation to the FM, the destruction of the FM progresses, and resources such as the processor of the NVM module 126 and the bus bandwidth are consumed as the copy operation, which causes a decrease in performance. For this reason, it is desirable that the number of valid PBA areas be as small as possible. The NVM module 126 according to the present embodiment refers to the block management information 900 at the time of garbage collection, and deletes the effective PBA by sequentially deleting the blocks having a larger storage value of the invalid PBA amount 904 (including many invalid PBA areas). Operates to reduce the amount of space copy.
 尚、本実施例では、NVMモジュールLBA0(811)及びLBA1(821)との対応付けが解除された領域の量をPBA量(KB)にて管理した例について記述しているが、本発明はこの管理単位に限定されるものではない。例えば無効PBA量に代えて、最小書き込み単位であるページの個数を管理する態様もあり得る。 In the present embodiment, an example is described in which the amount of the area released from the association with the NVM modules LBA0 (811) and LBA1 (821) is managed by the PBA amount (KB). It is not limited to this management unit. For example, instead of the invalid PBA amount, there may be a mode of managing the number of pages that are the minimum writing unit.
 以上が、本発明が適用されるNVMモジュールが用いるブロック管理情報900の内容である。 The above is the content of the block management information 900 used by the NVM module to which the present invention is applied.
 (1-10)NVMモジュール制御用のコマンド1:ライトコマンド
 続いて、本発明が適用されるNVMモジュール126が用いるコマンドについて説明する。
(1-10) NVM Module Control Command 1: Write Command Next, commands used by the NVM module 126 to which the present invention is applied will be described.
 本実施例におけるNVMモジュール126は、ストレージコントローラ110のプロセッサ121から1つのコマンドを受理すると、当該受理したコマンドの内容を解析して所定の処理を行い、処理完了後に1つの応答(応答情報)をストレージコントローラに返答する。この処理は、NVMモジュール126内のプロセッサ415が、RAM413に格納されているコマンド処理用のプログラムを実行することにより実現される。なお、コマンドには、NVMモジュール126が所定の処理を行うために必要となる情報の集合が含まれており、たとえばNVMモジュール126にデータの書き込みを指示するライトコマンドであれば、コマンド中には、そのコマンドがライトコマンドであること、ライトのために必要となる情報(ライトデータの書き込み位置やデータ長など)を含んでいる。NVMモジュール126は複数種類のコマンドをサポートしているが、まず各コマンドに共通の情報について説明する。 When the NVM module 126 in this embodiment receives one command from the processor 121 of the storage controller 110, the NVM module 126 analyzes the content of the received command, performs predetermined processing, and sends one response (response information) after the processing is completed. Reply to the storage controller. This process is realized by the processor 415 in the NVM module 126 executing a command processing program stored in the RAM 413. The command includes a set of information necessary for the NVM module 126 to perform predetermined processing. For example, if the command is a write command that instructs the NVM module 126 to write data, The command includes a write command and information necessary for writing (write data write position, data length, etc.). The NVM module 126 supports a plurality of types of commands. First, information common to each command will be described.
 各コマンドには、共通の情報として、オペレーションコード(Opcode)とコマンドIDという情報が、先頭に含まれている。そしてコマンドIDの後に、各コマンドに固有の情報(パラメータ)が付加されて、1つのコマンドが形成される。たとえば図10は、本実施例におけるNVMモジュール126のライトコマンドのフォーマットと、そのライトコマンドに対する応答情報のフォーマットを示した図であるが、図10の要素(フィールド)1011がOpcode、要素1012がコマンドIDである。そして要素1013から1016が、ライトコマンドに固有のパラメータである。また、各コマンドの処理完了後に返送される応答情報として、コマンドIDとステータス(Status)が、全応答情報に共通に含まれる情報であり、ステータスの後に各応答情報に固有の情報が付加されることもある。 Each command includes information such as an operation code (Opcode) and a command ID at the head as common information. Then, after the command ID, information (parameter) unique to each command is added to form one command. For example, FIG. 10 is a diagram showing the format of the write command of the NVM module 126 and the format of the response information for the write command in this embodiment. The element (field) 1011 in FIG. 10 is Opcode, and the element 1012 is a command. ID. Elements 1013 to 1016 are parameters specific to the write command. Further, as response information returned after the processing of each command is completed, the command ID and status (Status) are information included in all response information, and information unique to each response information is added after the status. Sometimes.
 オペレーションコード(Opcode)は、コマンドの種別をNVMモジュール126に通知するための情報であり、コマンドを取得したNVMモジュール126は、この情報を参照することにより、通知されたコマンドの種別を認知する。たとえばOpcodeが0x01であればライトコマンドで、Opcodeが0x02であればリードコマンドである、と認識する。 The operation code (Opcode) is information for notifying the NVM module 126 of the command type, and the NVM module 126 that has acquired the command recognizes the notified command type by referring to this information. For example, if Opcode is 0x01, it is recognized as a write command, and if Opcode is 0x02, it is recognized as a read command.
 コマンドIDは、コマンドの固有のIDを格納するフィールドであり、コマンドの応答情報には、どのコマンドに対する応答情報であるかを、ストレージコントローラ110に認識させるために、このフィールドに、指定されたIDが付与される。ストレージコントローラ110は、コマンド作成時にコマンドを一意に識別可能なIDを生成して、このコマンドIDのフィールドにこのIDを格納したコマンドを作成し、NVMモジュール126にコマンドを送信する。そして、NVMモジュール126では、受信したコマンドに対応した処理が完了すると、当該コマンドのコマンドIDを応答情報に含めてストレージコントローラ110に返送する。ストレージコントローラ110は、この応答情報を受領した際、応答情報に含まれるIDを取得することで、当該コマンドの完了を認識する。また、応答情報に含まれるステータス(図10の要素1022)は、コマンドの処理が正常に完了したか否かを表す情報が格納されるフィールドである。コマンドの処理が正常に完了しなかった(エラー)場合、ステータスには、例えばエラー原因等を識別できる番号が格納される。 The command ID is a field for storing a unique ID of the command. In the response information of the command, the ID specified in this field is used so that the storage controller 110 can recognize which command is the response information. Is granted. The storage controller 110 generates an ID capable of uniquely identifying the command when creating the command, creates a command storing this ID in the command ID field, and transmits the command to the NVM module 126. Then, when the process corresponding to the received command is completed, the NVM module 126 includes the command ID of the command in response information and returns it to the storage controller 110. When the storage controller 110 receives the response information, the storage controller 110 recognizes the completion of the command by acquiring the ID included in the response information. Further, the status (element 1022 in FIG. 10) included in the response information is a field in which information indicating whether or not the command processing has been normally completed is stored. If the command process is not completed normally (error), the status stores a number that can identify the cause of the error, for example.
 図10は、本実施例におけるNVMモジュール126のLBA0ライトコマンドとそのライトコマンドへの応答情報を示した図である。本実施例におけるNVMモジュール126のLBA0ライトコマンド1010は、コマンド情報として、オペレーションコード1011、コマンドID1012、LBA0/1開始アドレス1013、LBA0/1長1014、圧縮要否フラグ1015、ライトデータアドレス1016により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。例えば、DIF(Data Integrity Field)等に関連する情報がコマンドに付与されていたとしても、本発明は有効である。 FIG. 10 is a diagram showing the LBA0 write command of the NVM module 126 and the response information to the write command in this embodiment. The LBA0 write command 1010 of the NVM module 126 in the present embodiment is constituted by an operation code 1011, a command ID 1012, an LBA0 / 1 start address 1013, an LBA0 / 1 length 1014, a compression necessity flag 1015, and a write data address 1016 as command information. Is done. In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. For example, the present invention is effective even if information related to DIF (Data Integrity Field) or the like is given to the command.
 オペレーションコード1011は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドがライトコマンドであることを認知する。 The operation code 1011 is a field for notifying the type of command to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a write command.
 コマンドID1012は、コマンドの固有のIDを格納するフィールドであり、コマンドの応答情報には、どのコマンドに対する応答情報であるかを、ストレージ装置に認識させるために、このフィールドに指定されたIDが付与される。ストレージ装置101は、このコマンド作成時にコマンドを一意に識別可能なIDを付与する。そして、NVMモジュール126からの応答情報を受領した際、応答情報に含まれるIDを取得することで、当該コマンドの完了を認識する。 The command ID 1012 is a field for storing a unique ID of the command. The command response information is assigned with the ID specified in this field so that the storage apparatus can recognize which command is the response information. Is done. The storage apparatus 101 assigns an ID that can uniquely identify the command when the command is created. When the response information from the NVM module 126 is received, the completion of the command is recognized by acquiring the ID included in the response information.
 LBA0/1開始アドレス1013は、ライト先の論理空間の先頭アドレスを指定するフィールドである。なお、本発明の実施例におけるLBA0空間はアドレス0x000_0000_0000から0x07F_FFFF_FFFFの範囲の空間であり、LBA1空間はアドレス0x800_0000_0000以降の範囲の空間と定められているので、NVMモジュール126は、ライトコマンドのLBA0/1開始アドレス1013に0x000_0000_0000から0x07F_FFFF_FFFFの範囲のアドレスが格納されていた場合、LBA0空間のアドレスが指定されたと認識し、0x800_0000_0000から0x8FF_FFFF_FFFFの範囲のアドレスが指定されていた場合、LBA1空間のアドレスが指定されたと認識することができる。ただLBA0空間とLBA1空間のいずれのアドレス空間に従うアドレスが指定されたかを認識する方法は、上で説明した方法以外の方法を採用することも可能である。たとえばOpcode1011の内容によってLBA0空間とLBA1空間を識別する方法などもありえる。 The LBA 0/1 start address 1013 is a field for designating the start address of the write destination logical space. In the embodiment of the present invention, the LBA0 space is a space in the range of addresses 0x000_0000_0000 to 0x07F_FFFF_FFFF, and the LBA1 space is defined as a space in the range after the address 0x800_0000_0000. Therefore, the NVM module 126 uses the LBA0 / 1 of the write command. If an address in the range from 0x000_0000_0000 to 0x07F_FFFF_FFFF is stored in the start address 1013, it is recognized that an address in the LBA0 space has been specified, and if an address in the range from 0x800_0000_0000 to 0x8FF_FFFF_FFFF is specified, the address in the LBA1 space is specified Can be recognized. However, a method other than the method described above can be adopted as a method for recognizing which address space in the LBA0 space or the LBA1 space is designated. For example, there may be a method of identifying the LBA0 space and the LBA1 space according to the contents of Opcode 1011.
 LBA0/1長1014は、LBA0/1開始アドレス1013から始まる記録先LBA0またはLBA1の範囲(長さ)を指定するフィールドで、セクタ数で表された長さが格納される。NVMモジュール126は、前述のLBA0またはLBA1開始アドレス1013とLBA0/1長1014が示す範囲のLBA0またはLBA1領域に対して、ライトデータを格納したPBA領域を対応づける処理を行う。 The LBA 0/1 length 1014 is a field for designating the range (length) of the recording destination LBA 0 or LBA 1 starting from the LBA 0/1 start address 1013, and stores the length represented by the number of sectors. The NVM module 126 performs processing for associating the PBA area storing the write data with the LBA0 or LBA1 area in the range indicated by the LBA0 or LBA1 start address 1013 and the LBA0 / 1 length 1014 described above.
 圧縮要否フラグ1015は、このコマンドが指示するライト対象データの圧縮要否を指定するフィールドである。ストレージコントローラ110がライトコマンドを作成する際、ライト対象データにデータ圧縮によるサイズ削減効果が見込めない場合(例えば既に画像圧縮等で圧縮されたデータと認識している場合)、このフラグを制御することで、NVMモジュール126に圧縮が不要であることを通知する。本実施例では、LBA1に対してライトする際、ライト対象データが既に圧縮済みの為、圧縮が不要であることを明示的に伝えるために使用する。尚、LBA1へのライト時には、転送データの圧縮が不要であることを固定の設定とするならば、この圧縮要否フラグ1015が無くとも良い。 The compression necessity flag 1015 is a field for designating whether to compress the write target data indicated by this command. When the storage controller 110 creates a write command, if the size reduction effect due to data compression cannot be expected for the write target data (for example, when it is already recognized as data compressed by image compression), this flag is controlled. Then, the NVM module 126 is notified that compression is not necessary. In this embodiment, when writing to LBA1, the write target data has already been compressed and is used to explicitly notify that compression is not necessary. If the fixed setting indicates that transfer data compression is not required when writing to LBA1, this compression necessity flag 1015 may be omitted.
 ライトデータアドレス1016は、このコマンドが指示するライト対象データの現在の格納先の先頭アドレスを格納するフィールドである。たとえばストレージ装置101のDRAM125に一時格納されているデータをNVMモジュール126に書き込む場合、ストレージ装置101のプロセッサは、データの格納されているDRAM125上のアドレスがライトデータアドレス1016に格納されたライトコマンドを作成すればよい。NVMモジュール126は、このフィールドに指示されたアドレスから、LBA0/1長1014で指定された長さの領域のデータをストレージ装置101より取得することでライトデータの取得を行う。 The write data address 1016 is a field for storing the start address of the current storage destination of the write target data indicated by this command. For example, when data temporarily stored in the DRAM 125 of the storage apparatus 101 is written to the NVM module 126, the processor of the storage apparatus 101 issues a write command in which the address on the DRAM 125 in which the data is stored is stored in the write data address 1016. Create it. The NVM module 126 acquires write data by acquiring, from the storage apparatus 101, data of an area having a length designated by the LBA 0/1 length 1014 from the address indicated in this field.
 ライト応答情報1020は、コマンドID1021、ステータス1022、圧縮データ長1023により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。 The write response information 1020 includes a command ID 1021, a status 1022, and a compressed data length 1023. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above.
 コマンドID1021は、完了したコマンドを一意に特定できる番号を格納するフィールドである。 The command ID 1021 is a field for storing a number that can uniquely identify a completed command.
 ステータス1022は、コマンドの完了またはエラーをストレージ装置に通知する為のフィールドである。エラーである場合、例えばエラー原因等を識別できる番号を格納する。 The status 1022 is a field for notifying the storage device of the completion or error of the command. In the case of an error, for example, a number for identifying the cause of the error is stored.
 圧縮データ長1023は、ライトしたデータが、データ圧縮により縮小した際のデータ長を記録するフィールドである。ストレージ装置101は、このフィールドを取得することで、ライトしたデータの圧縮後のデータサイズを把握できる。但し、ストレージ装置101は、更新ライトをするにしたがって、特定のLBA0領域に対応づけられている実際の圧縮データサイズを正確に把握できなくなる。このため、ストレージ装置101は、ライトコマンドにて取得した圧縮データ長1023の合計が一定値になった際、LBA1にマッピングするために後述の圧縮データサイズ取得コマンドを発行する。 The compressed data length 1023 is a field for recording the data length when the written data is reduced by data compression. The storage apparatus 101 can grasp the data size after compression of the written data by acquiring this field. However, the storage apparatus 101 cannot accurately grasp the actual compressed data size associated with the specific LBA0 area as update writing is performed. For this reason, the storage apparatus 101 issues a compressed data size acquisition command, which will be described later, for mapping to LBA1 when the total of the compressed data lengths 1023 acquired by the write command reaches a constant value.
 また、本実施例では、ライト先がLBA1であるとき、圧縮済みのデータを記録することとなっている為、本フィールドは無効となる。 Also, in this embodiment, when the write destination is LBA1, this field is invalid because compressed data is recorded.
 (1-11)NVMモジュール制御用のコマンド2:圧縮データサイズ取得コマンド
 図11は、本実施例におけるNVMモジュール126の圧縮データサイズ取得コマンドとその圧縮データサイズ取得コマンドへの応答情報を示した図である。本実施例におけるNVMモジュール126の圧縮データサイズ取得コマンド1110は、コマンド情報として、オペレーションコード1111、コマンドID1012、LBA0開始アドレス1113、LBA0長1114により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1012は先のLBA0ライトコマンドと同一の内容の為、説明は省略する。
(1-11) NVM Module Control Command 2: Compressed Data Size Acquisition Command FIG. 11 is a diagram showing a compressed data size acquisition command of the NVM module 126 and response information to the compressed data size acquisition command in this embodiment. It is. The compressed data size acquisition command 1110 of the NVM module 126 in the present embodiment is constituted by an operation code 1111, a command ID 1012, an LBA 0 start address 1113, and an LBA 0 length 1114 as command information. In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous LBA0 write command, description thereof is omitted.
 オペレーションコード1111は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドが圧縮データサイズ取得コマンドであることを認知する。 The operation code 1111 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a compressed data size acquisition command.
 LBA0開始アドレス1113は、圧縮後のデータサイズ取得の対象となるLBA0領域の先頭アドレスを指定するフィールドである。LBA0長1114は、LBA0開始アドレス1113から始まるLBA0の範囲を指定するフィールドである。NVMモジュール126は、前述のLBA0開始アドレス1113とLBA0長1114が示す範囲のLBA1領域に対応づけられている圧縮データのサイズを算出し、ストレージ装置に通知する。なお、LBA0開始アドレス1113に指定できるアドレスは、8セクタ(4KB)の倍数に限定される。同様にLBA0長1114に指定できる長さも8セクタ(4KB)の倍数に限定される。8セクタ境界と一致しないアドレス(たとえば0x000_0000_0001など)や長さがLBA0開始アドレス1113またはLBA0長1114に指定された場合には、エラーが返却される。 The LBA 0 start address 1113 is a field for designating the start address of the LBA 0 area that is the target of acquiring the data size after compression. The LBA 0 length 1114 is a field for designating a range of LBA 0 starting from the LBA 0 start address 1113. The NVM module 126 calculates the size of the compressed data associated with the LBA1 area in the range indicated by the LBA0 start address 1113 and the LBA0 length 1114, and notifies the storage apparatus. The address that can be specified as the LBA 0 start address 1113 is limited to a multiple of 8 sectors (4 KB). Similarly, the length that can be designated as the LBA 0 length 1114 is also limited to a multiple of 8 sectors (4 KB). If an address that does not match the 8-sector boundary (for example, 0x000 — 0000 — 0001) or length is specified as the LBA 0 start address 1113 or the LBA 0 length 1114, an error is returned.
 圧縮データサイズ取得応答1120は、コマンドID1021、ステータス1022、圧縮データ長1123により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1021とステータス1022は先のLBA0ライト応答と同一の内容の為、説明は省略する。 The compressed data size acquisition response 1120 includes a command ID 1021, a status 1022, and a compressed data length 1123. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
 圧縮データ長1123は、圧縮データサイズ取得コマンドにて指示されたLBA0領域に対応づけられた圧縮データのサイズを格納するフィールドである。ストレージコントローラ110は、この圧縮データ長の値を取得することで、後述のLBA1マッピングコマンドにてマッピング先となるLBA1に必要な領域サイズを認識する。 The compressed data length 1123 is a field for storing the size of the compressed data associated with the LBA0 area specified by the compressed data size acquisition command. The storage controller 110 acquires the value of this compressed data length, and recognizes the area size required for the LBA 1 that is the mapping destination by an LBA 1 mapping command described later.
 (1-12)NVMモジュール制御用のコマンド3:LBA1マッピングコマンド
 本実施例におけるNVMモジュール126では、LBA0の領域を指定してライトしたデータを、NVMモジュール126が圧縮してFM420に記録する。また、FM420に記録された圧縮データを取得するためには、当該圧縮データはLBA1空間上にマッピングされていなければならない。LBA1マッピングコマンドはそのために用いられる。
(1-12) Command 3 for NVM Module Control: LBA1 Mapping Command In the NVM module 126 in this embodiment, the NVM module 126 compresses and writes the data written by designating the LBA0 area and records it in the FM 420. Further, in order to acquire the compressed data recorded in the FM 420, the compressed data must be mapped on the LBA1 space. The LBA1 mapping command is used for that purpose.
 図12は、本実施例におけるNVMモジュール126でサポートされる、LBA1マッピングコマンドとそのLBA1マッピングコマンドへの応答情報を模擬的に示した図である。本実施例におけるNVMモジュール126のLBA1マッピングコマンド1210は、コマンド情報として、オペレーションコード1211、コマンドID1012、LBA0開始アドレス1213、LBA0長1214、LBA1開始アドレス1215、により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1012は先のライトコマンドと同一の内容の為、説明は省略する。 FIG. 12 is a diagram schematically showing an LBA1 mapping command and response information to the LBA1 mapping command supported by the NVM module 126 in the present embodiment. The LBA1 mapping command 1210 of the NVM module 126 in the present embodiment is configured by an operation code 1211, a command ID 1012, an LBA0 start address 1213, an LBA0 length 1214, and an LBA1 start address 1215 as command information. In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous write command, description thereof is omitted.
 オペレーションコード1211は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドがLBA1マッピングコマンドであることを認知する。 The operation code 1211 is a field for notifying the type of command to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is an LBA1 mapping command.
 LBA0開始アドレス1213は、LBA1に圧縮データをマッピングする対象データのLBA0領域を指定する先頭アドレスを指定するフィールドである。LBA0長1214は、LBA1へのマッピング対象となるLBA0開始アドレス1213から始まるLBA0の範囲を指定するフィールドである。なお、圧縮データサイズ取得コマンドと同様、LBA0開始アドレス1213とLBA0長1214は、8セクタ(4KB)の倍数に限定される。 The LBA 0 start address 1213 is a field for designating a head address for designating the LBA 0 area of the target data for mapping the compressed data to LBA 1. The LBA0 length 1214 is a field for designating a range of LBA0 starting from the LBA0 start address 1213 to be mapped to LBA1. As with the compressed data size acquisition command, the LBA 0 start address 1213 and the LBA 0 length 1214 are limited to multiples of 8 sectors (4 KB).
 LBA1開始アドレス1215は、マッピングするLBA1の開始アドレスを指定するフィールドである。ストレージコントローラ110は、圧縮データサイズ取得コマンドを用いて、予めマッピングするデータサイズを知っており、このデータサイズがマッピング可能なLBA1の領域を確保し、この先頭アドレスをLBA1開始アドレス1215フィールドに格納して、当該コマンドをNVMモジュール126に発行する。 The LBA1 start address 1215 is a field for designating the start address of LBA1 to be mapped. The storage controller 110 knows the data size to be mapped in advance using the compressed data size acquisition command, reserves an LBA1 area to which this data size can be mapped, and stores this head address in the LBA1 start address 1215 field. The command is issued to the NVM module 126.
 本実施例のNVMモジュール126は、前述のLBA0開始アドレス1213とLBA0長1214が示す範囲のLBA0空間に対応づけられている圧縮データを、LBA1開始アドレス1215から、圧縮データサイズ分の領域に渡ってマッピングを行う。より具体的には、LBA0-PBA変換テーブルを参照し、LBA0開始アドレス1213とLBA0長1214が示す範囲のLBA0空間に対応付けられたPBA(NVMモジュールPBA812)を取得する。そして、LBA1-PBA変換テーブルを参照し、LBA1開始アドレス1215から、取得したPBAの総サイズと同サイズとなるLBA1範囲(NVMモジュールLBA1(821)で特定されるエントリ)のPBA822に取得したPBAのアドレスを記入する。 The NVM module 126 according to the present embodiment transfers the compressed data associated with the LBA0 space in the range indicated by the LBA0 start address 1213 and the LBA0 length 1214 from the LBA1 start address 1215 to an area corresponding to the compressed data size. Perform mapping. More specifically, the PBA (NVM module PBA812) associated with the LBA0 space in the range indicated by the LBA0 start address 1213 and the LBA0 length 1214 is acquired by referring to the LBA0-PBA conversion table. Then, referring to the LBA1-PBA conversion table, from the LBA1 start address 1215, the PBA acquired in the PBA 822 in the LBA1 range (entry specified by the NVM module LBA1 (821)) having the same size as the total size of the acquired PBA Enter the address.
 LBA1マッピング応答1220は、コマンドID1021、ステータス1022、により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1021とステータス1022は先のライト応答と同一の内容の為、説明は省略する。 The LBA1 mapping response 1220 includes a command ID 1021 and a status 1022. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous write response, description thereof is omitted.
 (1-13)NVMモジュール制御用のコマンド4:フルストライプパリティ生成コマンド
 RAID技術におけるパリティの生成方法には大きく2通りある。1つは、パリティを生成するために必要なすべてのデータによって、XOR等のパリティデータの演算を行ってパリティ生成を行う方法で、この方法を本明細書では「フルストライプパリティ生成方法」と呼ぶ。もう一方は、RAID構成された記憶媒体群に対して更新データの書き込みが行われる場合、当該更新データに加えて、記憶媒体に格納されている更新前のデータ及び当該更新前のデータに対応する更新前パリティとのXOR演算等を行って、更新データに対応するパリティ(更新後パリティ)を生成する方法で、この方法を本明細書では「更新パリティ生成方法」と呼ぶ。
(1-13) NVM Module Control Command 4: Full Stripe Parity Generation Command There are two main parity generation methods in RAID technology. One is a method of generating parity by calculating parity data such as XOR by using all data necessary for generating parity, and this method is referred to as a “full stripe parity generation method” in this specification. . The other corresponds to the data before update and the data before update stored in the storage medium in addition to the update data when update data is written to the RAID-configured storage medium group. This is a method of generating parity (updated parity) corresponding to update data by performing an XOR operation with the parity before update, and this method is called an “update parity generation method” in this specification.
 フルストライプパリティ生成コマンドは、RAIDパリティを構成する全てのデータがNVMモジュール126に格納されており、且つLBA1空間にマッピングされているときに用いることができる。従って、6つのデータに対してパリティを生成するRAID構成の場合、6つのデータがNVMモジュール126に格納されている必要がある。
先に述べたとおり、本発明の実施例に係るストレージ装置101では、上位装置103からのライトデータをNVMモジュール126に圧縮された状態で格納するが、パリティ生成の際には非圧縮状態のデータからパリティを生成する。そのため、パリティ生成対象のデータは、LBA0空間にマッピングされている必要がある。
The full stripe parity generation command can be used when all the data constituting the RAID parity is stored in the NVM module 126 and mapped in the LBA1 space. Therefore, in the case of a RAID configuration that generates parity for six data, six data must be stored in the NVM module 126.
As described above, in the storage apparatus 101 according to the embodiment of the present invention, the write data from the higher level apparatus 103 is stored in a compressed state in the NVM module 126. However, in the parity generation, the uncompressed data is stored. Generate parity from. Therefore, the parity generation target data needs to be mapped to the LBA0 space.
 図13は、本実施例におけるNVMモジュール126のフルストライプパリティ生成コマンドとフルストライプパリティ生成コマンドへの応答情報を示した図である。フルストライプパリティ生成コマンド1310は、コマンド情報として、オペレーションコード(Opcode)1311、コマンドID1012、LBA0長1313、ストライプ数1314、LBA0開始アドレス0~X(1315~1317)、LBA0開始アドレス(XOR パリティ用)1318、LBA0開始アドレス(RAID6パリティ用)1319により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。 FIG. 13 is a diagram showing the response information to the full stripe parity generation command and the full stripe parity generation command of the NVM module 126 in the present embodiment. The full stripe parity generation command 1310 includes, as command information, an operation code (Opcode) 1311, a command ID 1012, an LBA0 length 1313, a stripe number 1314, an LBA0 start address 0 to X (1315 to 1317), and an LBA0 start address (for XOR parity) 1318, an LBA0 start address (for RAID 6 parity) 1319. In this embodiment, an example of a command based on the above information will be described, but there may be additional information above.
 オペレーションコード1311は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドがフルストライプパリティ生成コマンドであることを認知する。 The operation code 1311 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a full stripe parity generation command.
 LBA0長1313は、生成するパリティの長さ(RAIDパリティは、パリティとパリティ生成元データの長さは同一)を指定するフィールドである。ストライプ数1314は、パリティを生成のために使用するデータの個数を指定する。たとえば6つのデータに対してパリティを生成する場合、ストライプ数1314には6が格納される。 The LBA 0 length 1313 is a field for designating the length of the parity to be generated (for RAID parity, the parity and the parity generation source data have the same length). The number of stripes 1314 designates the number of data used for generating parity. For example, when parity is generated for 6 data, 6 is stored in the stripe number 1314.
 LBA0開始アドレス0~X(1315~1317)は、パリティ生成元のデータが対応づけられているLBA0の開始アドレスを指定するフィールドである。このフィールドの個数はストライプ数1314で指定された個数と一致している(一致していないコマンドが発行された場合、NVMモジュール126はエラーを返却する)。例えば、6つのデータに対して2つのパリティを作成する構成(RAID6 6D+2P)では、6つのLBA1開始アドレスを指定する。 LBA 0 start addresses 0 to X (1315 to 1317) are fields for designating the start address of LBA 0 to which the parity generation source data is associated. The number of fields matches the number specified by the stripe number 1314 (when a command that does not match is issued, the NVM module 126 returns an error). For example, in a configuration in which two parities are created for six data (RAID6 6D + 2P), six LBA1 start addresses are designated.
 LBA0開始アドレス(XOR パリティ用)1318は、生成するRAIDパリティ(XOR パリティ)の格納先を指定するフィールドである。RAID5の構成であれば、この開始アドレスから、LBA0長1313に指定された範囲の領域に生成されたパリティ(RAID5のパリティ、あるいはRAID6のPパリティ、水平パリティ)がマッピングされる。 LBA 0 start address (for XOR parity) 1318 is a field for designating the storage destination of the generated RAID parity (XOR parity). In the case of the RAID 5 configuration, the parity (RAID 5 parity, RAID 6 P parity, or horizontal parity) generated in the area specified by the LBA 0 length 1313 is mapped from this start address.
 LBA0開始アドレス(RAID6用)1319は、生成するRAID6用のパリティの格納先を指定するフィールドである。RAID6用のパリティは先に述べたとおり、リードソロモン符号のQパリティ、あるいはEVENODD方式における対角パリティである。このLBA0開始アドレス(RAID6用)1319からLBA0長1313に指定された範囲の領域に、生成されたパリティが格納される。 The LBA 0 start address (for RAID 6) 1319 is a field for designating the storage destination of the parity for RAID 6 to be generated. As described above, the parity for RAID 6 is Q parity of Reed-Solomon code or diagonal parity in the EVENODD system. The generated parity is stored in an area in a range specified by the LBA 0 start address (for RAID 6) 1319 and the LBA 0 length 1313.
 本実施例のNVMモジュール126は、前述のLBA0開始アドレス0~X(1315~1317)に指定された領域に対応付けられたPBAが示すFM420より複数の圧縮データを取得する。続いて、取得したデータをデータ圧縮/伸長ユニット418を用いて伸長し、NVMモジュール126内部のパリティ生成ユニット419を使って、伸長されたデータから1または2つのパリティを生成する。その後、生成したパリティをデータ圧縮/伸長ユニット418を用いて圧縮した後、FM420に記録する。最後にパリティを記録した記録先FMの領域のPBAを、LBA0開始アドレス(XOR パリティ用)1318やLBA0開始アドレス(RAID6用)1319に対応付けるため、LBA0-PBA管理情報810の該当行(NVMモジュールLBA0(811)が、LBA0開始アドレス(XOR パリティ用)1318やLBA0開始アドレス(RAID6用)1319の値と等しい行)の、NVMモジュールPBA(812)とPBA長(813)に、記録先FMの領域のPBA及び圧縮後のデータ長を記録する。 The NVM module 126 of this embodiment acquires a plurality of compressed data from the FM 420 indicated by the PBA associated with the area specified by the LBA 0 start addresses 0 to X (1315 to 1317) described above. Subsequently, the acquired data is decompressed using the data compression / decompression unit 418, and one or two parities are generated from the decompressed data using the parity generation unit 419 inside the NVM module 126. Thereafter, the generated parity is compressed using the data compression / decompression unit 418 and then recorded in the FM 420. Since the PBA in the area of the recording destination FM where the parity is recorded last is associated with the LBA 0 start address (for XOR parity) 1318 and the LBA 0 start address (for RAID 6) 1319, the corresponding row of the LBA 0-PBA management information 810 (NVM module LBA 0 (811) is the area of the recording destination FM in the NVM module PBA (812) and the PBA length (813) of the LBA0 start address (for XOR parity) 1318 and the LBA0 start address (for RAID6) 1319) The PBA and the compressed data length are recorded.
 フルストライプパリティ生成応答1320は、コマンドID1021、ステータス1022、により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1021とステータス1022は先のLBA0ライト応答と同一の内容の為、説明は省略する。 The full stripe parity generation response 1320 includes a command ID 1021 and a status 1022. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
 (1-14)NVMモジュール制御用のコマンド5:更新パリティ生成コマンド
 更新パリティ生成は、既にパリティが作成されている最終記憶媒体の領域に更新データを記録する際、更新データ、更新データにより更新される領域の旧データ、そして旧データを保護している旧パリティの3つの情報が、LBA0上にマッピングされているときに行う。本実施例のストレージコントローラ110は、更新パリティを生成する際、RAID構成されている最終記憶媒体から旧データと旧パリティの圧縮データを読出し、NVMモジュール126のLBA1空間上領域に対してライトする。そして、LBA1空間上の各圧縮データをLBA0空間にマッピングすることで、LBA0空間上に更新データ、更新データにより更新される領域の旧データ、旧データを保護している旧パリティを揃え、その後更新パリティ生成コマンドを発行することによって、更新パリティ生成を行う。
(1-14) NVM module control command 5: Update parity generation command Update parity generation is updated with update data and update data when update data is recorded in the area of the final storage medium for which parity has already been created. This is performed when three pieces of information of old data in the area and old parity protecting the old data are mapped on LBA0. When generating the updated parity, the storage controller 110 of this embodiment reads the old data and the compressed data of the old parity from the final storage medium configured in RAID, and writes it to the area on the LBA1 space of the NVM module 126. Then, by mapping each compressed data in the LBA1 space to the LBA0 space, the update data, the old data in the area updated by the update data, and the old parity protecting the old data are aligned in the LBA0 space, and then updated. Update parity generation is performed by issuing a parity generation command.
 図14は、本実施例におけるNVMモジュール126の更新パリティ生成コマンドと更新パリティ生成コマンドへの応答情報を示した図である。本実施例におけるNVMモジュール126の更新パリティコマンド1410は、コマンド情報として、オペレーションコード1011、コマンドID1012、LBA0長1413、LBA0開始アドレス0(1414)、LBA0開始アドレス1(1415)、LBA0開始アドレス2(1416)、LBA0開始アドレス3(1417)、LBA0開始アドレス4(1418)、LBA0開始アドレス5(1419)により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1012は先のLBA0ライトコマンドと同一の内容の為、説明は省略する。 FIG. 14 is a diagram showing an update parity generation command of the NVM module 126 and response information to the update parity generation command in the present embodiment. The update parity command 1410 of the NVM module 126 in this embodiment includes, as command information, an operation code 1011, a command ID 1012, an LBA0 length 1413, an LBA0 start address 0 (1414), an LBA0 start address 1 (1415), and an LBA0 start address 2 ( 1416), LBA0 start address 3 (1417), LBA0 start address 4 (1418), and LBA0 start address 5 (1419). In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous LBA0 write command, description thereof is omitted.
 オペレーションコード1411は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドが更新パリティ生成コマンドであることを認知する。 The operation code 1411 is a field for notifying the type of command to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is an update parity generation command.
 LBA0長1413は、生成するパリティの長さ(なお、RAIDパリティとパリティ生成元データの長さは同一の関係がある)を指定するフィールドである。 The LBA0 length 1413 is a field for designating the length of parity to be generated (note that the length of RAID parity and parity generation source data have the same relationship).
 LBA0開始アドレス0(1414)は、パリティ更新のための新データがマッピングされているLBA0の領域の開始アドレスを示すフィールドである。ストレージ装置101はこのフィールドを用いてLBA0開始アドレス0(1414)からLBA0長1413の領域のデータが新データであることをNVMモジュール126に通知する。 LBA 0 start address 0 (1414) is a field indicating the start address of the LBA 0 area to which new data for parity update is mapped. The storage apparatus 101 uses this field to notify the NVM module 126 that the data in the area from the LBA 0 start address 0 (1414) to the LBA 0 length 1413 is new data.
 LBA0開始アドレス1(1415)は、パリティ更新のための旧データがマッピングされているLBA0の領域の開始アドレスを示すフィールドである。ストレージ装置101はこのフィールドを用いてLBA0開始アドレス1(1415)からLBA0長1413の領域のデータが旧データであることをNVMモジュール126に通知する。 LBA0 start address 1 (1415) is a field indicating the start address of the LBA0 area to which the old data for parity update is mapped. The storage apparatus 101 uses this field to notify the NVM module 126 that the data in the area from the LBA 0 start address 1 (1415) to the LBA 0 length 1413 is old data.
 LBA0開始アドレス2(1416)は、パリティ更新のための更新前のXOR パリティがマッピングされているLBA0の領域の開始アドレスを示すフィールドである。ストレージ装置101はこのフィールドを用いてLBA0開始アドレス2(1416)からLBA0長1413の領域のデータが更新前XOR パリティであることをNVMモジュール126に通知する。 LBA0 start address 2 (1416) is a field indicating the start address of the LBA0 area to which the XOR parity before update for parity update is mapped. The storage apparatus 101 uses this field to notify the NVM module 126 that the data in the area from the LBA 0 start address 2 (1416) to the LBA 0 length 1413 is the pre-update XOR parity.
 LBA0開始アドレス3(1417)は、パリティ更新のための更新前のRAID6用のパリティがマッピングされているLBA0の領域の開始アドレスを示すフィールドである。ストレージ装置101はこのフィールドを用いてLBA0開始アドレス3(1417)からLBA0長1413の領域のデータがRAID6用のパリティであることをNVMモジュール126に通知する。 LBA 0 start address 3 (1417) is a field indicating the start address of the LBA 0 area to which the parity for RAID 6 before update for parity update is mapped. The storage apparatus 101 uses this field to notify the NVM module 126 that the data in the area from the LBA 0 start address 3 (1417) to the LBA 0 length 1413 is RAID 6 parity.
 LBA0開始アドレス4(1418)は、更新によって新たに作成するXOR パリティを対応付けるLBA0の領域の開始アドレスを示すフィールドである。ストレージ装置101はこのフィールドを用いて、LBA0開始アドレス4(1418)からLBA0長1413の領域に、新たなXOR パリティをマッピングするようにNVMモジュール126に指示する。 LBA0 start address 4 (1418) is a field indicating the start address of the LBA0 area to which the XOR parity newly created by updating is associated. The storage apparatus 101 uses this field to instruct the NVM module 126 to map a new XOR parity from the LBA 0 start address 4 (1418) to the LBA 0 length 1413 area.
 LBA0開始アドレス5(1419)は、更新によって新たに作成するRAID6用の パリティを対応付けるLBA0の領域の開始アドレスを示すフィールドである。ストレージコントローラ110はこのフィールドを用いて、LBA0開始アドレス5(1419)からLBA0長1413の領域に新たなRAID6用のパリティをマッピングするようにNVMモジュール126に指示する。 LBA 0 start address 5 (1419) is a field indicating the start address of the LBA 0 area to which a parity for RAID 6 newly created by update is associated. The storage controller 110 uses this field to instruct the NVM module 126 to map a new parity for RAID 6 from the LBA 0 start address 5 (1419) to the LBA 0 length 1413 area.
 NVMモジュール126が更新パリティ生成コマンドを受け付けた時の処理は、フルストライプパリティ生成コマンドを受け付けた場合に行われる処理と同様である。前述のLBA0開始アドレス0~3(1414~1417)に指定された領域に対応付けられたPBAが示すFM420上記憶領域から複数の圧縮データを取得して伸長し、NVMモジュール126内部のパリティ生成ユニット419を使って、1または2つのパリティを生成した後、パリティを圧縮する。その後、生成したパリティをFM420に記録し、LBA0開始アドレス4(1418)とLBA0開始アドレス5(1419)で指定されたLBA0にマッピングする。 The processing when the NVM module 126 receives the update parity generation command is the same as the processing performed when the full stripe parity generation command is received. A parity generation unit in the NVM module 126 that acquires and decompresses a plurality of compressed data from the storage area on the FM 420 indicated by the PBA associated with the area specified by the LBA 0 start addresses 0 to 3 (1414 to 1417). After generating one or two parities using 419, the parities are compressed. Thereafter, the generated parity is recorded in the FM 420 and mapped to the LBA 0 specified by the LBA 0 start address 4 (1418) and the LBA 0 start address 5 (1419).
 更新パリティ生成応答1420は、コマンドID1021、ステータス1022により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1021とステータス1022は先のLBA0ライト応答と同一の内容の為、説明は省略する。 The update parity generation response 1420 includes a command ID 1021 and a status 1022. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
 (1-15)NVMモジュール制御用のコマンド6:圧縮情報取得コマンド
 本実施例におけるストレージ装置101では、キャッシュ装置であるNVMモジュール126にて、データに対応するパリティを生成し、パリティを含む各データを圧縮した後、ストレージ装置101は圧縮データをNVMモジュール126より取得し、圧縮データを最終記憶媒体に記録する。このとき、圧縮データの伸長に必要な情報(以降圧縮情報と記す)についても、最終記憶媒体に記録する。尚、本発明は、この方式に依存するものは無く、伸長に必要な情報をNVMモジュール126が恒久的に保持するとしてもよい。
(1-15) NVM module control command 6: compression information acquisition command In the storage apparatus 101 in this embodiment, the NVM module 126, which is a cache apparatus, generates parity corresponding to data, and each data including parity , The storage apparatus 101 acquires the compressed data from the NVM module 126 and records the compressed data in the final storage medium. At this time, information necessary for decompressing the compressed data (hereinafter referred to as compressed information) is also recorded in the final storage medium. Note that the present invention does not depend on this method, and the NVM module 126 may permanently hold information necessary for decompression.
 本実施例のように圧縮情報を最終記憶媒体に記録する場合、ストレージ装置101はキャッシュ装置であるNVMモジュール126より、圧縮情報を取得する必要がある。圧縮情報取得コマンドは、ストレージコントローラ110が圧縮情報をNVMモジュール126より取得する際に利用する。 When recording the compressed information in the final storage medium as in the present embodiment, the storage apparatus 101 needs to acquire the compressed information from the NVM module 126 that is a cache apparatus. The compression information acquisition command is used when the storage controller 110 acquires compression information from the NVM module 126.
 図15は、本実施例におけるNVMモジュール126の圧縮情報取得コマンドと圧縮情報取得コマンドへの応答情報を示した図である。本実施例におけるNVMモジュール126の圧縮情報取得コマンド1510は、コマンド情報として、オペレーションコード1511、コマンドID1012、LBA1開始アドレス1513、LBA1長1514、圧縮情報アドレス1515により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1012は先のライトコマンドと同一の内容の為、説明は省略する。 FIG. 15 is a diagram illustrating a compression information acquisition command and response information to the compression information acquisition command of the NVM module 126 in the present embodiment. The compression information acquisition command 1510 of the NVM module 126 in this embodiment is constituted by an operation code 1511, a command ID 1012, an LBA1 start address 1513, an LBA1 length 1514, and a compression information address 1515 as command information. In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous write command, description thereof is omitted.
 オペレーションコード1511は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドが圧縮情報取得コマンドであることを認知する。 The operation code 1511 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a compression information acquisition command.
 LBA1開始アドレス1513は、圧縮情報の取得対象とするLBA1上の領域の開始アドレスを指定するフィールドである。 The LBA1 start address 1513 is a field for designating the start address of the area on the LBA1 from which compression information is to be acquired.
 LBA1長1514は、LBA1開始アドレス1513から始まるLBA1の範囲を指定するフィールドである。 The LBA1 length 1514 is a field for designating a range of LBA1 starting from the LBA1 start address 1513.
 圧縮情報アドレス1315は、ストレージコントローラ110がNVMモジュール126より取得する圧縮情報の格納先を指定するフィールドである。 The compression information address 1315 is a field for designating the storage destination of the compression information acquired by the storage controller 110 from the NVM module 126.
 NVMモジュール126は、前述のLBA1開始アドレス1513とLBA1長1514が示す範囲のLBA1領域に記録されたデータを伸長する際に必要な圧縮情報を作成し、ストレージコントローラ110が指定する圧縮情報アドレス1315に転送する。圧縮情報は、具体的にLBA1にマッピングされた圧縮データの構造を示すものである。例えば、指定されたLBA1領域に、4つの独立に伸長可能な圧縮データがマッピングされていた場合、その4つの圧縮データの開始位置と、圧縮データの長さを格納した情報である。 The NVM module 126 creates compression information necessary for decompressing the data recorded in the LBA1 area in the range indicated by the LBA1 start address 1513 and the LBA1 length 1514, and sets the compression information address 1315 specified by the storage controller 110. Forward. The compression information specifically indicates the structure of the compressed data mapped to LBA1. For example, when four pieces of independently decompressable compressed data are mapped to the designated LBA1 area, the information stores the start position of the four pieces of compressed data and the length of the compressed data.
 本実施例のストレージ装置101は、圧縮情報取得コマンドにより圧縮情報をNVMモジュール126から取得した後、最終記憶媒体に、圧縮データと共に圧縮情報を記録する。また、圧縮データを伸長して読み出す際、最終記憶媒体より、圧縮データと共に圧縮情報を取得し、圧縮データをNVMモジュール126にライトした後、後述の圧縮情報転送コマンドにと圧縮情報を転送することで、NVMモジュール126に伸長可能とさせる。 The storage apparatus 101 according to the present embodiment acquires the compression information from the NVM module 126 by the compression information acquisition command, and then records the compression information together with the compressed data on the final storage medium. When decompressing and reading the compressed data, the compressed information is acquired from the final storage medium together with the compressed data, the compressed data is written to the NVM module 126, and then the compressed information is transferred to a compressed information transfer command described later. Thus, the NVM module 126 can be extended.
 圧縮情報取得応答1520は、コマンドID1021、ステータス1022、により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1021とステータス1022は先のLBA0ライト応答と同一の内容の為、説明は省略する。 The compression information acquisition response 1520 includes a command ID 1021 and a status 1022. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
 (1-16)NVMモジュール制御用のコマンド7:リードコマンド
 図16は、本実施例におけるNVMモジュール126のリードコマンドとそのリードコマンドへの応答情報を示した図である。本実施例におけるNVMモジュール126のリードコマンド1610は、コマンド情報として、オペレーションコード1611、コマンドID1012、LBA0/1開始アドレス1613、LBA0/1長1614、伸長要否フラグ1615、リードデータアドレス1616により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1012は先のLBA0ライトコマンドと同一の内容の為、説明は省略する。
(1-16) NVM Module Control Command 7: Read Command FIG. 16 is a diagram showing a read command of the NVM module 126 and response information to the read command in this embodiment. The read command 1610 of the NVM module 126 in this embodiment is constituted by an operation code 1611, a command ID 1012, an LBA 0/1 start address 1613, an LBA 0/1 length 1614, an expansion necessity flag 1615, and a read data address 1616 as command information. The In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous LBA0 write command, description thereof is omitted.
 オペレーションコード1111は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドがリードコマンドであることを認知する。 The operation code 1111 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a read command.
 LBA0/1開始アドレス1613は、リード先の論理空間の先頭アドレスを指定するフィールドである。 The LBA 0/1 start address 1613 is a field for designating the start address of the logical space of the read destination.
 LBA0/1長1614は、LBA0/1開始アドレス1613から始まる記録先LBA0またはLBA1の範囲を指定するフィールドである。NVMモジュール126は、前述のLBA0またはLBA1開始アドレス1613とLBA0/1長1614が示す範囲のLBA0またはLBA1領域に対して対応付けられたPBAからデータを取得し、ストレージ装置に転送することでリード処理を行う。 The LBA 0/1 length 1614 is a field for designating the range of the recording destination LBA 0 or LBA 1 starting from the LBA 0/1 start address 1613. The NVM module 126 obtains data from the PBA associated with the LBA0 or LBA1 area in the range indicated by the LBA0 or LBA1 start address 1613 and the LBA0 / 1 length 1614 described above, and transfers the data to the storage device to perform read processing. I do.
 伸長要否フラグ1615は、このコマンドが指示するリード対象データの伸長要否を指定するフィールドである。ストレージコントローラ110がリードコマンドを作成する際、このフラグを制御することで、NVMモジュール126に伸長が不要であることを通知する。尚、このフィールドは、リードコマンドに含まれていなくても良い。本実施例では、LBA1空間からリードする際、リード対象データがを意図的に伸長せずに取得する必要がある。このため、伸長が不要であることを明示的に伝えるために使用する。尚、LBA1へのリードは取得データの伸長が不要であること固定の設定とするならば、この伸長要否フラグ1615が無くとも良い。 The decompression necessity flag 1615 is a field for designating the necessity of decompression of the read target data indicated by this command. When the storage controller 110 creates a read command, this flag is controlled to notify the NVM module 126 that decompression is unnecessary. This field may not be included in the read command. In this embodiment, when reading from the LBA1 space, it is necessary to acquire the read target data without intentionally expanding the data. For this reason, it is used to explicitly tell that decompression is unnecessary. It should be noted that the decompression necessity flag 1615 may not be provided if the read to the LBA 1 is set to a fixed value that does not require decompression of the acquired data.
 リードデータアドレス1616は、リード対象データの出力先領域の先頭アドレス(たとえばDRAM125内のアドレス)が指定される。リードされたデータはリードデータアドレス1616で指定されたアドレスの領域から、連続的にLBA0/1長1614で指定された長さのデータが格納されることになる。 In the read data address 1616, the head address (for example, an address in the DRAM 125) of the output destination area of the read target data is designated. In the read data, data having a length designated by the LBA 0/1 length 1614 is continuously stored from the area of the address designated by the read data address 1616.
 リード応答1620は、コマンドID1021、ステータス1022、により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1021とステータス1022は先のLBA0ライト応答と同一の内容の為、説明は省略する。 The read response 1620 includes a command ID 1021 and a status 1022. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous LBA0 write response, description thereof will be omitted.
 (1-17)NVMモジュール制御用のコマンド8:マッピング解除コマンド
 本発明の実施例に係るストレージコントローラ110は、NVMモジュール126内に圧縮して記録されたライトデータ及びパリティを圧縮した状態で取得する為、データをLBA1にマッピングする。また、圧縮した情報を伸長して取得するために、LBA1を指定してNVMモジュール126に記録したデータをLBA0にマッピングする。こうしてマッピングした領域は、処理が終了し不要となった際、マッピングを解除する必要がある。本実施例のストレージ装置101は、マッピング解除コマンドを用いて、PBAに対応付けたLBA0またはLBA1の対応付けを解除する。
(1-17) NVM Module Control Command 8: Mapping Cancel Command The storage controller 110 according to the embodiment of the present invention acquires the write data and parity compressed and recorded in the NVM module 126 in a compressed state. Therefore, the data is mapped to LBA1. Further, in order to decompress and acquire the compressed information, LBA1 is designated and data recorded in the NVM module 126 is mapped to LBA0. The mapped area needs to be unmapped when the processing is completed and becomes unnecessary. The storage apparatus 101 of this embodiment releases the association of LBA0 or LBA1 associated with the PBA using a mapping release command.
 図17は、本実施例におけるNVMモジュール126のマッピング解除コマンドとそのマッピング解除コマンドへの応答情報を示した図である。本実施例におけるNVMモジュール126のマッピング解除コマンド1710は、コマンド情報として、オペレーションコード1711、コマンドID1012、LBA0/1開始アドレス1713、LBA0/1長1714、により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1012は先のLBA0ライトコマンドと同一の内容の為、説明は省略する。 FIG. 17 is a diagram showing a mapping cancellation command of the NVM module 126 and response information to the mapping cancellation command in this embodiment. The mapping cancellation command 1710 of the NVM module 126 in this embodiment is constituted by an operation code 1711, a command ID 1012, an LBA0 / 1 start address 1713, and an LBA0 / 1 length 1714 as command information. In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous LBA0 write command, description thereof is omitted.
 オペレーションコード1111は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドがマッピング解除コマンドであることを認知する。 The operation code 1111 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a mapping release command.
 LBA0/1開始アドレス1713は、マッピングを解除する論理空間の先頭アドレスを指定するフィールドで、LBA0空間、LBA1空間の両方のアドレス空間のアドレスが指定可能である。ただし、LBA0空間のアドレスが指定される場合には、そのアドレスは4KB(8セクタ)境界のアドレスでなければならず、4KB(8セクタ)境界でないアドレスが指定された場合、NVMモジュール126はエラーを返す。LBA0/1長1714は、LBA0/1開始アドレス1713から始まるLBA0空間またはLBA1空間の範囲を指定するフィールドである。 The LBA 0/1 start address 1713 is a field for designating the start address of the logical space to be unmapped, and addresses in both the LBA 0 space and the LBA 1 space can be designated. However, if an address in the LBA0 space is specified, the address must be an address on a 4 KB (8 sector) boundary. If an address that is not on a 4 KB (8 sector) boundary is specified, the NVM module 126 will generate an error. return it. The LBA 0/1 length 1714 is a field for designating the range of the LBA 0 space or the LBA 1 space starting from the LBA 0/1 start address 1713.
 NVMモジュール126がストレージコントローラ110からマッピング解除コマンドを受け付けた時の処理は以下の通りである。NVMモジュール126は、前述のLBA0LBA/1開始アドレス1713とLBA0/1長1714が示す範囲のLBA0またはLBA1空間(以下、「ターゲットLBA0/1領域」と呼ぶ)に対応づけられたPBAについて、対応づけを削除する。具体的には、LBA0-PBA変換テーブル810またはLBA1-PBA変換テーブル820を参照し、NVMモジュールLBA0(811)またはNVMモジュールLBA1(821)の値が、ターゲットLBA0/1領域の範囲に属する各エントリについて、NVMモジュールPBA812またはNVMモジュールPBA822のフィールドを未割当に変更して更新する。 The processing when the NVM module 126 receives a mapping release command from the storage controller 110 is as follows. The NVM module 126 associates the PBA associated with the LBA0 or LBA1 space (hereinafter referred to as “target LBA0 / 1 area”) in the range indicated by the LBA0 LBA / 1 start address 1713 and the LBA0 / 1 length 1714 described above. Is deleted. Specifically, referring to the LBA0-PBA conversion table 810 or the LBA1-PBA conversion table 820, each entry in which the value of the NVM module LBA0 (811) or the NVM module LBA1 (821) belongs to the range of the target LBA0 / 1 area Is updated by changing the field of the NVM module PBA812 or the NVM module PBA822 to unallocated.
 このとき、LBA0及びLBA1への対応づけが共に解除されたPBAを検出し、そのPBAの情報をブロック管理情報900に反映する(つまり無効PBA量904の項目に、無効PBAとなった領域の量をカウントする)。なお、本発明の実施例におけるNVMモジュール126は、複数のブロックのうち、この無効PBA量904が相対的に多いブロックを選択(つまり、無効PBA量904が最も多いブロックから順に選択)して、ガーベッジコレクションを実施するが、ガーベッジコレクションは周知の処理であり、ここでの説明は省略する。 At this time, the PBA whose association with LBA 0 and LBA 1 has been released is detected, and the PBA information is reflected in the block management information 900 (that is, the amount of the area that has become invalid PBA in the invalid PBA amount 904 item) Count). The NVM module 126 in the embodiment of the present invention selects a block having a relatively large invalid PBA amount 904 among a plurality of blocks (that is, selects a block having the largest invalid PBA amount 904 in order), Garbage collection is carried out. Garbage collection is a well-known process and will not be described here.
 (1-18)NVMモジュール制御用のコマンド9:圧縮情報転送コマンド
 本実施例では、ストレージ装置101は、NVMモジュール126にて圧縮したデータを最終記憶媒体に記憶した後、ストレージ装置101は上位装置からのリード要求に応じて、圧縮データを伸長して上位装置に転送する必要が生じる。この時、ストレージ装置101は、最終記憶媒体から圧縮データを取得し、圧縮データをNVMモジュール126に転送した後、圧縮データを伸長するのに必要な圧縮情報についても転送する。
(1-18) NVM Module Control Command 9: Compressed Information Transfer Command In this embodiment, after the storage apparatus 101 stores the data compressed by the NVM module 126 in the final storage medium, the storage apparatus 101 In response to a read request from the user, it is necessary to decompress the compressed data and transfer it to the host device. At this time, the storage apparatus 101 acquires the compressed data from the final storage medium, transfers the compressed data to the NVM module 126, and then transfers the compressed information necessary for decompressing the compressed data.
 図18は、本実施例におけるNVMモジュール126の圧縮情報転送コマンドと圧縮情報転送コマンドへの応答情報を示した図である。本実施例におけるNVMモジュール126の圧縮情報転送コマンド1810は、コマンド情報として、オペレーションコード1811、コマンドID1012、LBA1開始アドレス1813、LBA1長1814、圧縮情報アドレス1815により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1012は先のライトコマンドと同一の内容の為、説明は省略する。 FIG. 18 is a diagram showing the compression information transfer command of the NVM module 126 and the response information to the compression information transfer command in the present embodiment. The compression information transfer command 1810 of the NVM module 126 in the present embodiment is constituted by an operation code 1811, a command ID 1012, an LBA1 start address 1813, an LBA1 length 1814, and a compression information address 1815 as command information. In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous write command, description thereof is omitted.
 オペレーションコード1811は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドが圧縮情報転送コマンドであることを認知する。 The operation code 1811 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is a compressed information transfer command.
 LBA1開始アドレス1813は、転送する圧縮情報の対象となるLBA1上の領域の開始アドレスを指定するフィールドである。 The LBA1 start address 1813 is a field that specifies the start address of the area on the LBA1 that is the target of the compressed information to be transferred.
 LBA1長1814は、LBA1開始アドレス1513から始まるLBA1の範囲を指定するフィールドである。 The LBA1 length 1814 is a field for designating a range of LBA1 starting from the LBA1 start address 1513.
 圧縮情報アドレス1815は、ストレージコントローラ110がNVMモジュール126に転送する圧縮情報の格納先を指定するフィールドである。 The compression information address 1815 is a field for designating the storage destination of the compression information transferred from the storage controller 110 to the NVM module 126.
 NVMモジュール126は、圧縮情報アドレス1815に指定されたアドレスから圧縮情報を取得し、LBA1開始アドレス1813とLBA1長1814にて指定された領域の複数の圧縮データについて、伸長を可能とする。具体的には、後述するLBA0マッピングコマンドにてLBA1に対応付けられている圧縮データをLBA0にマッピングした後、ストレージ装置よりLBA0に対するリード要求を受領した際、圧縮情報転送コマンドにて転送された圧縮情報を用いて圧縮データを伸長しストレージに転送する。 The NVM module 126 acquires the compression information from the address specified by the compression information address 1815, and enables decompression of a plurality of compressed data in the area specified by the LBA1 start address 1813 and the LBA1 length 1814. Specifically, after the compressed data associated with LBA1 is mapped to LBA0 with the LBA0 mapping command described later, the compression transferred with the compression information transfer command when a read request for LBA0 is received from the storage device The compressed data is decompressed using the information and transferred to the storage.
 圧縮情報転送応答1820は、コマンドID1021、ステータス1022、により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1021とステータス1022は先のライト応答と同一の内容の為、説明は省略する。 The compressed information transfer response 1820 includes a command ID 1021 and a status 1022. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous write response, description thereof is omitted.
 (1-19)NVMモジュール制御用のコマンド10:LBA0マッピングコマンド
 本実施例では、LBA1の領域を指定してライトした圧縮データを、NVMモジュール126がFMに記録する。本実施例では、この圧縮データをストレージコントローラ110が伸長して取得するため、圧縮データのライト先であったLBA1とは異なるLBA0にマッピングする。
(1-19) NVM Module Control Command 10: LBA0 Mapping Command In this embodiment, the NVM module 126 records the compressed data written by designating the LBA1 area in the FM. In this embodiment, the compressed data is acquired by the storage controller 110 being decompressed, so that the compressed data is mapped to LBA 0 different from LBA 1 that is the write destination of the compressed data.
 図19は、本実施例におけるNVMモジュール126のLBA0マッピングコマンドとそLBA0マッピングコマンドへの応答情報を示した図である。本実施例におけるNVMモジュール126のLBA0マッピングコマンド1210は、コマンド情報として、オペレーションコード1911、コマンドID1012、LBA1開始アドレス1913、LBA1長1914、LBA0開始アドレス1915、により構成される。尚、本実施例では、上記の情報によるコマンドの例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1012は先のライトコマンドと同一の内容の為、説明は省略する。 FIG. 19 is a diagram showing the LBA0 mapping command of the NVM module 126 and the response information to the LBA0 mapping command in the present embodiment. The LBA0 mapping command 1210 of the NVM module 126 in the present embodiment is configured by an operation code 1911, a command ID 1012, an LBA1 start address 1913, an LBA1 length 1914, and an LBA0 start address 1915 as command information. In this embodiment, an example of a command based on the above information will be described, but there may be additional information above. Since the command ID 1012 has the same contents as the previous write command, description thereof is omitted.
 オペレーションコード1911は、コマンドの種別をNVMモジュール126に通知するフィールドであり、コマンドを取得したNVMモジュール126は、このフィールドにより通知されたコマンドがLBA0マッピングコマンドであることを認知する。 The operation code 1911 is a field for notifying the command type to the NVM module 126, and the NVM module 126 that has acquired the command recognizes that the command notified by this field is an LBA0 mapping command.
 LBA1開始アドレス1913は、LBA1に圧縮データをマッピングする対象データのLBA1領域を指定する先頭アドレスを指定するフィールドである。 The LBA1 start address 1913 is a field for designating a head address for designating an LBA1 area of target data for mapping compressed data to LBA1.
 LBA1長1914は、LBA1へのマッピング対象となるLBA1開始アドレス1913から始まるLBA1の範囲を指定するフィールドである。 The LBA1 length 1914 is a field for designating a range of LBA1 starting from the LBA1 start address 1913 to be mapped to LBA1.
 LBA0開始アドレス1915は、マッピングするLBA0の開始アドレスを指定するフィールドである。ストレージ装置101は、ストレージコントローラ110がPDEVから取得する圧縮情報により、LBA1に記録した圧縮データの伸長後のデータサイズを知っており、このデータサイズがマッピング可能なLBA0の領域を確保し、この先頭アドレスをLBA0開始アドレス1915フィールドに記入する。また、LBA0開始アドレス1915に指定できるアドレスは、8セクタ(4KB)の倍数に限定される。 The LBA 0 start address 1915 is a field for designating the start address of LBA 0 to be mapped. The storage apparatus 101 knows the data size after the decompression of the compressed data recorded in the LBA 1 from the compression information acquired from the PDEV by the storage controller 110, and secures an area of LBA 0 to which this data size can be mapped. Enter the address in the LBA 0 start address 1915 field. The address that can be specified as the LBA 0 start address 1915 is limited to a multiple of 8 sectors (4 KB).
 本実施例のNVMモジュール126は、前述のLBA1開始アドレス1913とLBA0長1914が示す範囲のLBA1領域に対応づけられている圧縮データをLBA0開始アドレス1915から、伸長後のデータサイズ分の領域に渡ってマッピングを行う。より具体的には、LBA1-PBA変換テーブルを参照し、LBA1開始アドレス1913とLBA1長1914が示す範囲のLBAに対応付けられたPBAを取得する。そして、LBA0-PBA変換テーブルを参照し、LBA0開始アドレス1915から、NVMモジュール126が圧縮情報転送コマンドによりストレージ装置より取得した圧縮情報を基に伸長後の同サイズとなるLBA0範囲のPBA822に取得したPBAのアドレスを記入する。 The NVM module 126 of this embodiment transfers the compressed data associated with the LBA1 area in the range indicated by the LBA1 start address 1913 and the LBA0 length 1914 from the LBA0 start address 1915 to the area corresponding to the data size after decompression. Mapping. More specifically, referring to the LBA1-PBA conversion table, the PBA associated with the LBA in the range indicated by the LBA1 start address 1913 and the LBA1 length 1914 is acquired. Then, referring to the LBA0-PBA conversion table, acquired from the LBA0 start address 1915 to the PBA822 in the LBA0 range that has the same size after decompression based on the compression information acquired from the storage device by the NVM module 126 using the compression information transfer command Enter the PBA address.
 LBA0マッピング応答1220は、コマンドID1021、ステータス1022、により構成される。尚、本実施例では、上記の情報による応答情報の例について記すが、上記以上の付加的な情報があってもよい。尚、コマンドID1021とステータス1022は先のライト応答と同一の内容の為、説明は省略する。 The LBA0 mapping response 1220 includes a command ID 1021 and a status 1022. In the present embodiment, an example of response information based on the above information will be described, but there may be additional information above. Since the command ID 1021 and the status 1022 have the same contents as the previous write response, description thereof is omitted.
 (1-20)ストレージ装置におけるデータ管理の概要:
 続いて本発明の実施例に係るストレージ装置101のデータ管理について説明する。ストレージ装置101は、仮想ボリュームに、一つまたは複数のRAIDグループの領域を対応付けて管理する。またストレージ装置101は、RAIDグループに一つまたは複数の仮想PDEVを対応付けて管理する。さらにストレージ装置101は、仮想PDEVのそれぞれに対して、一つのPDEV(SSD111またはHDD112)を対応付けて管理する。これら対応付けについて、図21を用いて説明する。
(1-20) Overview of data management in storage device:
Next, data management of the storage apparatus 101 according to the embodiment of the present invention will be described. The storage apparatus 101 manages the virtual volume in association with one or more RAID group areas. The storage apparatus 101 manages one or more virtual PDEVs in association with the RAID group. Furthermore, the storage apparatus 101 manages one PDEV (SSD 111 or HDD 112) in association with each virtual PDEV. These associations will be described with reference to FIG.
 本発明の実施例に係るストレージ装置101は、仮想ボリューム(図中では「仮想Vol」と表記)200を上位装置103に提供している。図21の例は、上位装置103が認識するデータ領域「Data14」のストレージ装置内部での対応付けを示している。以下、仮想ボリューム200に対応付けられているRAIDグループのRAID種別がRAID5の場合を例にとって説明する。 The storage apparatus 101 according to the embodiment of the present invention provides a virtual volume (denoted as “virtual Vol” in the drawing) 200 to the host apparatus 103. The example of FIG. 21 shows the association in the storage apparatus of the data area “Data14” recognized by the higher-level apparatus 103. Hereinafter, a case where the RAID type of the RAID group associated with the virtual volume 200 is RAID 5 will be described as an example.
 図21において、データ領域Data0~14は、RAIDパリティの計算単位で区分された固定サイズの領域であり、以降この単位をRAIDストライプと呼ぶ。なお、データ領域Data0~14は、上位装置103が認識する記憶領域であるから、ストレージ装置101内で圧縮されていないデータが格納されている領域である。例えばRAIDストライプData14は、Data13、Data12とXOR演算されることによってRAID5用パリティを生成する。以降では、このData12、Data13、Data14のように、RAIDパリティを生成するために必要となるRAIDストライプのセットのことを、RAIDストライプ列と呼ぶ。RAIDストライプの長さは、64KBや32KBであり、RAIDストライプ列の長さは、RAIDストライプ列を構成するRAIDストライプ数の積となる。例えば、RAIDストライプの長さが32KB、RAIDストライプ列を構成するRAIDストライプ数が3のとき、RAIDストライプ列の長さは、32KB×3=96KBとなる。このとき、RAIDパリティは、96KBのデータ領域毎に生成される。 In FIG. 21, data areas Data0 to Data14 are fixed-size areas divided by RAID parity calculation units, and these units are hereinafter referred to as RAID stripes. Note that the data areas Data0 to Data14 are storage areas recognized by the host apparatus 103, and are therefore areas in which uncompressed data is stored in the storage apparatus 101. For example, the RAID stripe Data14 generates a parity for RAID5 by performing an XOR operation with Data13 and Data12. Hereinafter, a set of RAID stripes necessary for generating a RAID parity, such as Data12, Data13, and Data14, is referred to as a RAID stripe column. The length of the RAID stripe is 64 KB or 32 KB, and the length of the RAID stripe column is the product of the number of RAID stripes constituting the RAID stripe column. For example, when the length of the RAID stripe is 32 KB and the number of RAID stripes constituting the RAID stripe row is 3, the length of the RAID stripe row is 32 KB × 3 = 96 KB. At this time, the RAID parity is generated for each 96 KB data area.
 なお、1つの仮想ボリューム内各ストライプには、0から始まる連続番号が付されている。本明細書ではこの番号は「ストライプ番号」と呼ばれる。仮想ボリュームの先頭に位置するストライプのストライプ番号が0番で、以降順にストライプ番号1、2、…が付される。図21において、Data0~14等のように、「Data」の後に付された番号が、ストライプ番号である。のことを、「ストライプ番号」と呼ぶ。 Note that each stripe in one virtual volume is assigned a serial number starting from 0. This number is referred to herein as the “striped number”. The stripe number of the stripe located at the head of the virtual volume is 0, and the stripe numbers 1, 2,. In FIG. 21, the numbers given after “Data” such as Data 0 to 14 are stripe numbers. This is referred to as a “stripe number”.
 また、各RAIDストライプ列もRAIDグループの先頭に位置するストライプ列から順に0から始まる番号(ストライプ列番号と呼ばれる)が付されているものとする。RAIDグループの先頭に位置するストライプ列のストライプ列番号が0で、以下順に各ストライプ列には、ストライプ列番号1、2、…が付される。 Suppose that each RAID stripe column is also given a number starting from 0 (called a stripe column number) in order from the stripe column located at the head of the RAID group. The stripe column number of the stripe column located at the head of the RAID group is 0, and the stripe column numbers 1, 2,...
 続いて、仮想ボリュームとRAIDグループとの対応関係、及びRAIDグループと仮想PDEVとの対応関係について説明するが、その前に、仮想PDEVについて説明する。仮想PDEVとは、仮想ボリュームのアドレスをPDEVのアドレスに変換するために、ストレージ装置101内で定義されている概念であり、ストレージ装置101は、仮想PDEVを、上位装置103からライトされたデータをそのまま(非圧縮の状態で)格納する記憶デバイスとして扱う。 Subsequently, the correspondence relationship between the virtual volume and the RAID group and the correspondence relationship between the RAID group and the virtual PDEV will be described. Before that, the virtual PDEV will be described. The virtual PDEV is a concept defined in the storage apparatus 101 in order to convert the address of the virtual volume into the PDEV address. The storage apparatus 101 converts the data written from the host apparatus 103 into the virtual PDEV. Treat as a storage device that stores the data as is (uncompressed).
 図21において、仮想ボリューム200は、RAIDグループ0と対応付けられており、RAIDグループ0は、仮想PDEV0~3で構成されている。RAIDグループ0は、仮想PDEV0~3によるRAID5にてデータを保護する構成となっており、仮想ボリューム200の各ストライプは、静的に定められた計算可能な対応関係にて、仮想ボリューム200に対応したRAIDグループ0に属する複数の仮想PDEV内の、いずれかのストライプに対応付けられている。これは従来のRAID構成を採用するストレージ装置で行われている対応付けと同様である。またPDEVは、仮想PDEVと1:1に対応付けられた関係にあるので、仮想ボリューム200の各ストライプは、RAIDグループ0に属するいずれかのPDEVに対応付けられているともいえる。 21, the virtual volume 200 is associated with the RAID group 0, and the RAID group 0 is configured with virtual PDEVs 0 to 3. RAID group 0 is configured to protect data with RAID 5 by virtual PDEVs 0 to 3, and each stripe of virtual volume 200 corresponds to virtual volume 200 with a statically defined computable correspondence. Is associated with one of the stripes in the plurality of virtual PDEVs belonging to the RAID group 0. This is the same as the association performed in the storage apparatus adopting the conventional RAID configuration. Further, since the PDEV has a relationship of 1: 1 with the virtual PDEV, it can be said that each stripe of the virtual volume 200 is associated with one of the PDEVs belonging to the RAID group 0.
 一例として、仮想ボリューム200のストライプ「Data14」について説明する。仮想ボリューム200のストライプ「Data14」は、計算可能な対応関係にて、仮想PDEV2内の先頭から4番目(なお、ここでは仮想PDEV2内先頭ストライプを0番目のストライプとして計数する)のRAIDストライプ列が格納される領域(図21内の「仮想PDEV2」内の「Data14」)に対応付けられている。例えば、RAIDストライプの長さが32KBであれば、仮想PDEV2内の32KB×4=128KBを先頭アドレスとして32KBの領域が仮想ボリューム200のRAIDストライプ「Data14」の格納先として対応付けられる。 As an example, the stripe “Data 14” of the virtual volume 200 will be described. The stripe “Data 14” of the virtual volume 200 has the fourth RAID stripe column from the beginning in the virtual PDEV 2 (here, the first stripe in the virtual PDEV 2 is counted as the 0th stripe) in a computable correspondence relationship. It is associated with the area to be stored (“Data 14” in “virtual PDEV 2” in FIG. 21). For example, if the length of the RAID stripe is 32 KB, the 32 KB area in the virtual PDEV 2 is associated with the start address of 32 KB × 4 = 128 KB as the storage destination of the RAID stripe “Data 14” of the virtual volume 200.
 RAIDグループ0を構成する各仮想PDEVと、ストレージ装置101に搭載されるPDEVとの対応関係は、後述する仮想PDEV情報2230によって管理される。また仮想PDEV2内のRAIDストライプ「Data14」の格納先領域と、RAIDストライプ「Data14」のデータが圧縮して格納されているPDEV2内の領域(図21内「圧縮D14」)との対応関係は、PDEV2内に格納される圧縮管理情報(図21の「管理情報2」)によって管理される。また圧縮管理情報(管理情報2)は、PDEV2内の所定の場所に記録される。この圧縮管理情報の内容とPDEV2内の記録位置に関する詳細は、後述する。なお、ここではPDEV2内の圧縮管理情報についてのみ説明しているが、その他のPDEVについても同様に圧縮管理情報(図21の管理情報0、1、3)が格納されている。 The correspondence relationship between each virtual PDEV configuring the RAID group 0 and the PDEV installed in the storage apparatus 101 is managed by virtual PDEV information 2230 described later. The correspondence relationship between the storage destination area of the RAID stripe “Data14” in the virtual PDEV2 and the area in the PDEV2 in which the data of the RAID stripe “Data14” is compressed and stored (“compressed D14” in FIG. 21) is as follows. It is managed by the compression management information (“management information 2” in FIG. 21) stored in the PDEV 2. The compression management information (management information 2) is recorded at a predetermined location in the PDEV 2. Details regarding the contents of the compression management information and the recording position in the PDEV 2 will be described later. Although only the compression management information in the PDEV 2 is described here, the compression management information ( management information 0, 1, 3 in FIG. 21) is also stored in the other PDEVs.
 以上が、実施例におけるデータ管理の概要である。 The above is the outline of data management in the embodiment.
 (1-21)ストレージ装置の管理情報1:仮想ボリューム管理情報
 続いて、ストレージ装置101が、プロセッサ121が高速にデータアクセスできるように、ストレージ装置101のDRAM125に格納する「仮想ボリューム管理情報」と「RAIDグループ管理情報」と「仮想PDEV情報」について、図22を用いて説明する。尚、ストレージ装置101のDRAM125に格納する管理情報は、上記のものに限定されるわけではなく、他の管理情報を格納したとしても良い。
(1-21) Storage Device Management Information 1: Virtual Volume Management Information Subsequently, the storage device 101 stores “virtual volume management information” stored in the DRAM 125 of the storage device 101 so that the processor 121 can access data at high speed. “RAID group management information” and “virtual PDEV information” will be described with reference to FIG. Note that the management information stored in the DRAM 125 of the storage apparatus 101 is not limited to the above, and other management information may be stored.
 まず仮想ボリューム管理情報2210について説明する。仮想ボリューム管理情報2210は、ストレージ装置101が一つの仮想ボリュームを作成する毎に生成される管理情報であり、一つの仮想ボリューム管理情報には、一つの仮想ボリュームについての管理情報が格納される。ストレージ装置101は、この仮想ボリューム管理情報2210を用いて仮想ボリュームとRAIDグループとの対応付けを管理し、上位装置からの要求されたアドレスに対して、参照すべきRAIDグループを特定する。 First, the virtual volume management information 2210 will be described. The virtual volume management information 2210 is management information generated each time the storage apparatus 101 creates one virtual volume, and management information for one virtual volume is stored in one virtual volume management information. The storage apparatus 101 manages the association between the virtual volume and the RAID group using the virtual volume management information 2210, and identifies the RAID group to be referred to for the address requested from the host apparatus.
 仮想ボリューム管理情報2210は、仮想ボリューム内先頭アドレス2211、仮想ボリューム内サイズ2212、RAIDグループ番号2213、RAIDグループ内先頭アドレス2214の項目により構成される。尚、本発明は、この4種類の項目のみに限定されるものではない。仮想ボリューム管理情報2210に、図22に示した以外の管理情報が含まれていても良い。 The virtual volume management information 2210 includes items of a virtual volume start address 2211, a virtual volume size 2212, a RAID group number 2213, and a RAID group start address 2214. The present invention is not limited to these four types of items. The virtual volume management information 2210 may include management information other than that shown in FIG.
 仮想ボリューム内先頭アドレス2211は、RAIDグループを対応付ける仮想ボリューム内のアドレスの先頭を格納する項目である。 The head address 2211 in the virtual volume is an item for storing the head of the address in the virtual volume to which the RAID group is associated.
 仮想ボリューム内サイズ2212は、RAIDグループを対応付ける仮想ボリューム内の領域サイズを格納する項目である。ストレージ装置101は、仮想ボリューム内先頭アドレス2211にて指定された先頭アドレスから、仮想ボリューム内サイズ2212にて指定されたサイズの領域を後述するRAIDグループ番号2213にて指定されるRAIDグループに対応付ける。 The virtual volume size 2212 is an item for storing the area size in the virtual volume associated with the RAID group. The storage apparatus 101 associates the area of the size specified by the virtual volume size 2212 from the start address specified by the virtual volume start address 2211 with the RAID group specified by the RAID group number 2213 described later.
 RAIDグループ番号2213は、仮想ボリュームに対応付けるRAIDグループの番号を格納する項目である。 The RAID group number 2213 is an item for storing the number of the RAID group associated with the virtual volume.
 RAIDグループ内先頭アドレス2214は、RAIDグループ番号2213にて指定されたRAIDグループ内のアドレスを指定する項目である。なお、本明細書において、RAIDグループのアドレスとは、パリティの格納される領域は含まない。RAIDグループのアドレスについて図21を用いて以下に説明する。ストライプサイズが64KBの場合、図21のRAIDグループ0において、RAIDグループ0を構成する先頭仮想PDEVである仮想PDEV0の先頭ストライプ「Data0」が0で、次の仮想PDEV1の先頭ストライプ「Data1」の格納される位置がアドレス64KBである。続いて次の仮想PDEV2の先頭ストライプ「Data2」の格納される位置がアドレス128KBである。また、パリティの格納されるストライプはRAIDグループアドレスからは除外されるため、仮想PDEV0の「Data4」の格納される位置が、アドレス192KBになる。 The RAID group start address 2214 is an item for designating an address in the RAID group designated by the RAID group number 2213. In this specification, the RAID group address does not include an area in which parity is stored. The RAID group address will be described below with reference to FIG. When the stripe size is 64 KB, in RAID group 0 in FIG. 21, the first stripe “Data 0” of virtual PDEV 0 that is the first virtual PDEV that constitutes RAID group 0 is 0, and the first stripe “Data 1” of the next virtual PDEV 1 is stored. The position to be recorded is address 64 KB. Subsequently, the position where the first stripe “Data2” of the next virtual PDEV2 is stored is the address 128 KB. Further, since the stripe in which the parity is stored is excluded from the RAID group address, the position where “Data4” of the virtual PDEV0 is stored is the address 192KB.
 図22に示す例において、仮想ボリューム管理情報2210は、仮想ボリュームの0~20(=0+20)TBの領域をRAIDグループ番号0の0~20TBの領域に対応付けていることを示している。さらに、図22に示す例において、仮想ボリューム管理情報2210は、仮想ボリュームの20TB~100(=20+80)TBまでの領域をRAID番号1の10~90(=10+80)TBに対応付けていることを示している。これにより、上位装置103から仮想ボリュームのアクセス対象領域(LBAとアクセスデータ長によりアクセス対象領域が指定される)を指定したアクセス要求を受け付けた時、プロセッサ121は仮想ボリューム管理情報2210を参照し、アクセス対象領域が対応付けられているRAIDグループを特定することができる。 In the example shown in FIG. 22, the virtual volume management information 2210 indicates that the 0 to 20 (= 0 + 20) TB area of the virtual volume is associated with the 0 to 20 TB area of the RAID group number 0. Further, in the example shown in FIG. 22, the virtual volume management information 2210 indicates that the area from 20 TB to 100 (= 20 + 80) TB of the virtual volume is associated with 10 to 90 (= 10 + 80) TB of RAID number 1. Show. Thus, when receiving an access request specifying the virtual volume access target area (the access target area is specified by the LBA and the access data length) from the host device 103, the processor 121 refers to the virtual volume management information 2210, A RAID group associated with an access target area can be specified.
 上記のとおり、仮想ボリュームとRAIDグループを対応付ける仮想ボリューム管理情報は、領域の対応付けに、後述の圧縮管理情報ほど多くの情報を必要としない。この理由について図21を用いて説明する。 As described above, the virtual volume management information for associating the virtual volume with the RAID group does not require as much information as the compression management information described later for associating the areas. The reason for this will be described with reference to FIG.
 図21の例において、仮想ボリューム内のRAIDストライプ「Data14」の記録先領域を特定する場合、ストライプ番号をRAID構成のドライブ台数で割った余りに1を加算した値として仮想PDEVを特定できる。図21の場合、14÷4の余りは2であり1を加えると3となる。これより、RAIDグループ0203における3台目の仮想PDEVに記録されていることが計算より算出できる。また、RAIDストライプData14の仮想PDEV2内の記録場所の先頭アドレスは、ストライプ番号をRAIDストライプ列に含まれるデータの個数で割った値から特定できる。例えば、図21の場合、RAIDストライプ列に含まれるデータの個数は3である。これより、RAIDストライプ「Data14」の記録場所の先頭アドレスは、14÷3=4より、仮想PDEV2内に格納された4つ目のストライプ後となる。 In the example of FIG. 21, when the recording destination area of the RAID stripe “Data14” in the virtual volume is specified, the virtual PDEV can be specified as a value obtained by adding 1 to the remainder obtained by dividing the stripe number by the number of RAID-configured drives. In the case of FIG. 21, the remainder of 14 ÷ 4 is 2, and when 1 is added, it becomes 3. From this, it can be calculated from the calculation that it is recorded in the third virtual PDEV in the RAID group 0203. Further, the start address of the recording location in the virtual PDEV 2 of the RAID stripe Data 14 can be specified from a value obtained by dividing the stripe number by the number of data included in the RAID stripe column. For example, in the case of FIG. 21, the number of data included in the RAID stripe column is three. Accordingly, the start address of the recording location of the RAID stripe “Data 14” is after the fourth stripe stored in the virtual PDEV 2 from 14 ÷ 3 = 4.
 なお、ここで説明した例はRAIDグループのRAID種別がRAID5の場合の例であり、その他のRAID種別を採用しているRAIDグループの場合には、また異なった計算方法により、仮想ボリューム上のアドレス(LBA、ストライプ番号等)に対応付けられた仮想PDEV内アドレスの特定が可能である。さらに、あるRAIDストライプに対応するパリティの格納されるストライプが対応付けられている仮想PDEV及び仮想PDEV内アドレスについても同様に、簡単な計算によって求められる。 The example described here is an example when the RAID type of the RAID group is RAID 5, and in the case of a RAID group adopting another RAID type, the address on the virtual volume is calculated by a different calculation method. The address in the virtual PDEV associated with (LBA, stripe number, etc.) can be specified. Further, the virtual PDEV and the address in the virtual PDEV associated with the stripe storing the parity corresponding to a certain RAID stripe are similarly obtained by simple calculation.
 上記のとおり、仮想ボリュームのアドレスとRAIDグループ内のアドレスの対応付けは、RAID構成を考慮した計算により一意に特定できる。このため、対応付けに多くの情報を必要とせず、プロセッサ121が高速にアクセス可能なDRAMに記録することができる。 As described above, the correspondence between the address of the virtual volume and the address in the RAID group can be uniquely specified by calculation considering the RAID configuration. For this reason, a lot of information is not required for the association, and the information can be recorded in the DRAM accessible by the processor 121 at a high speed.
 以上が、本発明の実施例における仮想ボリューム管理情報である。ストレージ装置101は、この管理情報を用いて仮想ボリュームとRAIDグループとの対応付けを管理する。 The above is the virtual volume management information in the embodiment of the present invention. The storage apparatus 101 manages the association between virtual volumes and RAID groups using this management information.
 (1-22)ストレージ装置の管理情報2:RAIDグループ管理情報
 続いてRAIDグループ管理情報2220について説明する。
(1-22) Storage Device Management Information 2: RAID Group Management Information Next, the RAID group management information 2220 will be described.
 RAIDグループ管理情報2220は、RAIDグループを構成する仮想PDEVを管理する情報であり、一つのRAIDグループが定義される際、ストレージ装置101が1つのRAIDグループ管理情報2220を生成する。一つのRAIDグループ管理情報が一つのRAIDグループを管理する。 The RAID group management information 2220 is information for managing a virtual PDEV that constitutes a RAID group, and the storage apparatus 101 generates one RAID group management information 2220 when one RAID group is defined. One RAID group management information manages one RAID group.
 RAIDグループ管理情報2220は、RAID構成台数2221、登録仮想PDEV番号2222、RAID種別2223により構成される。尚、本発明のRAIDグループ管理情報2220は、図22に示した項目に限定されるものではない。RAIDグループ管理情報2220に、図22に示す以外の項目が含まれてもよい。 The RAID group management information 2220 is composed of a RAID configuration number 2221, a registered virtual PDEV number 2222, and a RAID type 2223. The RAID group management information 2220 of the present invention is not limited to the items shown in FIG. The RAID group management information 2220 may include items other than those shown in FIG.
 RAID構成仮想PDEV数2221は、RAIDグループを構成する仮想PDEVの個数を格納する項目である。図22の例では、4台で構成されることが示されている。 The RAID configuration virtual PDEV number 2221 is an item for storing the number of virtual PDEVs constituting the RAID group. In the example of FIG. 22, it is shown that four units are configured.
 登録仮想PDEV番号2222は、RAIDグループを構成する仮想PDEVを識別するための番号を格納する項目である。図22の例では、RAIDグループは仮想PDEV3、仮想PDEV8、仮想PDEV9、仮想PDEV15の4台にて構成されていることを示している。 The registered virtual PDEV number 2222 is an item for storing a number for identifying the virtual PDEV that constitutes the RAID group. In the example of FIG. 22, the RAID group is configured with four units of virtual PDEV3, virtual PDEV8, virtual PDEV9, and virtual PDEV15.
 RAID種別2223は、RAIDの種類(RAIDレベル)を格納する項目である。図22の例では、この管理情報が該当するRAIDグループは、RAID5にて構成されていることを示している。 The RAID type 2223 is an item for storing a RAID type (RAID level). The example of FIG. 22 indicates that the RAID group to which this management information corresponds is configured with RAID5.
 以上が、本発明の実施例におけるRAIDグループ管理情報である。ストレージ装置101は、この管理情報を用いて、仮想PDEVにより構成されるRAIDグループを管理する。 The above is the RAID group management information in the embodiment of the present invention. The storage apparatus 101 uses this management information to manage a RAID group configured with virtual PDEVs.
 続いて、仮想PDEV情報2230について説明する。仮想PDEV情報2230は、仮想PDEVとPDEVの対応付けを管理する情報である。仮想PDEV番号2231は、ストレージ装置101内で管理される各仮想PDEVに付された識別番号で、PDEV Addr2232は、ストレージ装置101内で管理される各PDEVの識別番号である。たとえばPDEVがSAS規格に従う記憶媒体である場合、PDEV Addr2232は各PDEVに付されたSASアドレスが格納される。仮想PDEV情報2230により、仮想PDEV番号からPDEVを特定することができる。 Subsequently, the virtual PDEV information 2230 will be described. The virtual PDEV information 2230 is information for managing the association between the virtual PDEV and the PDEV. The virtual PDEV number 2231 is an identification number assigned to each virtual PDEV managed in the storage apparatus 101, and the PDEV Addr 2232 is an identification number of each PDEV managed in the storage apparatus 101. For example, when the PDEV is a storage medium that conforms to the SAS standard, the PDEV Addr 2232 stores the SAS address assigned to each PDEV. With the virtual PDEV information 2230, the PDEV can be specified from the virtual PDEV number.
 (1-23)ストレージ装置の管理情報2:圧縮管理情報
 続いて、PDEVに格納される圧縮管理情報について説明する。圧縮管理情報は、仮想PDEVの領域とPDEVの領域の対応付けを管理するための情報で、各PDEV内に記録される。なお、本発明の実施例に係るストレージ装置101では、圧縮管理情報のPDEV内における記録位置は、全PDEVで共通としている。以下では、圧縮管理情報の格納されるPDEV内の先頭アドレスを、「圧縮管理情報のPDEV内先頭アドレス」と呼ぶ。圧縮管理情報のPDEV内先頭アドレスの情報は、ストレージコントローラ110のDRAM125に格納されていてもよいし、あるいは圧縮管理情報へのアクセスを実行するプログラムに、アドレス情報が埋め込まれている態様であってもよい。
(1-23) Storage Device Management Information 2: Compression Management Information Next, compression management information stored in the PDEV will be described. The compression management information is information for managing the association between the virtual PDEV area and the PDEV area, and is recorded in each PDEV. In the storage apparatus 101 according to the embodiment of the present invention, the recording position of the compression management information in the PDEV is common to all PDEVs. Hereinafter, the head address in the PDEV in which the compression management information is stored is referred to as “head address in the PDEV of the compression management information”. The information on the head address in the PDEV of the compression management information may be stored in the DRAM 125 of the storage controller 110, or the address information is embedded in a program for accessing the compression management information. Also good.
 データ圧縮した際の圧縮率は、データ内容に依存するため、仮想PDEV領域とPDEV領域の対応付けは動的に変化する。このため、圧縮管理情報は、記録データの変更に伴って変更される。 Since the compression rate when data is compressed depends on the data content, the association between the virtual PDEV area and the PDEV area dynamically changes. For this reason, the compression management information is changed as the recording data is changed.
 図23は、本発明の実施例に係るストレージ装置101が用いる圧縮管理情報2300を示している。圧縮管理情報2300は、圧縮データが格納されているPDEV内の先頭アドレス2301、圧縮データの長さ2302、圧縮フラグ2303の3つのフィールドにて構成されている。尚、本発明における圧縮管理情報2300は、この構成に限定されるものではなく、圧縮管理情報2300に3つ以上のフィールドがあったとしてもよい。 FIG. 23 shows compression management information 2300 used by the storage apparatus 101 according to the embodiment of the present invention. The compression management information 2300 is composed of three fields: a head address 2301 in the PDEV in which compressed data is stored, a length 2302 of the compressed data, and a compression flag 2303. The compression management information 2300 in the present invention is not limited to this configuration, and the compression management information 2300 may have three or more fields.
 また、図23の例では1TBのPDEV容量に対して仮想PDEV容量を8倍の8TBとしているが、本発明はこの値に限定されるものではない。仮想PDEV容量は、想定される圧縮率に応じて任意の値に設定してよい。例えば、格納データの圧縮率が50%ほどしか見込めない場合、1TBのPDEV容量に対して仮想PDEVの容量を2TBとしたほうが、圧縮管理情報2300を削減できることから望ましい。また、格納データの圧縮率に応じて、動的に仮想PDEV容量を変化させてもよい。たとえば、最初は1TBのPDEV容量に対して仮想PDEVの容量を2TBとして運用を行い、運用後に、たとえばNVMモジュール126から得られる圧縮データ長の情報などに基づいて、格納データが1/8未満に圧縮可能であることが見込まれる場合、動的に仮想PDEVの容量を8TBにしてもよい。その場合、圧縮管理情報2300も動的に増加させる。逆に、運用開始後に、あまり格納データが圧縮できないことが判明した場合には、動的に仮想PDEVの容量を小さくする。 In the example of FIG. 23, the virtual PDEV capacity is set to 8 TB, which is eight times the 1 TB PDEV capacity, but the present invention is not limited to this value. The virtual PDEV capacity may be set to an arbitrary value according to the assumed compression rate. For example, when the compression rate of stored data can only be expected to be about 50%, it is preferable to set the virtual PDEV capacity to 2 TB with respect to the 1 TB PDEV capacity because the compression management information 2300 can be reduced. Further, the virtual PDEV capacity may be dynamically changed according to the compression rate of the stored data. For example, at first, the virtual PDEV capacity is set to 2 TB with respect to the 1 TB PDEV capacity. After the operation, the stored data is reduced to less than 1/8 based on the compressed data length information obtained from the NVM module 126, for example. If compression is expected, the capacity of the virtual PDEV may be dynamically set to 8 TB. In that case, the compression management information 2300 is also increased dynamically. On the other hand, if it is found that the stored data cannot be compressed much after the operation is started, the capacity of the virtual PDEV is dynamically reduced.
 また、図23の例では、仮想PDEV領域を4KB毎に区分して対応付ける圧縮管理情報2300の例について示しているが、本発明はこの区分単位に限定されるものではない。ホストから4KBを超えるサイズのデータのリクエストを頻繁に受領すると想定される場合、4KB以上の単位で区分することが、圧縮管理情報2300の量を削減できることから望ましい。 In the example of FIG. 23, an example of the compression management information 2300 is shown in which the virtual PDEV area is divided and associated with each 4 KB, but the present invention is not limited to this division unit. In the case where it is assumed that requests for data having a size exceeding 4 KB are frequently received from the host, it is desirable to classify the data in units of 4 KB or more because the amount of compression management information 2300 can be reduced.
 圧縮データが格納されているPDEV内の先頭アドレス2301は、対応する仮想PDEV領域の圧縮データが格納されている領域の先頭アドレスを格納するフィールドである。また、圧縮効果が無い等の理由で、データが未圧縮状態で格納されている場合、未圧縮のデータが格納されている領域の先頭アドレスを格納するフィールドである。なお、圧縮データが格納されているPDEV内の先頭アドレス2301にNULLが格納されている場合(図23では「未割り当て」と記載)、対応する仮想PDEV領域にPDEVの領域が割り当てられていない(未割り当てである)ことを意味する。本発明の実施例に係るストレージ装置101では、圧縮データ及び非圧縮データは、セクタ長512Bを最小単位として管理する。尚、本発明は、このセクタ長に限定されるものではない。 The head address 2301 in the PDEV in which the compressed data is stored is a field for storing the head address of the area in which the compressed data in the corresponding virtual PDEV area is stored. In addition, when data is stored in an uncompressed state due to a lack of compression effect or the like, this field stores the start address of an area where uncompressed data is stored. When NULL is stored at the head address 2301 in the PDEV in which the compressed data is stored (described as “unassigned” in FIG. 23), the PDEV area is not allocated to the corresponding virtual PDEV area ( Unassigned). In the storage apparatus 101 according to the embodiment of the present invention, the compressed data and the uncompressed data are managed with the sector length 512B as the minimum unit. The present invention is not limited to this sector length.
 圧縮データの長さ2302は、PDEV内に格納された圧縮データの長さを格納するフィールドである。このフィールドに格納される値の単位はセクタである。これにより、圧縮データが格納されているPDEV内の先頭アドレス2301の値を先頭アドレスとし、圧縮データの長さ2302フィールドに記録されたセクタ数+1の領域に圧縮データが格納されていることが表されている。本発明の実施例に係るストレージ装置101では、仮想ボリュームの記憶領域を4KB単位に分割し、この分割された4KBの領域毎に圧縮を行う。圧縮によるデータ最小値は512B、そして圧縮効果が無い場合は、未圧縮で格納するため、圧縮データの長さがとり得る値の範囲は512B~4096B(4KB)である(セクタ数に換算すると、1~8である)。しかし、圧縮データの長さ2302の値が0の時、圧縮データの長さが1セクタ(512B)であるというルールで、このフィールドを利用する。これにより圧縮データの長さ2302フィールドは、3bitで512B~4KBになるデータの長さを管理する。 The compressed data length 2302 is a field for storing the length of the compressed data stored in the PDEV. The unit of the value stored in this field is a sector. As a result, the value of the start address 2301 in the PDEV in which the compressed data is stored is set as the start address, and the compressed data is stored in the area of the number of sectors + 1 recorded in the compressed data length 2302 field. Has been. In the storage apparatus 101 according to the embodiment of the present invention, the storage area of the virtual volume is divided into 4 KB units, and compression is performed for each divided 4 KB area. The minimum data value by compression is 512B, and when there is no compression effect, the compressed data length is 512B to 4096B (4KB) because it is stored uncompressed (in terms of the number of sectors, 1-8). However, when the value of the compressed data length 2302 is 0, this field is used with the rule that the compressed data length is 1 sector (512 B). As a result, the compressed data length 2302 field manages the data length of 512 B to 4 KB in 3 bits.
 圧縮フラグ2303は、対応する仮想PDEVの領域が圧縮して格納されていることを示すフィールドである。圧縮フラグ2303の値が1のとき、データは圧縮して格納されていることを示す。一方で、圧縮フラグ2303の値が0のとき、データは非圧縮で格納されていることを示している。 The compression flag 2303 is a field indicating that the corresponding virtual PDEV area is compressed and stored. When the value of the compression flag 2303 is 1, it indicates that the data is compressed and stored. On the other hand, when the value of the compression flag 2303 is 0, it indicates that the data is stored uncompressed.
 圧縮管理情報2300は、仮想PDEVの4KBの領域毎に、PDEV内の先頭アドレス2301、圧縮データのPDEV内の長さ2302、圧縮フラグ2303の各値を持つ。以降、PDEV内の先頭アドレス2301、圧縮データのPDEV内の長さ2302、圧縮フラグ2303のセットを圧縮情報エントリと記す。圧縮情報エントリのサイズは、圧縮データが格納されているPDEV内の先頭アドレス2301が60bit、圧縮データのPDEV内の長さ2302が3bit、圧縮フラグ2303が1bitで計64bit=8Bである。 The compression management information 2300 has each value of the leading address 2301 in the PDEV, the length 2302 in the PDEV of the compressed data, and the compression flag 2303 for each 4 KB area of the virtual PDEV. Hereinafter, a set of the head address 2301 in the PDEV, the length 2302 in the PDEV of the compressed data, and the compression flag 2303 is referred to as a compression information entry. The size of the compressed information entry is 60 bits for the head address 2301 in the PDEV storing the compressed data, 3 bits for the length 2302 in the PDEV of the compressed data, and 1 bit for the compression flag 2303, for a total of 64 bits = 8B.
 仮想PDEV領域の対応付けを管理する圧縮情報エントリのPDEV内の記録位置は、対応付ける仮想PDEV領域により、一意に定められている。例えば、図23の例では、仮想PDEV領域「0x0000_0000_1000」を先頭アドレスとした4KBの領域の圧縮情報エントリは、記録位置「0x00_0000_0008」に固定的に記録されることが表されている。なお、図23において、仮想PDEV領域の先頭アドレス、記録位置のアドレスの単位は、いずれもバイトである。よって、仮想PDEV領域のアドレス「0x0000_0000_1000」は、仮想PDEVの領域先頭から4KBの位置を表す。また、記録位置は、圧縮管理情報のPDEV内先頭アドレスを起点(アドレス0)とした場合の相対アドレスで表されている。以下、特に断りのない限り、圧縮管理情報のPDEV内先頭アドレスは0、つまり圧縮管理情報が各PDEVの先頭領域に記録されている場合について説明する。 The recording position in the PDEV of the compression information entry that manages the association of the virtual PDEV area is uniquely determined by the virtual PDEV area to be associated. For example, the example of FIG. 23 indicates that the compressed information entry of the 4 KB area with the virtual PDEV area “0x0000_0000_1000” as the head address is fixedly recorded at the recording position “0x00_0000_0008”. In FIG. 23, the units of the start address of the virtual PDEV area and the address of the recording position are both bytes. Therefore, the address “0x0000 — 0000 — 1000” of the virtual PDEV area represents a position of 4 KB from the top of the virtual PDEV area. The recording position is represented by a relative address when the start address (address 0) is the start address in the PDEV of the compression management information. Hereinafter, unless otherwise specified, the case where the head address in the PDEV of the compression management information is 0, that is, the compression management information is recorded in the head area of each PDEV will be described.
 本発明の実施例に係るストレージ装置101は、上位装置103からのライトデータ(圧縮データ)をPDEVに格納する際、すでにデータが格納されている領域以外の領域(未使用領域)であれば任意の領域に圧縮データを格納できる。上位装置103からあるデータ(更新前データ)に対する更新データを受け付けた場合、更新データの圧縮データを、更新前データ(の圧縮データ)と異なる場所に書き込む。そして更新前データの格納されていたPDEV上領域を未使用領域として扱う。 The storage apparatus 101 according to the embodiment of the present invention can arbitrarily store the write data (compressed data) from the host apparatus 103 as long as it is an area (unused area) other than the area where the data is already stored. Compressed data can be stored in this area. When update data for certain data (pre-update data) is received from the host apparatus 103, the compressed data of the update data is written in a location different from the pre-update data (compressed data). Then, the PDEV area where the pre-update data is stored is treated as an unused area.
 このように、更新データが更新前データとは異なる位置に格納される場合があるが、その場合でも、圧縮情報エントリ内の値が変更されるだけで、記録位置「0x00_0000_0008」には常に、仮想PDEV領域「0x0000_0000_1000」の圧縮情報エントリが記録される。以降、仮想PDEV領域のアドレスが4KB増加する毎に、その領域の圧縮情報エントリが記録されるPDEV領域は8Bインクリメントされた領域に記録される。このため、
 仮想PDEVアドレス÷4KB×8B+圧縮管理情報のPDEV内先頭アドレス
の計算式により、仮想PDEVアドレスを管理する圧縮情報エントリの場所を特定できる。尚、本発明はこの計算式に限定されるものではない。仮想PDEVアドレスからPDEV内に記録された圧縮情報エントリの記録位置を一意に算出できればよい。
As described above, the update data may be stored at a different position from the pre-update data, but even in this case, the value in the compression information entry is changed, and the virtual data is always stored in the recording position “0x00_0000_0008”. A compressed information entry of the PDEV area “0x0000_0000_1000” is recorded. Thereafter, each time the address of the virtual PDEV area increases by 4 KB, the PDEV area in which the compression information entry of the area is recorded is recorded in the area incremented by 8B. For this reason,
The location of the compression information entry that manages the virtual PDEV address can be specified by the calculation formula of the head address in the PDEV of virtual PDEV address ÷ 4 KB × 8B + compression management information. The present invention is not limited to this calculation formula. It is only necessary to uniquely calculate the recording position of the compressed information entry recorded in the PDEV from the virtual PDEV address.
 こうした圧縮情報エントリの固定配置により、圧縮情報エントリ消失時に読み出し不可となった仮想PDEV領域を、消失エントリの記録アドレスから算出することが可能となる。このため、消失した仮想PDEV領域をRAIDによるリビルドにて復元した後、復元データを圧縮して記録する際、圧縮情報エントリの内容を再生成できる。このため、本発明では、圧縮管理情報2300を冗長化していなくとも、ストレージ装置の信頼性を維持できる。 Such a fixed arrangement of compressed information entries makes it possible to calculate a virtual PDEV area that has become unreadable when the compressed information entry is lost from the recording address of the lost entry. For this reason, after the lost virtual PDEV area is restored by rebuilding by RAID, the contents of the compressed information entry can be regenerated when the restored data is compressed and recorded. Therefore, in the present invention, the reliability of the storage apparatus can be maintained even if the compression management information 2300 is not made redundant.
 (1-24)ストレージ装置の伸長リード動作
 続いて本実施例におけるストレージ装置101の伸長リード動作について図24を用いて説明する。本発明の実施例に係るストレージ装置101は、図3-Aを用いて説明した、ストレージ装置101のライトデータ圧縮動作によって最終記憶媒体に記録されたデータを、上位装置103からのリード要求に応じて、データを伸長して上位装置103に返送する。以下、特に断りのない限り、各処理はストレージ装置101のプロセッサ121により実行される。
(1-24) Decompression Read Operation of Storage Device Next, the expansion read operation of the storage device 101 in this embodiment will be described with reference to FIG. The storage apparatus 101 according to the embodiment of the present invention responds to a read request from the host apparatus 103 with respect to the data recorded in the final storage medium by the write data compression operation of the storage apparatus 101 described with reference to FIG. The data is decompressed and returned to the host apparatus 103. Hereinafter, unless otherwise specified, each process is executed by the processor 121 of the storage apparatus 101.
 伸長リード動作の説明の前に、本発明の実施例に係るストレージコントローラ110で管理されるキャッシュ領域の管理情報について説明する。ストレージコントローラ110は、NVMモジュール126が提供する記憶領域を、上位装置103からのライトデータや、SSD111やHDD122からのリードデータを一時記憶するためのキャッシュ領域として利用する。NVMモジュール126はLBA0空間またはLBA1空間を、ストレージコントローラ110(のプロセッサ121)に対して提供するが、提供されたLBA0空間またはLBA1空間のうち、データを格納するために使用されている領域とそうでない領域(空き領域と呼ぶ)の管理は、プロセッサ121が行う。この領域の管理のために使用する情報をキャッシュ管理情報と呼ぶ。 Before explaining the decompression read operation, the management information of the cache area managed by the storage controller 110 according to the embodiment of the present invention will be explained. The storage controller 110 uses the storage area provided by the NVM module 126 as a cache area for temporarily storing write data from the higher-level device 103 and read data from the SSD 111 or the HDD 122. The NVM module 126 provides the LBA0 space or LBA1 space to the storage controller 110 (the processor 121 thereof), and among the provided LBA0 space or LBA1 space, the region used for storing data and so on. The processor 121 manages a non-region (referred to as a free space). Information used for managing this area is called cache management information.
 図20-Aに、ストレージコントローラ110が管理するキャッシュ管理情報3000の一例を示す。キャッシュ管理情報3000はDRAM125上に格納される。本発明の実施例に係るストレージ装置101では、原則として、NVMモジュール126が提供するLBA0空間を、上位装置103からのライトデータを格納するためのキャッシュ領域として利用する。最終記憶媒体から読み出されたデータを格納する際には、LBA1空間を利用する。これは最終記憶媒体から読み出されたデータが圧縮データであるからである。また、キャッシュ領域の割り当て単位をストライプのサイズとしている。また、以下の説明では、ストライプサイズは64KBの場合を例にとって説明する。 FIG. 20A shows an example of the cache management information 3000 managed by the storage controller 110. The cache management information 3000 is stored on the DRAM 125. In principle, the storage apparatus 101 according to the embodiment of the present invention uses the LBA0 space provided by the NVM module 126 as a cache area for storing write data from the host apparatus 103. When storing data read from the final storage medium, the LBA1 space is used. This is because the data read from the final storage medium is compressed data. The cache area allocation unit is the stripe size. In the following description, the stripe size is 64 KB as an example.
 キャッシュ管理情報3000の各行(エントリ)は、仮想ボリュームに付された識別番号(仮想ボリューム番号)であるVOL#3010と仮想ボリューム内アドレス3020で特定される、仮想ボリュームの1ストライプ分の領域のデータをキャッシュするための領域が、キャッシュLBA0(3030)から始まるストライプサイズ分のLBA0空間アドレスの領域、キャッシュLBA1(3040)から始まるストライプサイズ分のLBA1空間アドレスの領域であることを表している。仮想ボリュームの1ストライプ分の領域に、キャッシュ領域が割り当てられていない場合は、キャッシュLBA0(3030)またはキャッシュLBA1(3040)には無効値(NULL)が格納された状態にある。図20-Aの例では、VOL#3010が0、アドレス3020が0の領域(ストライプ)のデータをキャッシュするための領域は、キャッシュLBA0(3030)が0から始まる64KB分(ストライプサイズ分)の領域である。またキャッシュLBA1(3040)がNULLであるので、LBA1空間の領域は割り当てられていないことを表している。なお、アドレス3020には、ストライプ番号が格納される。 Each row (entry) of the cache management information 3000 is data of an area corresponding to one stripe of the virtual volume specified by VOL # 3010 which is an identification number (virtual volume number) given to the virtual volume and an address 3020 in the virtual volume. This indicates that the area for caching the LBA0 space address for the stripe size starting from the cache LBA0 (3030) and the LBA1 space address for the stripe size starting from the cache LBA1 (3040). When a cache area is not allocated to an area for one stripe of a virtual volume, an invalid value (NULL) is stored in the cache LBA0 (3030) or the cache LBA1 (3040). In the example of FIG. 20-A, the area for caching data in the area (stripe) where VOL # 3010 is 0 and address 3020 is 0 is for 64 KB (for the stripe size) where the cache LBA0 (3030) starts from 0. It is an area. Further, since the cache LBA1 (3040) is NULL, it indicates that the area of the LBA1 space is not allocated. The address 3020 stores a stripe number.
 ビットマップ3050は、キャッシュLBA0(3030)で特定される1ストライプの領域のうち、どの領域にデータが格納されているかを表す16ビットの情報である。各ビットは1ストライプ内の4KBの領域についての情報を表しており、各ビットが1の場合、当該ビットに対応する領域にはデータが格納されており、0の場合にはデータが格納されていないことを表す。図20-Aの例では、キャッシュLBA0(3030)が0の行(先頭行)に対応するビットマップ3040は「0x8000」、つまり16ビットの先頭ビットが1である。そのため、キャッシュLBA0(3030)が0から始まる1ストライプ分の領域のうち、先頭の4KBにデータが格納されていることを表している。 Bit map 3050 is 16-bit information indicating in which area data is stored in one stripe area specified by cache LBA0 (3030). Each bit represents information about a 4 KB area in one stripe. When each bit is 1, data is stored in an area corresponding to the bit, and when 0, data is stored. It means not. In the example of FIG. 20A, the bitmap 3040 corresponding to the row (first row) where the cache LBA 0 (3030) is 0 is “0x8000”, that is, the first bit of 16 bits is 1. Therefore, the cache LBA0 (3030) indicates that data is stored in the first 4 KB in the area corresponding to one stripe starting from 0.
 属性3060には、キャッシュLBA0(3030)で特定される領域にキャッシュされているデータの状態を表す情報として、「Dirty」または「Clean」の情報が格納される。属性3060に「Dirty」が格納されている場合、キャッシュLBA0(3030)で特定される領域のデータは、最終記憶媒体(SSD111またはHDD112)に未反映であることを意味し、「Clean」が格納されている場合には、キャッシュされているデータは最終記憶媒体にも反映済みであることを表す。最終アクセス時刻3070は、キャッシュされているデータが最後にアクセスされた時刻を表す。最終アクセス時刻3070は、上位装置103からキャッシュ領域に格納された複数のデータのうち、最終記憶媒体にデステージするデータを選択する際の基準情報として用いられる(たとえば最終アクセス時刻が最も古いデータを選択するなど)。そのため、キャッシュ管理情報3000には、最終アクセス時刻3070に、デステージ対象データを選択するために用いられるそれ以外の情報を格納するようにしてもよい。 In the attribute 3060, “Dirty” or “Clean” information is stored as information indicating the state of data cached in the area specified by the cache LBA0 (3030). When “Dirty” is stored in the attribute 3060, it means that the data in the area specified by the cache LBA 0 (3030) is not reflected in the final storage medium (SSD 111 or HDD 112), and “Clean” is stored. If it is, it indicates that the cached data has already been reflected in the final storage medium. The last access time 3070 represents the time when the cached data was last accessed. The last access time 3070 is used as reference information when selecting data to be destaged to the last storage medium from among a plurality of data stored in the cache area from the higher-level device 103 (for example, the data with the oldest last access time is Select). Therefore, the cache management information 3000 may store other information used for selecting the destage target data at the last access time 3070.
 また、ストレージコントローラ110は、NVMモジュール126のLBA0空間、LBA1空間の記憶空間の中で、未使用(キャッシュデータが格納されていない)領域を管理する必要があるため、未使用領域一覧の情報を有する。これをフリーリスト3500と呼び、一例を図20-Bに示す。フリーリスト3500には、フリーLBA0リスト3510と、フリーLBA1リスト3520があり、それぞれのリスト(3510、3520)には、未使用のLBA0/1の領域のアドレスが格納されている。 In addition, since the storage controller 110 needs to manage an unused (no cache data) area in the storage space of the LBA0 space and the LBA1 space of the NVM module 126, the storage controller 110 stores information on the unused area list. Have. This is called a free list 3500, and an example is shown in FIG. 20-B. The free list 3500 includes a free LBA 0 list 3510 and a free LBA 1 list 3520. The addresses of unused LBA 0/1 areas are stored in the respective lists (3510, 3520).
 上位装置103から、仮想ボリュームのアドレス(ストライプ番号)N番の領域に対するライト要求を受け付けた場合、フリーLBA0リスト3510から、LBA0アドレスを取得して、キャッシュ管理情報3000の、アドレス3020がNの行のキャッシュLBA0(3030)に取得したLBA0アドレスを格納することでキャッシュ領域の確保を行うことができる。 When a write request for the Nth area of the virtual volume address (stripe number) is received from the host device 103, the LBA0 address is acquired from the free LBA0 list 3510, and the address 3020 of the cache management information 3000 is N rows. The cache area can be secured by storing the acquired LBA0 address in the cache LBA0 (3030).
 また、最終記憶媒体から圧縮データを読み出してNVMモジュール126に格納する場合、LBA1空間上の領域を割り当てる必要があるので、フリーLBA1リスト3520から、LBA1アドレスを取得し、キャッシュ管理情報3000のキャッシュLBA1(3040)に取得したLBA1アドレスを格納する。 Further, when the compressed data is read from the final storage medium and stored in the NVM module 126, it is necessary to allocate an area on the LBA1 space. Therefore, the LBA1 address is obtained from the free LBA1 list 3520, and the cache LBA1 of the cache management information 3000 is acquired. The acquired LBA1 address is stored in (3040).
 続いて、伸長リード動作について説明する。ストレージ装置101の伸長リード動作の最初のステップであるS2401では、ストレージ装置101が上位装置103より、リード要求とリード対象アドレスを受領するステップである。 Next, the extension read operation will be described. In S2401, which is the first step of the decompression read operation of the storage apparatus 101, the storage apparatus 101 receives a read request and a read target address from the host apparatus 103.
 S2401より続くステップS2402では、プロセッサ121は、S2401にて取得したリードアドレスを用いて、リード対象データがNVMモジュール126(キャッシュ)に存在する(キャッシュヒット)かを検査する。プロセッサ121は、このキャッシュ管理情報3000内にS2401にて取得したリードアドレスに対応する行のキャッシュLBA0(3030)に値が格納されているか検査し、値が格納されている場合、キャッシュヒットと判定する。一方で、値が格納されていない場合、キャッシュミスと判定する。 In step S2402 following S2401, the processor 121 checks whether the read target data exists in the NVM module 126 (cache) (cache hit) using the read address acquired in S2401. The processor 121 checks whether or not a value is stored in the cache LBA0 (3030) of the row corresponding to the read address acquired in S2401 in the cache management information 3000. If the value is stored, it is determined as a cache hit. To do. On the other hand, if no value is stored, it is determined that there is a cache miss.
 S2402より続くステップS2403では、S2402にて判定した条件により分岐するステップである。S2402にてキャッシュミスと判定されている場合、プロセッサ121はS2404以降の処理を行う。一方で、S2402にてキャッシュヒットと判定されている場合、プロセッサ121はS2413の処理を行う。 In step S2403 following S2402, the process branches according to the condition determined in S2402. If it is determined in S2402 that the cache miss has occurred, the processor 121 performs the processing from S2404 onward. On the other hand, if it is determined in S2402 that there is a cache hit, the processor 121 performs the process of S2413.
 S2403より続くステップS2404では、S2402にてキャッシュミスと判断されたため、リード対象アドレスが対応付けられている仮想PDEVを取得するステップである。より具体的には、プロセッサ121は、仮想PDEV管理情報2210を参照し、リード対象アドレスに対応するRAIDグループのRAIDグループ番号2213、及び当該RAIDグループ内先頭アドレス2214を取得する。また、リード対象アドレスに対応したキャッシュ領域(LBA0、LBA1)を確保して、キャッシュ管理情報3000の、キャッシュLBA0(3030)、キャッシュLBA1(3040)に格納する。 In step S2404 following S2403, since a cache miss is determined in S2402, a virtual PDEV associated with the read target address is acquired. More specifically, the processor 121 refers to the virtual PDEV management information 2210, and acquires the RAID group number 2213 of the RAID group corresponding to the read target address and the start address 2214 in the RAID group. Further, the cache areas (LBA0, LBA1) corresponding to the read target address are secured and stored in the cache LBA0 (3030) and the cache LBA1 (3040) of the cache management information 3000.
 そしてプロセッサ121は、RAIDグループ番号2213の値が示すRAIDグループのRAIDグループ管理情報2220を取得し、RAIDグループに登録された仮想PDEV番号2222を取得する。そして、S2401にて取得したリードアドレスより、対象データが格納されている仮想PDEV番号と仮想PDEV内アドレスを算出する。このとき、リードアドレスとリクエストサイズによっては、上位装置からのリード要求領域は複数の仮想PDEVに跨ることがあり、この場合は、リードリクエストに応答するために複数の仮想PDEVと仮想PDEV内アドレスを算出する。 The processor 121 acquires the RAID group management information 2220 of the RAID group indicated by the value of the RAID group number 2213, and acquires the virtual PDEV number 2222 registered in the RAID group. Then, the virtual PDEV number storing the target data and the virtual PDEV internal address are calculated from the read address acquired in S2401. At this time, depending on the read address and the request size, the read request area from the host device may straddle a plurality of virtual PDEVs. In this case, in order to respond to the read request, a plurality of virtual PDEVs and addresses in the virtual PDEV are used. calculate.
 S2404より続くステップS2405では、S2404にて取得した仮想PDEV番号と仮想PDEV内アドレスを管理する圧縮管理情報2300のエントリをPDEVより取得するステップである。プロセッサ121は仮想PDEV情報2230を参照し、S2404にて取得した仮想PDEV番号に対応付けられたPDEVを特定する。ストレージ装置101では仮想PDEVに一意にPDEVを対応付けており、また圧縮管理情報2300はPDEV内の特定領域に記録されている。このため、前述のとおり、仮想PDEV内アドレスからPDEV内に記録されている圧縮管理情報2300のエントリが格納されているアドレスを算出し、PDEVより圧縮情報エントリを取得する。 In step S2405 following S2404, an entry of compression management information 2300 for managing the virtual PDEV number and the virtual PDEV address acquired in S2404 is acquired from the PDEV. The processor 121 refers to the virtual PDEV information 2230 and specifies the PDEV associated with the virtual PDEV number acquired in S2404. In the storage apparatus 101, the PDEV is uniquely associated with the virtual PDEV, and the compression management information 2300 is recorded in a specific area in the PDEV. For this reason, as described above, the address at which the entry of the compression management information 2300 recorded in the PDEV is stored is calculated from the address in the virtual PDEV, and the compression information entry is acquired from the PDEV.
 S2405より続くステップS2406では、プロセッサ121は、S2405にて取得した圧縮管理情報エントリを参照し、圧縮データが格納されているPDEV内の先頭アドレス2301と圧縮データの長さ2302より、PDEV内に記録されている圧縮データの格納領域を特定する。 In step S2406 following S2405, the processor 121 refers to the compression management information entry acquired in S2405, and records it in the PDEV from the start address 2301 in the PDEV in which the compressed data is stored and the length 2302 of the compressed data. Specify the storage area for compressed data.
 S2406より続くステップS2407では、プロセッサ121は、S2406にて取得した圧縮データの格納領域から圧縮データを読み出す。読み出したデータは一旦DRAM125に格納される。 In step S2407 following S2406, the processor 121 reads the compressed data from the compressed data storage area acquired in S2406. The read data is temporarily stored in the DRAM 125.
 S2407より続くステップS2408では、プロセッサ121は、圧縮データをキャッシュ装置であるNVMモジュール126のLBA1を指定してライトする。プロセッサ121は、ライトコマンド1010を用いてLBA1を指定して圧縮データをライトする。 In step S2408 following S2407, the processor 121 writes the compressed data by designating LBA1 of the NVM module 126 which is a cache device. The processor 121 uses the write command 1010 to specify LBA1 and writes the compressed data.
 S2408より続くステップS2409では、ストレージ装置101が、S2405にて取得した圧縮管理情報のエントリを元に、圧縮データの伸張に必要な圧縮情報を作成し、NVMモジュール126に転送するステップである。プロセッサ121は、図18に示した圧縮情報転送コマンド1810を用いて、NVMモジュール126に圧縮情報を転送する。 In step S2409 following S2408, the storage apparatus 101 creates compression information necessary for decompressing the compressed data based on the entry of the compression management information acquired in S2405, and transfers it to the NVM module 126. The processor 121 transfers the compressed information to the NVM module 126 using the compressed information transfer command 1810 shown in FIG.
 S2409より続くステップS2410では、ストレージ装置101がS2408にてライトした圧縮データを伸長して取得するために圧縮データをLBA0にマッピングするステップである。プロセッサ121は、図19にて示したLBA0マッピングコマンドを用いて圧縮データをLBA0にマッピングするようNVMモジュール126に指示する。コマンドを取得したNVMモジュール126は、LBA1に対応づけられている圧縮データに関する圧縮情報を参照し、圧縮データの伸長後のデータサイズに対応するLBA0の領域に圧縮データを対応づける。 Step S2410 following S2409 is a step of mapping the compressed data to LBA0 in order to decompress and acquire the compressed data written by the storage apparatus 101 in S2408. The processor 121 instructs the NVM module 126 to map the compressed data to LBA0 using the LBA0 mapping command shown in FIG. The NVM module 126 that has acquired the command refers to the compression information related to the compressed data associated with LBA1, and associates the compressed data with the area of LBA0 corresponding to the data size after expansion of the compressed data.
 S2413または2410より続くステップS2411は、プロセッサ121が、S2407~S2409の処理によってキャッシュ領域に読み出され、LBA0空間にマッピングされたデータ(キャッシュヒットの場合にはすでにLBA0に対応付けられているデータ)を、LBA0を指定してリードすることで伸長して取得するステップである。LBA0を指定したリードコマンドを取得したNVMモジュール126は、LBA0に対応づけられた圧縮データをFM420から取得し、圧縮伸長ユニット418にて伸長した後、ストレージコントローラ110(DRAM125)に返送する。 In step S2411 subsequent to S2413 or 2410, the processor 121 reads data that has been read into the cache area by the processing of S2407 to S2409 and mapped to the LBA0 space (data that has already been associated with LBA0 in the case of a cache hit). Is obtained by decompressing and reading by designating LBA0. The NVM module 126 that has acquired the read command designating LBA 0 acquires the compressed data associated with LBA 0 from the FM 420, decompresses it with the compression / decompression unit 418, and returns it to the storage controller 110 (DRAM 125).
 ステップS2411より続くステップS2412では、プロセッサ121は、S2411にて取得した伸長後のデータをリード要求への応答データとしてサーバに返送する。またキャッシュLBA1(3040)の領域を未使用領域とするため、キャッシュLBA1(3040)の値をフリーリストに返却し、キャッシュ管理情報3000のキャッシュLBA1(3040)の値をNULLにして、処理を終了する。 In step S2412 following step S2411, the processor 121 returns the decompressed data acquired in S2411 to the server as response data to the read request. Since the cache LBA1 (3040) area is an unused area, the cache LBA1 (3040) value is returned to the free list, the cache LBA1 (3040) value of the cache management information 3000 is set to NULL, and the process ends. To do.
 ステップS2403にてキャッシュミスではない(キャッシュヒット)と判定された場合に遷移するS2413は、プロセッサ121がキャッシュ管理情報3000を参照し、リード対象領域が既に格納されているLBA0(キャッシュLBA0(3030))の情報を取得するステップである。 In step S2413, when it is determined in step S2403 that there is no cache miss (cache hit), the processor 121 refers to the cache management information 3000, and the read target area is already stored in LBA0 (cache LBA0 (3030)). ) Information.
 以上が本実施例における伸張リード動作である。 The above is the extension read operation in this embodiment.
 (1-25)ストレージ装置のライトデータキャッシュ格納動作
 続いてストレージ装置のライトデータキャッシュ格納動作について説明する。本実施例のライトデータキャッシュ格納動作は、図3-Aに示した本実施例のライトデータ圧縮動作の処理311~314である。ライトデータキャッシュ格納動作について、図25のフローを用いて説明する。
(1-25) Write Data Cache Storage Operation of Storage Device Next, the write data cache storage operation of the storage device will be described. The write data cache storage operation of this embodiment is the processes 311 to 314 of the write data compression operation of this embodiment shown in FIG. The write data cache storage operation will be described with reference to the flowchart of FIG.
 ストレージ装置のライトデータキャッシュ格納動作の最初のステップS2501は、ストレージ装置101が上位装置より、ライトデータとライト先となるアドレスを受領するステップである。この時ライトデータは、図3-Aに示すデータフロー311のように一旦、ストレージ装置101のDRAM125に記録される。尚、Hostインターフェース124より、NVMモジュール126に直接データを転送する機能がある場合、ストレージ装置101のDRAM125に記録しなくてもよい。 The first step S2501 of the storage device write data cache storage operation is a step in which the storage device 101 receives write data and a write destination address from the host device. At this time, the write data is once recorded in the DRAM 125 of the storage apparatus 101 as in a data flow 311 shown in FIG. If there is a function of directly transferring data from the host interface 124 to the NVM module 126, the data may not be recorded in the DRAM 125 of the storage apparatus 101.
 S2501より続くステップS2502は、プロセッサ121がS2001にて取得したライトアドレスを用いて、キャッシュヒット判定を行うステップである。ここでは伸長リード動作のステップS2402と同様の処理を行う。 Step S2502 following S2501 is a step of performing a cache hit determination using the write address acquired by the processor 121 in S2001. Here, the same process as step S2402 of the extension read operation is performed.
 S2502より続くステップS2503は、S2502の判定結果により分岐するステップである。S2502の結果がキャッシュヒットの場合、S2504に遷移する。また、S2501の結果がキャッシュミスの場合S2509に遷移する。 Step S2503 following S2502 is a step that branches depending on the determination result of S2502. If the result of S2502 is a cache hit, the process proceeds to S2504. If the result of S2501 is a cache miss, the process proceeds to S2509.
 S2503より続くステップS2504では、プロセッサ121がNVMモジュール126上にステージングされている領域のLBA0を取得する。 In step S2504 following S2503, the processor 121 acquires LBA0 of the area staged on the NVM module 126.
 S2503より続くステップS2509は、プロセッサ121が、ライトデータを記録するNVMモジュール126のLBA0を新規に確保するステップである。LBA0の確保は、伸長リード処理のS2404で行われる処理と同様であるが、ここではLBA1空間の領域の確保は必要ない。 Step S2509 following S2503 is a step in which the processor 121 newly secures LBA0 of the NVM module 126 for recording the write data. The securing of LBA0 is the same as the processing performed in S2404 of the decompression read processing, but it is not necessary to secure the area of the LBA1 space here.
 S2504及びS2509より続くステップS2505では、プロセッサ121がS2504またはS2509にて取得したLBA0を指定し、図10にて示したライトコマンド1010を用いてNVMモジュール126に対してデータをライトする。この時ライトデータは、図3-Aに示すデータフロー312のように、ストレージ装置101のDRAM125からNVMモジュール126のデータ圧縮/伸長ユニット418に転送され、圧縮されたのち、NVMモジュール126内のデータバッファ416に記録される。データバッファ416に記録された圧縮データは、データフロー314のように任意のタイミングでFM420に記録される。 In step S2505 following S2504 and S2509, the processor 121 specifies LBA0 acquired in S2504 or S2509, and writes data to the NVM module 126 using the write command 1010 shown in FIG. At this time, the write data is transferred from the DRAM 125 of the storage apparatus 101 to the data compression / decompression unit 418 of the NVM module 126 and compressed, as shown in the data flow 312 shown in FIG. Recorded in the buffer 416. The compressed data recorded in the data buffer 416 is recorded in the FM 420 at an arbitrary timing as in the data flow 314.
 S2505より続くステップS2506では、プロセッサ121が図10にて示したライト応答をNVMモジュール126より取得し、ライト応答情報1020の圧縮データ長1023のフィールドより、S2505にてライトしたデータの圧縮後のデータサイズを取得する。 In step S2506 following S2505, the processor 121 obtains the write response shown in FIG. 10 from the NVM module 126, and the compressed data of the data written in S2505 from the compressed data length 1023 field of the write response information 1020. Get the size.
 S2506より続くステップS2507では、プロセッサ121はキャッシュ管理情報3000のビットマップ3050、属性3060、最終アクセス時刻3070を更新する。 In step S2507 following S2506, the processor 121 updates the bitmap 3050, the attribute 3060, and the last access time 3070 of the cache management information 3000.
 S2507より続くステップS2508は、ストレージ装置101がNVMモジュール126により構成されるキャッシュ上に保持されている圧縮データの内、RAIDパリティが生成されていないデータの総量が閾値以上になったかを判断するステップである。キャッシュ上に保持されている圧縮データの内、RAIDパリティが生成されていないデータの総量が閾値以上になった場合、ストレージ装置101は、キャッシュに保持された圧縮データに対してパリティを生成する必要があると判断し、パリティ生成動作に遷移する。一方で、キャッシュ上に保持されている圧縮データの内、RAIDパリティが生成されていないデータの総量が閾値以下の場合、ストレージ装置101はパリティ生成が不要と判断し、ライトデータのキャッシュ格納動作を終了する。以上が本実施例におけるライトデータのキャッシュ格納動作である。 In step S2508 subsequent to step S2507, the storage apparatus 101 determines whether the total amount of data in which RAID parity is not generated among the compressed data held in the cache configured by the NVM module 126 is equal to or greater than the threshold value. It is. When the total amount of data for which RAID parity is not generated among the compressed data held in the cache exceeds a threshold value, the storage apparatus 101 needs to generate parity for the compressed data held in the cache. It judges that there is, and shifts to the parity generation operation. On the other hand, if the total amount of data for which no RAID parity is generated among the compressed data held in the cache is less than or equal to the threshold value, the storage apparatus 101 determines that parity generation is unnecessary, and performs cache storage operation for write data. finish. The above is the write data cache storage operation in this embodiment.
 (1-26)ストレージ装置のRAIDパリティ生成動作
 続いて本実施例におけるストレージ装置のRAIDパリティ生成動作について説明する。本実施例のRAIDパリティ生成動作は、図2に示したライトデータのキャッシュ格納動作におけるステップS2008にてキャッシュ上に保持されている圧縮データの内、RAIDパリティが生成されていないデータの総量が閾値以上になった場合にのみ実行される態様に限定されない。RAIDパリティ動作は、ストレージ装置101が任意のタイミングで実施されてよい。例えば、上位装置103からのリクエストが少ない時や、全くない時などに実施される。
(1-26) RAID Parity Generation Operation of Storage Device Next, the RAID parity generation operation of the storage device in this embodiment will be described. In the RAID parity generation operation of this embodiment, the total amount of data in which RAID parity is not generated among the compressed data held in the cache in step S2008 in the write data cache storage operation shown in FIG. It is not limited to the aspect performed only when it becomes above. The RAID parity operation may be performed by the storage apparatus 101 at an arbitrary timing. For example, it is performed when there are few or no requests from the host device 103.
 本実施例のRAIDパリティ生成動作は、図3-Aに示した本実施例のライトデータ圧縮動作の処理315~320である。RAIDパリティ生成動作について図26のフローを用いて説明する。 The RAID parity generation operation of this embodiment is the processes 315 to 320 of the write data compression operation of this embodiment shown in FIG. The RAID parity generation operation will be described with reference to the flow of FIG.
 ストレージ装置のRAIDパリティ生成処理の最初のステップS2601は、プロセッサ121が、NVMモジュール126のLBA0にて構成されるキャッシュ領域に記録されたデータの中から、パリティ生成対象とするデータを選択するステップである。このとき、プロセッサ121は、キャッシュ管理情報3000の最終アクセス時刻3070を参照して、最後にアクセスされてからの経過時間が長いデータを選択する。また、これ以外の何らかのルールにより、パリティ生成対象とするデータを選択するようにしてもよい。例えば更新頻度の相対的に低いデータ等を選択する等でもよい。 The first step S2601 of the RAID parity generation processing of the storage apparatus is a step in which the processor 121 selects data to be a parity generation target from the data recorded in the cache area configured by the LBA0 of the NVM module 126. is there. At this time, the processor 121 refers to the last access time 3070 of the cache management information 3000 and selects data having a long elapsed time since the last access. Further, data that is a parity generation target may be selected according to some other rule. For example, data with a relatively low update frequency may be selected.
 S2601より続くステップS2602は、プロセッサ121がNVMモジュール126の提供する論理空間であるLBA0上にこれから生成するパリティの記録先の領域を確保するステップである。プロセッサ121は、フリーリスト3500を参照して、未使用のLBA0を確保する。選択したLBA0は、キャッシュ管理情報3000と同様の、パリティ用のキャッシュ領域管理情報(非図示)にて管理する。 Step S2602 following S2601 is a step of securing a recording destination area of the parity to be generated on the LBA0, which is a logical space provided by the NVM module 126, by the processor 121. The processor 121 refers to the free list 3500 and secures an unused LBA0. The selected LBA 0 is managed by the parity cache area management information (not shown) similar to the cache management information 3000.
 S2602より続くステップS2603は、フルストライプパリティ生成を行うか判断するステップである。プロセッサ121は、S2601で選択されたデータと同一ストライプ列に属する全てのデータがキャッシュに存在する場合、フルストライプパリティ生成を行うためS2604に遷移する。一方で、S2601で選択されたデータと同一ストライプ列に属するデータが一部のみしかない場合、更新パリティ生成を行うため、S2607に遷移する。 Step S2603 following S2602 is a step of determining whether to perform full stripe parity generation. If all data belonging to the same stripe column as the data selected in S2601 exists in the cache, the processor 121 moves to S2604 in order to generate full stripe parity. On the other hand, if there is only a part of the data belonging to the same stripe column as the data selected in S2601, the process proceeds to S2607 to generate updated parity.
 S2601で選択されたデータと同一ストライプ列に属するデータをキャッシュ上から検索する方法は、キャッシュ管理情報3000に格納されている各行について、VOL#3010とアドレス3020を参照し、VOL#3010とアドレス3020がS2601で選択されたデータと同一ストライプ列の範囲内にあるか確認すればよい。たとえば図21を例にとって説明する。S2601で選択されたデータがData14であったとすると、キャッシュ管理情報3000に格納されている各行のVOL#3010とアドレス3020を参照し、Vol#3010がData14の属する仮想ボリューム番号と等しく、アドレス3020を3で除算した値が、Data14のストライプ番号(14)を3で除算した結果(3)と等しいものが、同一ストライプに属するデータである。さらに、それぞれのビットマップ3050の値が同じであれば、S2601で選択されたデータと同一ストライプ列に属するデータがキャッシュ上に格納されていると判断できる。 The method of searching the cache for data belonging to the same stripe column as the data selected in S2601 refers to VOL # 3010 and address 3020 for each row stored in cache management information 3000, and VOL # 3010 and address 3020. Is within the same stripe column range as the data selected in S2601. For example, FIG. 21 will be described as an example. If the data selected in S2601 is Data14, VOL # 3010 and address 3020 of each row stored in the cache management information 3000 are referred to, Vol # 3010 is equal to the virtual volume number to which Data14 belongs, and address 3020 is set. Data belonging to the same stripe is a value obtained by dividing the stripe number (14) of Data 14 by 3 (3), which is equal to the result (3). Furthermore, if the values of the respective bitmaps 3050 are the same, it can be determined that data belonging to the same stripe column as the data selected in S2601 is stored in the cache.
 S2603より続くステップS2604は、ストレージ装置101がS2602にて確保したLBA0領域にRAIDパリティをマッピングするようNVMモジュール126に指示するステップである。プロセッサ121は、図13に示したフルストライプパリティ生成コマンド1310を用いて、パリティを生成する圧縮データをLBA0開始アドレス0~X(1315~1317)により指定し、さらに生成したパリティのマッピング箇所についてもLBA0開始アドレス(XOR パリティ用)1318、LBA0開始アドレス(RAID6パリティ用)1319により指定する。 Step S2604 following S2603 is a step for instructing the NVM module 126 to map the RAID parity to the LBA0 area secured by the storage apparatus 101 in S2602. The processor 121 uses the full stripe parity generation command 1310 shown in FIG. 13 to specify the compressed data for generating the parity by the LBA 0 start addresses 0 to X (1315 to 1317), and also for the generated parity mapping location. The LBA 0 start address (for XOR parity) 1318 and the LBA 0 start address (for RAID 6 parity) 1319 are specified.
 フルストライプパリティ生成コマンドを受領したNVMモジュール126は、LBA0に対応付けられている領域がFM420であれば、FM420に記録された圧縮データをNVMモジュール126内のデータバッファ416に読出す(LBA0に対応付けられた領域がNVMモジュール126内データバッファ416の場合は不要)。そして、NVMモジュール126内のパリティ生成ユニット419に対して、データバッファ416内の圧縮データについてのパリティ生成を指示する。指示を受けたパリティ生成ユニット419は、データ圧縮/伸長ユニット418にデータバッファ416上のデータを取得して伸張させ、その伸張データからパリティを生成する。さらに、パリティ生成ユニット419は、生成したパリティをデータ圧縮/伸長ユニット418に転送して圧縮し、NVMモジュール126内のデータバッファ416またはFM420に記録する。そして生成したパリティの記録された領域(データバッファ416またはFM420)に対応付けられているPBAを、LBA0(LBA0開始アドレス(XOR パリティ用)1318、LBA0開始アドレス(RAID6パリティ用)1319で指定されたLBA0)に対応づける。 The NVM module 126 that has received the full stripe parity generation command reads the compressed data recorded in the FM 420 to the data buffer 416 in the NVM module 126 if the area associated with LBA 0 is FM 420 (corresponding to LBA 0). (Not required if the attached area is the data buffer 416 in the NVM module 126). The parity generation unit 419 in the NVM module 126 is instructed to generate parity for the compressed data in the data buffer 416. Upon receipt of the instruction, the parity generation unit 419 acquires data from the data buffer 416 by the data compression / decompression unit 418 and expands it, and generates parity from the expanded data. Further, the parity generation unit 419 transfers the generated parity to the data compression / decompression unit 418, compresses it, and records it in the data buffer 416 or FM 420 in the NVM module 126. The PBA associated with the generated parity recording area (data buffer 416 or FM 420) is designated by LBA0 (LBA0 start address (for XOR parity) 1318 and LBA0 start address (for RAID6 parity) 1319. LBA0).
 S2603より続くステップS2607は、更新パリティを生成するために、ストレージ装置101がRAID構成された最終記憶媒体より旧データの圧縮データと旧パリティの圧縮データを取得し、LBA1を指定してライトするステップである。プロセッサ121は、旧データの圧縮データと旧パリティの圧縮データを格納するためのLBA1を、フリーリストから取得する。そしてプロセッサ121は取得したLBA1の情報を一時的に記憶しておく。 Step S2607 following S2603 is a step in which the storage apparatus 101 acquires the compressed data of the old data and the compressed data of the old parity from the final storage medium configured in RAID and designates and writes LBA1 in order to generate updated parity. It is. The processor 121 acquires, from the free list, LBA1 for storing the compressed data of the old data and the compressed data of the old parity. The processor 121 temporarily stores the acquired LBA1 information.
 続いてパリティ生成に必要な旧データと旧パリティの格納されている仮想PDEV、及び仮想PDEV内アドレスを特定する必要がある。旧データの格納されている仮想PDEVは、新データ(S2601で選択されたデータ)の格納される仮想PDEVと同じである。また旧パリティの仮想PDEVは、先に述べたとおり、新データのアドレス(ストライプ番号)から簡単な計算で求められる。そして旧データと旧パリティの仮想PDEV内アドレスは、新データ(S2601で選択されたデータ)の格納されるべき仮想PDEV内アドレスと同じであるので、プロセッサ121は、S2601で選択されたデータの格納されるべき仮想PDEV内アドレスを特定すればよい。続いて前述のリード動作のS2404~S2407と同様の処理にて、パリティ生成に必要な旧データと旧パリティの仮想PDEV内アドレスから、圧縮データのPDEV内格納位置を特定し、PDEVよりデータを読み出す。その後、プロセッサ121は、図10に示したライトコマンド1010を用いて、確保したLBA1に旧圧縮データと旧パリティをライトする。 Subsequently, it is necessary to specify the old data necessary for parity generation, the virtual PDEV in which the old parity is stored, and the address in the virtual PDEV. The virtual PDEV in which the old data is stored is the same as the virtual PDEV in which the new data (the data selected in S2601) is stored. Further, as described above, the old parity virtual PDEV can be obtained by simple calculation from the address (stripe number) of the new data. Since the addresses in the virtual PDEV for the old data and the old parity are the same as the addresses in the virtual PDEV to be stored for the new data (the data selected in S2601), the processor 121 stores the data selected in S2601. What is necessary is just to specify the address in virtual PDEV which should be performed. Subsequently, the storage position in the PDEV of the compressed data is specified from the old data required for parity generation and the address in the virtual PDEV of the old parity, and the data is read from the PDEV, by the same processing as S2404 to S2407 of the above-described read operation. . Thereafter, the processor 121 writes the old compressed data and the old parity to the secured LBA 1 using the write command 1010 shown in FIG.
 S2607より続くステップS2608は、S2607にてLBA1領域に記録された旧データと旧パリティの圧縮状態データを、LBA0空間上領域にマッピングするステップである。プロセッサ121は、各圧縮データの伸張後のデータサイズがマッピング可能なLBA0領域を、フリーリスト3500から取得する。そして、各LBA0と各LBA1を指定した図19に示したLBA0マッピングコマンドをNVMモジュール126に複数転送し、S2607にてライトしたLBA1領域に記録されたデータの圧縮状態のデータの伸張イメージをLBA0領域にマッピングする。 Step S2608 following S2607 is a step of mapping the old data and the old parity compressed state data recorded in the LBA1 area in S2607 to the LBA0 space area. The processor 121 acquires from the free list 3500 an LBA0 area to which the data size after decompression of each compressed data can be mapped. Then, a plurality of LBA0 mapping commands shown in FIG. 19 specifying each LBA0 and each LBA1 are transferred to the NVM module 126, and the decompressed image of the compressed data recorded in the LBA1 area written in S2607 is displayed in the LBA0 area. To map.
 S2608より続くステップS2609は、S2601にて選択されたデータ(更新データ)と、S2608にてLBA0にマッピングされた旧圧縮データ、旧パリティを用いて更新パリティ生成を実施するステップである。プロセッサ121は、図14に示した更新パリティ生成コマンド1410を用いて、圧縮データ、旧圧縮データ、旧パリティの領域をLBA0により指定し、更新パリティの格納先についてもLBA0により指定する。更新パリティ生成コマンドを受領したNVMモジュール126で行われる処理の流れは、先に説明した、フルストライプパリティ生成コマンドを受領した時に行われる処理の流れとおおむね同様である。 Step S2609 following S2608 is a step of performing update parity generation using the data (update data) selected in S2601, the old compressed data mapped to LBA0 in S2608, and the old parity. The processor 121 uses the updated parity generation command 1410 shown in FIG. 14 to specify the compressed data, the old compressed data, and the old parity area by LBA0, and also specifies the storage location of the updated parity by LBA0. The flow of processing performed by the NVM module 126 that has received the update parity generation command is substantially the same as the flow of processing that is performed when the full stripe parity generation command is received as described above.
 S2604またはS2609より続くステップS2605は、S2609またはS2604にて生成されたパリティの、圧縮後の正確なデータサイズを取得するステップである。このステップにてプロセッサ121は、コマンドパラメータのLBA0開始アドレス1113に、生成されたパリティの格納されているLBA0が指定された圧縮データサイズ取得コマンド1110を作成し、NVMモジュール126に発行する。そして、プロセッサ121は、圧縮データサイズ取得応答1120により、パリティの圧縮後のデータサイズを取得する。 Step S2605 subsequent to S2604 or S2609 is a step of obtaining the correct data size after compression of the parity generated in S2609 or S2604. In this step, the processor 121 creates a compressed data size acquisition command 1110 in which the LBA 0 in which the generated parity is stored is specified at the LBA 0 start address 1113 of the command parameter, and issues it to the NVM module 126. Then, the processor 121 acquires the data size after compression of the parity by the compressed data size acquisition response 1120.
 S2605より続くステップS2606では、デステージが必要か判断するステップである。プロセッサ121は、パリティを生成したキャッシュ上の圧縮データを最終記憶媒体に記録すべきか判断する。この判断は例えば、キャッシュの空き領域により決定する。ストレージ装置101は、キャッシュに空き領域が閾値以下であれば、空き領域を作成するために、デステージ処理を開始する。一方、キャッシュに十分な空き領域があると判断すれば、パリティ生成処理は終了となる。 In step S2606 following S2605, it is determined whether destage is necessary. The processor 121 determines whether the compressed data on the cache that has generated the parity should be recorded in the final storage medium. This determination is made, for example, based on the cache free area. If the free area in the cache is equal to or smaller than the threshold value, the storage apparatus 101 starts destage processing to create a free area. On the other hand, if it is determined that there is a sufficient free area in the cache, the parity generation process ends.
 以上が本実施例におけるパリティ生成動作である。なお、ここまで本実施例におけるRAID 5のパリティ生成動作を前提に記述したが、RAID6も同様である。 The above is the parity generation operation in this embodiment. Although description has been made on the premise of the parity generation operation of RAID 5 in the present embodiment, the same applies to RAID 6.
 (1-27)ストレージ装置のデステージ動作
 続いて本実施例におけるストレージ装置のデステージ動作について説明する。本実施例のはデステージ動作は、図26に示したRAIDパリティ生成動作におけるステップ2606にてデステージ要と判断された場合にのみ実行される態様に限定されるものではない。本実施例におけるデステージ動作は、ストレージ装置101が任意のタイミングで実施してよい。例えば、上位装置からのリクエストが少ない時や、全くない時に任意のタイミングで実施してよい。
(1-27) Destaging Operation of Storage Device Next, the destaging operation of the storage device in this embodiment will be described. In this embodiment, the destage operation is not limited to a mode that is executed only when it is determined that destage is necessary in step 2606 in the RAID parity generation operation shown in FIG. The destage operation in this embodiment may be performed by the storage apparatus 101 at an arbitrary timing. For example, it may be performed at an arbitrary timing when there are few or no requests from the host device.
 本実施例のデステージ動作は、図3-Aに示した本実施例のライトデータ圧縮動作の処理321~323である。デステージ動作について図27のフローを用いて説明する。 The destage operation of the present embodiment is the processes 321 to 323 of the write data compression operation of the present embodiment shown in FIG. Destage operation will be described with reference to the flowchart of FIG.
 ストレージ装置のデステージ動作の最初のステップS2701は、キャッシュ装置であるNVMモジュール126からデステージするデータを選択するステップである。この時プロセッサ121は、LBA0からデステージする領域を選択する。デステージは、キャッシュ管理情報3000の最終アクセス時刻3070を参照して、上位装置103からのアクセスが最近ないデータを対象としてもよいし、あるいはストレージ装置101で管理しているその他の統計情報などをもとにして、シーケンシャルライトのデータと判断されるデータを対象とする等の方法を採ってもよい。なお、ここでは、図26の処理で生成されたパリティもデステージ対象になる。 The first step S2701 of the destaging operation of the storage device is a step of selecting data to be destaged from the NVM module 126 which is a cache device. At this time, the processor 121 selects an area to be destaged from the LBA0. The destage refers to the last access time 3070 of the cache management information 3000 and may target data that has not been accessed recently from the upper level apparatus 103 or other statistical information managed by the storage apparatus 101. On the basis of this, a method of targeting data determined to be sequential write data may be adopted. Note that here, the parity generated by the processing of FIG.
 ステップS2701より続くステップS2702では、ストレージ装置101がS2701にて選択したLBA0空間上領域のデータの圧縮後のデータサイズをNVMモジュール126から取得するステップである。プロセッサ121は、図11に示す圧縮データサイズ取得コマンド1120をNVMモジュール126に転送し、圧縮データサイズ取得応答1120内の圧縮データ長1123を取得し、デステージ動作にて取得すべき圧縮データサイズを把握する。 In step S2702, following step S2701, the storage apparatus 101 acquires the data size after compression of the data in the LBA0 space area selected in S2701 from the NVM module 126. The processor 121 transfers the compressed data size acquisition command 1120 shown in FIG. 11 to the NVM module 126, acquires the compressed data length 1123 in the compressed data size acquisition response 1120, and sets the compressed data size to be acquired in the destage operation. To grasp.
 ステップS2702より続くステップS2703では、ストレージ装置101がS2701にて決定したLBA0領域の圧縮データをLBA1領域にマッピングするステップである。プロセッサ121は、ステップS2702にて取得した、圧縮データ長がマッピング可能なLBA1領域を記載したLBA1マッピングコマンド1210をNVMモジュール126に転送する。 In step S2703 following step S2702, the storage apparatus 101 maps the compressed data in the LBA0 area determined in S2701 to the LBA1 area. The processor 121 transfers the LBA1 mapping command 1210 describing the LBA1 area to which the compressed data length can be mapped acquired in step S2702 to the NVM module 126.
 ステップS2703より続くステップS2704では、ストレージ装置101がS2703にてマッピングしたLBA1領域より、圧縮データを取得するステップである。プロセッサ121は、図16に示すリードコマンド1610にS2703にてマッピングしたLBA1を記載し、NVMモジュール126に転送することで圧縮データを取得する。 In step S2704 following step S2703, the storage apparatus 101 acquires compressed data from the LBA1 area mapped in step S2703. The processor 121 describes the LBA1 mapped in S2703 in the read command 1610 shown in FIG. 16, and acquires the compressed data by transferring it to the NVM module 126.
 ステップS2704より続くステップS2704’は、ライト対象データの格納先アドレスを特定するステップである。キャッシュ領域(NVMモジュール126のLBA0空間)上の各ライト対象データの格納先仮想ボリュームのアドレスは、キャッシュ管理情報3000内のアドレス(3020)に格納されているため、プロセッサ121はこれを利用して、このアドレスに対応付けられる仮想PDEV及び仮想PDEVのアドレスを算出する。算出方法は先に述べたとおりである。 Step S2704 'following step S2704 is a step of specifying the storage destination address of the write target data. Since the address of the storage destination virtual volume of each write target data in the cache area (the LBA0 space of the NVM module 126) is stored at the address (3020) in the cache management information 3000, the processor 121 uses this. The virtual PDEV associated with this address and the address of the virtual PDEV are calculated. The calculation method is as described above.
 ステップS2704’より続くステップS2705は、ステップS2704にて取得した圧縮データをPDEVに記録するステップである。まず、ステップS2704’で、ライトデータの格納先仮想PDEVが特定されているので、ステップS2705ではプロセッサ121は、仮想PDEV情報2230を参照し、ライトデータの格納先仮想PDEVに対応付けられたPDEVを特定する。続いて特定されたPDEVの空き領域を選択し、選択された空き領域を、圧縮データの格納先と決定し、当該決定された格納先に対して、S2704にて取得した圧縮データをPDEV領域に記録する。 Step S2705 subsequent to step S2704 'is a step of recording the compressed data acquired in step S2704 on the PDEV. First, in step S2704 ′, the write data storage destination virtual PDEV is specified. In step S2705, the processor 121 refers to the virtual PDEV information 2230 and selects the PDEV associated with the write data storage destination virtual PDEV. Identify. Subsequently, the free area of the specified PDEV is selected, the selected free area is determined as a storage location for compressed data, and the compressed data acquired in S2704 is stored in the PDEV area for the determined storage location. Record.
 本発明の実施例に係るストレージ装置101は、各PDEV(SSDorHDD)の空き領域(仮想PDEVに対応付けられていない領域)についての情報を、ストレージコントローラ110内(たとえばDRAM125)に管理している。プロセッサ121がPDEV内の空き領域を選択する場合には、この情報を用いる。あるいは、圧縮管理情報2300に登録されている領域(圧縮データが格納されているPDEV内の先頭アドレス2301で特定される領域)以外の領域は、空き領域であるから、プロセッサ121は圧縮データの格納先PDEVから圧縮管理情報2300を読み出し、圧縮管理情報2300をもとに空き領域を特定する方法を採ってもよい。 The storage apparatus 101 according to the embodiment of the present invention manages information about free areas (areas not associated with virtual PDEVs) of each PDEV (SSDorHDD) in the storage controller 110 (for example, DRAM 125). This information is used when the processor 121 selects a free area in the PDEV. Alternatively, since the area other than the area registered in the compression management information 2300 (the area specified by the start address 2301 in the PDEV in which the compressed data is stored) is an empty area, the processor 121 stores the compressed data. A method may be employed in which the compression management information 2300 is read from the destination PDEV, and an empty area is specified based on the compression management information 2300.
 ステップS2705より続くステップS2706は、S2703にて圧縮データ取得用にマッピングしたLBA1領域を開放するステップである。プロセッサ121は、図17に示すマッピング解除コマンド1710を用いて、LBA1を解除する。またプロセッサ121は、解除されたLBA1の情報を、フリーLBA1リスト3520に格納し、キャッシュ管理情報3000からは削除しておく。 Step S2706 following step S2705 is a step of releasing the LBA1 area mapped for obtaining compressed data in S2703. The processor 121 releases LBA1 using a mapping release command 1710 shown in FIG. Further, the processor 121 stores the released LBA1 information in the free LBA1 list 3520 and deletes it from the cache management information 3000.
 ステップS2706より続くステップS2707では、圧縮管理情報2300を更新し、PDEVに記録するステップである。プロセッサ121は圧縮管理情報2300を読み出し、デステージ対象とした仮想PDEV領域に対応する圧縮情報エントリの、圧縮データが格納されているPDEV内の先頭アドレス2301欄に、S2705にて圧縮データを格納したPDEV領域のアドレスを記録して更新する。そして、更新した圧縮管理情報2300をPDEVに記録する。なお、圧縮管理情報2300の更新の際、圧縮管理情報2300に格納されている情報をすべて読み出して更新する必要はなく、必要な領域だけを読み出して、更新すればよい。 In step S2707 following step S2706, the compression management information 2300 is updated and recorded in the PDEV. The processor 121 reads the compression management information 2300, and stores the compressed data at S2705 in the first address 2301 column in the PDEV storing the compressed data of the compressed information entry corresponding to the virtual PDEV area targeted for destaging. The address of the PDEV area is recorded and updated. Then, the updated compression management information 2300 is recorded in the PDEV. When the compression management information 2300 is updated, it is not necessary to read and update all the information stored in the compression management information 2300, and only a necessary area may be read and updated.
 以上が、本実施例におけるデステージ動作である。 The above is the destage operation in the present embodiment.
 (1-28)ストレージ装置の圧縮管理情報のエントリ再生成動作
 続いて、実施例における圧縮管理情報2300の部分回復動作について説明する。実施例における圧縮管理情報2300のエントリ再生成動作は、圧縮管理情報2300のエントリ消失を検出した場合に実施される。実施例におけるエントリ消失の検出機会は、ストレージ装置におけるエントリの定期的な監視動作時や、リードやデステージ動作にて圧縮管理情報2300のエントリを取得する時等である。
(1-28) Entry Regeneration Operation of Compression Management Information in Storage Device Next, the partial recovery operation of the compression management information 2300 in the embodiment will be described. The operation of regenerating the entry of the compression management information 2300 in the embodiment is performed when entry disappearance of the compression management information 2300 is detected. In the embodiment, the entry loss detection opportunity is at the time of regularly monitoring the entry in the storage device, or when the entry of the compression management information 2300 is acquired by the read or destage operation.
 本発明の実施例に係るストレージ装置101はこの圧縮管理情報2300の部分回復動作と後述のリビルド処理によって、圧縮管理情報2300を再生成可能とすることを特徴とするものである。この機能によってストレージ装置101は、圧縮管理情報2300を冗長に保持することなくストレージ装置の信頼性を維持できる。 The storage apparatus 101 according to the embodiment of the present invention is characterized in that the compression management information 2300 can be regenerated by the partial recovery operation of the compression management information 2300 and a rebuild process described later. With this function, the storage apparatus 101 can maintain the reliability of the storage apparatus without redundantly holding the compression management information 2300.
 圧縮管理情報2300の部分回復動作について、図28を用いて説明する。なお、説明の簡単化のため、以下では、圧縮管理情報2300の一エントリのみが消失した場合について説明する。ただし、圧縮管理情報2300の複数エントリが消失した場合でも、この処理は適用可能である。 The partial recovery operation of the compression management information 2300 will be described with reference to FIG. For simplification of description, a case where only one entry of the compression management information 2300 is lost will be described below. However, this process is applicable even when a plurality of entries of the compression management information 2300 are lost.
 圧縮管理情報2300の部分回復動作の最初のステップであるS2801は、消失した圧縮管理情報2300のエントリが管理していた仮想PDEV領域(仮想PDEV内アドレス)を算出するステップである。本発明の実施例に係るストレージ装置101は、前述のとおり、PDEV内の圧縮管理情報2300のエントリの記録位置をエントリが管理する仮想PDEVの領域に合わせて固定的に割り当てている。このためプロセッサ121は、消失したエントリのPDEV内アドレスから、当該エントリによって管理されていた仮想PDEVの領域が特定できる。たとえば図23に示されている内容の圧縮管理情報2300がPDEVに格納されており、記録位置(PDEV内相対アドレス)が0x00_0000_0008に格納されている圧縮管理情報エントリが読み出せなくなった場合、それは仮想PDEV領域のアドレス0x0000_0000_1000から始まる4KBの領域用の、圧縮管理情報エントリであることが分かる。本発明の実施例に係る部分回復動作ではこの場合、仮想PDEV領域のアドレス0x0000_0000_1000から始まる4KBの領域に格納されていたデータを、RAIDグループを構成する他のPDEVから読み出したデータを用いて再生成してPDEVに書き戻し、それをもとに圧縮管理情報のエントリを作り直す処理を、S2802以降で行う。 S2801, which is the first step of the partial recovery operation of the compression management information 2300, is a step of calculating a virtual PDEV area (address within the virtual PDEV) managed by the lost entry of the compression management information 2300. As described above, the storage apparatus 101 according to the embodiment of the present invention fixedly assigns the recording position of the entry of the compression management information 2300 in the PDEV in accordance with the virtual PDEV area managed by the entry. Therefore, the processor 121 can identify the virtual PDEV area managed by the entry from the address in the PDEV of the lost entry. For example, if the compression management information 2300 having the contents shown in FIG. 23 is stored in the PDEV and the compression management information entry whose recording position (relative address in the PDEV) is stored at 0x00_0000_0008 cannot be read, It can be seen that this is a compression management information entry for a 4 KB area starting from the address 0x0000_0000_1000 of the PDEV area. In this case, in the partial recovery operation according to the embodiment of the present invention, the data stored in the 4 KB area starting from the address 0x0000_0000_1000 of the virtual PDEV area is regenerated using the data read from the other PDEVs constituting the RAID group. Then, the process of rewriting the PDEV and recreating the entry of the compression management information based on it is performed after S2802.
 ステップS2801より続くステップS2802では、消失した圧縮管理情報2300が格納されていた仮想PDEVの属するRAIDグループを特定するステップである。プロセッサ121は、RAIDグループ管理情報2220を探索し、該当するRAIDグループを特定する。 In step S2802, following step S2801, the RAID group to which the virtual PDEV in which the lost compression management information 2300 was stored belongs is specified. The processor 121 searches the RAID group management information 2220 and identifies the corresponding RAID group.
 ステップS2802より続くステップS2803は、プロセッサ121は、失われたエントリが管理していた仮想PDEV領域を復元するために、S2801にて特定した仮想PDEV領域のデータを復旧するために必要なデータを、各PDEVから取得する。この処理について、先に説明した、仮想PDEV領域のアドレス0x0000_0000_1000から始まる4KBの領域用の、圧縮管理情報エントリが消失した場合を例にとって説明する。この場合、S2802で特定されたRAIDグループを構成する複数の仮想PDEVのうち、消失した圧縮管理情報2300が格納されていた仮想PDEV以外の仮想PDEV(以下、これらの仮想PDEVを「他仮想PDEV」と呼ぶ)について、S2801で特定された仮想PDEV領域(つまり仮想PDEV領域のアドレス0x0000_0000_1000から始まる4KBの領域)に対応する圧縮データを読み出す。そのために、プロセッサ121は、他仮想PDEVのそれぞれから圧縮管理情報2300を読み出して、他仮想PDEVの仮想PDEV領域のアドレス0x0000_0000_1000から始まる4KBの領域に対応付けられたPDEVのアドレス、及び圧縮データの長さを取得し、取得されたPDEVのアドレス、圧縮データの長さを用いて、他仮想PDEVに対応付けられたPDEVから圧縮データを読み出す。 In step S2803 following step S2802, in order to restore the virtual PDEV area managed by the lost entry, the processor 121 stores data necessary for restoring the data in the virtual PDEV area specified in S2801, Obtain from each PDEV. This process will be described by taking as an example the case where the compression management information entry for the 4 KB area starting from the virtual PDEV area address 0x0000_0000_1000 described above has disappeared. In this case, among the plurality of virtual PDEVs constituting the RAID group identified in S2802, virtual PDEVs other than the virtual PDEV in which the lost compression management information 2300 was stored (hereinafter, these virtual PDEVs are referred to as “other virtual PDEVs”). , The compressed data corresponding to the virtual PDEV area identified in S2801 (that is, the 4 KB area starting from the virtual PDEV area address 0x0000_0000_1000) is read. Therefore, the processor 121 reads the compression management information 2300 from each of the other virtual PDEVs, and the PDEV address associated with the 4 KB area starting from the virtual PDEV area address 0x0000_0000_1000 of the other virtual PDEV, and the length of the compressed data The compressed data is read from the PDEV associated with the other virtual PDEV using the acquired PDEV address and the length of the compressed data.
 ステップS2803より続くステップS2804では、プロセッサ121は、S2803にて取得した複数の圧縮データ(S2801にて特定した仮想PDEV領域のデータを復旧するために必要なデータ)を、NVMモジュール126に記録し、伸張イメージをLBA0にマッピングする。ここでは、図24のステップS2408~S2410と同様の処理を行えばよい。なお、データをNVMモジュール126に記録する前に、LBA0、LBA1空間の領域確保が必要だが、これは図24のS2404等で行われる処理と同様であるので、説明を省略する。 In step S2804 following step S2803, the processor 121 records a plurality of compressed data (data necessary for restoring the data in the virtual PDEV area specified in S2801) acquired in S2803 in the NVM module 126. Map the decompressed image to LBA0. Here, the same processing as in steps S2408 to S2410 in FIG. 24 may be performed. Note that it is necessary to secure the LBA0 and LBA1 space areas before data is recorded in the NVM module 126, but this is the same as the processing performed in S2404 of FIG.
 ステップS2804より続くステップS2805では、プロセッサ121は、ステップS2804にてNVMモジュール126のLBA0にマッピングされたRAIDストライプ列の複数データより、RAID機能を用いてS2801にて特定した仮想PDEV領域のデータを復元する。これは、フルストライプパリティ生成コマンド1310を用いればよい。フルストライプパリティ生成コマンド1310のパラメータとして、LBA0開始アドレス(1314~1316)に、ステップS2804にてLBA0空間にマッピングされたデータのアドレスを格納する。そして復元されるデータを格納するためのLBA0空間領域を確保し、確保されたLBA0空間領域の先頭アドレスを、LBA0開始アドレス(XOR パリティ用)1317に格納したコマンドを作成し、NVMモジュール126に発行する。 In step S2805 following step S2804, the processor 121 restores the data in the virtual PDEV area identified in step S2801 using the RAID function from the plurality of data in the RAID stripe column mapped to LBA0 of the NVM module 126 in step S2804. To do. For this, a full stripe parity generation command 1310 may be used. As a parameter of the full stripe parity generation command 1310, the address of the data mapped in the LBA0 space in step S2804 is stored in the LBA0 start address (1314 to 1316). Then, an LBA0 space area for storing data to be restored is secured, and a command in which the start address of the secured LBA0 space area is stored in the LBA0 start address (for XOR parity) 1317 is created and issued to the NVM module 126 To do.
 ステップS2805より続くステップS2806では、プロセッサ121は、S2805にて生成した復元データの圧縮データをLBA1にマッピングし、圧縮データを取得する。ここでは、図27のステップS2702~S2704と同様の処理を行う。 In step S2806 following step S2805, the processor 121 maps the compressed data of the decompressed data generated in S2805 to LBA1, and acquires the compressed data. Here, processing similar to that in steps S2702 to S2704 in FIG. 27 is performed.
 ステップS2806より続くステップS2807は、S2806にて取得した圧縮データをPDEVに記録するステップである。この処理では図27のステップS2705と同様に、プロセッサ121は復元データの格納先仮想PDEVに対応付けられたPDEVを特定し、特定されたPDEVの中の空き領域を選択し、選択された空き領域を、圧縮データの格納先と決定し、当該決定された格納先に対して、S2806にて取得した圧縮データをPDEV領域に記録すればよい。 Step S2807 following step S2806 is a step of recording the compressed data acquired in S2806 on the PDEV. In this process, as in step S2705 of FIG. 27, the processor 121 identifies the PDEV associated with the restored data storage destination virtual PDEV, selects a free area in the identified PDEV, and selects the selected free area. May be determined as the storage location of the compressed data, and the compressed data acquired in S2806 may be recorded in the PDEV area for the determined storage location.
 ステップS2807より続くステップS2808は、圧縮管理情報2300を更新し、PDEVに記録するステップである。プロセッサ121は、デステージ対象とした仮想PDEV領域の圧縮管理情報2300エントリの圧縮データが格納されているPDEV内の先頭アドレス2301欄に、S2807にて圧縮データをライトしたPDEV領域のアドレスを記録して更新する。そして、更新した圧縮管理情報エントリをPDEVに記録することで消失したエントリを復元する。 Step S2808 following step S2807 is a step of updating the compression management information 2300 and recording it in the PDEV. The processor 121 records the address of the PDEV area to which the compressed data is written in S2807 in the first address 2301 column in the PDEV in which the compressed data of the compression management information 2300 entry of the virtual PDEV area to be destaged is stored. Update. Then, the lost entry is restored by recording the updated compression management information entry in the PDEV.
 上記圧縮管理情報2300のエントリ再生成動作により、ストレージ装置101は、圧縮管理情報2300を消失した場合であっても、データから圧縮管理情報2300の各エントリを再生成できる。 The storage device 101 can regenerate each entry of the compression management information 2300 from the data even if the compression management information 2300 is lost by the entry regeneration operation of the compression management information 2300.
 以上が本実施例の圧縮管理情報のエントリ再生成動作である。 The above is the compression regeneration information entry regeneration operation of this embodiment.
 (1-28)ストレージ装置のリビルド動作
 続いて、本実施例におけるリビルド動作について説明する。本発明の実施例に係るストレージ装置101は、RAIDグループを構成するPDEVの一つが故障してアクセスできなくなった際、図29に示すリビルド処理を行う。リビルド処理の最初のステップS2901では、プロセッサ121は、故障したPDEVが対応付けられている仮想PDEV(以降故障した仮想PDEVと記す)を特定する。
(1-28) Rebuild Operation of Storage Device Next, the rebuild operation in this embodiment will be described. The storage apparatus 101 according to the embodiment of the present invention performs the rebuild process shown in FIG. 29 when one of the PDEVs constituting the RAID group fails and cannot be accessed. In the first step S2901 of the rebuild process, the processor 121 identifies a virtual PDEV associated with the failed PDEV (hereinafter referred to as a failed virtual PDEV).
 ステップS2901より続くステップS2902では、故障した仮想PDEVが属するRAIDグループを特定するステップである。プロセッサ121は、RAIDグループ管理情報2220を探索し、該当するRAIDグループを特定する。 In step S2902, following step S2901, the RAID group to which the failed virtual PDEV belongs is specified. The processor 121 searches the RAID group management information 2220 and identifies the corresponding RAID group.
 ステップS2902より続くステップS2903では、プロセッサ121は、故障した仮想PDEV領域を復元するために、S2901にて特定された故障した仮想PDEVの各領域のデータを復旧するために必要なデータを、S2902で特定されたRAIDグループ内の、故障した仮想PDEV以外の(複数の)仮想PDEVに対応づけられている複数のPDEVから取得する。ここでは、図28のステップS2803で説明した処理を、故障した仮想PDEVの全領域(仮想PDEVのアドレス0から最大アドレスまでの全て)について行う。つまり、故障した仮想PDEV以外の(複数の)仮想PDEV(以下、これらを「他仮想PDEV」と呼ぶ)から圧縮管理情報2300を読み出し、他仮想PDEVのアドレス0x0000_0000_0000、0x0000_0000_1000、…のデータを順次読み出していく。ただし圧縮管理情報2300内の、圧縮データが格納されているPDEV内の先頭アドレス2301が未割り当て(NULL)の領域については、当該領域にはPDEVの領域が対応付けられていないため、読み出す必要はない。 In step S2903 subsequent to step S2902, the processor 121 retrieves data necessary for recovering data in each area of the failed virtual PDEV identified in S2901 in order to restore the failed virtual PDEV area in S2902. Obtained from a plurality of PDEVs that are associated with a plurality of virtual PDEVs other than the failed virtual PDEV in the identified RAID group. Here, the processing described in step S2803 in FIG. 28 is performed for all areas of the failed virtual PDEV (all addresses from the virtual PDEV address 0 to the maximum address). That is, the compression management information 2300 is read from (a plurality of) virtual PDEVs (hereinafter referred to as “other virtual PDEVs”) other than the failed virtual PDEV, and the data of the other virtual PDEV addresses 0x0000_0000_0000, 0x0000_0000_1000,. To go. However, for the area in the compression management information 2300 in which the start address 2301 in the PDEV in which the compressed data is stored is not assigned (NULL), the area is not associated with the PDEV area, so it is not necessary to read it out. Absent.
 また、以下で説明するステップS2904~S2908も、図28のS2804~S2808と同様の処理を行う。図28の処理との違いは、図28の処理は、仮想PDEV内の一部の領域(消失したエントリが管理していた仮想PDEV内の特定領域)のみについて行われる処理であるが、図29の処理は、故障した仮想PDEVの全領域(ただし、PDEVの領域が対応づけられていない領域は除く)について処理を行う点である。 Also, steps S2904 to S2908 described below perform the same processing as S2804 to S2808 in FIG. The difference from the process of FIG. 28 is that the process of FIG. 28 is performed only for a part of the area in the virtual PDEV (the specific area in the virtual PDEV managed by the lost entry). The process is to perform the process on all areas of the faulty virtual PDEV (excluding areas where the PDEV areas are not associated).
 ステップS2903より続くステップS2904では、プロセッサ121は、S2903にて取得した圧縮データを、NVMモジュール126に記録し、伸張イメージをLBA0にマッピングする。ここでは図28のステップS2804と同様の処理を行う。 In step S2904 subsequent to step S2903, the processor 121 records the compressed data acquired in S2903 in the NVM module 126, and maps the decompressed image to LBA0. Here, the same processing as step S2804 in FIG. 28 is performed.
 ステップS2904より続くステップS2905では、プロセッサ121は、ステップS2904にてNVMモジュール126のLBA0にマッピングしたRAIDストライプ列の複数データより、RAID機能を用いてS2901にて特定した故障した仮想PDEV領域のデータを復元する。このとき復元されたデータは圧縮されてNVMモジュール126のLBA0空間に記録される。ここでは図28のステップS2805と同様の処理を行う。 In step S2905 following step S2904, the processor 121 obtains data of the failed virtual PDEV area identified in S2901 using the RAID function from a plurality of data in the RAID stripe column mapped to LBA0 of the NVM module 126 in step S2904. Restore. The restored data is compressed and recorded in the LBA0 space of the NVM module 126. Here, the same processing as step S2805 in FIG. 28 is performed.
 ステップS2905より続くステップS2906では、プロセッサ121は、S2905にて生成した復元データの圧縮データをLBA1にマッピングし、圧縮データを取得する。ここでは図28のステップS2806と同様の処理を行う。 In step S2906 following step S2905, the processor 121 maps the compressed data of the decompressed data generated in S2905 to LBA1, and acquires the compressed data. Here, the same processing as step S2806 in FIG. 28 is performed.
 ステップS2906より続くステップS2907は、S2906にて取得した圧縮データを新たなPDEVに記録するステップである。ストレージ装置101は、故障したPDEVのスペアとなるPDEVを1台以上保持しており、故障したPDEVの代替として用いる。以降このPDEVを代替PDEVと記す。プロセッサ121は、代替PDEVに対応した仮想PDEVを選択し、RAIDグループ管理情報2220の仮想PDEV番号欄2222に、代替PDEVに対応した仮想PDEVの仮想PDEV番号を登録し、故障したPDEVに対応する仮想PDEVの番号を削除する。これにより、故障した仮想PDEVが対応付けられていたRAIDグループに、選択された仮想PDEVを故障した仮想PDEVの代替として加入させることになる。プロセッサ121は、代替PDEVの領域に、S2906にて取得した圧縮データを記録する。代替PDEVへの圧縮データの格納は、図27のステップS2705と同様の処理を行えばよい。 Step S2907 following step S2906 is a step of recording the compressed data acquired in S2906 in a new PDEV. The storage apparatus 101 holds one or more PDEVs that are spares for the failed PDEV, and is used as a substitute for the failed PDEV. Hereinafter, this PDEV is referred to as an alternative PDEV. The processor 121 selects a virtual PDEV corresponding to the alternative PDEV, registers the virtual PDEV number of the virtual PDEV corresponding to the alternative PDEV in the virtual PDEV number column 2222 of the RAID group management information 2220, and corresponds to the failed PDEV. Delete the PDEV number. As a result, the selected virtual PDEV is added as a substitute for the failed virtual PDEV to the RAID group associated with the failed virtual PDEV. The processor 121 records the compressed data acquired in step S2906 in the alternative PDEV area. The compressed data can be stored in the alternative PDEV by performing the same process as in step S2705 in FIG.
 ステップS2907より続くステップS2908では、プロセッサ121は、代替とした仮想PDEVの領域とS2907にて記憶した代替PDEV領域との対応付けを管理する圧縮管理情報を生成し、この生成された圧縮管理情報2300をPDEVに記録する。これにより、PDEVの故障時のデータ復元が完了する。 In step S2908 following step S2907, the processor 121 generates compression management information for managing the association between the substitute virtual PDEV area and the substitute PDEV area stored in step S2907, and the generated compression management information 2300 is generated. Is recorded on the PDEV. Thereby, the data restoration at the time of the failure of the PDEV is completed.
 上で述べたとおり、本発明の実施例に係るストレージ装置101では、PDEVが故障した際に、故障仮想PDEVが構成するRAIDグループの他の仮想PDEV領域の伸張データを用いて、データを復旧する。そして復旧データを圧縮し、代替となるPDEVに記録する。このとき、代替となる仮想PDEVと代替PDEVの領域を対応付ける圧縮管理情報を再生成することで、PDEV故障によって消失した圧縮管理情報を再生成する。 As described above, in the storage apparatus 101 according to the embodiment of the present invention, when the PDEV fails, the data is recovered using the decompressed data of the other virtual PDEV area of the RAID group that the failed virtual PDEV configures. . Then, the recovery data is compressed and recorded in an alternative PDEV. At this time, by regenerating the compression management information that associates the alternate virtual PDEV with the area of the substitute PDEV, the compression management information lost due to the PDEV failure is regenerated.
 なお、図29の処理では、故障した仮想PDEVの全領域に対して、ステップS2903~S2908の処理が行われるが、各ステップにおいて一度に全領域の処理を行う必要はなく、部分領域ごと(ストライプごと、あるいは本発明の実施例に係るストレージ装置101のデータ圧縮単位である4KBごと等)に各ステップの処理を実行するようにしてもよい。 In the process of FIG. 29, the processes of steps S2903 to S2908 are performed for all areas of the failed virtual PDEV, but it is not necessary to process the entire area at once in each step, and for each partial area (stripe Or every 4 KB which is a data compression unit of the storage apparatus 101 according to the embodiment of the present invention.
 以上が、本発明の実施例に係るストレージ装置における各処理の説明である。本発明の実施例に係るストレージ装置のように、データを圧縮して最終記憶媒体に格納するストレージ装置の場合、サーバなどの上位装置に対して、圧縮によるデータの変化を隠蔽し、あたかも非圧縮状態でデータが記録されているように見える記憶領域(仮想非圧縮ボリューム)を提供している。この場合、上位装置に提供する仮想非圧縮ボリュームの領域と、圧縮データの記録先となる物理領域との対応付け管理する情報(圧縮管理情報)が必要となる。 The above is the description of each process in the storage apparatus according to the embodiment of the present invention. In the case of a storage device that compresses data and stores it in the final storage medium, such as a storage device according to an embodiment of the present invention, the change in the data due to compression is concealed from a host device such as a server, as if uncompressed It provides a storage area (virtual uncompressed volume) where data appears to be recorded in the state. In this case, information (compression management information) for managing the association between the virtual uncompressed volume area provided to the host device and the physical area that is the recording destination of the compressed data is required.
 圧縮管理情報は、仮想非圧縮ボリュームと圧縮データの物理的な記録先を管理する管理情報であり、サーバからのデータリード要求への応答に不可欠な情報である。従って、ストレージ装置の信頼性の観点において、圧縮管理情報の消失は保持データの消失と等価である。このため、圧縮管理情報の保持には、最低でもデータと同程度の信頼性を維持する必要がある。 The compression management information is management information for managing the virtual non-compressed volume and the physical recording destination of the compressed data, and is indispensable for the response to the data read request from the server. Therefore, from the viewpoint of the reliability of the storage apparatus, the loss of compression management information is equivalent to the loss of retained data. For this reason, in order to retain the compression management information, it is necessary to maintain at least the same level of reliability as data.
 圧縮管理情報を廉価な記憶媒体に記録し、且つデータと同程度の信頼性を維持する方法として、圧縮管理情報をデータと同様にRAID(ミラーリング、パリティ)にて保護する方法が考えられる。しかし、この方法は、記録データを更新する際、データとともに圧縮管理情報についてもパリティを更新する必要が生じるため、ストレージ装置の性能低下を招く。同時にミラーデータ、パリティ等の冗長情報を必要とするため、圧縮管理情報のために記憶領域を多く消費し、ストレージ装置のコストの増加を招く。 As a method of recording the compression management information in an inexpensive storage medium and maintaining the same level of reliability as the data, a method of protecting the compression management information with RAID (mirroring, parity) like the data can be considered. However, in this method, when updating the recording data, it is necessary to update the parity of the compression management information as well as the data. At the same time, redundant information such as mirror data and parity is required, so that a large storage area is consumed for the compression management information, and the cost of the storage apparatus is increased.
 本発明のストレージ装置では、圧縮データの最終記憶媒体上の記録先との対応付けを管理する圧縮管理情報を最終記録媒体毎に分割し、一つの最終記録媒体に関連する対応関係のみを管理する圧縮管理情報を、管理する最終記録媒体の特定領域に記録する。この記録方式の場合、何らかの理由で圧縮管理情報が消失し、アクセスできなくなる障害が発生した時でも、アクセスできなくなった圧縮管理情報が管理していたデータをRAID技術により再生成し、再生成したデータ(復旧データ)を圧縮して最終記憶媒体に書き込むと同時に、最終記憶媒体に書き込まれた復旧データに対応した圧縮管理情報を作成して最終記憶媒体に書き込むことで、圧縮管理情報を復旧可能である。そのため、本発明のストレージ装置では、圧縮管理情報を冗長化して格納する必要がなく、圧縮管理情報による記憶領域の消費量を低減させることができる。 In the storage apparatus of the present invention, the compression management information for managing the association of the compressed data with the recording destination on the final storage medium is divided for each final recording medium, and only the correspondence relation related to one final recording medium is managed. The compression management information is recorded in a specific area of the final recording medium to be managed. In the case of this recording method, the compression management information is lost for some reason, and even if a failure that makes it inaccessible occurs, the data managed by the compression management information that has become inaccessible is regenerated by RAID technology and regenerated. Data (recovery data) can be compressed and written to the final storage medium. At the same time, compression management information corresponding to the recovery data written to the final storage medium can be created and written to the final storage medium to restore the compression management information. It is. Therefore, in the storage apparatus of the present invention, it is not necessary to store the compression management information in a redundant manner, and the consumption of the storage area by the compression management information can be reduced.
 以上、本発明の実施例を説明してきたが、これは本発明の説明のための例示であって、本発明を上で説明した実施例に限定する趣旨ではない。本発明は、他の種々の形態でも実施可能である。 The embodiment of the present invention has been described above, but this is an example for explaining the present invention, and is not intended to limit the present invention to the embodiment described above. The present invention can be implemented in various other forms.
 たとえば上で述べた実施例では、ストレージ装置は、複数の仮想PDEVの記憶領域を用いて構成されるRAIDグループの記憶領域が静的に割り当てられた仮想ボリュームを形成し、この仮想ボリュームを上位装置に提供していた。ただし別の実施形態として、上位装置に提供するボリュームとして、動的に物理記憶領域を割り当てる、いわゆるThin Provioning技術(Dynamic Provisioning技術とも呼ばれる)を用いて形成されるボリュームを採用してもよい。 For example, in the embodiment described above, the storage apparatus forms a virtual volume in which storage areas of a RAID group configured using storage areas of a plurality of virtual PDEVs are statically assigned, and this virtual volume is used as a host apparatus. Was offered to. However, as another embodiment, a volume formed by using a so-called Thin Provisioning technology (also referred to as a Dynamic Provisioning technology) that dynamically allocates a physical storage area may be adopted as a volume to be provided to a host device.
 Dynamic Provisioningは、ストレージ装置にインストールされている最終記憶媒体(SSD111またはHDD112)の記憶容量以上のボリューム(以下、このボリュームのことを「DPボリューム」と呼ぶ)を定義可能な機能である。ユーザはこの機能により、初期状態では必ずしも、定義したボリューム(DPボリューム)と同容量分の最終記憶媒体をストレージ装置にインストールしておく必要はなく、DPボリュームの運用開始後、必要に応じて最終記憶媒体を追加していけばよい。 Dynamic provisioning is a function that can define a volume that is larger than the storage capacity of the final storage medium (SSD 111 or HDD 112) installed in the storage device (hereinafter, this volume is referred to as “DP volume”). With this function, the user does not necessarily have to install a final storage medium with the same capacity as the defined volume (DP volume) in the storage device in the initial state. Add storage media.
 DPボリュームは、ストレージ装置が仮想的に作成するボリュームの一つで、ユーザまたは上位装置から指示された、任意の容量にて作成される。初期状態において、DPボリュームには記憶領域が割り当てられていない。上位装置103からデータがライトされた際、必要となった容量だけ、記憶領域を割り当てる。DPボリュームに割り当てるべき記憶領域としてはたとえば、上で述べた実施例における仮想ボリューム200を、固定サイズの記憶領域(この記憶領域を、Dynamic Provisioningページ(またはDPページ)と呼ぶ)ごとに管理しておき、DPボリュームに割り当てればよい。なお、仮想ボリューム200の記憶領域は、上で述べたとおり、圧縮前のデータが格納されているものとして扱われる記憶領域である。そのため、仮想ボリューム200から割り当てられる記憶領域を用いるDPボリュームも、上位装置103にはデータが圧縮されて格納されていることは隠蔽する。 The DP volume is one of the volumes virtually created by the storage device, and is created with an arbitrary capacity designated by the user or the host device. In the initial state, no storage area is allocated to the DP volume. When data is written from the host device 103, a storage area is allocated as much as necessary. As a storage area to be allocated to the DP volume, for example, the virtual volume 200 in the embodiment described above is managed for each fixed-size storage area (this storage area is called a Dynamic Provisioning page (or DP page)). It may be assigned to the DP volume. As described above, the storage area of the virtual volume 200 is a storage area that is treated as if data before compression is stored. For this reason, the DP volume using the storage area allocated from the virtual volume 200 also conceals that the data is compressed and stored in the higher level device 103.
 また、先に述べたとおり、ストレージコントローラ110は、NVMモジュール126から取得した圧縮データ長又は圧縮率(圧縮前のデータ量と圧縮後のデータ量の比率)などの圧縮情報に基づいて、仮想ボリュームを構成する仮想PDEVのサイズを増減させてもよい。仮想PDEVのサイズの増減に伴い、仮想ボリューム200から切り出す事が可能なDPページ数も増減する。ストレージ装置は圧縮率によるこのDPページ数の増減を管理し、DPページの残量が一定以下となった場合、最終記憶媒体の追加が必要なことを上位装置103やストレージ装置101の管理端末に通知する。ユーザは通知を受けた時点で、ストレージ装置101に最終記憶媒体の増設を行えばよい。 Further, as described above, the storage controller 110 determines whether or not the virtual volume based on the compression information such as the compressed data length or the compression rate (ratio between the data amount before compression and the data amount after compression) acquired from the NVM module 126. The size of the virtual PDEV that constitutes may be increased or decreased. As the size of the virtual PDEV increases or decreases, the number of DP pages that can be extracted from the virtual volume 200 also increases or decreases. The storage device manages the increase / decrease in the number of DP pages depending on the compression ratio, and when the remaining DP page amount is below a certain level, the storage device 101 or the management terminal of the storage device 101 needs to add the final storage medium. Notice. The user may add the final storage medium to the storage apparatus 101 at the time of receiving the notification.
 DPボリュームを上位装置103に提供する場合、上位装置103にはあらかじめ決められた固定サイズのボリュームであるDPボリュームが提供されているので、圧縮率の変動によって使用可能な記憶領域(DPページ)の増減が発生しても、上位装置103またはそれを用いるユーザが、記憶領域の増減を意識する必要はなく、また圧縮率が向上したことにより使用可能な記憶領域(DPページ)が増加した場合、増加した記憶領域を有効に活用可能になるという利点がある。 When a DP volume is provided to the host device 103, a DP volume, which is a fixed size volume determined in advance, is provided to the host device 103. Even when the increase / decrease occurs, the host device 103 or the user using it does not need to be aware of the increase / decrease of the storage area, and when the storage area (DP page) that can be used increases due to the improved compression ratio, There is an advantage that the increased storage area can be effectively used.
101:ストレージ装置
102:SAN
103:上位装置
104:管理装置
110:ストレージコントローラ
111:SSD
112:HDD
121:プロセッサ
122:内部SW
123:ディスクインターフェース
124:ホストインターフェース
125:DRAM
126:NVMモジュール
410:FMコントローラ
411:I/Oインターフェース
413:RAM
414:スイッチ
416:データバッファ
417:FMインターフェース
418:データ圧縮/伸長ユニット
419:パリティ生成ユニット
101: Storage device 102: SAN
103: Host device 104: Management device 110: Storage controller 111: SSD
112: HDD
121: Processor 122: Internal SW
123: Disk interface 124: Host interface 125: DRAM
126: NVM module 410: FM controller 411: I / O interface 413: RAM
414: Switch 416: Data buffer 417: FM interface 418: Data compression / decompression unit 419: Parity generation unit

Claims (13)

  1.  1以上のプロセッサとキャッシュ装置を有するコントローラと、複数の記憶媒体とを有し、前記複数の記憶媒体で1以上のRAIDグループを構成しているストレージ装置において、
     前記RAIDグループの1つは、前記複数の記憶媒体のうち(n+m)個(n≧1、m≧1)の記憶媒体で構成されており、
     前記プロセッサは、上位装置に対して提供する仮想ボリュームと、前記RAIDグループとを対応付けて管理するとともに、
     前記仮想ボリュームの記憶領域を所定サイズのストライプに分割し、前記仮想ボリュームの記憶領域の先頭に位置する前記ストライプから順に選択されたn個のストライプのそれぞれを、前記RAIDグループを構成する(n+m)個の記憶媒体のいずれかに対応付けて管理しており、
     前記プロセッサは、前記n個の前記ストライプに格納されるデータを、前記記憶媒体に格納する時、
     前記キャッシュ装置に、前記n個の前記ストライプに格納されるデータから、m個のパリティを生成させ、
     前記キャッシュ装置に、前記n個の前記ストライプに格納されるデータと、前記生成されたm個のパリティとを圧縮させ、
     前記圧縮された前記n個の前記ストライプに格納されるデータのそれぞれを、前記ストライプの対応付けられた前記記憶媒体に格納し、
     前記圧縮されたm個のパリティのそれぞれを、前記RAIDグループを構成する(n+m)個の記憶媒体のうち、前記n個のストライプが対応付けられていないm個の記憶媒体に格納する、
    ことを特徴とする、ストレージ装置。
    In a storage apparatus having one or more processors and a controller having a cache device, and a plurality of storage media, wherein the plurality of storage media constitute one or more RAID groups.
    One of the RAID groups includes (n + m) (n ≧ 1, m ≧ 1) storage media among the plurality of storage media,
    The processor manages the virtual volume provided to the host device in association with the RAID group,
    The storage area of the virtual volume is divided into stripes of a predetermined size, and each of the n stripes sequentially selected from the stripe located at the head of the storage area of the virtual volume constitutes the RAID group (n + m) Managed in association with one of the storage media,
    When the processor stores the data stored in the n stripes in the storage medium,
    Causing the cache device to generate m parities from the data stored in the n stripes;
    The cache device compresses the data stored in the n stripes and the generated m parities,
    Storing each of the compressed data stored in the n stripes in the storage medium associated with the stripe;
    Each of the compressed m parities is stored in m storage media not associated with the n stripes among (n + m) storage media constituting the RAID group.
    A storage apparatus characterized by the above.
  2.  前記キャッシュ装置は、不揮発性半導体メモリ(NVM)から成る記憶媒体と、データ圧縮部とパリティ生成部を有し、
     前記プロセッサは、前記n個の前記ストライプに格納されるライトデータを前記上位装置から受け付けると、
     前記プロセッサは、前記キャッシュ装置に対してライトコマンドを発行することにより、前記キャッシュ装置に前記ライトデータを送信し、
     前記ライトコマンドを受信した前記キャッシュ装置は、受信した前記ライトデータを圧縮状態で前記不揮発性半導体メモリに格納し、
     前記プロセッサは、前記キャッシュ装置に前記m個のパリティを生成させる時、前記キャッシュ装置に対して、パリティ生成コマンドを発行し、
     前記パリティ生成コマンドを受信した前記キャッシュ装置は、
     前記不揮発性半導体メモリに格納された前記ライトデータを読み出して伸長し、
     伸長された前記ライトデータからパリティを生成し、
     生成された前記パリティを、圧縮状態で前記不揮発性半導体メモリに格納し、
     前記プロセッサは、前記キャッシュ装置にリードコマンドを発行することによって、
     前記キャッシュ装置が前記不揮発性半導体メモリに格納している、圧縮された前記ライトデータ及び圧縮された前記パリティを読み出して、圧縮された前記ライトデータ及び圧縮された前記パリティのそれぞれを、前記各記憶媒体に格納する、
    ことを特徴とする、請求項1に記載のストレージ装置。
    The cache device includes a storage medium composed of a nonvolatile semiconductor memory (NVM), a data compression unit, and a parity generation unit,
    When the processor receives write data stored in the n stripes from the host device,
    The processor sends the write data to the cache device by issuing a write command to the cache device,
    The cache device that has received the write command stores the received write data in a compressed state in the nonvolatile semiconductor memory,
    The processor issues a parity generation command to the cache device when the cache device generates the m parities.
    The cache device that has received the parity generation command,
    Read and decompress the write data stored in the non-volatile semiconductor memory,
    Parity is generated from the expanded write data,
    The generated parity is stored in the nonvolatile semiconductor memory in a compressed state,
    The processor issues a read command to the cache device,
    The cache device reads the compressed write data and the compressed parity stored in the nonvolatile semiconductor memory, and stores each of the compressed write data and the compressed parity. Store on media,
    The storage apparatus according to claim 1, wherein:
  3.  前記プロセッサは前記上位装置から、前記仮想ボリューム上の記憶位置を指定したライト要求とともにライトデータを受信すると、
     1)前記仮想ボリューム上の記憶位置から、前記ライトデータの格納されるべき前記記憶媒体を特定し、
     2)前記特定された記憶媒体上の仮想物理アドレスを算出し、
      前記仮想物理アドレスは、前記仮想ボリューム上記憶領域のデータを非圧縮状態で前記記憶媒体に格納する場合のアドレスであって、
     3)前記記憶媒体上の未使用領域を選択し、
     4)前記未使用領域に対して、前記キャッシュ装置にて圧縮された前記ライトデータを格納し、
     5)前記仮想物理アドレスと前記ライトデータの格納された未使用領域のアドレスとの対応関係を管理する情報である圧縮管理情報を、前記ライトデータの格納された記憶媒体に格納する、
    ことを特徴とする、請求項1に記載のストレージ装置。
    When the processor receives write data from the host device together with a write request specifying a storage location on the virtual volume,
    1) Specify the storage medium in which the write data is to be stored from the storage location on the virtual volume,
    2) calculating a virtual physical address on the specified storage medium;
    The virtual physical address is an address when data in the storage area on the virtual volume is stored in the storage medium in an uncompressed state,
    3) Select an unused area on the storage medium,
    4) Store the write data compressed by the cache device in the unused area,
    5) storing compression management information, which is information for managing the correspondence between the virtual physical address and the address of the unused area in which the write data is stored, in a storage medium in which the write data is stored;
    The storage apparatus according to claim 1, wherein:
  4.  前記プロセッサは前記上位装置から、前記仮想ボリューム上記憶位置を指定したリード要求を受信すると、
     前記仮想ボリューム上記憶位置に基づいて、前記リード要求で指定されるデータの格納された前記記憶媒体と、該記憶媒体上の前記仮想物理アドレスを特定し、
     前記特定された記憶媒体から、前記圧縮管理情報を読み出し、
     前記読み出された圧縮管理情報に含まれている、前記リード要求で指定されるデータの格納された前記記憶媒体上アドレスを用いて、圧縮状態のデータを読み出して前記キャッシュ装置に格納し、
     前記キャッシュ装置において前記圧縮状態のデータを伸長し、
     前記伸長されたデータを前記上位装置へ返送する、
    ことを特徴とする、請求項3に記載のストレージ装置。
    When the processor receives a read request specifying the storage location on the virtual volume from the host device,
    Based on the storage position on the virtual volume, the storage medium storing the data specified by the read request and the virtual physical address on the storage medium are specified,
    Read the compression management information from the specified storage medium,
    Using the address on the storage medium in which the data specified by the read request included in the read compression management information is stored, the compressed data is read and stored in the cache device,
    Decompressing the compressed data in the cache device;
    Returning the decompressed data to the host device;
    The storage apparatus according to claim 3, wherein:
  5.  前記プロセッサはライトデータを前記記憶媒体に格納する前に、
     前記ライトデータを前記キャッシュ装置に格納し、
     前記ライトデータの格納されるべき記憶媒体から、前記ライトデータの更新前データを圧縮状態で読み出して前記キャッシュ装置に格納し、
     前記ライトデータに対応するパリティの格納される前記記憶媒体から、前記ライトデータに対応するパリティを圧縮状態で読み出して前記キャッシュ装置に格納し、
     前記キャッシュ装置において、前記更新前データと、前記パリティと、前記ライトデータとから、更新後パリティを生成する、
    ことを特徴とする、請求項3に記載のストレージ装置。
    Before the processor stores the write data in the storage medium,
    Storing the write data in the cache device;
    Read the pre-update data of the write data from the storage medium in which the write data is to be stored in a compressed state and store it in the cache device;
    From the storage medium storing the parity corresponding to the write data, the parity corresponding to the write data is read in a compressed state and stored in the cache device,
    In the cache device, a post-update parity is generated from the pre-update data, the parity, and the write data.
    The storage apparatus according to claim 3, wherein:
  6.  前記プロセッサは、
     前記ライトデータの格納されるべき記憶媒体上の仮想物理アドレスを算出し、
     前記ライトデータの格納されるべき記憶媒体から、前記圧縮管理情報を読み出し、
     前記読み出された圧縮管理情報に含まれる、前記ライトデータの更新前データの格納された前記記憶媒体上アドレスを用いて、圧縮状態の前記更新前データを読み出して前記キャッシュ装置に格納し、
     前記キャッシュ装置において前記圧縮状態の更新前データを伸長し、
     前記ライトデータに対応するパリティの格納される前記記憶媒体を特定し、
     前記パリティの格納される記憶媒体上の仮想物理アドレスを算出し、
     前記パリティの格納される記憶媒体から、前記圧縮管理情報を読み出し、
      前記読み出された圧縮管理情報に含まれる、前記パリティの格納された前記記憶媒体上アドレスを用いて、圧縮状態の前記パリティを読み出して前記キャッシュ装置に格納し、
     前記キャッシュ装置において前記圧縮状態のパリティを伸長し、
     前記キャッシュ装置において、前記伸長された更新前データと、前記伸長されたパリティと、前記ライトデータとから、更新後パリティを生成する、
    ことを特徴とする、請求項5に記載のストレージ装置。
    The processor is
    Calculating a virtual physical address on a storage medium in which the write data is to be stored;
    Read the compression management information from the storage medium in which the write data is to be stored,
    Using the address on the storage medium in which the pre-update data of the write data included in the read compression management information is read, the pre-update data in a compressed state is read and stored in the cache device,
    Decompressing the pre-update data in the compressed state in the cache device;
    Specifying the storage medium storing the parity corresponding to the write data;
    Calculating a virtual physical address on a storage medium in which the parity is stored;
    Read the compression management information from the storage medium storing the parity,
    Using the address on the storage medium where the parity is stored, included in the read compression management information, the compressed parity is read and stored in the cache device,
    Decompressing the compressed parity in the cache device;
    In the cache device, an updated parity is generated from the expanded pre-update data, the expanded parity, and the write data.
    The storage apparatus according to claim 5, wherein:
  7.  前記圧縮管理情報は、前記記憶媒体の所定位置に格納されており、
     前記圧縮管理情報には、前記記憶媒体の所定位置から、前記仮想物理アドレスに対応付けられた前記記憶媒体上の領域のアドレスと、前記領域に格納されたデータの圧縮状態における長さから構成されるエントリが、前記仮想物理アドレス順に格納されており
     前記プロセッサが前記圧縮管理情報を読み出す際、
     アクセス対象データの格納された前記記憶媒体上の前記仮想物理アドレスに基づいて、前記圧縮管理情報内からアクセス対象データに対応した前記エントリを読み出す、
    ことを特徴とする、
    請求項4に記載のストレージ装置。
    The compression management information is stored in a predetermined position of the storage medium,
    The compression management information includes an address of an area on the storage medium associated with the virtual physical address and a length in a compressed state of data stored in the area from a predetermined position of the storage medium. Entries are stored in the order of the virtual physical addresses, and when the processor reads the compression management information,
    Based on the virtual physical address on the storage medium in which the access target data is stored, the entry corresponding to the access target data is read from the compression management information.
    It is characterized by
    The storage apparatus according to claim 4.
  8.  前記プロセッサは、前記圧縮管理情報の前記エントリの、前記記憶媒体からの読み出しに失敗した時、
     前記エントリの格納されていた、前記記憶媒体上アドレスに基づき、前記エントリに記録されていた情報が対応付けられている前記仮想物理アドレスを特定し、
     前記記憶媒体と同じRAIDグループに属するその他の記憶媒体から、前記特定された仮想物理アドレスに格納されているデータを再生成するために必要なデータを読み出し、該読み出されたデータをもとに、前記仮想物理アドレスに対応するデータを再生成し、
     前記再生成されたデータを、前記エントリの格納されていた記憶媒体の未使用領域に格納し、
     前記再生成されたデータの格納された未使用領域のアドレスと前記再生成されたデータの圧縮状態における長さを、前記圧縮管理情報の前記エントリに記録する、
    ことを特徴とする、請求項7に記載のストレージ装置。
    When the processor fails to read the entry of the compression management information from the storage medium,
    Based on the address on the storage medium in which the entry is stored, the virtual physical address associated with the information recorded in the entry is specified,
    Data necessary to regenerate the data stored in the specified virtual physical address is read from another storage medium belonging to the same RAID group as the storage medium, and based on the read data , Regenerate the data corresponding to the virtual physical address,
    Storing the regenerated data in an unused area of the storage medium in which the entry is stored;
    An address of an unused area in which the regenerated data is stored and a length of the regenerated data in a compressed state are recorded in the entry of the compression management information;
    The storage apparatus according to claim 7, wherein
  9.  前記プロセッサは、前記データの再生成の際、
     前記記憶媒体と同じRAIDグループに属するその他の記憶媒体の各々に格納されている前記圧縮管理情報に基づいて、前記特定された仮想物理アドレスに対応する、前記その他の記憶媒体上のアドレスに格納されている圧縮状態のデータを読み出して、前記キャッシュ装置に格納し、
     前記キャッシュ装置において、前記圧縮状態のデータを伸長し、前記伸長されたデータからパリティを生成することによって、前記データを再生成する、
    ことを特徴とする、請求項8に記載のストレージ装置。
    When the processor regenerates the data,
    Based on the compression management information stored in each of the other storage media that belong to the same RAID group as the storage media, stored in an address on the other storage media corresponding to the specified virtual physical address. The compressed data being stored and stored in the cache device,
    In the cache device, the compressed data is decompressed, and the data is regenerated by generating parity from the decompressed data.
    The storage apparatus according to claim 8, wherein
  10.  前記RAIDグループを構成する1の記憶媒体にアクセスできなくなったとき、
     前記プロセッサは、前記1の記憶媒体と同じRAIDグループに属するその他の記憶媒体から、前記1の記憶媒体に格納されていたデータを再生成して、代替記憶媒体に格納し、
     前記再生成されたデータに対応する、前記圧縮管理情報を作成し、前記代替記憶媒体に格納する、
    ことを特徴とする、請求項7に記載のストレージ装置。
    When access to one storage medium constituting the RAID group becomes impossible,
    The processor regenerates data stored in the first storage medium from other storage media belonging to the same RAID group as the first storage medium, and stores the data in an alternative storage medium.
    Creating the compression management information corresponding to the regenerated data and storing it in the alternative storage medium;
    The storage apparatus according to claim 7, wherein
  11.  1以上のプロセッサとキャッシュ装置を有するコントローラと、複数の記憶媒体とを有し、前記複数の記憶媒体で1以上のRAIDグループを構成しているストレージ装置の制御方法であって、
     前記RAIDグループの1つは、前記複数の記憶媒体のうち(n+m)個(n≧1、m≧1)の記憶媒体で構成されており、
     前記ストレージ装置は、上位装置に対して提供する仮想ボリュームと、前記RAIDグループとを対応付けて管理するとともに、
     前記仮想ボリュームの記憶領域を所定サイズのストライプに分割し、前記仮想ボリュームの記憶領域の先頭に位置する前記ストライプから順に選択されたn個のストライプのそれぞれを、前記RAIDグループを構成する(n+m)個の記憶媒体のいずれかに対応付けて管理しており、
     前記プロセッサが、前記n個の前記ストライプに格納されるデータを、前記記憶媒体に格納する時、
     前記キャッシュ装置に、前記n個の前記ストライプに格納されるデータから、m個のパリティを生成させ、
     前記キャッシュ装置に、前記n個の前記ストライプに格納されるデータと、前記生成されたm個のパリティとを圧縮させ、
     前記圧縮された前記n個の前記ストライプに格納されるデータのそれぞれを、前記ストライプの対応付けられた前記記憶媒体に格納し、
     前記圧縮されたm個のパリティのそれぞれを、前記RAIDグループを構成する(n+m)個の記憶媒体のうち、前記n個のストライプが対応付けられていないm個の記憶媒体の記憶領域に格納する、
    ことを特徴とする、ストレージ装置の制御方法。
    A control method for a storage device, comprising a controller having one or more processors and a cache device, and a plurality of storage media, wherein the plurality of storage media constitutes one or more RAID groups,
    One of the RAID groups includes (n + m) (n ≧ 1, m ≧ 1) storage media among the plurality of storage media,
    The storage device manages the virtual volume provided to the host device in association with the RAID group,
    The storage area of the virtual volume is divided into stripes of a predetermined size, and each of the n stripes sequentially selected from the stripe located at the head of the storage area of the virtual volume constitutes the RAID group (n + m) Managed in association with one of the storage media,
    When the processor stores data stored in the n stripes in the storage medium,
    Causing the cache device to generate m parities from the data stored in the n stripes;
    The cache device compresses the data stored in the n stripes and the generated m parities,
    Storing each of the compressed data stored in the n stripes in the storage medium associated with the stripe;
    Each of the compressed m parities is stored in a storage area of m storage media not associated with the n stripes among (n + m) storage media constituting the RAID group. ,
    A method for controlling a storage apparatus.
  12.  前記プロセッサが、前記上位装置から前記仮想ボリューム上の記憶位置を指定したライト要求とともにライトデータを受信すると、
     1)前記仮想ボリューム上の記憶位置から、前記ライトデータの格納されるべき前記記憶媒体を特定し、
     2)前記特定された記憶媒体上の仮想物理アドレスを算出し、
      前記仮想物理アドレスは、前記仮想ボリューム上記憶領域のデータを非圧縮状態で前記記憶媒体に格納する場合のアドレスであって、
     3)前記記憶媒体上の未使用領域を選択し、
     4)前記未使用領域に対して、前記キャッシュ装置にて圧縮された前記ライトデータを格納し、
     5)前記仮想物理アドレスと前記ライトデータの格納された未使用領域のアドレスとの対応関係を管理する情報である圧縮管理情報を、前記ライトデータの格納された記憶媒体に格納する、
    ことを特徴とする、請求項11に記載のストレージ装置の制御方法。
    When the processor receives write data together with a write request specifying a storage location on the virtual volume from the host device,
    1) Specify the storage medium in which the write data is to be stored from the storage location on the virtual volume,
    2) calculating a virtual physical address on the specified storage medium;
    The virtual physical address is an address when data in the storage area on the virtual volume is stored in the storage medium in an uncompressed state,
    3) Select an unused area on the storage medium,
    4) Store the write data compressed by the cache device in the unused area,
    5) storing compression management information, which is information for managing the correspondence between the virtual physical address and the address of the unused area in which the write data is stored, in a storage medium in which the write data is stored;
    The storage apparatus control method according to claim 11, wherein:
  13.  前記プロセッサは前記上位装置から、前記仮想ボリューム上記憶位置を指定したリード要求を受信すると、
     前記仮想ボリューム上記憶位置に基づいて、前記リード要求で指定されるデータの格納された前記記憶媒体と、該記憶媒体上の前記仮想物理アドレスを特定し、
     前記特定された記憶媒体から、前記圧縮管理情報を読み出し、
     前記読み出された圧縮管理情報に含まれている、前記リード要求で指定されるデータの格納された前記記憶媒体上アドレスを用いて、圧縮状態のデータを読み出して前記キャッシュ装置に格納し、
     前記キャッシュ装置において前記圧縮状態のデータを伸長し、
     前記伸長されたデータを前記上位装置へ返送する、
    ことを特徴とする、請求項12に記載のストレージ装置の制御方法。
    When the processor receives a read request specifying the storage location on the virtual volume from the host device,
    Based on the storage position on the virtual volume, the storage medium storing the data specified by the read request and the virtual physical address on the storage medium are specified,
    Read the compression management information from the specified storage medium,
    Using the address on the storage medium in which the data specified by the read request included in the read compression management information is stored, the compressed data is read and stored in the cache device,
    Decompressing the compressed data in the cache device;
    Returning the decompressed data to the host device;
    The storage apparatus control method according to claim 12, wherein:
PCT/JP2014/062959 2014-05-15 2014-05-15 Storage device WO2015173925A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2014/062959 WO2015173925A1 (en) 2014-05-15 2014-05-15 Storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2014/062959 WO2015173925A1 (en) 2014-05-15 2014-05-15 Storage device

Publications (1)

Publication Number Publication Date
WO2015173925A1 true WO2015173925A1 (en) 2015-11-19

Family

ID=54479494

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/062959 WO2015173925A1 (en) 2014-05-15 2014-05-15 Storage device

Country Status (1)

Country Link
WO (1) WO2015173925A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017109931A1 (en) * 2015-12-25 2017-06-29 株式会社日立製作所 Computer system
JP2019053485A (en) * 2017-09-14 2019-04-04 Necプラットフォームズ株式会社 Storage control device, storage control system, storage control method, and, storage control program
US20200081780A1 (en) * 2018-09-11 2020-03-12 Silicon Motion, Inc. Data storage device and parity code processing method thereof
JP2021515298A (en) * 2018-02-26 2021-06-17 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Virtual storage drive management in a data storage system
CN116107516A (en) * 2023-04-10 2023-05-12 苏州浪潮智能科技有限公司 Data writing method and device, solid state disk, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5778411A (en) * 1995-05-16 1998-07-07 Symbios, Inc. Method for virtual to physical mapping in a mapped compressed virtual storage subsystem
WO2013186828A1 (en) * 2012-06-11 2013-12-19 株式会社 日立製作所 Computer system and control method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5778411A (en) * 1995-05-16 1998-07-07 Symbios, Inc. Method for virtual to physical mapping in a mapped compressed virtual storage subsystem
WO2013186828A1 (en) * 2012-06-11 2013-12-19 株式会社 日立製作所 Computer system and control method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017109931A1 (en) * 2015-12-25 2017-06-29 株式会社日立製作所 Computer system
JPWO2017109931A1 (en) * 2015-12-25 2018-08-16 株式会社日立製作所 Computer system
US10628088B2 (en) 2015-12-25 2020-04-21 Hitachi, Ltd. Computer system
JP2019053485A (en) * 2017-09-14 2019-04-04 Necプラットフォームズ株式会社 Storage control device, storage control system, storage control method, and, storage control program
JP2021515298A (en) * 2018-02-26 2021-06-17 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Virtual storage drive management in a data storage system
JP7139435B2 (en) 2018-02-26 2022-09-20 インターナショナル・ビジネス・マシーンズ・コーポレーション Virtual storage drive management in data storage systems
US20200081780A1 (en) * 2018-09-11 2020-03-12 Silicon Motion, Inc. Data storage device and parity code processing method thereof
CN116107516A (en) * 2023-04-10 2023-05-12 苏州浪潮智能科技有限公司 Data writing method and device, solid state disk, electronic equipment and storage medium
CN116107516B (en) * 2023-04-10 2023-07-11 苏州浪潮智能科技有限公司 Data writing method and device, solid state disk, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
JP6212137B2 (en) Storage device and storage device control method
US9880766B2 (en) Storage medium storing control program, method of controlling information processing device, information processing system, and information processing device
EP2042995B1 (en) Storage device and deduplication method
JP6429963B2 (en) Storage device and storage device control method
US7831764B2 (en) Storage system having plural flash memory drives and method for controlling data storage
US9946616B2 (en) Storage apparatus
US7761655B2 (en) Storage system and method of preventing deterioration of write performance in storage system
US7386758B2 (en) Method and apparatus for reconstructing data in object-based storage arrays
US10061710B2 (en) Storage device
US9304685B2 (en) Storage array system and non-transitory recording medium storing control program
WO2014102882A1 (en) Storage apparatus and storage control method
US20150378613A1 (en) Storage device
KR20170125178A (en) Raid storage device and management method thereof
US20100100664A1 (en) Storage system
WO2015173925A1 (en) Storage device
JP6513888B2 (en) Computer system having data volume reduction function, and storage control method
JPWO2017068904A1 (en) Storage system
CN112596673A (en) Multi-active multi-control storage system with dual RAID data protection
CN118051179A (en) Techniques for partition namespace storage using multiple partitions
JP7093799B2 (en) Storage system and restore control method
JP2021114164A (en) Storage device and storage control method
JP6817340B2 (en) calculator
WO2015118680A1 (en) Storage device
JP6605762B2 (en) Device for restoring data lost due to storage drive failure
JP6693181B2 (en) Storage control device, storage control method, and storage control program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14891797

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14891797

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP