WO2015198412A1 - ストレージシステム - Google Patents
ストレージシステム Download PDFInfo
- Publication number
- WO2015198412A1 WO2015198412A1 PCT/JP2014/066768 JP2014066768W WO2015198412A1 WO 2015198412 A1 WO2015198412 A1 WO 2015198412A1 JP 2014066768 W JP2014066768 W JP 2014066768W WO 2015198412 A1 WO2015198412 A1 WO 2015198412A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- volume
- command
- storage device
- storage
- write
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2056—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
- G06F11/2069—Management of state, configuration or failover
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2056—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
- G06F11/2071—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring using a plurality of controllers
- G06F11/2074—Asynchronous techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2056—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
- G06F11/2071—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring using a plurality of controllers
- G06F11/2079—Bidirectional techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/10—Program control for peripheral devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0617—Improving the reliability of storage systems in relation to availability
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0635—Configuration or reconfiguration of storage systems by changing the path, e.g. traffic rerouting, path reconfiguration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0659—Command handling arrangements, e.g. command buffers, queues, command scheduling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0662—Virtualisation aspects
- G06F3/0665—Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0866—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
- G06F12/0868—Data transfer between cache memory and other subsystems, e.g. storage devices or host systems
Definitions
- the present invention relates to a storage system.
- a configuration has been proposed in which a plurality of physical storage devices are provided as a single virtual storage subsystem to a host system such as a host computer that uses the storage device.
- a host system such as a host computer that uses the storage device.
- a plurality of (for example, two) physical storage devices are connected to each other via a network
- a plurality of volumes of the plurality of physical storage devices are represented as volumes in a single virtual storage subsystem.
- the processing method of the I / O request received from the host system such as the host computer differs depending on the state of the system or the state of the volume.
- the device B may receive an I / O request that has been received by the device A so far due to an access path failure between the host system and the storage device.
- the processing to be performed by the storage device differs depending on whether the device A receives an I / O request or the device B receives an I / O request.
- the state after receiving the I / O request needs to be the same regardless of which apparatus receives the I / O request.
- various states such as a state in which the volume cannot be duplicated may exist, but the conventional technology does not consider what processing should be performed in such a case.
- An object of the present invention is to provide an appropriate control method for a duplicated volume in a virtual storage subsystem.
- a storage system includes at least a first storage device and a second storage device.
- the first volume of the first storage device and the second volume of the second storage device are managed as a volume pair in which write data from the host computer is written twice, and the host computer has the first volume and the second volume. It is configured to be accessible to both.
- attribute information that is information related to the write order and / or write permission is set for the volume, and when an access command for the volume is received from the host computer, the storage apparatus, depending on the attribute set for the volume, The necessity of I / O to the first volume and / or the second volume is determined.
- FIG. 1 is a configuration diagram of a storage system according to an embodiment. It is explanatory drawing of the concept of virtual storage. Indicates the contents of the volume management table. Shows the contents of the pair management table. It is a figure which shows the whole flow of I / O processing. It is a flowchart (1) of the process which concerns on a WRITE SAME command. It is a flowchart (2) of the process which concerns on a WRITE SAME command. It is a flowchart of the process which concerns on a WRFBA command. It is a flowchart (3) of the process which concerns on a WRITE SAME command. It is a flowchart (4) of the process which concerns on a WRITE SAME command.
- FIG. 1 is a configuration diagram of a computer system according to an embodiment of the present invention.
- the computer system includes a host computer 100 (hereinafter abbreviated as “host 100”) and a storage system 300.
- the storage system 300 includes at least two storage apparatuses 400-1 and 400-2.
- the host 100 is connected to both the storage apparatuses 400-1 and 400-2 via a SAN (Storage Area Network) 210.
- SAN Storage Area Network
- the hardware configuration of the storage apparatuses 400-1 and 400-2 will be described.
- the hardware configurations of the storage apparatuses 400-1 and 400-2 are the same. Therefore, hereinafter, when the storage apparatus 400-1 and the storage apparatus 400-2 are referred to without being distinguished, they are referred to as “storage apparatus 400”.
- the hardware configurations of the storage apparatuses 400-1 and 400-2 are not necessarily the same.
- the storage apparatus 400 includes one or more channel adapters (CHA) 530 (530a and 530b), one or more disk adapters (DKA) 540, one or more memory packages (memory) 510, and one or more MP packages (MPPK) 520. And a storage controller 500 having one or more drives 900.
- the CHA 530 has at least one interface (I / F) 531 for connecting to the host 100 or another storage apparatus 400, and relays control information and data transmitted to and received from the host 100 and other storage apparatuses 400. It is a component that performs In the storage apparatus 400 of the present embodiment, for communication with the host 100, a command (hereinafter referred to as a SCSI command) in accordance with a standard defined by SCSI (Small Computer System Interface) is used. Therefore, the I / F 531 basically has a function of transmitting / receiving a SCSI command and its accompanying information (write data, read data, response information, etc.) to and from the host 100.
- a command hereinafter referred to as a SCSI command
- SCSI Small Computer System Interface
- the CHA 530 has a buffer 532 in addition to the I / F 531.
- the buffer 532 is a storage area for temporarily storing control information and data relayed by the CHA 530, and is configured using various volatile memories and nonvolatile memories.
- FIG. 1 only two CHAs 530 (CHA 530a and CHA 530b) are shown in one storage apparatus 400, but the number of CHAs 530 is not limited to this. Further, the number of I / Fs 531 is not limited to the number shown in the figure.
- CHA 530a those connected to the host 100 are denoted as CHA 530a
- CHA 530b those connected to another storage device 400 (for example, the storage device 400-2) are denoted as CHA 530b.
- the hardware configurations of CHA 530a and CHA 530b are the same.
- FIG. 1 only one transmission line for connecting the host 100 and the storage apparatus 400-1 or one transmission line for connecting the host 100 and the storage apparatus 400-2 is shown.
- the host 100 and the storage apparatus 400 may be connected by a plurality of transmission lines.
- only one transmission line connecting the storage apparatus 400-1 and the storage apparatus 400-2 is described, but a plurality of transmission lines may be connected.
- the DKA 540 has a plurality of interfaces (not shown) for connecting to the drive 900, and is a component that relays control information and data transmitted and received between the drive 900 and the memory 510 or the MPPK 520.
- the MPPK 520 includes a processor (MP) 521, a local memory (LM) 522, and an internal bus and an internal port (not shown).
- the MPPK 520 is a component for performing various data processing in the storage apparatus 400.
- the MP 521 executes the program on the LM 522, processing according to various commands received from the host 100 described below is realized.
- the LM 522 can be configured using various volatile memories and nonvolatile memories.
- the memory 510 is a component including a cache for temporarily storing write data from the host 100 and data read from the drive 900, and a shared memory for storing various control information.
- a cache area an area used as a cache on the memory 510 is referred to as a cache area.
- An area used as a shared memory for storing control information is called a shared memory area.
- the memory 510 may include a battery to prevent data loss when a failure such as a power failure occurs, and may be configured to hold data on the memory 510 during a power failure.
- the drive 900 is a storage device in which write data from the host 100 is finally stored.
- a storage medium of the drive 900 a nonvolatile semiconductor storage medium such as a NAND flash memory, an MRAM, a ReRAM, and a PRAM can be used in addition to a magnetic storage medium used in an HDD.
- the write data from the host 100 is finally stored in the drive 900, but the storage apparatus 400 according to the embodiment of the present invention directly stores the storage space of the drive 900 with respect to the host 100. Not provided. Instead, the storage apparatus 400 provides the host 100 with one or more volumes formed by the well-known Thin Provisioning technology. Hereinafter, this volume provided to the host 100 is referred to as a “logical volume”.
- the storage apparatus 400 manages the storage space of the logical volume by dividing it into fixed size areas. This fixed size area is called a “page”.
- the page size is, for example, 42 MB.
- Each page is dynamically assigned any storage area of one or more drives 900.
- a storage area allocated to a page is referred to as a “physical storage area”.
- the logical volume is a volume that can be freely defined when a user issues a logical volume creation instruction to the storage apparatus 400 from a management terminal (not shown) or the host 100 connected to the storage apparatus 400. However, when the logical volume is defined, no storage area of the drive 900 is allocated to each page of the logical volume.
- the host 100 issues a write request (write command) in which a certain address on the storage space of the logical volume is designated to the logical volume
- the storage apparatus 400 issues a page on the drive 900 to the page corresponding to the designated address. Allocate unused storage (not yet assigned to any page).
- the storage area of the drive 900 is dynamically allocated to the logical volume (page thereof), at least in the initial state, the total storage capacity (size) of the drive 900 mounted on the storage apparatus 400 is the logical volume. Less than the total size. Then, after the operation of the storage apparatus 400 is started, the drive 900 may be added as necessary, so that the introduction cost can be reduced.
- the storage apparatus 400 employs a so-called write-back cache method. Therefore, when a data update request such as a WRITE command is received from the host 100 to the logical volume, a partial area for storing write data is secured in the cache area, and the write data is stored in the secured partial area. The host 100 is notified that the write process has been completed.
- This process is a process normally performed in a storage apparatus equipped with a disk cache.
- the state is substantially equivalent to the state in which the data is stored in the logical volume. Therefore, in the following description, the portion described as “writing data to a logical volume” does not necessarily mean that data is written to the drive 900. Even when data is only stored in the cache area and data is not yet written in the drive 900, it is expressed as “write data to the logical volume”. Conversely, it can be understood that the process of “writing data in a cache (such as a cache slot described later)” is a process of substantially storing data in a logical volume.
- the write data stored in the cache area is then written (destaged) to the drive 900 asynchronously (at a predetermined timing) with the data update request from the host 100.
- destage process itself is a well-known process, description thereof is omitted in the embodiment of the present invention.
- the storage apparatus 400 reserves an unused area on the cache area as a partial area, and stores the write data in the reserved partial area.
- the minimum size of the partial area to be secured is a fixed size, and is 256 KB as an example.
- this partial area of the minimum size is referred to as a “cache slot” or “slot”.
- the storage system 300 includes at least two storage apparatuses 400, and each storage apparatus 400 can define a logical volume.
- logical volumes accessed from the host 100 are divided into two storage devices (storage device 400-1 and storage device 400 in the storage system 300). -2) is replicated and operated.
- the data written to the logical volume 90-1 in the storage apparatus 400-1 is also written to the logical volume 90-2 of the storage apparatus 400-2 (this is “mirrored”). "). This duplication processing is performed by the MP 521 of the storage apparatuses 400-1 and 400-2.
- mirroring function the function of copying the contents of the logical volume performed by the MP 521 of the storage apparatuses 400-1 and 400-2 is referred to as “mirroring function”.
- the first logical volume to which data is written is called a primary volume (sometimes referred to as P-VOL), and the second logical volume to which data is written is referred to as a secondary volume (S-VOL).
- P-VOL primary volume
- S-VOL secondary volume
- FIG. 2 shows an example where the logical volume 90-1 is determined to be P-VOL and the logical volume 90-2 is determined to be S-VOL. Further, attribute information (attribute information indicating whether it is P-VOL or S-VOL) indicating the order of data writing to the logical volume 90-1 or 90-2 is stored in a pair management table T400 described later. ing.
- attribute information attribute information indicating whether it is P-VOL or S-VOL
- T400 pair management table
- S-VOL S-VOL
- S-VOL pair volume An S-VOL in which a copy of a P-VOL is stored.
- a P-VOL that is a logical volume in which copy data of an S-VOL is stored is also referred to as “a volume that is paired with an S-VOL” or “a pair volume of an S-VOL”.
- a set of S-VOL that is a pair volume of P-VOL and P-VOL is called a “pair” or “volume pair”.
- the storage apparatus 400-1 receives the WRITE command and data to be written (referred to as write data) by the WRITE command, the storage apparatus 400-1 stores the write data in the P-VOL, and then the storage apparatus 400- 2 issues a command (WRFBA command to be described later) for storing write data in the S-VOL that is a pair volume.
- the WRFBA command and the write data are transmitted to the storage apparatus 400-2 via the I / F (531b) of the storage apparatus 400-1 and the SAN 220.
- the storage apparatus 400-2 that has received the WRFBA command stores the write data in the S-VOL.
- the storage apparatus 400-2 notifies the storage apparatus 400-1 that the storage of the write data in the S-VOL is complete.
- the storage apparatus 400-1 replies to the host 100 that the write process has been completed.
- the storage device 400-2 When the storage device 400-2 receives the WRITE command and the write data, the storage device 400-2 issues a command (WRFBA command) for storing the write data in the P-VOL to the storage device 400-1.
- the storage apparatus 400-1 that has received this command stores the write data in the P-VOL.
- the storage apparatus 400-1 When the storage of the write data in the P-VOL is completed, the storage apparatus 400-1 notifies the storage apparatus 400-2 that the storage of the write data in the P-VOL is complete.
- the storage apparatus 400-2 stores the write data in the S-VOL. After the storage of the write data in the S-VOL is completed, the storage apparatus 400-2 returns a response indicating that the write process has been completed to the host 100.
- the storage system 300 first stores data in the P-VOL and then stores data in the S-VOL, as in the case of receiving the WRITE command for the P-VOL. .
- the processing is performed in such a flow so that the contents of the P-VOL and the S-VOL are kept the same.
- control to make the contents the same may not be performed.
- the P-VOL is present in the storage device 400-1 and the S-VOL is present in the storage device 400-2.
- the P-VOL is present in the storage device 400-2, and the storage device 400- 1 may have an S-VOL.
- write data from the host 100 is first stored in the P-VOL, and then stored in the S-VOL.
- all the logical volumes must be P-VOL (or all the logical volumes must be S-VOL). There is no restriction, and P-VOL and S-VOL may be mixed in the storage apparatus 400.
- the storage apparatus 400-1 (or 400-2) have a pair relationship with the logical volumes in the storage apparatus 400-2 (or 400-1) (logical not in a pair relationship).
- the volume may exist in the storage apparatus 400).
- the processing performed by the storage apparatus 400 when an I / O command for a volume having a pair relationship is received will be mainly described.
- the host 100 makes the same data regardless of which of the P-VOL (logical volume 90-1) and S-VOL (logical volume 90-2) in the pair relationship is accessed. It is possible to access. However, the host 100 does not need to recognize whether the logical volume 90-1 and the logical volume 90-2 are volumes storing the same data. This is because the logical volume 90-1 of the storage apparatus 400-1 and the logical volume 90-2 of the storage apparatus 400-2 are apparently the same volume for the host 100 by the storage system 300 according to the embodiment of the present invention. This is because it has a function of recognizing it. Hereinafter, this function will be described.
- the storage apparatus 400 has a function of making one or more virtual storage apparatuses (hereinafter referred to as virtual storages) different from physical storage apparatuses such as the storage apparatus 400 appear to exist on the SAN 210. .
- This function is called “virtual storage function” in the embodiment of the present invention. An example in which a virtual storage is defined in the storage apparatus 400 will be described with reference to FIG.
- the virtual storage has a device serial number (Ser #).
- the device serial number of the virtual storage is referred to as “virtual product number”.
- one or more logical volumes of the storage apparatus 400 can belong to the virtual storage in accordance with a user instruction.
- FIG. 2 shows a configuration example in which the serial number of the storage apparatus 400-1 is 65000, the serial number of the storage apparatus 400-2 is 66000, and the serial number of the defined virtual storage is 63500.
- the storage apparatus 400-1 has a logical volume 90-1
- the storage apparatus 400-2 has a logical volume 90-2.
- the logical volume of the storage apparatus 400 is given an identifier (identification number) that is unique within the storage apparatus 400, and the number is referred to as “LDEV #” or “R-LDEV #”.
- the R-LDEV # of the logical volume 90-1 is 0x2000 (“0x2000” means “2000” in hexadecimal), and the R-LDEV # of the logical volume 90-2 is 0x5000. It is.
- the logical volume 90-1 and the logical volume 90-2 belong to the virtual storage according to an instruction from the user.
- the logical volume 90-1 and the logical volume 90-2 can be assigned unique numbers in the virtual storage in addition to the R-LDEV #.
- this number is called “virtual LDEV #” or “V-LDEV #”.
- This number is also a number that can be arbitrarily set by the user (however, the V-LDEV # assigned to each logical volume must be unique within the virtual storage).
- the V-LDEV # of the logical volume 90-1 and the logical volume 90-2 that belong to the virtual storage are both set to 0x1000.
- the alternate path software 1111 is running on the host 100.
- the alternate path software 1111 recognizes that there are a plurality of access paths (called paths) from the host 100 to the logical volume, and selects a path to be used from among the plurality of paths when accessing the logical volume. It has a function to do.
- the alternate path software 1111 issues a command for acquiring volume identification information, such as an INQUIRY command defined in the SCSI standard, to a logical volume that can be recognized from the host 100. Get volume identification information.
- the storage apparatus 400 is configured to return an apparatus serial number and R-LDEV # as volume identification information when receiving an INQUIRY command for a logical volume.
- the logical volume to be inquired by the INQUIRY command belongs to the virtual storage, it has a function of returning the device serial number of the virtual storage and the V-LDEV # as the volume identification information.
- the storage apparatus 400-1 receives an INQUIRY command for the logical volume 90-1
- the storage apparatus 400-2 receives an INQUIRY command for the logical volume 90-2
- the alternate volume software 1111 since the same volume identification information is returned to the alternate path software 1111 from the logical volume 90-1 and the logical volume 90-2, the alternate volume software 1111 has the same volume for the logical volume 90-1 and the logical volume 90-2. Recognize. As a result, the alternate path of the path from the host 100 to the logical volume 90-1 (solid arrow in the figure; this path is hereinafter referred to as “path 1”) is the path from the host 100 to the logical volume 90-2 (see FIG. Middle dotted arrow. Hereinafter, this path is referred to as “path 2”).
- the alternate path software 1111 accesses the logical volume 90-1 from the application program 1112 or the like.
- the alternate path software 1111 issues an access request via the path 2 (that is, issues an access request to the logical volume 90-2).
- the alternate path software 1111 issues an access request to the logical volume 90-2, since the same data as the logical volume 90-1 is stored in the logical volume 90-2, there is no problem. Operate.
- the storage apparatus 400 When the host 100 accesses a logical volume by issuing a WRITE command or the like, at least a LUN (logical unit number) and an identifier (port ID) of the I / F 531 are specified as information for uniquely specifying the logical volume. Therefore, the storage apparatus 400 according to the embodiment of the present invention attaches a LUN and a port ID to a logical volume accessed from the host 100. The storage apparatus 400 manages the correspondence between the logical volume and the LUN and port ID assigned to the logical volume in the volume management table T300.
- FIG. 3 shows an example of the volume management table T300.
- the volume management table T300 is stored in a shared memory area in the memory 510 of the storage apparatus 400.
- the upper table T300-1 and the lower table T300-2 are both volume management tables, and the table T300-1 is a volume management table managed by the storage apparatus 400-1, and the table T300- Reference numeral 2 denotes a volume management table managed by the storage apparatus 400-2.
- volume management table T300 Since all tables are tables having items T301 to T307 described below, hereinafter, when calling the table T300-1 and the table T300-2 without distinction, they are referred to as “volume management table T300”. When it is necessary to distinguish, the table T300-1 is referred to as a “volume management table T300-1”. Further, only information about the logical volumes in the storage apparatus 400-1 is stored in the table T300-1, and only information about the logical volumes in the storage apparatus 400-2 is stored in the table T300-2. .
- Each item of the volume management table T300 will be described.
- Each row of the volume management table T300 represents information about one logical volume.
- the LDEV (T303) stores the R-LDEV # of the logical volume
- the R-Ser # (T304) stores the serial number (Ser #) of the storage device in which the logical volume exists (hereinafter, the logical volume
- the serial number of the storage device in which the volume exists is referred to as “logical volume serial number”).
- the port ID (T301) and LUN (T302) are columns for storing the port ID and LUN assigned to the logical volume.
- the virtual LDEV # set to the logical volume belongs to VLDEV # (T305), and the logical volume belongs to V-Ser # (T306).
- the virtual serial number of the virtual storage is stored (hereinafter, the virtual serial number of the virtual storage to which the logical volume belongs is referred to as “virtual volume virtual serial number” in order to prevent the notation from becoming redundant).
- the I / O mode (T307) is a kind of attribute information that indicates the status of whether or not the logical volume can be written, and the specific contents will be described later.
- the V-LDEV # of the logical volume stored in the row (R300-1) of the volume management table T300-1 is 0x1000, and the virtual product number is 63500.
- the V-LDEV # of the logical volume stored in the row (R300-2) of the volume management table T300-2 is also 0x1000, and the virtual product number is 63500.
- the host 100 issues an INQUIRY command for the logical volume stored in the row (R300-1) of the volume management table T300-1 to the storage apparatus 400-1 to inquire about the information on the logical volume.
- the host 100 designates the port ID (3e243174 aaaaaaaaa) and LUN (0) and issues an INQUIRY command)
- the MP 521 of the storage apparatus 400-1 refers to the volume management table T300-1.
- the information of the VLDEV # (T305) and V-Ser # (T306) of the logical volume to be inquired is returned.
- the host 100 issues an INQUIRY command for the logical volume stored in the row (R300-2) of the volume management table T300-2 to the storage apparatus 400-2 and inquires about the information of the logical volume.
- the MP 521 of the storage apparatus 400-2 refers to the volume management table T300-2 and returns the information of the VLDEV # (T305) and V-Ser # (T306) of the logical volume to be inquired.
- the host 100 (alternate path software 1111) recognizes that the two logical volumes are the same.
- the pair management table T400 is stored in the shared memory area of the storage apparatus 400, like the volume management table.
- the pair management table T400 is a table that both the storage apparatus 400-1 and the storage apparatus 400-2 have.
- the pair management table T400-1 and the pair management table are respectively referred to. Expressed as T400-2.
- they are expressed as a pair management table T400.
- the pair management table T400 is a table for managing volume pair information, and each row represents information about one volume pair. Each volume pair is managed with an identification number called a pair number, and the pair number (T401) stores the pair number.
- P-VOL # (T404) stores the R-LDEV # of the logical volume that is P-VOL.
- the device serial number of the logical volume specified by P-VOL # (T404) is stored in PDKC # (T403).
- the S-VOL (T405) stores the R-LDEV # of the volume (S-VOL) paired with the logical volume specified by the P-VOL # (T404), and the SDKC # (T406) ,
- the device serial number of the logical volume specified by S-VOL # (T405) is stored.
- pair status (T402)
- the pair status will be described.
- the process when data is written from the host 100 to the volume pair has been described above.
- the example in which the data written to the P-VOL is always copied to the S-VOL has been described.
- the storage device 400 manages the status of the volume pair, and when performing data access (write, etc.) to the P-VOL or S-VOL, processing is performed according to the status of the volume pair.
- a state in which the contents of the P-VOL and S-VOL are the same is referred to as a “Pair” state.
- the state (status) of a certain volume pair is Pair
- “1” is stored in the pair status (T402) of the pair management table T400.
- the pair status is Pair from the host 100
- a command such as a WRITE command
- both the P-VOL and S-VOL are Updated.
- the write data is written to both the P-VOL and the S-VOL (double-written).
- the pair status of the volume pair at that time is called a “Suspend” state.
- “0” is stored in the pair status (T402) of the pair management table T400.
- a command such as a WRITE command
- only one of the P-VOL and S-VOL is updated. Is done.
- the storage apparatus 400 autonomously changes the pair status from “Pair” to “Suspend”.
- the pair status may be changed in response to the user of the storage apparatus 400 issuing an instruction to change the pair status from the host 400 or the management terminal to the storage apparatus 400 (for example, changing from Pair to Suspend). .
- repair flag (T407) “1” is stored when the contents of the P-VOL and S-VOL partially coincide with each other because a failure has occurred during data writing to the logical volume.
- the information for example, logical block address (LBA)
- LBA logical block address
- the I / O mode (T307) managed by the volume management table T300 will be described.
- the I / O mode is a kind of attribute information that indicates the write status of the logical volume, and the storage apparatus 400 sends a WRITE command to the P-VOL or S-VOL from the host 100.
- processing is performed in consideration of the I / O mode in addition to the pair status.
- I / O mode There are at least four types of attributes set to the I / O mode: “Mirror”, “Local”, “Remote”, and “Block”.
- “Mirror” attribute “0001” (binary notation) is stored in the I / O mode (T307) column for the logical volume.
- the I / O mode (T307) for the relevant logical volume is changed to “ “0010”, “0100”, and “1000” are stored.
- the I / O mode set for one logical volume is limited to one of “Mirror”, “Local”, “Remote”, and “Block” (for example, two I / Os of Local and Block). Mode is never set).
- the I / O mode of a certain logical volume is “Mirror”, it means that the logical volume is in a state where data can be replicated with the pair volume.
- the pair status of the volume pair composed of the logical volume and its pair volume is set to “Pair”.
- the logical volume is not updated even if a data update request for the logical volume is received. Instead, data update is performed on the pair volume of the logical volume.
- the I / O mode of the logical volume is “Block”, even if a data update request for the logical volume is received, the logical volume is not updated, and the device (such as the host 100) that issued the data update request is not updated. Return an error.
- the I / O mode may change autonomously according to the status of the storage device 400, etc., but the user of the storage device 400 issues an instruction from the host 400 or the management terminal. Thus, the I / O mode can be changed.
- the storage apparatus 400 stores management information in a table called a volume management table T300 and a pair management table T400, but the management information must be managed in a data structure called a table. Do not mean.
- the management information may be managed using a data structure other than the table.
- the pair management table T400 stores information (P-VOL # (T404), S-VOL (T406), etc.) for identifying whether a certain logical volume is a P-VOL or S-VOL.
- information for identifying whether a certain logical volume is P-VOL or S-VOL can be said to be a kind of attribute information set in the logical volume. Therefore, information indicating whether the logical volume is P-VOL or S-VOL may be stored in the I / O mode (T307) of the volume management table T300.
- volume attribute information information stored in the I / O mode (T307) of the volume management table T300 and information for identifying whether the logical volume is P-VOL or S-VOL (P-VOL # (T404), S- VOL (T406) and the like are collectively referred to as “volume attribute information”.
- the MP 521 of the storage apparatus 400 When the MP 521 of the storage apparatus 400 receives an access command (WRITE command or the like) for a certain logical volume from the host 100, the MP 521 refers to the volume management table T300 and the pair management table T400 stored in the shared memory area of the own apparatus. Thus, the volume attribute information of the access target logical volume and the attribute (whether P-VOL or S-VOL) of the access target logical volume are specified. The MP 521 performs processing according to the identified volume attribute.
- an access command WRITE command or the like
- the volume management table T300 and the pair management table T400 that are referred to at this time are tables of the own device and do not refer to the tables of other devices (that is, the MP 521 of the storage device 400-1 stores the storage device 400-
- the operation is performed by referring to the volume management table T300 and the pair management table T400 managed in the storage system 1, and the table in the storage apparatus 400-2 is not referred to).
- the contents stored in the volume management table T300 and the pair management table T400 managed by the storage device 400-1 and the storage device 400-2 are managed so as not to be in an inconsistent state. For example, when a certain logical volume is managed as a P-VOL in the pair management table T400-1, the logical volume is also managed as a P-VOL in the pair management table T400-2.
- the commands (a) and (b) may be referred to as information collection commands in this specification.
- the information collection command is used when the host 100 wishes to acquire the status of the logical volume.
- the command (a) is a command for the host 100 to inquire about the status of the specified LBA. Specifically, the command for inquiring whether a physical storage area is allocated to the specified LBA. It is a command.
- the command (b) is used when the host 100 makes an inquiry about the state of the designated logical volume, for example, the capacity.
- the commands (c) to (f) are commands used by the host 100 to perform processing related to I / O, especially update, for the logical volume. Therefore, in this specification, “I / O commands” or “update” are used. This is called “system command”.
- the WRITE SAME command in (c) is used when the host 100 writes the same data to all the predetermined areas (one block or more) of the logical volume. Information specifying the area on the logical volume (specifically, the LUN that specifies the logical volume, and the LBA that is the information specifying the area on the logical volume specified by the LUN, as command parameters) (Data length) is designated, and one block of data is designated as write data (hereinafter, the designated one block of data is referred to as “pattern data”). Receiving this command, the storage apparatus 400 writes the designated pattern data in all designated areas. For example, when all zero data (data in which all bits in one block are 0) is specified as pattern data, the storage apparatus 400 writes 0 in all the specified areas.
- the UNMAP command is used when the host 100 wants to unmap (described later) a predetermined area of the logical volume.
- the storage apparatus 400 has a function of setting a page to which a physical storage area is allocated to a state in which no physical storage area is allocated.
- this function is called an unmap function.
- a state in which a page is not assigned a physical storage area is referred to as an “unmapped state”.
- an operation for bringing a page into an unmapped state is called “unmapping” or “page deallocation”.
- the storage apparatus 400 performs unmapping in response to receiving the UNMAP command from the host 100.
- the user or the host 100 issuing the UNMAP command can specify one or more pieces of information of an area (area specified by LUN, LBA and data length) to be unmapped as a parameter of the UNMAP command.
- the minimum unit of the area that can be specified by the UNMAP command is one block (the minimum access unit when the host 100 reads or writes data of the volume of the storage apparatus 400, and the size is 512 bytes as an example. ).
- the page size of the logical volume is larger than one block such as 42 MB. Therefore, there is a case where UNMAP is not performed for the entire page by a single UNMAP command issued from the host 100.
- the storage apparatus 400 when the storage apparatus 400 according to the embodiment of the present invention receives the UNMAP command, “0” is written in all areas specified by the parameters of the UNMAP command.
- the storage apparatus 400 includes an area in the page (precisely, “physical storage area allocated to the page”, but is described as “area in the page” in order to avoid redundant description), It has a function of recognizing an area where “0” is stored (or a function of determining whether or not 0 is stored in all areas in the page for each page). Then, when the UNMAP command is received one or more times and “0” is written to the area specified by the UNMAP command, it is recognized that “0” is stored in all areas of a page. At that time, the storage apparatus 400 performs page deallocation for the page.
- the COMPARE AND WRITE command may also be called an ATS (Atomic Test and Set) command.
- Information LUN, LBA, and data length
- the storage apparatus 400 that has received the COMPARE AND WRITE command compares the data in the logical volume area specified by the parameter with the transmitted compare data. As a result of the comparison, if the content of the data in the area on the logical volume is the same as the compare data, the write data is written. Otherwise, the write process is not performed.
- the area specified by the parameter is locked from the start of the data comparison process until the writing of the write data is completed. That is, data access to the area by other processing is prohibited.
- the command (WRITE command) of (f) is used when the host 100 writes designated data to a predetermined area of the logical volume.
- this WRITE command is issued to the storage device (however, a specific pattern of data (for example, all zero, all 1 etc.) is written in a given area of the logical volume, for example, when all data in a predetermined range of the logical volume is to be erased, the (c) WRITE SAME command is issued).
- Information (LUN, LBA, and data length) for specifying an area on the logical volume is specified as a command parameter. Further, data having the same size as the length of the designated area is transmitted as write data.
- the MP 521 of the storage apparatus 400 first determines whether or not the received command is an information collection command (S1), and the received command is an information collection command. The process after S3 is performed.
- the MP 521 further analyzes the command type. For example, if the command type is GET LBA STATUS. The process acquires the state of the LBA specified by the command parameter (for example, whether a physical storage area is allocated) and returns the information to the host 100. If the command type is LOG SENSE, information on the logical volume or storage device 400 specified by the command parameter is returned.
- the storage device 400 when the storage device 400 receives an information collection command, the storage device 400 does not consider the I / O mode or pair status of the logical volume, and the logical volume specified by the command parameter or the storage device 400 information, A process of returning to the command issuer (host 100 or the like) is performed.
- the processing after S2 is performed.
- the MP 521 determines the type of command, and performs any one of S11, S21, S31, and S40 according to the type of command.
- the command is WRITE SAME
- the MP 521 performs the processing after S11
- the command is UNMAP
- the MP 521 performs the processing after S21.
- the command is ATS
- the MP 521 performs the processing after S31
- the command is WRITE
- the MP 521 performs the processing after S40.
- the storage apparatus 400 may receive a command not described above (for example, a READ command that is a command used when reading data stored in a logical volume). In this case, for example, when a READ command is received, the storage apparatus 400 performs a logical volume read process and the like, but these processes are omitted in this specification.
- a command not described above for example, a READ command that is a command used when reading data stored in a logical volume.
- the MP 521 first refers to the volume management table T300, and determines the port ID and LUN specified in the command. Conversion to a logical volume number (R-LDEV #). Then, referring to the pair management table T400, it is determined whether the logical volume (access target volume) specified by the converted logical volume number is a P-VOL or an S-VOL (S11, S21, S31, S41). Note that S41 is not shown in FIG. This is because the processing to be performed differs depending on whether the access target volume is a P-VOL or an S-VOL.
- the MP 521 refers to the volume management table T300 and determines whether or not the status (I / O mode) of the access target volume is Block (S12, S13, S22, S23, S32, S33, S42, S43). . If the status of the access target volume is Block, the MP 521 returns an error to the host 100 and ends the process. If the status of the access target volume is not Block, the processing of S14, S15, S24, S25, S34, S44 or S45 is performed.
- the MP 521 refers to the volume management table T300 and determines the I / O mode of the access target volume (S140).
- the I / O mode is Local, the case of Remote, or the case of Mirror will be described.
- the range of the access target area of the logical volume specified by the WRITE SAME command matches one slot will be described unless otherwise specified.
- the MP 521 secures a data area for one slot on the LM 522. Then, write data (one block) specified by the WRITE SAME command is written in all the reserved areas (S1421). For example, when the specified write data is all 1 (all bits in one block are “1”), “1” is written in all the bits of the reserved area. Further, an area for one slot is secured also in the buffer 532a of the CHA 530a, and the data stored in the area secured on the LM 522 is copied to the secured area on the buffer 532a.
- the processing here that is, the processing for writing the data for one block specified by the WRITE SAME command into the entire area for one slot of the buffer 532 is referred to as “data expansion processing”.
- data expansion processing an example is described in which data for one slot created by the data expansion process is created in the buffer 532a of the CHA 530a. However, it may be created in an area other than the buffer 532a. For example, the present invention is effective even if created in LM522.
- the MP 521 secures an unused area for one slot in the cache area (S1422).
- This process is a well-known process executed by a storage apparatus equipped with a disk cache.
- this processing processing for securing an unused area for one slot or a plurality of slots on the cache area
- slot securing process processing for securing an unused area for one slot or a plurality of slots on the cache area
- cache slot securing process processing for securing an unused area for one slot or a plurality of slots on the cache area
- the MP 521 secures a plurality of slots.
- the MP 521 writes the data stored in the area secured on the buffer 532a in S1421 to the unused area for one slot on the cache area secured in S1422.
- the same data (specified by the WRITE SAME command) is stored in the entire range of the access target area on the logical volume specified by the WRITE SAME command.
- the MP 521 When the processing of S1423 is completed, the MP 521 returns a response indicating that the processing related to the WRITE SAME command has been completed normally to the host 100 (S1424).
- the status “GOOD” is returned to the host 100.
- a status of “CHECK CONDITION” is returned to the host 100. Therefore, in the following, returning the storage apparatus 400 to the host 100 that the processing has been completed normally is expressed as “returning a GOOD status”.
- the MP 521 refers to the pair management table T400, and identifies the device serial number and LDEV # of the volume (S-VOL) that is paired with the access target volume. Then, a command instructing to write the data created in S1431 to the S-VOL is issued to the storage apparatus 400 (storage apparatus 400-2) in which the S-VOL exists.
- the command issued here is called a WRFBA command.
- the WRFBA command is valid only between the storage apparatuses 400.
- the contents performed by the storage apparatus 400 that has received the WRFBA command are similar to those in the case of receiving the WRITE command, and the process of storing the specified data in the area (LBA) on the specified logical volume is performed. Therefore, the storage apparatus 400-2 that has received the WRFBA command transmitted from the storage apparatus 400-1 through the process of S1432 performs a process of writing data to the S-VOL. This process will be described later.
- the WRFBA command is a command for instructing data writing for one slot. For this reason, when writing to an area corresponding to a plurality of slots is instructed by the WRITE SAME command, here, the WRFBA command is issued multiple times to write data to the S-VOL to the storage apparatus 400-2. The instructions are given. Further, the WRFBA command and the write data are transmitted to the storage apparatus 400-2 via the I / F (531b) and the SAN 220.
- FIG. 25 shows an example of the format of the WRFBA command used in the storage apparatus 400 according to the embodiment of the present invention.
- the WRFBA command includes information on an operation code (Opcode, sometimes called an opcode) 701, write position information 702, an issuer I / O mode 703, and an exclusion reservation flag 704. Further, other information may be included.
- Opcode sometimes called an opcode
- the operation code 701 is a 1-byte code (for example, a value such as 0xDA) indicating that the transmitted command is a WRFBA command.
- the write position information 702 is information for specifying the area of the logical volume to which the write data is to be written, and includes the logical volume LDEV # and the logical block address (LBA).
- the storage apparatus 400 that has received the WRFBA command performs a process of storing write data in an area for one slot from the LBA of the logical volume specified by the write position information 702.
- the issuer I / O mode 703 is information on the I / O mode of a logical volume that is paired with the logical volume specified by the write position information 702. That is, when the storage apparatus 400 that issues the WRFBA command (assumed to be the storage apparatus 400-1) instructs to write to the S-VOL of the storage apparatus 400 (400-2) that is the command issue destination, the issuer I / O mode In 703, a WRFBA command including information on the I / O mode of the P-VOL in itself (storage device 400-1) is created and transmitted to the storage device 400-2. How this information is used will be described later.
- the exclusive reservation flag 704 stores 0 or 1. When 1 is stored, the storage apparatus 400 that has received the WRFBA command performs a data exclusion securing process described later, and then performs a write process to the logical volume. In the storage apparatus 400, the exclusive reservation flag 704 is set to “1” only when the logical volume to be written by the WRFBA command is P-VOL and the I / O mode of the P-VOL is “Mirror”. Create a set command. Therefore, in S1432, a command in which the exclusive reservation flag 704 is set to 0 is created and transmitted.
- a response indicating that the processing is completed is returned to the storage device 400-1.
- the MP 521 of the storage apparatus 400-1 receives the response from the storage apparatus 400-2 (in other words, after issuing the WRFBA command in S1432, the storage apparatus 400-1 receives the response from the storage apparatus 400-2. Wait for a response).
- the MP 521 issues a WRFBA command to the storage apparatus 400-2 in which the S-VOL exists multiple times. To do. More precisely, the MP 521 performs S1432 (issue of a WRFBA command) and S1433 (reception of a response from the storage apparatus 400-2) a plurality of times.
- the MP 521 When the response indicating that the processing is completed in S1433 is received, the MP 521 returns a GOOD status to the host 100 (S1434). This process is the same as S1424. After returning the GOOD status to the host 100, the MP 521 ends the process.
- the I / O mode is Mirror
- data is written to both the P-VOL and S-VOL. Therefore, data is written to the logical volume in both the storage apparatus 400-1 where the P-VOL exists and the storage apparatus 400-2 where the S-VOL exists.
- the order of data writing is determined such that writing to the P-VOL is performed first and then writing to the S-VOL is performed.
- the range of the access target area of the logical volume specified by the WRITE SAME command is a size corresponding to a plurality of slots.
- the MP 521 locks the entire range of the access target area of the logical volume specified by the WRITE SAME command.
- This processing is referred to as “data exclusion ensuring processing” in this specification.
- the entire range of the access target area of the logical volume is accessed by other processes (such as a process of reading or writing to the range) until the data exclusive release process described later is performed. Is prevented.
- the P-VOL area is locked by the data exclusion reservation process.
- writing to a logical volume whose I / O mode is Mirror is performed first from the P-VOL. Therefore, even when a write request (such as a WRITE command) to the S-VOL is received from the host 100 immediately after the execution of S1411, for example, a write process to the P-VOL is started first.
- S1412 to S1414 the same processing as S1421 to S1423 described above is performed. That is, the MP 521 performs data expansion processing, cache slot allocation processing, and data write processing to the cache area. However, in S1413, if the range of the access target area of the logical volume is the size of a plurality of slots, a plurality of cache slots are secured.
- the MP 521 (storage apparatus 400-1) releases the lock of the area locked in S1411 (S1417).
- this processing is referred to as “data exclusive release processing”.
- the MP 521 performs the processes of S1424 and S1425 and ends the process. In the slot release process performed in S1425, all the cache slots secured in S1413 are released.
- the storage apparatus 400 receives the WRITE SAME command and the volume specified as the access target volume in the command is S-VOL, the processing performed in the storage apparatus 400 The flow of will be described.
- the P-VOL is in the storage apparatus 400-1
- the volume (S-VOL) that is paired with the P-VOL is the storage apparatus 400.
- -2 will be described as an example.
- the processes described below are executed by the MP 521 of the storage apparatus 400-2 (storage apparatus in which the S-VOL exists) unless otherwise specified.
- the MP 521 refers to the volume management table T300 and determines the I / O mode of the access target volume (S150).
- the I / O mode is Local, the case of Remote, or the case of Mirror will be described.
- the range of the access target area of the logical volume specified by the WRITE SAME command matches one slot will be described unless otherwise specified.
- the MP 521 When the I / O mode is Local (S150: Local), the MP 521 writes data to the S-VOL by performing the processing of S1521 to S1525.
- the processing of S1521 to S1525 is the same as S1421 to S1425 of FIG.
- the MP 521 performs data extension processing (S1531). This process is the same as S1421 and S1431.
- the data for one slot created by the data expansion processing is stored in the buffer 532b, that is, the buffer 532b of the CHA 530b connected to the storage apparatus 400-1 via the SAN 220.
- the data created in the LM 522 is stored, and the data created from the LM 522 is read and transmitted when the data is transmitted to the P-VOL (storage device 400-1). There is also a possibility.
- the MP 521 refers to the pair management table T400, and identifies the device serial number and R-LDEV # of the volume (P-VOL) that is paired with the access target volume.
- a WRFBA command is created based on these pieces of information, and the WRFBA command and the data created in S1531 are transmitted to the storage apparatus 400 (storage apparatus 400-1) in which the P-VOL exists.
- the MP 521 waits until a response is returned from the storage apparatus 400-1.
- the processing from S1511 is performed.
- the MP 521 performs a data expansion process (this is the same process as S1531).
- the MP 521 creates a WRFBA command, transmits the WRFBA command and the data created in S1511 to the volume (P-VOL) paired with the access target volume, and the storage apparatus 400 having the P-VOL. It waits for a response to be returned from (storage device 400-1) (S1513).
- This process is similar to S1532 and S1533.
- the WRFBA command created in S1512 is different from S1532 in that “1” is set in the exclusion reservation flag 704.
- the MP 521 When a response is received from the storage apparatus 400 (storage apparatus 400-1) having the P-VOL (S1513), the MP 521 performs the processes of S1514 to S1517 to write data to the S-VOL.
- the processing of S1514 to S1517 is the same processing as S1522 to S1525.
- the MP 521 transmits a data exclusive release command to the storage apparatus 400 (storage apparatus 400-1) having the P-VOL (S1518), and ends the process.
- the storage apparatus 400-1 performs a data exclusion securing process.
- the storage apparatus 400-2 transmits a data exclusive release command to cause the storage apparatus 400-1 to perform the data exclusive release process.
- the data exclusive release command is a command exchanged only between the storage apparatuses 400.
- the data exclusive release command includes information on the access target area of the logical volume to which the lock is to be released, and the storage apparatus 400 that has received the data exclusive release command uses the information to access the logical volume access target area. Release the lock.
- the MP 521 refers to the pair management table T400 and determines whether the access target logical volume is a P-VOL or an S-VOL (S1001). When the access target logical volume is P-VOL, the MP 521 determines the I / O mode of the pair volume (S-VOL) of the access target volume (S1011). Since the I / O mode of the pair volume is included in the WRFBA command parameter (issue source I / O mode 703), the MP 521 refers to the issue source I / O mode 703 to make the determination in S1011.
- the MP 521 performs the processing after S1012.
- the exclusive reservation flag 704 included in the WRFBA command is referred to. If the exclusive reservation flag 704 is “1”, the access target area is locked (data exclusive reservation).
- the MP521 performs a cache slot securing process, and in S1015, the MP521 writes the write data to the cache slot. Thereafter, the MP 521 responds to the storage apparatus 400 that issued the WRFBA command that the processing has been completed (S1016). Then, it waits for a new instruction from the storage apparatus 400 that issued the WRFBA command.
- the MP 521 When the MP 521 receives the data exclusive release command from the storage apparatus 400 that issued the WRFBA command in S1017, the MP 521 performs the data exclusive release (S1018). Thereafter, the MP 521 releases the cache slot secured in S1014 (S1019) and ends the process.
- the processing of S1012 to S1019 is repeatedly executed n times.
- the MP 521 performs the processing in S1022 to S1025. Since the processing of S1022 to S1025 is the same as that of S1014 to S1016 and S1019, respectively, description thereof is omitted here.
- an area for one slot in which all 0s are written is prepared on the buffer 532 (532a and 532b). Then, the MP 521 determines whether the write data transmitted together with the WRITE SAME command is all zero, and in the case of all zero, the data expansion process (for example, S1411, S1421, S1431, FIG. 7, S1511, S1521 in FIG. 8). , S1531 etc.) is not performed, and the data of one slot area prepared on the buffer 532 (that is, all zero data for one slot) is written to the cache area.
- the WRITE SAME propagation command is a command for instructing to write the same data to all the predetermined areas of the logical volume.
- the difference between the WRFBA command and the WRITE SAME propagation command will be described with reference to FIG.
- the upper part of FIG. 26 shows an example in which “1” is written to the entire area of one slot in the logical volume using the WRFBA command from the storage apparatus 400-1 to the storage apparatus 400-2.
- a WRFBA command is transmitted from the storage apparatus 400-1 to the storage apparatus 400-2, and data for one slot is transmitted as write target data.
- the storage apparatus 400-2 that has received the WRFBA command writes the received write data as it is into the area of the logical volume specified by the parameter of the WRFBA command.
- the lower part of FIG. 26 shows an example in which “1” is written to the entire area of one slot in the logical volume by using the WRITE SAME propagation command from the storage apparatus 400-1 to the storage apparatus 400-2. is there.
- the command (WRITE SAME propagation command) needs to be transmitted from the storage apparatus 400-1 to the storage apparatus 400-2, as in the case of the WRFBA command.
- the WRITE SAME propagation command is used, only one block of data (pattern data) needs to be transmitted as the write target data.
- the storage apparatus 400-2 that has received the WRITE SAME propagation command extends the received pattern data to the length of one slot. This process is the same as the data expansion process described above.
- the data expanded in the storage apparatus 400-2 is written to the logical volume.
- the write range extends over a plurality of slots
- the expanded data is written into the plurality of slots. That is, the storage apparatus 400 that has received the WRITE SAME propagation command performs the same processing as when the WRITE SAME command is received from the host 100.
- FIG. 9 shows the flow of processing performed in the storage apparatus 400 when the storage apparatus 400 receives the WRITE SAME command and the volume specified as the access target volume in the command is P-VOL. ing. Many of the processes in FIG. 9 are the same as the processes described with reference to FIG. 6, and therefore, differences from FIG. 6 will be mainly described here.
- the MP 521 sequentially executes the processes of S1411 to S1417, S1424, and S1425.
- the only processing different from FIG. 6 is S1415 ', and the subsequent processing is the same as the processing described in FIG.
- the WRITE SAME propagation command is issued to the storage apparatus 400 having the volume (S-VOL) paired with the P-VOL in S1415 ′.
- the write data is transmitted together with the WRITE SAME propagation command.
- the data transmitted here may be transmission of only one block of data (pattern data).
- the WRITE SAME propagation command of S1415 ' may be issued only once. This is because when the write range is specified in the parameter of the WRITE SAME propagation command, the same range as the access target range specified by the WRITE SAME command can be specified.
- the MP 521 sequentially executes the processes of S1432 'to S1434. 6 differs from FIG. 6 in that the processing of FIG. 9 does not perform S1431 (data expansion processing) of FIG. 6, and replaces S1432 (transmission of a WRFBA command) of FIG. 6 with S1432 ′ (WRITE). The transmission of the SAME propagation command is performed. Since the process of S1432 'is the same as that of S1415', the description thereof is omitted here.
- FIG. 10 shows the flow of processing performed in the storage apparatus 400 when the storage apparatus 400 receives the WRITE SAME command and the volume specified as the access target volume in the command is S-VOL. ing. Many of the processes in FIG. 10 are the same as the processes described with reference to FIG. 7, and therefore, differences from FIG. 7 will be mainly described here.
- the MP 521 sequentially executes the processing of S1511 to S1518.
- the only processing different from FIG. 7 is S1512 ', and the subsequent processing is the same as the processing described in FIG.
- the WRFBA command was issued, whereas in S1512 ′, the WRITE SAME propagation command is issued to the storage apparatus 400 in which the volume (P-VOL) that is paired with the S-VOL exists.
- the write data transmitted together with the WRITE SAME propagation command may transmit only one block of data as in the process of S1415 '.
- the present invention is not limited to the processing procedure described above.
- the data expansion process (S1511) is executed before S1512 ′.
- the data expansion process is performed before data is written to the S-VOL.
- the process of S1511 may be performed before, for example, S1514 (cache slot securing process).
- the MP 521 sequentially executes the processes of S1532 'to S1534. 7 differs from FIG. 7 in that the processing of FIG. 10 does not perform S1531 (data expansion processing) of FIG. 7, and replaces S1532 (WRFBA command transmission) of FIG. 7 with S1532 ′ (WRITE). The transmission of the SAME propagation command is performed. Since the process of S1532 'is the same as S1512', description thereof is omitted here.
- the difference between the process of FIG. 11 and the process of FIG. 8 is that the data expansion process is performed in the process of FIG. If the access target logical volume is P-VOL and the I / O mode of the volume (S-VOL) paired with the access target logical volume (P-VOL) is Mirror as determined in S1001 The data expansion process is performed before the slot securing process in S1014 (S1013).
- the above is the description of the processing related to the WRITE SAME command in the storage apparatus 400 according to the modification.
- the storage apparatus 400 according to the modification by using the WRITE SAME propagation command instead of the WRFBA command, the amount of write data to be transmitted between the storage apparatus 400-1 and the storage apparatus 400-2 and the command to be transmitted As a result, the performance of the storage system 300 is improved.
- the storage apparatus 400 receives the UNMAP command and the volume specified as the access target volume in the command is P-VOL, the flow of processing performed in the storage apparatus 400 is shown in FIG. It explains using.
- the storage apparatus 400 when the storage apparatus 400 according to the embodiment of the present invention receives the UNMAP command, it performs a process of writing 0 to all the areas on the logical volume specified by the UNMAP command. This is substantially the same as the processing performed when a WRITE SAME command is received. For this reason, when receiving the UNMAP command, the MP 521 converts the command into a WRITE SAME command instructing to write “0” in all the designated areas (S241). After the processing of S241, the processing is performed assuming that the WRITE SAME command is received.
- the subsequent processing is almost the same as the processing performed when the WRITE SAME command is received.
- the process of S242 is the same as S140 of FIG.
- the processing of S2422 to S2425 is the same as S1432 to S1434 of FIG.
- the processing of S2432 to S2434 is the same as S1422 to S1425 of FIG.
- the processing of S2411 to S2417 is the same as S1411 and S1413 to S1417 of FIG.
- the MP 521 when the data to be written by the WRITE SAME command is all zero, the data expansion process is not performed. Therefore, when the UNMAP command is received, the MP 521 does not perform the data expansion process, but instead of performing the data expansion process, the MP521 stores the data of the area corresponding to one slot in which all zero on the buffer 532 is stored. Perform the writing process. Also, in the process of S2432 (process for transmitting a WRFBA command and write data to the S-VOL), data of an area for one slot storing all zero on the buffer 532 is transmitted.
- the storage apparatus 400 receives the UNMAP command and the volume specified as the access target volume in the command is S-VOL, the flow of processing performed in the storage apparatus 400 is shown in FIG. It explains using. This process is also similar to the process performed when a WRITE SAME command is received in many parts. First, in the storage apparatus 400, when an UNMAP command is received, the command is converted into a WRITE SAME command instructing to write “0” in all designated areas (S251).
- the subsequent processing is almost the same as the processing (FIG. 7) performed when the WRITE SAME command is received.
- the process of FIG. 13 differs from the process of FIG. 7 in that the data expansion process is not performed in the process of FIG.
- the process of S252 is the same as S150 of FIG.
- the processing of S2522 to S2525 is the same as S1522 to S1525 of FIG.
- the processing of S2532 to S2534 (processing performed when the I / O mode is Remote) is the same as S1532 to S1534 of FIG.
- the processing of S2512 to S2518 is the same as S1512 to S1518 of FIG.
- a WRFBA command is issued to the storage apparatus 400 in which the S-VOL (or P-VOL) exists.
- the processing to be performed by the storage apparatus 400 when this command is received is the same as that described in the processing related to the WRITE SAME command (FIG. 8), and thus description thereof is omitted here.
- the MP 521 determines whether all 0s are stored in all areas in the page including the write target area. If all 0s are stored, page allocation is canceled for the page. This determination and page deallocation may be performed without fail following the processing in FIG. 12, 13, or 8, or may be asynchronous with the processing in FIG. 12, FIG. 13, or FIG. May be executed. For example, it may be periodically determined whether 0 is stored in all areas in each page and page allocation is canceled.
- data (all zero) is written to the S-VOL (or P-VOL) by issuing a WRFBA command to the storage apparatus 400 in which the S-VOL (or P-VOL) exists.
- the WRITE SAME propagation command described in [Modification] may be issued.
- the storage apparatus 400 receives an ATS command and the volume specified as the access target volume by the received ATS command is a P-VOL
- the flow of processing performed in the storage apparatus 400 is as follows. This will be described with reference to FIG.
- the processes described below are processes executed by the MP 521 of the storage apparatus 400-1 (storage apparatus in which the P-VOL exists) unless otherwise specified.
- the MP 521 refers to the volume management table T300 and determines the I / O mode of the access target volume (S340). This process is the same as S140.
- the I / O mode is Local, the case of Remote, or the case of Mirror will be described.
- a description will be given of a case where the range of the access target area of the logical volume specified by the ATS command matches one slot unless otherwise specified, so that the description is not redundant.
- the MP 521 When the I / O mode is Local, the MP 521 performs a cache slot securing process (S34220). This process is the same as S1422. Similarly to S1422, when the range of the access target area of the logical volume specified by the ATS command is a range of a plurality of slots, a plurality of slots are secured.
- the MP 521 reads the data in the access target area range of the logical volume specified by the ATS command to the cache area secured in S34220.
- the MP 521 once releases the cache area (S34222). Even if the cache area is released, the data in the area is not deleted from the cache area, and the data in the range of the access target area of the logical volume is stored in the area.
- the MP 521 again secures the cache area released in S34222.
- the securing process here is slightly different from the cache slot securing process performed in S34220, and is called an exclusive securing process.
- the cache area secured by the exclusion securing process becomes inaccessible from other processes. For example, when receiving a WRITE command for requesting data writing to the same range as the access target area of the logical volume specified by the ATS command, the MP 521 tries to secure a cache area, but if an exclusive reservation process is performed, The cache slot securing process performed in accordance with the WRITE command is waited until the process related to the ATS command is completed.
- the MP521 compares the data on the cache slot exclusively secured in S34230 with the compare data received together with the ATS command. Although not shown, when the two data are different as a result of the comparison in S34231, the MP 521 returns an error to the host 100 and ends the process related to the ATS command.
- the MP 521 When the processing of S3424 ends, the MP 521 returns a GOOD status to the host 100 (S3425). In step S3426, the MP 521 releases the secured cache slot (S3426) and ends the processing.
- the MP 521 refers to the pair management table T400, and identifies the device serial number and LDEV # of the volume (S-VOL) that is paired with the access target volume. Then, a command called an ATS propagation command is transmitted to the storage apparatus 400 (storage apparatus 400-2) where the S-VOL exists.
- the ATS propagation command is a command that is valid only between the storage apparatuses 400, similarly to the WRFBA command.
- the parameter of the ATS propagation command includes information for specifying the area of the access target logical volume. Therefore, the storage apparatus 400 that has received the ATS propagation command performs the same processing (specifically, S34220 to S3424) as when the ATS command was received for the area of the logical volume to be accessed based on the information specified in the parameters. The same processing can be performed.
- the compare data and the write data are also transmitted to the command issuing destination storage apparatus 400 together with the ATS propagation command in the same manner as when the host 100 issues the ATS command to the storage apparatus 400.
- the storage apparatus 400 that has received the ATS propagation command performs a comparison process using the received compare data, and the comparison result is correct (the compare data and the data stored in the access target range of the logical volume are the same). In addition, processing for storing the write data in the logical volume is performed. This process will be described later.
- the MP 521 When the MP 521 (of the storage apparatus 400-1) transmits an ATS propagation command to the storage apparatus 400-2 in S3432, the MP 521 waits for a response from the storage apparatus 400-2. If the response from the storage apparatus 400-2 is received in S3433 and the result is normal (that is, the compare and write are successful), the MP521 returns a GOOD status to the host 100 (S3434) and ends the process. To do.
- S3411 If it is determined in S340 that the I / O mode is Mirror, the processing after S3411 is executed. In S3411, a lock is secured for the entire range of the access target area of the logical volume. This process is the same as S1411. Subsequently, by performing the processing of S34120 to S3414, the comparison and writing to the access target area of its own logical volume (P-VOL) are performed. This process is the same as the process performed when it is determined that the I / O mode is Local (S34220 to S3424).
- the MP 521 stores the write data transmitted together with the ATS command to the S-VOL in the access target area of the S-VOL, so that the storage apparatus 400 in which the S-VOL exists (storage apparatus 400-2). Issue a WRFBA command.
- the write data transmitted together with the WRFBA command is write data transmitted to the storage apparatus 400-1 together with the ATS command. Note that compare data is not transmitted.
- the MP 521 waits for a response to be returned from the storage apparatus 400-2.
- the MP521 When the MP521 receives a response from the storage apparatus 400-2 in S3416, the MP521 performs a data exclusive release process (S3417), returns a GOOD status to the host 100 (S3425), releases a cache slot (S3426), and ends the process. .
- the processing described below is processing executed by the MP 521 of the storage apparatus 400-2 (storage apparatus in which the S-VOL exists) unless otherwise specified.
- the MP 521 refers to the volume management table T300 and determines the I / O mode of the access target volume (S350). This process is the same process as S340.
- the MP 521 When the I / O mode is Local, the MP 521 performs the processing from S35220 to S3526. This process is the same as S34220 to S3426 in FIG.
- the MP 521 When the I / O mode is Remote, the MP 521 performs the processing from S3532 to S3534. This process is almost the same as S3432 to S3434 in FIG. Of the processes in S3532 to S3534, the difference from S3432 to S3434 is that the ATS propagation command is transmitted to the storage apparatus 400 (400-2) where the S-VOL exists in S3532 in the process of FIG. In S3532 of FIG. 15, the MP 521 transmits an ATS propagation command to the storage apparatus 400 (400-1) where the P-VOL exists, and in S3533, the storage apparatus 400-1 (P-VOL) instead of the storage apparatus 400-2. The response is received from the storage device where
- S3512 If it is determined in S350 that the I / O mode is Mirror, the processing after S3512 is executed.
- the processing of S3512 and S3513 is the same as S3532 and S3533, and processing for transmitting the ATS propagation command to the storage apparatus 400-1 in which the P-VOL exists and receiving the processing result is performed.
- the MP 521 performs a cache slot securing process, and then stores the write data received together with the ATS command in the cache area secured in S3514 (S3515).
- the MP 512 returns a GOOD status to the host 100.
- the MP 512 releases the cache slot secured in step S3514. Thereafter, the MP 521 transmits a data exclusive release command to the storage apparatus 400-1, causes the storage apparatus 400-1 to perform a data exclusive release process (S3518), and ends the process.
- the MP 521 determines whether the access target logical volume is a P-VOL or an S-VOL (S2001). If the access target logical volume is a P-VOL (S2001: PVOL), the MP 521 determines the I / O mode of the pair volume (S-VOL) of the access target volume in S2011. Since the I / O mode of the pair volume is included in the parameters of the ATS propagation command (similar to the WRFBA command), the MP 521 can make the determination in S2011 by referring to the parameters of the ATS propagation command.
- the MP 521 performs the processing from S2012 onward.
- S2012 to S2018 are data comparison and write processing for the P-VOL, and the same processing as S3411 to S3414 in FIG. 14 is performed.
- the MP 521 issues the storage device that issued the ATS propagation command (for example, if the P-VOL exists in the storage device 400-1, the issuer of the ATS propagation command).
- the storage apparatus returns to the storage apparatus 400-2 that the processing has been completed. Thereafter, the MP 521 waits for a data exclusion release command to be transmitted from the storage apparatus that is the issuer of the ATS propagation command.
- the MP 521 When the MP 521 receives the data exclusive release command in S2020, the MP 521 performs the data exclusive release (S2021). Thereafter, the MP 521 releases the cache area that has been exclusively secured in S2016 (S2022), and ends the process.
- the storage apparatus provides the host 100 with a volume (logical volume) formed by the Thin Provisioning technology.
- the size of the physical storage area may be smaller than the total capacity of the logical volume (at least in the initial state).
- the storage apparatus 400 is operated while the size of the physical storage area is smaller than the total capacity of the logical volume, when a large amount of data is written from the host 100 to the logical volume, the logical volume page There may be a case where there is no physical storage area to be allocated to the. In that case, since the storage apparatus 400 cannot store the write data from the host 100, it is necessary to report to the host 100 that the processing such as the write cannot be executed because the physical storage area has been exhausted. .
- the storage device 400 when there is no unused physical storage area, or when the unused physical storage area decreases (when the amount of unused physical storage area falls below a predetermined threshold), the storage device The processing performed by the storage device 400 will be described by taking the processing when the storage device 400 receives the WRITE command as an example.
- FIG. 17 is a diagram showing a flow of processing when the command received from the host 100 is WRITE (processing corresponding to S40 in FIG. 5).
- the storage apparatus 400-1 has a P-VOL
- the storage apparatus 400-2 has an S-VOL that is paired with the P-VOL (in the storage apparatus 400-1). A case will be described as an example.
- the MP 521 refers to the volume management table T300 and converts the LUN specified by the command into a logical volume number. Then, with reference to the pair management table T400, it is determined whether the logical volume (access target volume) specified by the converted logical volume number is a P-VOL or an S-VOL.
- the MP 521 determines whether or not the status of the access target volume is Block (S42, S43). If the status of the access target volume is Block, the MP 521 returns an error to the host 100 and ends the process. If the status of the access target volume is not Block, the process of S44 or S45 is performed.
- the volume management table T300 is referenced to determine the I / O mode of the access target volume. Then, processing according to the I / O mode is performed. Hereinafter, the processing flow will be described for each I / O mode of the access target volume.
- FIG. 18 shows the flow of processing performed by the storage apparatus 400 when the access target volume is P-VOL and its I / O mode is Mirror.
- the MP 521 performs processing for securing the entire range of the access target area of the logical volume specified by the WRITE command (data exclusion securing processing).
- the MP 521 performs a cache slot securing process.
- the MP 521 needs to allocate a new unused physical storage area to the page corresponding to the access target area of the logical volume, and if there is a need to allocate an unused physical storage area, It is determined whether or not to perform (S4203).
- the state where there is no unused physical storage area is referred to as “the state where the unused physical storage area is depleted” (or “the state where the physical storage area is depleted” for short).
- the MP 521 changes the I / O mode and pair status of the access target logical volume (S4221). Specifically, the P / VOL I / O mode is changed to “Block”. Further, the pair status is changed to “SUSPEND”. At the same time, the I / O mode of the pair volume (S-VOL) is changed to “Local” and the pair status is set to “SUSPEND” for the storage apparatus 400-2 having the pair volume (S-VOL) of the access target logical volume. Give instructions to change.
- step S4222 the MP 521 returns an error to the host 100.
- the storage apparatus 400 returns a status of “CHECK CONDITION” to the host 100.
- detailed information on errors such as Sense Key and Sense Code is also returned.
- the MP 521 returns, as detailed information, a sense key “Aborted COMMAND” indicating that the processing is interrupted to the host 100.
- ABST Sense Key
- the MP 521 performs a data exclusive release process and a slot release process (S4223), and ends the process.
- the MP521 executes the processes in and after S4204. To do.
- the MP 521 determines whether the amount of unused physical storage area of the storage apparatus 400 (the storage apparatus having the access target volume) is equal to or less than a predetermined threshold. When the amount of the unused physical storage area is equal to or smaller than the predetermined threshold (S4204: Y), the MP 521 returns an error for notifying the host 100 that the unused physical storage area is equal to or smaller than the predetermined threshold. (S4211).
- the storage apparatus 400 when the amount of the unused physical storage area is equal to or less than the predetermined threshold, “UNIT ATTENTION” is returned as the Sense Key, and the amount of the unused physical storage area is predetermined. Detailed information that allows the host 100 to recognize that the threshold value is less than or equal to the threshold value is included and returned in the Additional Sense Code. In the following, when a response indicating that the unused physical storage area is equal to or less than a predetermined threshold is sent to the host 100, it is expressed as “return an error (UNIT ATTENTION)” or “return an error (UA)”.
- the amount of used physical storage space (physical storage space that has already been allocated to logical volume pages) (or physical storage space usage rate, that is, “used physical storage space amount ⁇ physical storage that can be allocated to pages” It is determined whether or not the total amount of area ") exceeds a predetermined upper limit value, and if it exceeds the predetermined upper limit value, the fact (the amount of used physical storage area has reached the upper limit value or the upper limit value is set) An error indicating (exceeded) may be returned to the host 100.
- the MP 521 performs a cache slot release process (S4213) and ends the process.
- the process of S4205 is performed.
- the MP 521 stores the write data transmitted together with the WRITE command from the host 100 in the cache slot secured in S4202.
- the MP 521 instructs the storage apparatus 400 to store write data in the S-VOL by sending a WRFBA command to the storage apparatus 400-2 having the pair volume (S-VOL) of the access target volume. Wait for response from -2. If the storage device 400-2 fails to write data to the S-VOL because the unused physical storage area to be allocated to the S-VOL has been exhausted, the storage device 400-2 uses this as response information. Return to 400-1. Further, even when the unused physical storage area to be allocated to the S-VOL is equal to or less than a predetermined threshold, the storage apparatus 400-2 returns that fact as response information to the storage apparatus 400-1.
- the MP 521 When the MP 521 receives the response from the storage apparatus 400-2 in S4207, the MP 521 refers to the response content and determines whether or not the unused physical storage area to be allocated to the S-VOL in the storage apparatus 400-2 is exhausted. (S4208).
- the MP 521 When the response information from the storage apparatus 400-2 includes information indicating that the unused physical storage area to be allocated to the S-VOL is depleted, the MP 521 performs the processing after S4231.
- the MP 521 changes the I / O mode of the access target logical volume. Specifically, the P / VOL I / O mode is changed to “Local”. Also, the pair status is changed to “SUSPEND”. Furthermore, for the storage apparatus 400-2 having the paired volume (S-VOL) of the logical volume to be accessed, the I / O mode of the paired volume (S-VOL) is changed to “Block” and the pair status is set to “SUSPEND”. Give instructions to change.
- the MP 521 returns an error (ABORT) to the host 100, performs data exclusive release processing and slot release processing in S4233, and ends the processing.
- ABST error
- the processing order of each step is not necessarily limited to that described above.
- the data exclusive release process may be performed before S4231 or S4232.
- the MP 521 performs an exclusive data release process (S4209), returns a GOOD status to the host 100 (S4212), performs a slot release process (S4213), and then ends the process.
- step S4301 the MP 521 performs a cache slot securing process.
- the MP 521 determines whether the physical storage area is exhausted (S4302). If it is determined that the physical storage area is depleted (S4302: Y), the MP 521 returns an error to the host 100 (S4311) and ends the process.
- the detailed information returned in S4311 includes information indicating that the physical storage area has been exhausted. Hereinafter, returning information indicating that the physical storage area has been exhausted to the host 100 in this manner is referred to as “returning an error (Stun)”.
- the pair volume (for example, S-VOL) physical storage area is still It may not be exhausted. That is, since there is a possibility that data can be written to the S-VOL in which the same data as the P-VOL is stored, no error (Stun) is returned.
- the MP 521 determines in S4304 whether the amount of unused physical storage area of the storage apparatus 400 (the storage apparatus having the access target volume) has become a predetermined threshold value or less. To do. If the unused physical storage area is equal to or smaller than the predetermined threshold (S4304: Y), the MP 521 returns an error (UA) to the host 100 (S4315). Thereafter, the MP 521 performs a cache slot release process (S4306) and ends the process.
- the MP 521 stores the write data in the cache slot secured in S4301 (S4303), and writes the data to the host 100. Is returned to the normal end (S4305). Thereafter, the MP 521 performs a cache slot release process (S4306) and ends the process.
- the MP 521 sends a WRFBA command to the storage apparatus (storage apparatus 400-2) having the pair volume (S-VOL) of the access target volume, thereby instructing writing of write data to the S-VOL (S4401). ), And then waits for a response from the storage apparatus 400-2.
- the MP 521 When the MP 521 receives the response from the storage apparatus 400-2 in S4402, the MP 521 refers to the response content and determines whether the physical storage area to be allocated to the S-VOL in the storage apparatus 400-2 is exhausted (S4403). ). If it is determined that the physical storage area to be allocated to the S-VOL has been exhausted (S4403: Y), the MP 521 returns an error (Stun) to the host 100 (S4405) and ends the process.
- the MP 521 determines whether the unused physical storage area in the storage apparatus 400-2 is equal to or less than a predetermined threshold value. (S4404). If the unused physical storage area is equal to or smaller than the predetermined threshold (S4404: Y), the MP 521 returns an error (UA) to the host 100 (S4411) and ends the process. If the unused physical storage area is not less than or equal to the predetermined threshold (S4404: N), the MP 521 returns a GOOD status to the host 100 (S4405) and ends the process.
- FIG. 21 shows a flow of processing performed by the storage apparatus 400 when the access target volume is S-VOL and its I / O mode is Mirror (hereinafter, the storage apparatus 400-2 performs processing). Will be described as an example).
- the MP 521 instructs the storage apparatus 400 to store write data in the P-VOL by sending a WRFBA command to the storage apparatus 400-1 having the pair volume (P-VOL) of the access target volume. Wait for a response from -1.
- the storage device 400-1 has a data storage failure to the P-VOL because the physical storage area to be allocated to the P-VOL has been exhausted. Is returned to the storage apparatus 400-2 as response information. Further, even when the unused physical storage area to be allocated to the P-VOL is equal to or less than a predetermined threshold, the storage apparatus 400-1 returns that fact as response information to the storage apparatus 400-2.
- the MP 521 When the MP 521 receives the response from the storage apparatus 400-1 in S6202, the MP 521 refers to the response content and determines whether or not the unused physical storage area to be allocated to the P-VOL in the storage apparatus 400-1 is exhausted. (S6203).
- the MP 521 When it is determined in S6203 that the physical storage area has been exhausted (S6203: Y), the MP 521 performs the processing after S6221.
- the MP 521 changes the I / O mode of the access target logical volume. Specifically, the S-VOL I / O mode is changed to “Local”. Further, the pair status is changed to “SUSPEND”.
- the MP 521 returns an error (ABORT) to the host 100 and ends the process.
- the MP 521 performs a cache slot securing process (S6204). Subsequently, it is determined whether or not the unused physical storage area to be allocated to the S-VOL is depleted in the storage apparatus 400-2 (S6205), and if it is depleted (S6205: Y), the processing after S6231 I do.
- the MP 521 changes the I / O mode of the access target logical volume. Specifically, the S-VOL I / O mode is changed to “Block”. In step S6322, the MP 521 returns an error (ABORT) to the host 100, and performs cache slot release processing in step S6233.
- ABST error
- the MP 521 transmits a data exclusive release command to the storage apparatus 400-1.
- the MP 521 creates a data exclusive release command including an instruction to turn on the repair flag in the parameter of the data exclusive release command.
- the storage apparatus 400-1 that has received this data exclusive release command turns on the repair flag (T407) of the volume management table T300 (stores 1) and designates the location (T408) with the WRITE command (or WRFBA command). Stores information on the access target area. This is because data specified by the WRITE command is written to the P-VOL, but data is not written to the S-VOL, and the contents of the P-VOL and S-VOL are inconsistent.
- the MP 521 determines the amount of the unused physical storage area (S6206), and the unused physical storage area is equal to or less than a predetermined threshold value. (S6206: Y), the MP 521 returns an error (UA) to the host 100 (S6210). Thereafter, the MP 521 releases the slot (S6233), transmits an exclusive data release command (S6234), and ends the process.
- the MP 521 performs a cache slot release process (S6213) and transmits a data exclusive release command to the storage apparatus 400-1 (S6214), and ends the process. Note that the processing in S6214 is different from the processing in S6234, and the parameter of the data exclusive release command does not include an instruction to turn on the repair flag.
- the MP 521 sends a WRFBA command to the storage apparatus (storage apparatus 400-1) having the pair volume (P-VOL) of the access target volume, thereby instructing write data to be written to the P-VOL. After that, it waits for a response from the storage apparatus 400-1.
- the MP 521 When the MP 521 receives a response from the storage apparatus 400-1 in S6402, the MP 521 refers to the response content and determines whether or not the physical storage area to be allocated to the P-VOL is exhausted in the storage apparatus 400-1 ( S6403). When it is determined that the physical storage area to be allocated to the P-VOL is exhausted (S6403: Y), the MP 521 returns an error (Stun) to the host 100 and ends the process.
- the MP 521 determines in the storage apparatus 400-1 whether the unused physical storage area is below a predetermined threshold value. (S6404). If the unused physical storage area is equal to or smaller than the predetermined threshold (S6404: Y), the MP 521 returns an error (UA) to the host 100 (S6411) and ends the process. If the unused physical storage area is not less than or equal to the predetermined threshold (S6404: N), the MP 521 returns a GOOD status to the host 100 (S6405) and ends the process.
- the storage apparatus 400 executes the process related to the WRFBA command, when notifying that the process has been normally performed or that an error has occurred, the status according to the SCSI standard is displayed. I will return it to you. Therefore, when the process is normally performed, a GOOD status is returned. Further, when the unused physical storage area of the storage apparatus 400 is exhausted, an error (Stun) is returned. Further, if the unused physical storage area of the storage apparatus 400 is equal to or less than a predetermined threshold, an error (UNIT ATTENTION) is returned.
- the WRFBA command is a uniquely defined command that is used only between the storage apparatuses 400, unique information may be used for status and error information returned to the storage apparatus 400 that issued the command. .
- determination whether the physical storage area is exhausted is provided after the cache slot securing process (S1014, S1022). If it is determined that the S-VOL I / O mode is Mirror and the physical storage area is exhausted (S5014: Y), the MP 521 of the storage apparatus 400 that has received the WRFBA command receives the access target.
- the I / O mode of the volume (P-VOL) is changed to “Block” (S5221), an error (Stun) is returned to the storage device 400 that issued the command (S5222), and finally the cache slot is released. Processing and data exclusive release processing are performed (S5223), and the processing is terminated.
- the access target logical volume specified by the received WRFBA command is P-VOL, and the I / O mode of the P-VOL pair volume (S-VOL) is Mirror (S1011: mirror), the physical volume When the storage area is not exhausted, write processing to the P-VOL (S1015) is performed. Then, after the P-VOL write processing (S1015) is completed, reception of a data exclusive release command from the storage apparatus 400 having the S-VOL is awaited (S1017).
- the physical storage area may be exhausted.
- the data exclusive release command received in S1017 includes a notification that data writing to the S-VOL has failed because the physical storage area has been exhausted.
- the MP 521 executes the data exclusive release process (S1018), and in S5018, the I of the access target volume (P-VOL). Change / O mode to "Local”.
- the MP 521 stores 1 in the repair flag (T407) for the row corresponding to the P-VOL in the volume management table T300, and stores information on the area that needs to be repaired in the position (T408).
- the physical storage area After the determination of whether or not there is a depletion (S5015), it is determined whether or not the amount of unused physical storage area is equal to or less than a predetermined threshold (S5022). When the amount of the unused physical storage area is not equal to or less than the predetermined threshold (S5022: N), the MP 521 completes the process for the write device (S1023) and the storage device 400 that issued the WRFBA command, as in the process of FIG. A response to the effect (GOOD status) (S1024), a cache slot release process (S1025) are performed, and the process is terminated.
- a predetermined threshold S5022
- the MP521 issues an error to the storage apparatus 400 (400-2) that issued the WRFBA command. By returning (UNIT ATTENTION), it is notified that the amount of unused physical storage area has become equal to or smaller than a predetermined threshold (S5024). In this case, the write process (S1023) is not performed.
- the MP 521 performs a cache slot release process (S1025) and ends the process.
- S1025 cache slot release process
- a determination as to whether the physical storage area is exhausted is provided after the cache slot allocation processing (S1022). If it is determined that the physical storage area is exhausted (S5102: Y), the MP 521 of the storage apparatus 400 that has received the WRFBA command determines the I / O mode of the P-VOL (S5110). If it is determined in S5110 that the I / O mode is Mirror, the MP 521 changes the I / O mode of the access target volume (S-VOL) to “Block” (S5111), and the command issuing source storage apparatus 400 An error (Stun) is returned (S5112). Finally, the MP 521 performs a cache slot release process (S5113) and ends the process. If it is determined in S5110 that the I / O mode is Remote, the MP 521 executes S5112 and S5113, but does not perform S5111 (I / O mode change).
- S5102 determines whether or not the unused physical storage area is equal to or less than a predetermined threshold (S5122). If the unused physical storage area is not equal to or less than the predetermined threshold (S5122: N), the MP 521 has completed the write processing (S1023) and the processing for the storage device 400 that issued the WRFBA command, as in the processing of FIG. A (GOOD status) response (S1024), a cache slot release process (S1025) are performed, and the process ends.
- S5122 a predetermined threshold
- the MP 521 gives an error to the storage apparatus 400 (400-1) that issued the WRFBA command. By returning (UNIT ATTENTION), it is notified that the unused physical storage area has become equal to or less than a predetermined threshold (S5124). In this case, the write process is not performed.
- the MP 521 performs a cache slot release process (S1025) and ends the process.
- S1025 cache slot release process
- the above is the flow of processing performed when the storage apparatus 400 receives a WRITE command from the host 100.
- the WRITE command is issued from the host 100 to the P-VOL / S-VOL volume pair
- the change in the state of the computer system when the physical storage area is depleted is shown in FIG. 27 and FIG. It explains using.
- the physical storage area in the storage apparatus 400 with the P-VOL is depleted in the storage apparatus 400 with the P-VOL (however, the physical storage area in the storage apparatus 400 with the S-VOL is It is a figure explaining the change of the state of the computer system which is not depleted.
- the pair status of the volume pair before the host 100 issues the WRITE command is “Pair”. Further, it is assumed that the P-VOL exists in the storage apparatus 400-1 and the S-VOL exists in the storage apparatus 400-2.
- the state of the computer system before the host 100 issues the WRITE command is referred to as [State 1].
- the storage apparatus 400-1 performs processing according to the processing flow described in FIG.
- an error (ABORT) is returned to the host (FIG. 18, S4222).
- the P-VOL I / O mode is changed to Block
- the S-VOL I / O mode is changed to Local
- the pair status is changed to Suspend (FIG. 18, S4221) (FIG. 27 [State 2 ]).
- the physical storage area is not exhausted, so the write process to the S-VOL (the process shown in FIG. 19) is executed.
- the host 100 can continue the business. Is possible.
- the host 100 issues a WRITE command to the P-VOL
- the state of the computer system changes as in [State 1 ′] to [State 3 ′] shown in FIG.
- the pair status of the volume pair before the host 100 issues the WRITE command is “Pair”. Further, it is assumed that the P-VOL exists in the storage apparatus 400-1 and the S-VOL exists in the storage apparatus 400-2.
- [State 1 '] is the state of the computer system before the host 100 issues the WRITE command.
- the storage apparatus 400-1 performs processing according to the processing flow described in FIG.
- the storage apparatus 400-1 returns an error status (ABORT) to the host 100 (FIG. 18, S4232).
- the P-VOL I / O mode is changed to Local
- the S-VOL I / O mode is changed to Block
- the pair status is changed to Suspend (FIG. 18, S4231) (FIG. 27 [State 2 ']).
- the host 100 can continue business. If the P / VOL I / O mode is changed to Block as in [State 2], the host 100 (alternate path software 1111) issues an I / O request to the S-VOL. . However, in the case of [state 2 '], the storage device 400-2 with the S-VOL is depleted of the physical storage area, so that the WRITE command cannot be accepted and the host 100 cannot continue the business.
- the storage apparatus 400 sets the P / VOL I / O mode to Local when the state of the computer system transitions from [State 1 ′] to [State 2 ′]. .
- the host 100 issues an I / O command such as a WRITE command to the P-VOL.
- the host 100 issues an I / O command to the S-VOL.
- the same is done in the computer system.
- the host 100 when the host 100 retries access to the P-VOL (for example, WRITE), since the I / O mode of the P-VOL is Remote, the storage apparatus 400-1 to the storage apparatus 400-2 The data related to the WRITE command is written to the S-VOL. That is, in addition to being able to continue operations, the host 100 does not need to change access paths.
- the P-VOL for example, WRITE
- the host 100 may access the S-VOL (storage device 400-2). That is, since the host 100 can use both the access path from the host 100 to the P-VOL and the access path from the host 200 to the P-VOL, load distribution of the access path is also possible.
- the above is the description of the processing that is performed when the unused physical storage area of the storage apparatus 400 is depleted or the amount of the unused physical storage area falls below a predetermined threshold.
- the case where the storage apparatus 400 has received the WRITE command has been described as an example.
- other update commands for example, the WRITE SAME, UNMAP, and ATS commands described above
- a depletion error process may be inserted before the data write process to the volume (for example, S1414 in FIG. 6) and after the process for receiving a response from the storage apparatus 400 having the pair volume (S1416 in FIG. 18).
- FIGS. 23 and 24 may be performed instead of the process shown in FIG. (1)
- FIG. 6 Before S1414 the processes of S4203 and S4221 to S4223 of FIG. 18 and the processes of S4204 and S4211 to S4213 of FIG. 18 are inserted.
- FIG. 6 After S1416 the processes of S4208 and S4231 to S4233 of FIG. 18 are inserted.
- FIG. 19 After S1433, the processing of S4403, S4410, S4404, and S4411 of FIG. 20 is inserted.
- FIG. 7 The processes of S6203 and S6221 to S6222 of FIG. 21 are inserted after S1513.
- FIG. 7 The processing of S6205, S6206, S6210, and S6231 to S5234 of FIG. 21 is inserted before S1515.
- FIG. 7 Before S1523 the processes of S4302 and S4311 of FIG. 19 and the processes of S4304 and S4315 of FIG. 19 are inserted.
- FIG. 7 After S1533, the processes of S6403, S6410, S6404, and S6411 of FIG. 22 are inserted.
- the storage device 400 is running out of unused physical storage area or the amount of unused physical storage area is being reduced while the storage apparatus 400 is executing the process related to the WRITE SAME command. Appropriate processing is also performed when the value is below the predetermined threshold.
- Appropriate processing is also performed when the unused physical storage area is depleted or the amount of unused physical storage area falls below a predetermined threshold.
- attribute information is set for the volumes of two storage apparatuses.
- I / O processing for the access target volume is executed according to the volume attribute information (Local, Remote, Mirror, Block), or the access target volume It is determined whether to execute the I / O processing for the pair volume. Further, when the I / O processing is executed for both the access target volume and the pair volume, it is determined for which volume the I / O processing is executed first.
- the Mirror attribute is set as attribute information for each volume (first volume, second volume) of two storage devices, so that the host updates data for either the first volume or the second volume. It can be performed, and even if either the first volume or the second volume is updated, the contents are reflected in the other volume so that the duplex state of the first volume and the second volume is maintained. To be used.
- Information such as programs, tables, and files for realizing each function may be stored in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.
- a recording device such as an SSD (Solid State Drive)
- a recording medium such as an IC card, an SD card, or a DVD.
- control lines and information lines indicate what is considered necessary for the explanation, and not all the control lines and information lines on the product are necessarily shown.
- SAN 220 SAN 300 storage system 400: storage device 500: storage controller 510: memory 520: MP package 521: MP 521: Local memory 530: Channel adapter 531: Interface 532: Buffer 540: Disk adapter 900: Drive
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
図1は、本発明の実施例に係る計算機システムの構成図である。計算機システムは、ホスト計算機100(以下、「ホスト100」と略記)とストレージシステム300とから構成される。ストレージシステム300は、少なくとも2台のストレージ装置400-1、400-2から構成される。ホスト100はSAN(Storage Area Network)210を介してストレージ装置400-1、400-2の両方に接続されている。
ホスト100からのライトデータは、最終的にはドライブ900に格納されるが、本発明の実施例に係るストレージ装置400は、ホスト100に対してドライブ900の記憶空間を直接提供していない。代わりにストレージ装置400は、周知のThin Provisioning技術によって形成される、1以上のボリュームをホスト100に提供する。以下ではこの、ホスト100に提供されるボリュームのことを「論理ボリューム」と呼ぶ。ストレージ装置400は論理ボリュームの記憶空間を、固定サイズの領域ごとに分割して管理している。この固定サイズの領域は「ページ」と呼ばれる。ページのサイズは、一例として42MB等のサイズである。各ページには動的に、1以上のドライブ900のいずれかの記憶領域が割り当てられる。以下では、ページに割り当てられる記憶領域のことを、「物理記憶領域」と呼ぶ。
続いて、ホスト100からコマンドを受信した時に、ストレージ装置400で行われる処理の流れについて説明していく。なお、本発明の実施例では、ホスト100から以下のコマンドを受信した時について、ストレージ装置400で行われる処理の流れを説明していく。
(a) GET LBA STATUS
(b) LOG SENSE
(c) WRITE SAME
(d) UNMAP
(e) COMPARE AND WRITE
(f) WRITE
S3では、MP521は、コマンドの種類の解析をさらに行い、たとえばコマンドの種類がGET LBA STATUSであれば、コマンドのパラメータで指定されたLBAの状態(たとえば物理記憶領域が割り当てられているか等)を取得し、その情報をホスト100に返却する処理を行う。またコマンドの種類がLOG SENSEであれば、コマンドのパラメータで指定された論理ボリュームあるいはストレージ装置400の情報を返却する。つまり、ストレージ装置400は情報採取系のコマンドを受信した場合には、論理ボリュームのI/Oモードやペアステータスを考慮せず、コマンドのパラメータで指定された論理ボリュームまたはストレージ装置400の情報を、コマンド発行元(ホスト100等)に返却する処理を行う。
図6を用いて、ストレージ装置400がWRITE SAMEコマンドを受信した場合で、かつ当該コマンドでアクセス対象ボリュームとして指定されているボリュームがP-VOLであった場合に、ストレージ装置400で行われる処理の流れを説明する。なお、以下の処理では一例として、P-VOLがストレージ装置400-1に、そして当該P-VOLとペア関係にあるボリューム(S-VOL)がストレージ装置400-2に存在する場合を例にとって説明する。そして以下で説明する処理は特に断りがない限り、ストレージ装置400-1(P-VOLが存在するストレージ装置)のMP521が実行する処理である。
上で説明した処理の流れでは、ストレージ装置400-1からストレージ装置400-2のボリュームに対して、WRITE SAMEコマンドで指定されたデータを書き込む場合(またはストレージ装置400-2からストレージ装置400-1のボリュームに対して、WRITE SAMEコマンドで指定されたデータを書き込む場合)、WRFBAコマンドが用いられていた。ただしWRFBAコマンドは、全領域に同内容のデータを書き込む場合であっても、データ書き込み範囲と同じサイズのデータを送信する必要があるコマンドである。そのため、上で説明した実施例の変形として、WRITE SAMEコマンドのように、書き込みデータのごく一部(たとえば1ブロック)だけを送信するだけで、データ書き込み範囲に全て同一データの書き込みが行われるコマンド(以下ではこのコマンドのことを、「WRITE SAME伝搬コマンド」と呼ぶ)を定義し、それを用いる例を説明する。
次に、ストレージ装置400がUNMAPコマンドを受信した場合に行われる処理について、図12、図13を用いて説明していく。なお、以下の処理では一例として、P-VOLがストレージ装置400-1に、そして当該P-VOLとペア関係にあるボリューム(S-VOL)がストレージ装置400-2に存在する場合を例にとって説明する。
まずストレージ装置400の内部では、UNMAPコマンドを受信すると、そのコマンドを、指定された領域の全てに「0」を書き込む旨を指示するWRITE SAMEコマンドに変換する(S251)。
次に、ストレージ装置400がCOMPARE AND WRITE(ATS)コマンドを受信した場合に行われる処理について説明していく。なお、以下の処理では一例として、P-VOLがストレージ装置400-1に、そして当該P-VOLとペア関係にあるボリューム(S-VOL)がストレージ装置400-2に存在する場合を例にとって説明する。
ここまで、WRITE SAME、UNMAP、COMPARE AND WRITEコマンドを受信した時に、ストレージ装置400が実施する処理の流れを説明してきたが、上で説明した処理は原則として、処理中にエラーが発生しなかった場合の処理である。ただし実際には、ストレージ装置400の状態によっては処理中にエラーが発生することもあり、その場合にストレージ装置400がどのような処理を行うか、以下で説明していく。
(1) 図6 S1414の前に、図18のS4203、S4221~S4223の処理、及び図18のS4204、S4211~S4213の処理が挿入される。
(2) 図6 S1416の後に、図18のS4208、S4231~S4233の処理が挿入される。
(3) 図6 S1423の前に、図19のS4302、S4311の処理、及び図19のS4304、S4315の処理が挿入される。
(4) 図6 S1433の後に、図20のS4403、S4410、S4404,S4411の処理が挿入される。
(5) 図7 S1513の後に、図21のS6203、S6221~S6222の処理が挿入される。
(6) 図7 S1515の前に、図21のS6205、S6206、S6210、S6231~S5234の処理が挿入される。
(7) 図7 S1523の前に、図19のS4302、S4311の処理、及び図19のS4304、S4315の処理が挿入される。
(8) 図7 S1533の後に、図22のS6403、S6410、S6404,S6411の処理が挿入される。
210: SAN
220: SAN
300 ストレージシステム
400: ストレージ装置
500: ストレージコントローラ
510: メモリ
520: MPパッケージ
521: MP
521: ローカルメモリ
530: チャネルアダプタ
531: インタフェース
532: バッファ
540: ディスクアダプタ
900: ドライブ
Claims (15)
- 第1ストレージ装置と、前記第1ストレージ装置に接続された第2ストレージ装置とから構成されるストレージシステムであって、
前記第1ストレージ装置と前記第2ストレージ装置はそれぞれ、ボリュームと、1以上の記憶デバイスを有し、ホスト計算機からSCSI規格に従ったI/Oコマンドを受け付け可能に構成されており、
前記第1ストレージ装置内の第1ボリュームと前記第2ストレージ装置内の第2ボリュームには、ライト順及び/またはライト可否を表す属性情報が設定されており、
前記第1ボリュームと前記第2ボリュームは、前記ホスト計算機からのライトデータが前記第1ボリュームと前記第2ボリュームの両方に書き込まれるペア関係にあるボリュームとして管理されており、
前記第1ストレージ装置は、前記ホスト計算機から前記第1ボリュームに対するI/Oコマンドを受信すると、前記属性情報に基づいて、前記第1ボリュームと前記第2ボリュームへのI/O要否を決定する、
ことを特徴とする、ストレージシステム。 - 前記第1ボリュームと前記第2ボリュームには、Mirror属性、Local属性、Remote属性のうちの1つが前記属性情報として設定されており、
前記第1ボリュームと前記第2ボリュームにはまた、最初に更新が行われるボリュームであることを示すP-VOL属性、または2番目に更新が行われるボリュームであることを示すS-VOL属性、のいずれかの第2属性情報が設定されており、
前記第1ボリュームに、前記属性情報としてMirror属性が設定されている時、前記前記第1ボリュームと前記第2ボリュームに対して、前記I/Oコマンドで指定された処理を実行し、
前記第1ボリュームに前記属性情報としてLocal属性が設定されている時、前記前記第1ボリュームのみに対して、前記I/Oコマンドで指定された処理を実行し、
前記第1ボリュームに前記属性情報としてRemote属性が設定されている時、前記前記第2ボリュームのみに対して、前記I/Oコマンドで指定された処理を実行する、
ことを特徴とする、請求項1に記載のストレージシステム。 - 前記第1ストレージ装置が前記I/Oコマンドとして、前記第1ボリュームに対するWRITE SAMEコマンドと1ブロックのパターンデータを受信した時、前記第1ボリュームに前記属性情報としてMirror属性が設定されている場合、
前記第1ストレージ装置は、
前記第1ボリューム内の、前記WRITE SAMEコマンドで指定された全ての領域に対して、前記受信したパターンデータを格納する処理と、
前記第2ストレージ装置に対して、前記第2ボリューム内の前記WRITE SAMEコマンドで指定された全ての領域に対して、前記受信したパターンデータを書き込むよう指示する処理の、2つの処理を実行する、
ことを特徴とする請求項2に記載のストレージシステム。 - 前記第1ストレージ装置が前記I/Oコマンドとして、前記第1ボリュームに対するWRITE SAMEコマンドと1ブロックのパターンデータを受信すると、
前記第1ストレージ装置は、全領域に前記パターンデータが書き込まれている、所定サイズのライトデータを作成し、
前記第1ストレージ装置は前記第2ストレージ装置に対して、前記所定サイズのライトデータを前記第2ボリュームの所定位置に書き込むことを指示するWRFBAコマンドと前記所定サイズのライトデータを1回以上送信することにより、前記第2ボリューム内の前記WRITE SAMEコマンドで指定された全ての領域に、前記受信したパターンデータを書き込ませる、
ことを特徴とする、請求項3に記載のストレージシステム。 - 前記第1ストレージ装置が前記I/Oコマンドとして、前記第1ボリュームに対するWRITE SAMEコマンドと1ブロックのパターンデータを受信すると、
前記第1ストレージ装置は前記第2ストレージ装置に対して、前記第2ボリューム内の前記WRITE SAMEコマンドで指定された領域を指定したWRITE SAME伝搬コマンドと、前記パターンデータを送信することにより、前記第2ボリューム内の前記WRITE SAMEコマンドで指定された全ての領域に、前記受信したパターンデータを書き込ませる、
ことを特徴とする、請求項3に記載のストレージシステム。 - 前記第1ストレージ装置が前記I/Oコマンドとして、前記第1ボリュームに対するWRITE SAMEコマンドと1ブロックのパターンデータを受信した時、前記第1ボリュームに、S-VOL属性が設定されている場合、
前記第1ストレージ装置は、前記第2ボリューム内の、前記WRITE SAMEコマンドで指定された全ての領域に対して、前記受信したパターンデータが書き込まれた後に、
前記第1ボリューム内の、前記WRITE SAMEコマンドで指定された全ての領域に対して、前記受信したパターンデータを格納する、
ことを特徴とする請求項3に記載のストレージシステム。 - 前記第1ストレージ装置が前記I/Oコマンドとして、前記第1ボリュームに対するWRITE SAMEコマンドと1ブロックのパターンデータを受信した時、前記第1ボリュームに、P-VOL属性が設定されている場合、
前記第1ストレージ装置は、前記第1ボリューム内の、前記WRITE SAMEコマンドで指定された全ての領域に対して、前記受信したパターンデータを格納した後に、
前記第2ストレージ装置に対して、前記第2ボリューム内の前記WRITE SAMEコマンドで指定された全ての領域に対して、前記受信したパターンデータを書き込むよう指示する、
ことを特徴とする請求項3に記載のストレージシステム。 - 前記第1ストレージ装置が前記I/Oコマンドとして、前記第1ボリュームに対するUNMAPコマンドを受信した時、
前記第1ボリュームと前記第2ボリュームに設定されている前記属性情報に基づいて、前記第1ボリューム及び/または前記第2ボリュームの、前記UNMAPコマンドで指定された全ての領域に対して、0データを格納する、
ことを特徴とする請求項2に記載のストレージシステム。 - 前記ストレージシステムは、ボリュームの記憶空間を、固定サイズの領域であるページ単位に管理しており、
前記第1ストレージ装置は、前記第1ボリュームのページに対してI/Oコマンドを受信した時点で、前記記憶デバイスの中から、前記ページのサイズに相当する記憶領域を割り当てるように構成されており、
前記第1ストレージ装置は、前記ページ内の全領域に0データが格納されている場合、前記ページを、前記記憶領域が割り当てられていない状態に変更する、
ことを特徴とする、
請求項8に記載のストレージシステム。 - 前記第1ボリュームに前記属性情報としてMirror属性が設定されている場合、
前記第1ストレージ装置が前記I/Oコマンドとして、前記第1ボリュームに対するCOMPARE AND WRITEコマンドとコンペアデータとライトデータとを受信した時、
前記ストレージシステムは、
前記第1ボリュームと前記第2ボリュームのうち、前記P-VOL属性の設定されているボリュームから、前記COMPARE AND WRITEコマンドで指定されている領域のデータを読み出して、前記コンペアデータと比較し、
前記読み出されたデータと前記コンペアデータの内容が同一である場合、前記P-VOL属性の設定されているボリュームに対して前記ライトデータを格納した後、前記S-VOL属性の設定されているボリュームに対して前記ライトデータを格納する、
ことを特徴とする、請求項2に記載のストレージシステム。 - 前記ストレージシステムは、ボリュームの記憶空間を、固定サイズの領域であるページ単位に管理しており、
前記第1ストレージ装置は、前記第1ボリュームのページに対してI/Oコマンドを受信した時点で、前記第1ストレージ装置の記憶デバイスの中から、前記ページのサイズに相当する記憶領域を割り当てるように構成されており、
前記第2ストレージ装置は、前記第2ボリュームのページに対してI/Oコマンドを受信した時点で、前記第2ストレージ装置の記憶デバイスの中から、前記ページのサイズに相当する記憶領域を割り当てるように構成されており、
前記ストレージシステムは前記ホスト計算機から、前記属性情報としてMirror属性が設定されている前記第1ボリュームまたは前記第2ボリュームに対するI/Oコマンドを受信すると、
前記第1ストレージ装置で、前記第1ボリュームに割り当て可能な記憶領域が枯渇した場合には、前記第1ボリュームの前記属性情報として、前記I/Oコマンドの受け付けが不可能であることを示すBlock属性を設定し、前記第2ボリュームの前記属性情報としてLocal属性を設定した後、前記ホスト計算機にエラーを返却し、
前記第2ストレージ装置で、前記第2ボリュームに割り当て可能な記憶領域が枯渇した場合には、前記第2ボリュームの前記属性情報として、前記I/Oコマンドの受け付けが不可能であることを示すBlock属性を設定し、前記第1ボリュームの前記属性情報としてLocal属性を設定した後、前記ホスト計算機にエラーを返却することを特徴とする、
請求項2に記載のストレージシステム。 - 前記ホスト計算機にエラーを返却した後に、前記ホスト計算機から前記第1ボリュームに対するI/Oコマンドを受け付けた時、
前記第1ボリュームに前記属性情報としてLocal属性が設定されている場合、前記第1ストレージ装置は、前記第1ボリュームに割り当て可能な記憶領域が枯渇したか否かの判定を行い、
前記第1ボリュームに割り当て可能な記憶領域が枯渇していない場合には、前記第1ストレージ装置は前記第1ボリュームに対してI/Oコマンドで指定された処理を行い、
前記第1ボリュームに割り当て可能な記憶領域が枯渇している場合には、前記第1ストレージ装置は、前記ホスト計算機に、前記記憶領域が枯渇した旨を示すエラー情報を返却することを特徴とする、
請求項11に記載のストレージシステム。 - 第1ストレージ装置と、前記第1ストレージ装置に接続された第2ストレージ装置とから構成されるストレージシステムと、前記第1ストレージ装置及び前記第2ストレージ装置に接続されるホスト計算機とを有する計算機システムの制御方法であって、
前記第1ストレージ装置と前記第2ストレージ装置はそれぞれ、ボリュームと、1以上の記憶デバイスを有し、ホスト計算機からSCSI規格に従ったI/Oコマンドを受け付け可能に構成されており、
前記第1ストレージ装置内の第1ボリュームと前記第2ストレージ装置内の第2ボリュームには、ライト順及び/またはライト可否を表す属性情報が設定されており、
前記第1ボリュームと前記第2ボリュームは、前記ホスト計算機からのライトデータが前記第1ボリュームと前記第2ボリュームの両方に書き込まれるペア関係にあるボリュームとして管理されており、
前記ホスト計算機は前記第1ボリュームと前記第2ボリュームのいずれにもアクセス可能に構成されており、
前記第1ストレージ装置が、前記ホスト計算機から前記第1ボリュームに対するI/Oコマンドを受信すると、前記属性情報に基づいて、前記第1ボリュームと前記第2ボリュームへのI/O要否を決定する、
ことを特徴とする、計算機システムの制御方法。 - 前記第1ボリュームと前記第2ボリュームには、Mirror属性、Local属性、Remote属性、Block属性のうちの1つが前記属性情報として設定されており、
前記第1ボリュームと前記第2ボリュームにはまた、最初に更新が行われるボリュームであることを示すP-VOL属性、または2番目に更新が行われるボリュームであることを示すS-VOL属性、のいずれかの第2属性情報が設定されており、
前記第1ボリュームに、前記属性情報としてMirror属性が設定されている時、前記前記第1ボリュームと前記第2ボリュームに対して、前記I/Oコマンドで指定された処理を実行し、
前記第1ボリュームに前記属性情報としてLocal属性が設定されている時、前記前記第1ボリュームのみに対して、前記I/Oコマンドで指定された処理を実行し、
前記第1ボリュームに前記属性情報としてRemote属性が設定されている時、前記前記第2ボリュームのみに対して、前記I/Oコマンドで指定された処理を実行する、
ことを特徴とする、請求項13に記載の計算機システムの制御方法。 - 前記ストレージシステムは、ボリュームの記憶空間を、固定サイズの領域であるページ単位に管理しており、
前記第1ストレージ装置は、前記第1ボリュームのページに対してI/Oコマンドを受信した時点で、前記第1ストレージ装置の記憶デバイスの中から、前記ページのサイズに相当する記憶領域を割り当てるように構成されており、
前記第2ストレージ装置は、前記第2ボリュームのページに対してI/Oコマンドを受信した時点で、前記第2ストレージ装置の記憶デバイスの中から、前記ページのサイズに相当する記憶領域を割り当てるように構成されており、
前記ホスト計算機は、前記第1ストレージ装置から受信した前記第1ボリュームの識別情報と前記第2ストレージ装置から受信した前記第2ボリュームの識別情報に基づいて、前記第1ボリュームと前記第2ボリュームが同一ボリュームであると認識している状態において、
前記ストレージシステムが前記ホスト計算機から、前記属性情報としてMirror属性が設定されている前記第1ボリュームに対するI/Oコマンドを受信すると、
前記第1ストレージ装置で、前記第1ボリュームに割り当て可能な記憶領域が枯渇した場合には、前記ストレージシステムは前記第1ボリュームの前記属性情報として、前記I/Oコマンドの受け付けが不可能であることを示すBlock属性を設定し、前記第2ボリュームの前記属性情報としてLocal属性を設定した後、前記ホスト計算機にエラーを返却し、
前記ホスト計算機はエラーが返却されたことに応じて、前記第1ボリュームに代えて前記第2ボリュームへI/Oコマンドを発行し、
前記ホスト計算機から前記第2ボリュームに対するI/Oコマンドを受け付けた前記第2ストレージ装置は、前記第2ボリュームに割り当て可能な記憶領域が枯渇したか否かの判定を行い、
前記第2ボリュームに割り当て可能な記憶領域が枯渇していない場合、前記第2ストレージ装置は前記第2ボリュームに対してI/Oコマンドで指定された処理を行い、
前記第2ボリュームに割り当て可能な記憶領域が枯渇している場合、前記第2ストレージ装置は、前記ホスト計算機に、前記記憶領域が枯渇した旨を示すエラー情報を返却することを特徴とする、
請求項14に記載の計算機システムの制御方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2014/066768 WO2015198412A1 (ja) | 2014-06-25 | 2014-06-25 | ストレージシステム |
US15/121,875 US10140035B2 (en) | 2014-06-25 | 2014-06-25 | Method for appropriately controlling duplicated volumes in virtual volume subsystems |
JP2016528801A JP6227776B2 (ja) | 2014-06-25 | 2014-06-25 | ストレージシステム |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2014/066768 WO2015198412A1 (ja) | 2014-06-25 | 2014-06-25 | ストレージシステム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015198412A1 true WO2015198412A1 (ja) | 2015-12-30 |
Family
ID=54937547
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2014/066768 WO2015198412A1 (ja) | 2014-06-25 | 2014-06-25 | ストレージシステム |
Country Status (3)
Country | Link |
---|---|
US (1) | US10140035B2 (ja) |
JP (1) | JP6227776B2 (ja) |
WO (1) | WO2015198412A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018131067A1 (ja) * | 2017-01-10 | 2018-07-19 | 株式会社日立製作所 | 記憶ドライブの故障により消失したデータを復元する装置 |
CN110795043A (zh) * | 2019-10-29 | 2020-02-14 | 北京浪潮数据技术有限公司 | 一种分布式存储块置零方法、装置、电子设备及存储介质 |
JP2022003589A (ja) * | 2020-07-09 | 2022-01-11 | 株式会社日立製作所 | システム及びその制御方法並びにプログラム |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6814020B2 (ja) * | 2016-10-26 | 2021-01-13 | キヤノン株式会社 | 情報処理装置とその制御方法、及びプログラム |
US10127029B1 (en) * | 2016-12-30 | 2018-11-13 | Veritas Technologies Llc | Operating system installation using logical volumes |
JP2021114264A (ja) * | 2020-01-21 | 2021-08-05 | 富士通株式会社 | ストレージ制御装置およびストレージ制御プログラム |
US20220100425A1 (en) * | 2020-09-29 | 2022-03-31 | Samsung Electronics Co., Ltd. | Storage device, operating method of storage device, and operating method of computing device including storage device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005018568A (ja) * | 2003-06-27 | 2005-01-20 | Hitachi Ltd | 記憶システム |
JP2006285388A (ja) * | 2005-03-31 | 2006-10-19 | Hitachi Ltd | 計算機システム、ホストコンピュータ及びコピーペア処理方法 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7065589B2 (en) * | 2003-06-23 | 2006-06-20 | Hitachi, Ltd. | Three data center remote copy system with journaling |
US7600087B2 (en) * | 2004-01-15 | 2009-10-06 | Hitachi, Ltd. | Distributed remote copy system |
JP5244332B2 (ja) * | 2006-10-30 | 2013-07-24 | 株式会社日立製作所 | 情報システム、データ転送方法及びデータ保護方法 |
US9015576B2 (en) * | 2011-05-16 | 2015-04-21 | Microsoft Technology Licensing, Llc | Informed partitioning of data in a markup-based document |
WO2013070792A1 (en) * | 2011-11-07 | 2013-05-16 | Nexgen Storage, Inc. | Primary data storage system with staged deduplication |
US20130238852A1 (en) | 2012-03-07 | 2013-09-12 | Hitachi, Ltd. | Management interface for multiple storage subsystems virtualization |
US8935496B2 (en) * | 2012-08-31 | 2015-01-13 | Hitachi, Ltd. | Management method of virtual storage system and remote copy system |
US9423978B2 (en) * | 2013-05-08 | 2016-08-23 | Nexgen Storage, Inc. | Journal management |
-
2014
- 2014-06-25 WO PCT/JP2014/066768 patent/WO2015198412A1/ja active Application Filing
- 2014-06-25 US US15/121,875 patent/US10140035B2/en active Active
- 2014-06-25 JP JP2016528801A patent/JP6227776B2/ja active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005018568A (ja) * | 2003-06-27 | 2005-01-20 | Hitachi Ltd | 記憶システム |
JP2006285388A (ja) * | 2005-03-31 | 2006-10-19 | Hitachi Ltd | 計算機システム、ホストコンピュータ及びコピーペア処理方法 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018131067A1 (ja) * | 2017-01-10 | 2018-07-19 | 株式会社日立製作所 | 記憶ドライブの故障により消失したデータを復元する装置 |
JPWO2018131067A1 (ja) * | 2017-01-10 | 2019-06-27 | 株式会社日立製作所 | 記憶ドライブの故障により消失したデータを復元する装置 |
CN110795043A (zh) * | 2019-10-29 | 2020-02-14 | 北京浪潮数据技术有限公司 | 一种分布式存储块置零方法、装置、电子设备及存储介质 |
JP2022003589A (ja) * | 2020-07-09 | 2022-01-11 | 株式会社日立製作所 | システム及びその制御方法並びにプログラム |
JP7193602B2 (ja) | 2020-07-09 | 2022-12-20 | 株式会社日立製作所 | システム及びその制御方法並びにプログラム |
Also Published As
Publication number | Publication date |
---|---|
US20170017421A1 (en) | 2017-01-19 |
JPWO2015198412A1 (ja) | 2017-04-20 |
JP6227776B2 (ja) | 2017-11-08 |
US10140035B2 (en) | 2018-11-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6227776B2 (ja) | ストレージシステム | |
JP4958739B2 (ja) | 障害の発生した記憶装置に記憶されているデータを修復するストレージシステム | |
US8375167B2 (en) | Storage system, control apparatus and method of controlling control apparatus | |
JP4890033B2 (ja) | 記憶装置システム及び記憶制御方法 | |
US7975115B2 (en) | Method and apparatus for separating snapshot preserved and write data | |
JP5538362B2 (ja) | 記憶制御装置及び仮想ボリュームの制御方法 | |
US7593973B2 (en) | Method and apparatus for transferring snapshot data | |
US7574577B2 (en) | Storage system, storage extent release method and storage apparatus | |
WO2017216887A1 (ja) | 情報処理システム | |
US10191685B2 (en) | Storage system, storage device, and data transfer method | |
US7080207B2 (en) | Data storage apparatus, system and method including a cache descriptor having a field defining data in a cache block | |
WO2015125221A1 (ja) | ストレージシステム及び移行方法 | |
CN113360082B (zh) | 存储系统及其控制方法 | |
US11194481B2 (en) | Information processing apparatus and method for controlling information processing apparatus | |
JP2010086394A (ja) | 特定パターンデータが格納される仮想ボリュームへの記憶領域の割り当てを制御するストレージシステム | |
JP2020071583A (ja) | データ管理装置、データ管理方法、及びデータ管理プログラム | |
JP6294569B2 (ja) | ストレージシステム及びキャッシュ制御方法 | |
JP2015114784A (ja) | バックアップ制御装置及びバックアップ制御方法、ディスクアレイ装置、並びにコンピュータ・プログラム | |
WO2013088474A2 (en) | Storage subsystem and method for recovering data in storage subsystem | |
WO2015141219A1 (ja) | ストレージシステム、制御装置、記憶装置、データアクセス方法及びプログラム記録媒体 | |
WO2016084156A1 (ja) | ストレージシステム | |
JP6884165B2 (ja) | 複数のストレージノードを含むストレージシステム | |
JP5856665B2 (ja) | ストレージシステム及びストレージシステムのデータ転送方法 | |
JP2018151927A (ja) | ストレージ装置、ストレージシステム、ストレージ装置の制御方法、プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14895856 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016528801 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15121875 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14895856 Country of ref document: EP Kind code of ref document: A1 |