WO2014020766A1

WO2014020766A1 - Storage system

Info

Publication number: WO2014020766A1
Application number: PCT/JP2012/069889
Authority: WO
Inventors: 朋宏吉原; 彰出口; 弘明圷
Original assignee: 株式会社日立製作所
Priority date: 2012-08-03
Filing date: 2012-08-03
Publication date: 2014-02-06
Also published as: JPWO2014020766A1; JP5965486B2

Abstract

In a storage system of one example, a memory stores control information indicative of whether requested data is stored in a cache memory, and management information for managing the use status of a process used in processing a write or read request. A processor allocates an unused process for the management information to a read request. If, on the basis of the control information and a first identifier of a region of a logical volume designated by the read request, the intended data of the read request is determined not to be in the cache, the processor stores the first identifier and an identifier of a region reserved on the cache in association with each other as control information if the class of a part of a plurality of physical storage volumes constituting the logical volume is a first class. If the class of the part is a second class, the processor stores a second identifier of the allocated unused process and an identifier of a region reserved on the cache in association with each other as control information, and stores data read out from the part in the region reserved on the cache.

Description

Storage system

The present invention relates to a storage system, and more particularly to control of a storage system.

International Publication No. 2010/131373 pamphlet (Patent Document 1) describes that a processor in charge of I / O of each volume caches data caching control information on a shared memory to a local memory (control caching), thereby increasing the storage system. A technique for improving performance is disclosed.

When the processor updates the control information of the local memory, the processor also updates the control information of the shared memory synchronously. As a result, other processors that take over responsibility from the failed processor can acquire the latest data caching control information from the shared memory, and can suppress the performance degradation of the storage system due to the cache hit rate decline. it can.

In addition, data caching, which enhances the performance of storage systems by caching user data from nonvolatile media into cache memory, is widely used in storage systems.

International Publication No. 2010/131373 Pamphlet

However, the update of the control information in the shared memory whose purpose is to improve performance is increasing the overhead of the shared memory to be accessed and the processor controlling the access. Data caching, whose purpose is to improve performance, is increasing the overhead of the cache memory to be accessed and the processor controlling the access. In particular, when the medium for storing user data is a storage medium capable of high-speed reading such as a solid state disk (SSD), the control information is updated by caching against the effect of shortening the reading time by caching. The ratio of the increase in processing time increases.

A storage system according to an aspect of the present invention is connected to a processor on which a control program operates, a plurality of first or second type physical storage volumes that provide storage resources to a plurality of logical volumes, and the processor. A cache memory for storing a part of data stored in the plurality of physical storage volumes, and a memory connected to the processor, wherein target data of a write or read request from a host is stored in the cache memory And a memory for storing process control information for managing the use status of a plurality of processes used for processing the write or read request. When the processor receives the read request designating any area of the logical volume from the host, the processor allocates an unused process among the plurality of processes managed by the process management information to the read request. And determining whether the target data of the read request is in a cache memory based on a first identifier that specifies an area of the logical volume specified by the read request and the cache control information, and the target data is the cache If it is determined that it is not in the memory, if a part of the plurality of physical storage volumes constituting the logical volume specified by the read request is the first type of physical storage volume, the first identifier and The key is associated with an identifier for specifying an area secured on the cache memory. Stored as cache control information in the memory and assigned to the read request when a part of the plurality of physical storage volumes constituting the logical volume specified by the read request is a second type of physical storage volume The second identifier for specifying the process specified and the identifier for specifying the area secured on the cache memory are associated with each other and stored in the memory as the cache control information. In the area secured on the cache memory, Data read from a part of the plurality of physical storage volumes in response to the read request is stored.

One embodiment of the present invention reduces the overhead in the storage system and improves the performance of the storage system.

In 1st Embodiment, it is a block diagram which shows typically the whole structure of a computer system. 3 is a diagram illustrating information stored in a local memory of a storage system in the first embodiment. FIG. FIG. 3 is a diagram illustrating information stored in a shared memory of the storage system in the first embodiment. FIG. 3 is a diagram schematically illustrating a configuration of a management computer in the first embodiment. In 1st Embodiment, it is a figure which shows an example of a performance boost function validation table. In 1st Embodiment, it is a figure which shows an example of the performance boost function activation table for every volume. 6 is a diagram illustrating an example of a media type table in the first embodiment. FIG. In 1st Embodiment, it is a figure which shows an example of a RAID level table. It is a figure which shows an example of a volume hit rate ratio table in 1st Embodiment. In 1st Embodiment, it is a figure which shows an example of a hit rate threshold value table. In 1st Embodiment, it is a figure which shows an example of MP operation rate table. In 1st Embodiment, it is a figure which shows an example of MP operation rate threshold value table. In 1st Embodiment, it is a figure which shows an example of CM operation rate table. In 1st Embodiment, it is a figure which shows an example of CM operation rate threshold value table. It is a flowchart of the process of the read command from the host in 1st Embodiment. It is a flowchart of the control information SM update determination process regarding data caching in the first embodiment. It is a flowchart of the host data caching process in 1st Embodiment. 7 is a part of a flowchart of processing of a write command from the host in the first embodiment. 12 is another part of the flowchart of the write command processing from the host in the first embodiment. It is a flowchart of the setting process from the management computer 20 in 1st Embodiment. In 1st Embodiment, it is a figure which shows an example of the setting menu screen in a management computer. It is a flowchart of the update process of the media type table in 1st Embodiment. It is a flowchart of the CMPK operating rate update process in 1st Embodiment. It is a flowchart of the hit rate update process in 1st Embodiment. It is a flowchart of the MP operation rate update process in 1st Embodiment. It is a flowchart of SM update processing at the time of owner movement in the first embodiment. It is a figure which shows the information stored in the local memory in 2nd Embodiment. It is a figure which shows the information stored in the shared memory in 2nd Embodiment. In 2nd Embodiment, it is a figure which shows an example of a dynamic mapping table. In 2nd Embodiment, it is a figure which shows an example of the monitor table for every page. In 2nd Embodiment, it is a figure which shows an example of the monitor difference table for every page. It is a flowchart of the storage tiering function monitor update process in the second embodiment. In 3rd Embodiment, it is a figure which shows typically the whole computer system structure. FIG. 10 is a diagram illustrating asynchronous remote copy in the third embodiment. It is a figure which shows the information stored in the local memory in 3rd Embodiment. It is a figure which shows the information stored in the shared memory in 3rd Embodiment. FIG. 20 is a diagram illustrating an example of an LM asynchronous remote copy sequence number management table in the third embodiment. FIG. 20 is a diagram illustrating an example of an SM asynchronous remote copy sequence number management table in the third embodiment. It is a flowchart of an asynchronous remote copy sequence number update process in the third embodiment. It is a flowchart of the asynchronous remote copy sequence number recovery process at the time of MPPK failure in the third embodiment. It is a figure which shows the information stored in the local memory in 4th Embodiment. It is a figure which shows the information stored in the shared memory in 4th Embodiment. In 4th Embodiment, it is a figure which shows an example of the LM local copy difference management table. FIG. 20 is a diagram illustrating an example of an SM local copy difference management table in the fourth embodiment. FIG. 20 is a diagram showing an example of an LM local copy difference area thinning operation management table in the fourth embodiment. FIG. 20 is a diagram illustrating an example of an SM local copy difference area thinning operation management table in the fourth embodiment. It is a flowchart of the asynchronous local copy difference management information update process in 4th Embodiment. It is a flowchart of the local copy difference copy process at the time of MPPK failure in 4th Embodiment. In 4th Embodiment, it is a figure which shows an example of the setting menu screen in a management computer. In 5th Embodiment, it is a figure which shows typically the whole structure of a computer system. It is a figure which shows the information stored in the local memory in 5th Embodiment. In 5th Embodiment, it is a figure which shows an example of an X path utilization rate table. In 5th Embodiment, it is a figure which shows an example of the X path utilization rate threshold value table. It is a flowchart of the control information SM update determination process regarding data caching in consideration of the X path in the fifth embodiment. It is a flowchart of the X path operation rate update process in 5th Embodiment. In a 6th embodiment, it is a figure showing typically the whole computer system composition. It is a figure which shows the information stored in the local memory in 6th Embodiment. In 6th Embodiment, it is a figure which shows an example of MP operation rate table. In 6th Embodiment, it is a figure which shows an example of MP operation rate threshold value table. FIG. 20 is a diagram illustrating an example of a shared memory area management table in the sixth embodiment. It is a part of flowchart of the control information SM update determination process regarding data caching in 6th Embodiment. It is another one part of the flowchart of the control information SM update determination process regarding data caching in 6th Embodiment. It is a flowchart of the MP operation rate update process in 6th Embodiment. It is a figure which shows the information stored in the local memory in 7th Embodiment. In 7th Embodiment, it is a figure which shows an example of a response table. In 7th Embodiment, it is a figure which shows an example of CM utilization threshold value table. It is a flowchart of the hit rate update process in 7th Embodiment. It is a figure which shows the information stored in the local memory in 1st Embodiment. 6 is a diagram illustrating an example of a CM bypass transfer ratio calculation table in the first embodiment. FIG. FIG. 6 is a diagram illustrating an example of a CM bypass transfer ratio table in the first embodiment. 4 is a flowchart of processing of a read command from a host in the first embodiment. 4 is a flowchart of host data caching determination processing in the first embodiment. 6 is a flowchart of a CM bypass transfer ratio calculation process in the first embodiment. It is a figure which shows the information stored in the local memory in 8th Embodiment. FIG. 20 is a diagram illustrating an example of a job management table in the eighth embodiment. FIG. 20 is a diagram illustrating an example of a job buffer address table in the eighth embodiment. FIG. 20 is a diagram illustrating an example of a buffer transfer ratio calculation table in the eighth embodiment. FIG. 20 is a diagram illustrating an example of a buffer transfer ratio table in the eighth embodiment. It is a part of flowchart of the process of the read command from the host in 8th Embodiment. It is another part of the flowchart of the process of the read command from the host in 8th Embodiment. It is a flowchart of the buffer transfer determination process in 8th Embodiment. It is a flowchart of the buffer ratio calculation process in 8th Embodiment. It is a figure which shows an example of the LRU replacement management of the cache slot and job number in 8th Embodiment.

The present invention relates to a technique for improving the performance of a storage system. Embodiments of the present invention will be described below with reference to the accompanying drawings. For clarity of explanation, the following description and the details of the drawings are omitted and simplified as appropriate, and redundant descriptions are omitted as necessary. This embodiment is merely an example for realizing the present invention, and does not limit the technical scope of the present invention.

First Embodiment The storage system of the present embodiment includes processors that are responsible for input / output (I / O) of different volumes. Each processor is assigned local memory. The storage system of this embodiment has a shared memory that can be accessed by a plurality of processors in charge of different volumes. Local memory and shared memory are typically volatile semiconductor memories.

The data caching control information of the volume handled by the processor is stored in the local memory of the processor (control data caching). Further, the shared memory stores data caching control information for the volume.

The processor refers to and updates the caching control information on the local memory, and performs data caching control of the assigned volume. As a result, the data caching control process can be speeded up.

As described above, the shared memory can be accessed by a plurality of processors in charge of different volumes. When a failure occurs in the processor in charge of any volume, the other processor takes over the charge and loads the corresponding data caching control information from the shared memory into its own local memory. The other processor uses the data caching control information acquired from the shared memory to control data caching of the inherited volume.

In this embodiment, the processor determines whether or not to update the update of the caching control information in the local memory to the control information in the shared memory according to a predetermined condition. By reflecting only the update necessary for updating the control information in the local memory in the control information in the shared memory, the overhead in communication between the processor and the shared memory can be reduced, and the performance of the storage system can be improved.

Furthermore, the storage system of this embodiment determines whether or not the read data and write data are cached according to a predetermined condition. By selectively caching read data and write data, the cache area is efficiently used, and further, the overhead of the processor that performs cache memory and data caching is reduced, thereby improving the performance of the storage system.

Hereinafter, the present embodiment will be specifically described with reference to FIGS. 1 to 25. FIG. 1 shows an example of a computer system including the storage system 10 of this embodiment, a host computer 180 that performs data processing and operations, and a management computer 20 that manages the storage system 10. The computer system can include a plurality of host computers 180.

The storage system 10 and the host computer 180 are connected to each other via the data network 190. The data network 190 is, for example, a SAN (Storage Area Network). The data network 190 may be an IP network or any other type of data communication network.

The storage system 10, the host computer 180, and the management computer 20 are connected to each other via a management network (not shown). The management network is, for example, an IP network. The management network may be a SAN or any other type of network. The data network 190 and the management network may be the same network.

The storage system 10 accommodates a plurality of storage drives 170. The storage drive 170 includes a hard disk drive (HDD) having a nonvolatile magnetic disk and a solid state drive (SSD) equipped with a nonvolatile semiconductor memory (for example, a flash memory). The storage drive 170 stores data (user data) sent from the host computer 180. Since the plurality of storage drives 170 perform data redundancy by RAID operation, data loss when a failure occurs in one storage drive 170 can be prevented.

The storage system 10 includes a front-end package (FEPK) 100 for connecting to the host computer 180, a back-end package (BEPK) 140 for connecting to the storage drive 170, a cache memory package (CMPK) 130 for mounting cache memory, It has a microprocessor package (MPPK) 120 on which a microprocessor that performs internal processing is mounted, and an internal network 150 that connects them. As shown in FIG. 1, the storage system 10 of this example includes a plurality of FEPKs 100, a plurality of BEPKs 140, a plurality of CMPKs 130, and a plurality of MPPKs 120.

Each FEPK 100 has an interface 101 for connecting to the host computer 180 and a transfer circuit 112 for transferring data in the storage system 10 on the substrate. The interface 101 can include a plurality of ports, and each port can be connected to the host computer 180. The interface 101 converts a protocol used for communication between the host computer 180 and the storage system 10, for example, Fiber Channel Over Ethernet (FCoE), into a protocol used for the internal network 150, for example, PCI-Express.

Each BEPK 140 has an interface 141 for connecting to the drive 170 and a transfer circuit 142 for transferring data in the storage system 10 on the substrate. The interface 141 can include a plurality of ports, and each port can be connected to the drive 170. The interface 141 converts a protocol used for communication with the storage drive 170, for example, FC, into a protocol used in the internal network 150.

Each CMPK 130 has a cache memory 131 that temporarily stores user data read and written from the host computer 180 and a shared memory (SM) 132 that stores control information handled by one or more MPPKs 120 on the substrate. A plurality of MPPKs 120 (microprocessors) in charge of different volumes can access the shared memory 132. Data and programs handled by the MPPK 120 are loaded from a nonvolatile memory (not shown) or the storage drive 170 in the storage system 10. The associated cache memory 131 and shared memory 132 may be mounted on different substrates (in a package).

Each MPPK 120 has one or more microprocessors 121, a local memory (LM) 122, and a bus 123 connecting them. In this example, a plurality of microprocessors 121 are mounted. The number of microprocessors 121 may be one. A plurality of microprocessors 121 can be viewed as one processor. The local memory 122 stores programs executed by the microprocessor 121 and control information used by the microprocessor 121.

As described above, one shared memory 132 stores control information handled by the MPPK 120. The MPPK 120 stores control information required by itself from the shared memory 132 in its own local memory 122 (control caching). Thereby, high speed access to the control information by the microprocessor 121 is realized, and the performance of the storage system 10 can be improved.

When the microprocessor 121 updates the control information in the local memory 122, the microprocessor 121 reflects the update in the control information on the shared memory 132 as necessary. One of the features of this embodiment is the control of this update. The microprocessor 121 reflects the update of the control information in the local memory 122 on the control information in the shared memory 132 when a predetermined condition is satisfied.

In this configuration example, the microprocessor 121 is assigned the charge of the volume that the storage system 10 provides to the host computer 180. The local memory 122 and the shared memory 132 allocated to the microprocessor 121 store data caching control information of the volume for which the microprocessor is responsible for I / O.

Note that control information to which the present invention can be applied is control information in general that does not lead to host data lost when an MP failure occurs even when the control information in the shared memory 132 is not updated. Examples of control information other than the data caching control information in this embodiment will be described in other embodiments. Although the present embodiment describes an example in which the microprocessor is responsible for the volume, the target to which the responsible microprocessor is assigned is not limited to the volume, and the responsible microprocessor may exist for each control information.

FIG. 2 is a block diagram showing information stored in the local memory 122. The local memory 122 includes a performance boost function enabling table 210, a per-volume performance boost function enabling table 220, a media type table 230, a RAID level table 240, a per-volume hit rate table 250, a hit rate threshold table 260, a microprocessor (MP ) The operating rate table 270 is stored.

The local memory 122 further includes a microprocessor (MP) operating rate threshold table 280, a cache memory (CM) operating rate table 290, and a cache memory (CM) operating rate threshold table 300. For example, the microprocessor 121 obtains at least a part of these tables from the storage drive 170 and other non-volatile storage areas in the storage system 10, stores them in the local memory 122, and creates some new tables. . Details of these tables will be described later.

The local memory 122 further stores a cache directory 310. FIG. 3 is a block diagram showing the cache directory 510 in the shared memory 132. The microprocessor 121 caches the cache directory 510 from the shared memory 132 in its own local memory 122 and reflects the update of the cache directory 310 on the local memory 122 in the cache directory 510 of the shared memory 132 as necessary. The cache directory 510 is backup data for the cache directory 310.

When the microprocessor 121 receives a read command from the host computer 180, the microprocessor 121 refers to the cache directory 310 of the local memory 122 and determines whether the target data is cached in the cache memory 131 (cache hit). As described above, the cache directory 310 provides information for searching the cache data stored in the cache memory 131.

The cache directory 310 includes reference tables GRPP, GRPT1, GRPT2, and a slot control block (SLCB) as a management table. The reference tables GRPP, GRPT1, and GRPT2 are tables that are referred to by the microprocessor 121 when searching for a cache segment, and have a directory structure. The reference table GRPP is located at the top and the reference table GRPT2 is located at the bottom. The upper table includes a pointer to the next table. GRPT2 includes a pointer to the SLCB.

The SLCB is a table for managing control information related to a segment, which is a minimum unit of cache management, and whether or not read command designation data exists on the cache memory 131, and whether cached data on the cache memory 131 is present. Address information and the like are stored.

• One or more segments can be associated with one slot. For example, 64 KB of data can be stored in one segment. The minimum unit of cache management is a segment, but the cache may be managed in slot units. Typically, the transition between the dirty data (the state before writing to the physical disk) and the clean data (the state after writing to the physical disk) is performed in units of slots. The cache area is reserved and released in slot units or segment units.

When there is a read access from the host computer 180, the microprocessor 121 sequentially traces each hierarchical table based on the logical block address (LBA) included therein, so that the requested data exists in the cache memory 131. Or if it exists, it knows its address.

When the requested data exists in the cache memory 131, the microprocessor 121 transmits the data to the host computer 180. If the requested data does not exist in the cache memory 131, the microprocessor 121 reads out the data requested by the host computer 180 from the storage drive 170 and stores it in one or more slots on the cache area. Write data is also cached in the same way. Note that retrieval of cache data using a cache directory is a well-known technique, and a detailed description thereof is omitted here.

FIG. 4 is a block diagram schematically showing the configuration of the management computer 20. The management computer 20 includes an input interface 22, an input device 28, a display interface 23, a display device 29, a CPU 26, a communication interface 21, a memory 24, and an HDD 25. Typical examples of the input device 28 are a keyboard and a pointer device, but other devices may be used. The display device 29 is typically a liquid crystal display device.

The administrator (user) inputs necessary data with the input device 28 while visually confirming the processing result with the display device 29. Information input by the administrator and a display example by the display device 29 will be described later. In the computer system of FIG. 1, the management system is composed of one management computer 20, but the management system can include a management console in addition to the management computer 20. The management console includes an input device and a display device, and is connected to the management computer 20 via a network.

The administrator accesses the management computer 20 from the management console, instructs the management computer 20 to process, and acquires and displays the processing result of the management computer 20 on the management console. The management system can also include a plurality of computers each having part or all of the functions of the management computer 20. The CPU 26 is a processor that executes a program stored in the memory 24. The communication I / F 21 is an interface with a management network, and exchanges data and control commands with the host computer 180 and the storage system 10 for system management.

FIG. 5 shows a configuration example of the performance boost function enabling table 210. The performance boost function enablement table 210 has a column 211 of performance boost function enable flags. The performance boost function valid flag indicates whether or not the performance boost function of the entire storage system 10 is active. When this flag is 1, the performance boost function of the entire storage system 10 is active.

In the present embodiment, the performance boost function is a function of reflecting (backup) control data updates stored in the local memory 122 to the shared memory 132 and data caching control. This function will be described later. The data of the performance boost function enabling table 210 is set by the administrator from the management computer 20, for example.

FIG. 6 shows a configuration example of the performance boost function enabling table 220 for each volume. The performance boost function enabling table for each volume table 220 manages the performance boost function for each volume. The per-volume performance boost function enabling table 220 has a logical volume number column 221 and a performance boost function enabling flag column 222. The logical volume number is an identifier of the logical volume.

When the performance boost function enable flag is 1, it indicates that the performance boost function for the volume is active. When both the entire system and the volume performance boost function enable flag are ON (1), the performance boost function for the volume is enabled. In this way, control corresponding to volume characteristics is realized by managing and controlling the performance boost function for each volume. The data of the performance boost function enabling table for each volume 220 is set by the administrator from the management computer 20, for example.

FIG. 7 shows a configuration example of the media type table 230. The media type table 230 manages the media type of the RAID group. In the present embodiment, a configuration including a storage area provided by one or a plurality of storage drives 170 and an interface thereof is called a medium. The media type table 230 includes a RAID group number column 231 and a media type column 232.

The RAID group number is an identifier that uniquely identifies a RAID group. In this specification, expressions such as an identifier, a name, and an ID can be used for identification information for identifying a target, and these can be replaced. The data of the media type table 230 is set by the administrator from the management computer 20, for example.

FIG. 8 shows a configuration example of the RAID level table 240. The RAID level table 240 manages the RAID level of the RAID group. It has a RAID group number column 241 and a RAID level column 242. The data of the RAID level table 240 is set by the administrator from the management computer 20, for example.

FIG. 9 shows a configuration example of the hit rate table for each volume 250. The volume hit rate table 250 manages the cache hit rate of each volume. The per-volume hit rate table 250 includes a logical volume number column 251, a hit rate column 252, an I / O count column 253, a hit count column 254, and a low hit rate flag column 255.

The number of I / Os is the number of read commands issued to the logical volume. The number of hits is the number of read commands having a cache hit. When the low hit rate flag is 1, it indicates that the hit rate of the entry is less than the specified threshold. The processor 121 counts the read access to the volume and the number of cache hits, and updates the data of each field of the hit rate table for each volume 250.

The unit for the microprocessor 121 to monitor the hit rate may be a unit smaller than the logical volume. For example, a page used in the virtual volume function or the hierarchization function may be used as a unit. Data caching control and caching control information update control, which will be described later, are performed in units of pages.

The calculation of the hit rate may include the write cache hit rate in addition to the read cache hit rate. The microprocessor 121 may individually manage the read cache hit rate and the write data hit rate. For example, the microprocessor 121 refers to the respective hit rates in read caching control and write caching control described later.

FIG. 10 shows a configuration example of the hit rate threshold table 260. The hit rate threshold table 260 has a hit rate threshold column 261. When the hit rate is equal to or less than the threshold value registered here, the low hit rate flag of the entry in the per-volume hit rate table 250 is set to 1 (ON flag). The hit rate threshold is set by the administrator from the management computer 20, for example.

FIG. 11 shows a configuration example of the MP operating rate table 270 that manages the operating rate of the microprocessor 121. The MP operating rate is the processing time of the microprocessor 121 within a unit time, and represents the load on the microprocessor. The MP operating rate table 270 includes a microprocessor number column 271, an operating rate column 272, an overload determination flag column 273, and an operating time column 274. The microprocessor number uniquely identifies the microprocessor within the storage system 10.

Each microprocessor 121 monitors its own operating status, and stores values of operating rate and operating time in the operating rate column 272 and operating time field of its own entry. The operating time is an operating time per unit time (1 second in this example). The operating rate is a value obtained by dividing the operating time by the unit time. The microprocessor 121 compares its own operating rate with a prescribed threshold value, and if it is equal to or greater than the threshold value, sets the value of the self-entry overload determination flag field to 1 (ON flag).

FIG. 12 shows a configuration example of the MP operating rate threshold value table 280 having the column 281 for storing the threshold values. In this example, the MP operating rate threshold is common to all the microprocessors, but a different threshold may be used.

FIG. 13 shows a configuration example of the CM operation rate table 290 that manages the operation rate of the cache memory. The CM operation rate is simply an access time to the cache memory 131 within the time. The CM operation rate table 290 includes a CMPK number column 291, an operation rate column 292, and an overload determination flag column 293. The CMPK number is an identifier of CMPK in the storage system 10.

The microprocessor 121 acquires the value of the operating rate from the controller on the CMPK 130 and stores it in the corresponding field of the operating rate column 292. The microprocessor 121 compares the obtained operating rate value with a prescribed threshold value, and when the operating rate value is equal to or greater than the threshold value, sets 1 (ON flag) in the overload determination flag field of the entry.

FIG. 14 shows a configuration example of a CM operation rate threshold value table 300 that stores the threshold values. In this example, the CM operation rate threshold is common to all CMPKs, but a different threshold may be used.

Referring to the flowchart of FIG. 15, processing performed by the storage system 10 for the read command received from the host computer 180 will be described. Receiving the read command from the host computer 180 (S101), the microprocessor 121 determines whether it has access right to the logical volume (also referred to as LDEV (Logo Device)) indicated by the read command (S102). When the access right is not possessed (S102: NO), the microprocessor 121 transfers the read command to the MPPK 120 having the access right (S103).

When the microprocessor 121 has an access right (S102: YES), the microprocessor 121 searches the cache directory 310 in the local memory 122 on the same MPPK 120 (S104). When the address (data) specified by the read command is found (S105: YES), the microprocessor 121 reads the read data from the cache memory 131 according to the information in the cache directory 310 and transmits it to the host computer 180 (S106).

If the address (data) specified by the read command is not found (cache miss) (S105: NO), the microprocessor 121 checks the uncached flag in the local memory 122 (S107). The uncached flag is a flag indicating whether all the data of the cache directory value 510 of the shared memory 132 is cached in the local memory 122, and is stored in the local memory 122. When some data is not read, the value is ON. For example, if the control information is not read from the shared memory 132 to the local memory 122 immediately after the failure failover, the uncached flag is ON.

When the uncached flag is ON (S107: YES), some data of the cache directory value 510 of the shared memory 132 is not cached. The microprocessor 121 transfers the cache directory (control information) from the shared memory 132 to the local memory 122 via the controller of the CMPK 130 (S108).

The microprocessor 121 searches the cache directory 310 in the local memory 122 (S109). When the data specified by the read command is found (S110: YES), the microprocessor 121 reads the read data from the cache memory 131 according to the information in the cache directory 310 and transmits it to the host computer 180 (S111).

In the case of a cache miss (S110: NO) or the uncached flag is OFF (S107: NO), the microprocessor 121 secures a slot for read data in the cache memory 131, and further, the cache directory of the local memory 122. 310 is updated (S112).

The microprocessor 121 determines whether or not the update of the cache directory 310, which is control information related to data caching, is reflected in the data 510 of the shared memory 132 (S113). A specific method of this determination will be described in detail later. If it is determined that the control information of the shared memory 132 is to be updated (S114: YES), the microprocessor 121 updates the cache directory 510 of the shared memory 132 (S115), and proceeds to the next step S116.

If it is determined not to update the control information of the shared memory 132 (S114: NO), the microprocessor 121 proceeds to step S116 without updating the control information of the shared memory 132. In step S116, the microprocessor 121 determines whether to cache read data (host data). This determination method will be described later.

When it is determined that the read data is stored in the cache memory 131 and then transmitted to the host computer 180 (S117: YES), the microprocessor 121 reads the read data from the storage drive 170 (permanent medium) using the BEPK 140 and CMPK 130, and caches the read data. Store in the reserved slot on the memory 131. Thereafter, the microprocessor 121 transmits the cache data to the host computer 180 by the CMPK 130 and the FEPK 100 (S118).

If it is determined that the read data is transmitted to the host computer 180 without being cached (S117: YES), the microprocessor 121 causes the BEPK 140 and FEPK 100 to read the read data read from the drive 170 (permanent medium) via the CMPK 130. And transfer to the host computer 180 (S119).

Referring to FIG. 16, the determination (S113) regarding the update of the data caching control information in the shared memory 132 in the flowchart of FIG. The microprocessor 121 starts this step S113, and determines whether or not the performance boost function of the logical volume designated by the read command is ON, by using the performance boost function enable table 210 and the per-volume performance boost function enable table 220. This is determined with reference to (S122). If one table indicates that the performance boost function is OFF, the performance boost function for the volume is OFF.

If the performance boost function of the logical volume is not ON (S122: NO), the microprocessor 121 determines to update the control information (cache directory) of the shared memory 132 (S128). When the performance boost function of the logical volume is ON (S122: YES), the microprocessor 121 next determines whether or not the media type of the RAID group in which the designated data is stored is SSD. With reference to the media type table 230, a determination is made (S123).

The microprocessor 121 has configuration management information for each volume in the local memory 122, and can know which RAID group each area of each volume belongs to by referring to that information.

If the media type is SSD (S123: YES), the microprocessor 121 determines not to update the control information (cache directory) of the shared memory 132 (S127). If the media type is not SSD (S123: NO), the microprocessor 121 next uses the logical volume number as a key to determine whether the low hit rate flag of the logical volume in which the designated data is stored is ON. And determining by referring to the hit rate table for each volume 250 (S124).

If the low hit rate flag is ON (S124: YES), the microprocessor 121 determines not to update the control information (cache directory) of the shared memory 132 (S127). When the low hit rate flag is OFF (S124: NO), the microprocessor 121 next refers to the MP operation rate table 270 as to whether or not its own overload flag is ON using the microprocessor number as a key. Then, the determination is made (S125).

If the overload flag is ON (S125: YES), the microprocessor 121 determines not to update the control information (cache directory) of the shared memory 132 (S127). When the overload flag is OFF (S125: NO), the microprocessor 121 next determines whether the overload flag of the access destination CMPK 130 is ON or not by using the CM operation rate table 290 using the CMPK number as a key. Refer to and determine (S126).

When the overload flag is ON (S126: YES), the microprocessor 121 determines not to update the control information (cache directory) of the shared memory 132 (S127). When the overload flag is OFF (S126: NO), the microprocessor 121 determines to update the control information (cache directory) of the shared memory 132 (S128).

As described above, when the specified condition is satisfied, the microprocessor 121 determines that the update of the cache directory 310 in the local memory 122 is not reflected in the cache directory 510 of the shared memory 132. As a result, the load on the microprocessor 121 and the CMPK 130 can be reduced and the throughput of the system can be improved.

The fact that the update of the control information (cache directory in this example) of the local memory is not reflected in the shared memory 132 becomes a problem when a failure occurs in the MPPK 120 in charge of the control information. In normal operation, the microprocessor 121 refers to its own local memory 122 and can therefore refer to the latest updated control information. On the other hand, when a failure occurs in the assigned MPPK 120, another MPPK 120 takes over the assigned charge (failover).

Since the data on the local memory 122 of the MPPK 120 where the failure has occurred is lost, the succeeding MPPK 120 (the microprocessor 121) can obtain only the old control information stored in the shared memory 132. Therefore, data stored in the shared memory 132 and for which updating (backup to the shared memory 132) can be omitted is control information that does not lead to user data lost when the MPPK 120 fails.

The above preferred configuration omits updating in the shared memory 132 that has a small effect when a failure occurs in the MPPK 120. Specifically, when the storage drive 170 from which read data is read due to a cache miss is an SSD (S123: YES), the microprocessor 121 determines not to update the shared memory 132 (S127). .

Due to failure of MPPK 120, information indicating that the data read from the SSD is cached is lost. However, the SSD has higher access performance than the drive 170 of other media types, the influence of cache miss due to lost control information is small, and the system performance improvement effect by reducing the overhead of MPPK120 and CMPK130 is greater.

In this configuration, the media type for which updating in the shared memory 132 is omitted is SSD, but this type depends on the system design. The types of media (drives) mounted in the system are not limited to SSDs and HDDs, and different types of drives can be mounted in addition to or instead of these. Among a plurality of installed media types, a type that satisfies the condition for omitting the update in the shared memory 132 is selected according to the design. A type having higher access performance than one or more other types including the type having the highest access performance is selected.

In this configuration, when the cache hit rate of the logical volume storing the read command designation data is low (S124: YES), the microprocessor 121 determines not to update the shared memory 132 (S127). Even if the cache control information of data of a volume with a low hit rate is lost, the influence on the access performance of the volume is small, and the effect of improving the system performance by reducing the overhead of MPPK120 and CMPK130 is greater.

This configuration further determines whether to update the shared memory 132 based on the current loads of the MPPK 120 and the CMPK 130 (S125, S126). When the load on the MPPK 120 or the CMPK 130 is high, the effect of improving the performance by omitting the update in the shared memory 132 is great.

Thus, in this configuration, when the performance boost function of the target volume is ON and any of the above four conditions is satisfied, the update of the cache control information in the shared memory 132 is omitted. The microprocessor 121 may determine whether to update the shared memory 132 based on conditions different from these. The microprocessor 121 may set a condition for omitting control information update in the shared memory 132 that a plurality of conditions among the four conditions are satisfied.

FIG. 17 shows a flowchart of determination (S116) for host data (read data) caching in the flowchart of FIG. The flowchart of this step is substantially the same as the flowchart shown in FIG. Therefore, mainly the points different from this will be specifically described.

17, step S132 to step S136 are the same as step 122 to step S126 in the flowchart of FIG. In step 137, the microprocessor 121 determines to transmit the host data (read data) read from the storage drive 170 to the host computer 180 without storing it in the cache memory 132. Transfer in which read data is not cached in the CM is called non-CM transfer. Non-CM transfer is realized by transferring read data from the transfer circuit 142 of the BEPK 140 to the transfer circuit 112 of the FEPK 110. Specifically, the transfer is from a volatile memory such as a DRAM in the transfer circuit 142 to a volatile memory in the transfer circuit 112.

On the other hand, in step S138, the microprocessor 121 determines to store (cache) the host data read from the storage drive 170 in the cache memory 132.

Thus, by selectively caching read data, the cache area is efficiently used, and further, the overhead of the cache memory and the processor that performs data caching is reduced, thereby improving the performance of the storage system. In particular, when the storage drive is an SSD, the ratio of the increase in the processing time for updating control information by caching to the effect of shortening the reading time by caching is large, so the effect of performance improvement by omitting the caching process Is big.

In this example, the condition for determining whether to cache read data is the same as the condition for determining whether to update the cache control information in the shared memory 132. Thus, by controlling read data caching, system performance can be improved by reducing the overhead of MPPK120 and CMPK130. The determination condition for the cache control and the determination condition for the control information update control may be different.

Next, processing for the write command received from the host computer 180 will be described with reference to the flowcharts shown in FIGS. 18A and 18B. The microprocessor 121 receives a write command from the host computer 180 (S141), and determines whether or not it has an access right to the volume (LDEV) of the designated address (S142).

When the microprocessor 121 does not have the access right (S142: NO), the microprocessor 121 transfers the write command to the other responsible MPPK 120 (S143). When the microprocessor 121 has an access right (S142: YES), the microprocessor 121 searches the cache directory 310 in the local memory 122 on the same substrate (S144).

When the address specified by the write command is found (S145: YES), the microprocessor 121 writes the write data to the cache memory 131 according to the information in the cache directory 310 and notifies the host computer 180 of the completion of the command (S146).

When the address specified by the write command is not found (cache miss) (S145: NO), the microprocessor 121 checks the uncached flag to the local memory 122 (S147). When the uncached flag is ON (S147: YES), the microprocessor 121 transfers the cache directory (control information) from the shared memory 132 to the local memory 122 via the controller of the CMPK 130 (S148).

The microprocessor 121 searches the cache directory 310 in the local memory 122 (S149). When the address specified by the write command is found (S150: YES), the microprocessor 121 writes the write data to the cache memory 131 according to the information in the cache directory 310, and notifies the host computer 180 of the completion of the command (S151).

In the case of a cache miss (S150: NO) or the uncached flag is OFF (S147: NO), the microprocessor 121 secures a slot for write data in the cache memory 131, and further, the cache directory of the local memory 122. 310 is updated (S152).

The microprocessor 121 determines whether or not the update of the cache directory 310, which is control information related to data caching, is reflected in the data 510 of the shared memory 132 (S153). A specific method of this determination is the same as the method described with reference to FIG. The microprocessor 121 further determines whether to cache write data (host data) (S154). This determination method is the same as the method described with reference to FIG.

If the microprocessor 121 determines that the write data is cached (S155: YES), the microprocessor 121 writes the write data to the newly secured area in the cache memory 131 and notifies the host computer 180 of the completion of the command (S156). . The microprocessor 121 updates the cache directory 510 in the shared memory 132 in synchronization with the update of the cache directory 310 in the local memory 122 regardless of the determination result in step S154.

When the microprocessor 121 determines not to cache the write data (S155: NO), the microprocessor 121 updates or omits the control information in the shared memory 132 based on the determination result in step S153. When the microprocessor 121 determines to update the cache control information (cache directory 510) in the shared memory 132 (S157: YES), the microprocessor 121 updates the cache directory 310 in the local memory 122 to update the cache in the shared memory 132. This is reflected in the directory 510 (S158), and the process proceeds to the next step S159.

When the microprocessor 121 determines not to update the cache control information in the shared memory 132 (S157: NO), the microprocessor 121 specifies the write-destination RAID level with reference to the RAID level table 240 (S159). . If the RAID level is 1 (S159: YES), the microprocessor 121 writes the data to the storage drive 170 by the BEPK 140 without storing the write data in the cache memory 131, and notifies the host computer 180 of the completion of the command. (S160).

When the RAID level is different from 1 (S159: NO), the microprocessor 121 generates parity and writes the parity and write data to the storage drive 170 by the BEPK 140 without storing the write data in the cache memory 131. Further, the microprocessor 121 notifies the host computer 180 of command completion (S161).

Thus, in this example, in the write command handling, in order to omit the update of the cache directory 510 in the shared memory 132, the storage of the write data in the cache memory 131 needs to be omitted. This is because if the cache control information is lost before the cached write data is destaged (written to the drive 170), the cache memory 131 cannot identify the write data.

As described above, in this example, the determination condition for determining whether or not the write data is cached in step S154 is the same as the determination condition in step S116 in FIG. The determination condition for determining whether or not the cache control information is updated in the shared memory 132 in step S153 is the same as the determination condition in step S113 in FIG. These may be different.

As described above, by controlling the caching of the write data and the update of the cache control information, the overhead of the MPPK 120 and the CMPK 130 can be reduced and the performance of the storage system 10 can be improved. If the write data is not cached, the parity generation is completed, the parity and the write data are written to the storage drive, and then the command completion is notified to the host. Also, SSD has inferior write performance than read performance. For this reason, in the case of a write command, a method of performing all caching may be used. In this case, the determination in S154 is omitted, and the process proceeds to S156.

Next, another example of the read command processing described with reference to FIGS. 15 to 17 will be described. Here, differences from the embodiment of FIGS. 15 to 17 will be mainly described. FIG. 66 shows control information stored in the local memory 122. 67 shows an example of the CM bypass transfer ratio calculation table 430, FIG. 67 shows an example of the CM bypass transfer ratio calculation table 430, and FIG. 68 shows an example of the CM bypass transfer ratio table 440.

FIG. 67 shows a configuration example of the CM bypass transfer ratio calculation table 430. The CM non-transit transfer rate calculation table 430 is a table for calculating a rate of transfer without CM from the cache hit rate and the MP operating rate for each logical volume. The CM bypass transfer ratio calculation table 430 includes a hit ratio column 431, a microprocessor operating ratio column 432, and a CM bypass transfer ratio column 433.

For the purpose of reducing the microprocessor OVH for read processing for data that does not hit the cache, a high value for the non-CM transfer rate is set when the hit rate is low, and a high value for the non-CM transfer rate when the microprocessor operating rate is high. Is set.

The lower limit of the non-CM transfer ratio is 0 and the upper limit is 99 or less. The reason why the upper limit is 99 or less is that the hit rate cannot be calculated when transferring without 100% CM. The hit rate used in this example is a hit rate when CM non-passing is excluded.

FIG. 68 shows a configuration example of the CM non-routed transfer ratio table 440. The CM non-transit transfer ratio table 440 is a table for managing the ratio of non-CM transfer in the read process for each logical volume. The CM bypass transfer ratio table 440 includes a logical volume number column 441 and a CM bypass transfer ratio column 442.

Processing for the read command received from the host computer 180 in this example will be described with reference to the flowchart shown in FIG. Receiving the read command from the host computer 180 (S851), the microprocessor 121 determines whether it has the access right to the LDEV indicated by the read command (S852). When the access right is not possessed (S852: NO), the microprocessor 121 transfers the read command to the MPPK 120 having the access right (S853).

When the microprocessor 121 has an access right (S852: YES), the microprocessor 121 searches the cache directory 310 in the local memory 122 on the same MPPK 120 (S854). When the address (data) specified by the read command is found (S855: YES), the microprocessor 121 reads the read data from the cache memory 131 according to the information in the cache directory 310 and transmits it to the host computer 180 (S856).

When the address (data) specified by the read command is not found (cache miss) (S855: NO), the microprocessor 121 checks the uncached flag of the local memory 122 (S857). The uncached flag is a flag indicating whether all the data of the cache directory value 510 of the shared memory 132 is cached in the local memory 122, and is stored in the local memory 122. When some data is not read, the value is ON. For example, if the control information is not read from the shared memory 132 to the local memory 122 immediately after the failure failover, the uncached flag is ON.

If the uncached flag is ON (S857: YES), some data in the cache directory value 510 of the shared memory 132 is not cached. The microprocessor 121 transfers the cache directory (control information) from the shared memory 132 to the local memory 122 via the controller of the CMPK 130 (S858).

The microprocessor 121 searches the cache directory 310 in the local memory 122 (S859). When the data specified by the read command is found (S860: YES), the microprocessor 121 reads the read data from the cache memory 131 according to the information in the cache directory 310 and transmits it to the host computer 180 (S111).

In the case of a cache miss (S860: NO) or the uncached flag is OFF (S857: NO), in S862, the microprocessor 121 determines whether or not to cache read data (host data). This determination method will be described later.

When it is determined that the read data is transmitted to the host computer 180 without being cached (S863: YES), the microprocessor 121 reads the read data read from the drive 170 (permanent medium) without passing through the CMPK 130. 142, the memory of the transfer circuit 142 is transferred to the memory of the transfer circuit 112 of the FEPK100, and the memory of the transfer circuit 112 is transferred to the host computer 180 (S864).

When it is determined that the read data is stored in the cache memory 131 and then transmitted to the host computer 180 (S863: YES), the microprocessor 121 secures a slot for the read data in the cache memory 131, and further, the local memory 122. The cache directory 310 and the cache directory 510 of the shared memory 132 are updated (S865).

The microprocessor 121 reads the read data from the storage drive 170 (permanent medium) using the BEPK 140 and the CMPK 130 and stores the read data in the reserved slot on the cache memory 131. Thereafter, the microprocessor 121 transmits the cache data to the host computer 180 by the CMPK 130 and the FEPK 100 (S866).

Referring to FIG. 70, the determination (S862) regarding the necessity of data caching in the flowchart of FIG. 69 will be described. The microprocessor 121 starts this step S862 (S871), determines whether or not the performance boost function of the logical volume designated by the read command is ON, the performance boost function enable table 210 and the per-volume performance boost function enable. The determination is made with reference to the table 220 (S872). If one table indicates that the performance boost function is OFF, the performance boost function for the volume is OFF.

If the performance boost function of the logical volume is not ON (S872: NO), the microprocessor 121 stores the host data (read data) read from the storage drive 170 in the host computer 180 without storing it in the cache memory 132. It decides to transmit (S877). When the performance boost function of the logical volume is ON (S872: YES), the microprocessor 121 next determines whether or not the media type of the RAID group in which the designated data is stored is SSD. With reference to the media type table 230 as a key, determination is made (S873).

When the media type is not SSD (S873: NO), it is determined that the host data (read data) read from the storage drive 170 is transmitted to the host computer 180 without being stored in the cache memory 132 (S877). When the media type is SSD (S873: YES), the microprocessor 121 next determines whether or not the current I / O is a non-CM transfer target and the logical volume number in which the specified data is stored. Judgment is made by referring to the CM bypass transfer ratio table 440 as a key (S874).

As a method of determining whether or not a CM bypass transfer target is used using a CM bypass transfer ratio having a value from 0 to 99, the microprocessor 121 uses random numbers from 0 to 100, and the random number is not bypassed by the CM. When the transfer ratio falls below, it may be determined that the current I / O is a non-CM transfer target. The microprocessor 121 uses the hash value from 0 to 100 using the read data address as a key, and if the hash value falls below the CM bypass transfer ratio, the current I / O is the CM bypass transfer target. May be determined. The microprocessor 121 uses a counter that increases by 1 from 0 to 100 (returns to 0 after 100), and when the counter value falls below the CM non-transit transfer rate, the current I / O is set to CM It may be determined that it is a non-passing transfer target.

As a result of the determination in S874, if it is determined that the current I / O is a non-CM transfer target (S875: YES), the microprocessor 121 reads the host data (read data) read from the storage drive 170 as a cache memory. If it is determined to transmit to the host computer 180 without storing in 132 (S876) and it is determined that the current I / O is not subject to CM non-transmission (S875: NO), read from the storage drive 170 The host data is determined to be stored in the cache memory 132 (S877).

As described above, when the specified condition is satisfied, the microprocessor 121 determines to use the CM bypass transfer that does not require updating the cache directory 310 in the local memory 122 and the cache directory 510 in the shared memory 132. As a result, the load on the microprocessor 121 and the CMPK 130 can be reduced and the throughput of the system can be improved.

Referring to the flowchart in FIG. 71, calculation of the CM non-transit transfer ratio will be described. This flow is called at a cycle such as 1 second for each LDEV (logical volume). The microprocessor 121 may calculate all LDEVs at a certain period, or may calculate when the I / O target LDEV has not been updated for 1 second or the like at the time of I / O processing.

The microprocessor 121 refers to the hit rate table 250 for each volume using the target LDEV number (logical volume number) as a key, obtains the hit rate from the number of I / Os and the number of hits, and uses the MP number of its own as an MP operation rate table 380, the MP operation rate is obtained, and the CM non-transmission transfer ratio calculation table 430 is referred to using the hit rate and the MP operation rate as keys, and the CM non-transmission transfer ratio is obtained (S882).

The microprocessor 121 updates the CM bypass transfer ratio column of the LDEV number (logical volume number) in the CM bypass transfer ratio table 440 with the CM bypass transfer ratio determined in S882 (S883), and ends this processing. (S884).

In contrast to the read process described with reference to FIGS. 66 to 71, the write process may perform data caching determination, or may employ a method of caching all data.

For example, when all the data is cached, the flow is almost the same as the flow in which S851, S853, S856, S861, and S866 are replaced with S141, S143, S146, S151, and S156 in FIG. 18A, respectively. . However, in the case of write processing, the steps (S862 to S864) relating to the host caching determination are omitted, and if it is determined NO in step S857 or S860, the process is different in that it proceeds to step S865.

Next, the setting process from the management computer 20 will be described with reference to the flowchart of FIG. The management computer 20 operates according to a management program executed thereon. Therefore, the description that uses the management computer 20 as the subject can use the management program or the CPU 26 as the subject. The management computer 20 starts the setting process (S171), and displays a menu for setting data input on the display device 29 (S172). The administrator uses the input device 28 to input necessary setting data (S173 and S174: NO).

When all necessary data has been input (S174: YES), the management computer 20 saves the setting data in response to the selection of the save button. The setting data is transmitted from the management computer 20 to the storage system 10 in response to a request from the storage system 10. The administrator can input again by selecting the cancel button.

FIG. 20 shows an example 2000 of the menu screen. The menu screen 2000 includes a performance boost function setting area 2001 and a performance boost function setting area 2004 for each volume.

The administrator selects one of “ENABLE” and “DISABLE” in the performance boost function setting area 2001 by using the input device 28, so that the performance boost function of the storage system 10 (the control information update control and the user data caching control) is selected. Can be enabled or disabled. This setting is reflected in the performance boost function enabling table 210. When this is disabled, all performance boost functions of the storage system 10 are not used.

The per-volume performance boost function setting area 2004 includes a logical volume number column 2005 and a performance boost function setting column 2006. The administrator can select enable / disable of the performance boost function of each logical volume with the input device 28 in the per-volume performance boost function setting area 2004.

This setting is reflected in the performance boost function enabling table 220 for each volume. The performance boost function of this embodiment is used for a volume for which the system performance boost function is enabled and the volume performance boost function is enabled.

FIG. 20 exemplifies the setting screen for the performance boost function. In addition, the management computer 20 displays, for example, a threshold setting screen included in the determination condition, and stores setting data input by the administrator. Send to system 10. Typically, the storage system 10 has in advance default values of items that can be set by the administrator, and updates the item data set by the administrator with the input data.

Next, the table update in the storage system 10 will be described with reference to FIGS. FIG. 21 is a flowchart for updating the media type table 230. When the number of RAID groups is increased or decreased (S201), the BEPK 140 transmits the information to one of the microprocessors 121. The microprocessor 121 that has received the update information updates the media type table 230 and the RAID level table 240 in the local memory 122, updates these tables in the nonvolatile storage area (S202), and notifies the other MPPKs 120 of these updates. .

Referring to FIG. 22, the update of the CM operation rate table 290 will be described. An arbitrary microprocessor 121 of the MPPK 120 performs this processing. Typically, this processing is performed periodically (for example, every second). The microprocessor 121 acquires operation rate information from the CMPK 130 as the access destination (S212). Specifically, the microprocessor 121 requests a value indicating the operating rate (CM operating rate) of the CMPK 130 from a controller (not shown) in the CMPK 130, and acquires it from the controller in the CMPK 130.

The microprocessor 121 updates the field of the operation rate column 292 of the corresponding entry in the CM operation rate table 290 with the value of the operation rate acquired from the CMPK 130. Further, the microprocessor 121 determines whether or not the updated operation rate value is equal to or greater than the threshold value in the CM operation rate threshold table 300 (S214).

When the operating rate is equal to or greater than the threshold (S214: YES), the microprocessor 121 sets the overload flag of the corresponding entry in the CM operating rate table 290 to 1 (ON) (S215). When the operating rate is less than the threshold (S214: NO), the microprocessor 121 sets the overload flag of the corresponding entry to 0 (OFF) (S216). The microprocessor 121 performs steps S212 to S216 for all CMPKs 130 to be accessed (S217).

Referring to FIG. 23, the update of the volume hit rate table 250 will be described. An arbitrary microprocessor 121 of the MPPK 120 performs this processing. Typically, this processing is performed periodically (for example, every second). The microprocessor 121 acquires the number of I / Os and the number of hits for one logical volume in charge from the local memory 122 (S222). The or other microprocessor 121 counts the number of I / Os (for example, the number of read commands) and the number of cache hits to each assigned logical volume since the previous update, and stores them in the local memory 122. The microprocessor 121 The value is acquired in step S222.

The microprocessor 121 updates the hit rate, I / O count, and hit count fields of the corresponding entry in the per-volume hit rate table 250 with the acquired value (S223). The microprocessor 121 further compares the hit rate with the threshold values in the hit rate threshold table 260.

If the hit rate is equal to or lower than the threshold (S224: YES), the microprocessor 121 sets the low hit flag of the entry to 1 (ON) (S225). On the other hand, when the hit rate is larger than the threshold (S224: NO), the microprocessor 121 sets the low hit flag of the entry to 0 (OFF) (S226). The microprocessor 121 performs steps S222 to S226 for all the logical volumes in charge (S227).

Referring to FIG. 24, the update of the MP operation rate table 270 will be described. Each microprocessor 121 performs this processing. Typically, this processing is performed periodically (for example, every second). The microprocessor 121 monitors its operating time per unit time (1 second in this example) and stores the value in the local memory 122. The microprocessor 121 acquires the value from the local memory 122 (S232).

The microprocessor 121 uses the acquired value to update the operation rate and operation time fields of the corresponding entry (S233). Further, the microprocessor 121 compares the updated operation rate with the threshold value of the MP operation rate threshold value table 280 (S234). When the operating rate is equal to or higher than the threshold (S234: YES), the microprocessor 121 sets the overload flag of the entry to 1 (ON) (S235). When the operating rate is less than the threshold (S234: NO), the microprocessor 121 sets the overload of the entry to 0 (OFF) (S236).

Referring to FIG. 25, the movement of ownership of the logical volume from the current MPPK 120 to another MPPK 120 will be described. Before the ownership is transferred, the MPPK 120 reflects the unreflected portion in the cache directory 310 stored in the local memory 122 in the shared memory 132. Accordingly, the next MPPK 120 can perform cache control using the latest cache directory, and the cache hit rate can be increased.

The current owner MPPK microprocessor 121 sets the search target in the cache directory 310 to the logical address 0 of the logical volume to which the owner is to be moved (S242). The microprocessor 121 searches the cache directory 310 for the address (S243).

If the address exists in the directory for which the shared memory non-reflecting flag is set to ON (S244: YES), the microprocessor 121 updates the directory in the shared memory 132 (S245), and proceeds to step S246. . The shared memory non-reflecting flag is a flag indicating whether or not the update of the target directory has been reflected in the shared memory 132. If it is ON, the update of the target directory is not reflected in the shared memory 132. Indicates.

If the above address exists in a directory where the shared memory non-reflecting flag is set to OFF (S244: NO), the microprocessor 121 proceeds to step S246 without updating the directory on the shared memory 132.

In step S246, the microprocessor 121 determines whether or not the search of the cache directory 310 for the volume has been completed. If all addresses have been searched (S246: YES), the microprocessor 121 ends this process. If an unsearched address remains (S246: NO), the microprocessor 121 changes the target address to the next logical address (S247) and repeats steps S243 to S246.

Second Embodiment This embodiment describes a storage system 10 having a storage tier virtualization function. The storage system 10 of this embodiment constructs a pool including a plurality of pool volumes (real volumes). The pool includes a plurality of media having different performances in the storage system 10 and is hierarchized into a plurality of tiers according to access performance. Each tier is composed of one or a plurality of pool volumes.

The storage system 10 provides the host computer 180 with a virtual volume constructed from the pool. The storage system 10 manages the pool in units of pages with a specific capacity. Each pool volume is divided into a plurality of pages, and data is stored in each page. The storage system 10 allocates one or more pages of the necessary capacity from the pool for writing from the host computer 180 to the virtual volume.

The storage system 10 can make the capacity of the virtual volume 401 recognized by the host computer 180 larger than the actual capacity assigned to the virtual volume, and is necessary for realizing the capacity assigned to the host computer 180. The actual capacity can be made smaller (thin provisioning).

The storage system 10 analyzes the I / O load from the host computer 180 with respect to the virtual volume, and pages with a high I / O load are transferred to upper tiers composed of resources composed of high performance and expensive media. Are automatically placed in a lower hierarchy consisting of resources composed of inexpensive media with low performance. Thereby, the cost of the system can be reduced while maintaining the access performance to the virtual volume.

Hereinafter, differences from the first embodiment will be mainly described. FIG. 26 shows information stored in the local memory 122 of this embodiment. The control information in the local memory 122 includes a per-page monitor difference table 320 in addition to the information described in the first embodiment. FIG. 27 shows data stored in the shared memory 132 of this embodiment. The control information of the shared memory 132 includes a dynamic mapping table 520 and a per-page monitor table 530 in addition to the information described in the first embodiment.

FIG. 28 shows an example of the dynamic mapping table 520. The dynamic mapping table 520 is a table for managing entries (storage area entries) for counting the number of accesses in each virtual volume. For example, one page is one entry of the dynamic mapping table 520. Here, this example will be described.

The dynamic mapping table 520 includes a pool number column 521, a virtual volume number column 522, a logical address column 523, a pool volume number column 524, a logical address column 525, and a monitor information index number column 526. The pool number and the virtual volume number are identifiers that uniquely identify the pool and the virtual volume in the storage system 10, respectively. The monitor information index number is an entry identifier in the dynamic mapping table 520.

The logical address column 523 stores the start logical address of each entry in the virtual volume. The logical address column 525 stores the start logical address in the pool volume of each entry. In this example, the entry capacity is constant, but it may not be constant.

FIG. 29 shows an example of the page-by-page monitor table 530. The per-page monitor table 530 manages the number of I / Os for each page. The microprocessor 121 refers to the table 530 to determine a hierarchy for storing the data of the page.

The per-page monitor table 530 includes a monitor information index number column 531, an I / O counter (current) column 532, and an I / O counter (previous) column 533. The microprocessor 121 monitors access to the page, counts the number of I / Os (access count) within a predetermined monitoring period (for example, 1 second), and stores the count in the page-by-page monitor table 530. The monitoring period continues continuously.

The column 533 of the I / O counter (previous) stores the number of I / Os in the previous monitoring period. An I / O counter (current) column 532 stores the number of I / Os in the current monitoring period. The microprocessor 121 repeatedly updates the value in the column 532 of the I / O counter (current) within the current monitoring period.

In this configuration, the microprocessor 121 uses the per-page monitor difference table 320 in the local memory 122 to count the number of I / Os and reflects the update of the value in the per-page monitor table 530 in the shared memory 132. This point will be described later. When the current monitoring period ends, the microprocessor 121 moves the number of I / Os in the previous monitoring period to the field of the number of I / Os in the current monitoring period.

FIG. 30 shows an example of the monitor difference table 320 for each page. The per-page monitor difference table 320 is used for counting accesses to each page. The per-page monitor difference table 320 includes a monitor information index number column 321 and an I / O difference counter column 322. The microprocessor 121 monitors the access of each page. When there is an access, the microprocessor 121 increments the value of the corresponding field in the column 322 of the I / O difference counter.

When the value of the field in the column 322 of the I / O difference counter reaches a specified value (the maximum value in this example), the microprocessor 121 sets the column 532 in the I / O counter (current) of the corresponding entry in the per-page monitor table 530. The field is updated by adding the value to the field value of. The microprocessor 121 returns the value of the field in the column 322 of the I / O difference counter that has reached the maximum value to the initial value (0 value). Thus, the I / O difference counter indicates the difference in the number of I / Os from the previous update of the per-page monitor table 530.

As shown in FIGS. 30 and 29, the I / O difference counter column 322 of the per-page monitor difference table 320 stores 8-bit data, and the I / O counter (current) column 532 of the per-page monitor table 530 includes , 32 bits of data larger than 8 bits are stored.

A specific method for updating the storage tier virtualization function monitor will be described with reference to the flowchart of FIG. When receiving access to the page, the microprocessor 121 increments the I / O difference counter for the page in the per-page monitor difference table 320 (S302).

The microprocessor 121 determines whether the logical volume performance boost function is ON (S303). This step is the same as step S122 in FIG. If the volume performance boost function is OFF (S303: NO), the microprocessor 121 proceeds to step S307.

When the volume performance boost function is ON (S303: YES), the microprocessor 121 determines whether or not its own overload flag is ON (S304). This step is the same as step S125 in FIG.

If the overload flag is ON (S304: YES), the microprocessor 121 proceeds to step S306. When the overload flag is OFF (S304: NO), the microprocessor 121 determines whether or not the overload flag of the CMPK 130 as the access destination is ON (S305). This step is the same as step S126 in FIG.

When the overload flag of the CMPK 130 is OFF (S305: NO), the microprocessor 121 proceeds to step S307. If the overload flag of the CMPK 130 is ON (S305: YES), the microprocessor 121 proceeds to step S306. In step S306, the microprocessor 121 determines whether or not the value of the I / O difference counter in the per-page monitor difference table 320 is the maximum value.

If the value of the I / O difference counter is less than the maximum value (S306: NO), this flow ends. When the value of the I / O difference counter is the maximum value (S306: YES), the microprocessor 121 sets the maximum value to the value of the field in the column 532 of the I / O counter (current) of the corresponding entry of the per-page monitor table 530. The value is added to update the field (S307). The microprocessor 121 further sets the value of the field in the column 322 of the I / O difference counter that has reached the maximum value to 0 value (initial value) (S308).

In this example, when the load on the microprocessor 121 and the CMPK 130 is small, the I / O counter of the shared memory 132 is updated in synchronization with the update of the I / O difference counter in the local memory 122. Since these loads are small, a decrease in system performance is not a problem, and an accurate I / O count can be obtained when a failure occurs. The load conditions of these two devices may be omitted, and the establishment of both may be used as the condition for asynchronous update of the I / O counter value. Different conditions may be used.

As described above, the microprocessor 121 counts the number of page I / Os with the counter in the local memory 122, and when the value reaches a specified value, the specified value is reflected in the counter of the shared memory 132. As a result, overhead due to communication between the microprocessor 121 and the CMPK 130 is reduced.

The bit number of the counter of the monitor difference table 320 for each page is smaller than the bit number of the counter of the monitor table 530 for each page. Thus, by counting the difference on the local memory, the capacity required in the local memory 122 for counting the number of I / Os can be reduced. When the MPPK 120 fails, information on the I / O count number for a predetermined period is lost. However, since the difference in the I / O count number is not reflected in the page I / O count number, page I / O analysis is performed. Will not have a substantial impact.

Note that the performance monitoring method of the present embodiment is not limited to the monitoring of the hierarchical virtualization function, but can be applied to other performance monitoring. For example, it can be applied to a monitor of a drive such as an HDD or an SSD. In the above example, the counter is initialized when the number of counters reaches the maximum value, but I / O may be counted in the initialization. For example, the microprocessor 121 initializes the I / O difference counter and adds a value obtained by adding the maximum number of 1 to the value of the I / O counter of the monitor table 530 for each page. This is the same in the counting method in the other embodiments.

Third Embodiment Hereinafter, an example in which the present invention is applied to asynchronous remote copy will be described. In the following, differences from the first embodiment and the second embodiment will be mainly described. FIG. 32 is a block diagram schematically showing the configuration of the computer system of this embodiment. The storage system of this embodiment includes a first storage system 10A and a second storage system 10B. Typically, the first storage system 10A and the second storage system 10B are installed at different sites, and are communicably connected via a data network (eg, SAN) 190A, a data network (eg, SAN) 190B, and a wide area network. To do.

The first storage system 10A and the second storage system 10B have the same configuration as the hardware configuration described with reference to FIG. Specifically, the first storage system 10A includes a plurality of FEPKs 110A, a plurality of MPPKs 120A, a plurality of CMPKs 130A, and a plurality of BEPKs 140A, which are connected via an internal network 150A. The first management computer 20A manages the first storage system 10A.

Similarly, the second storage system 10B includes a plurality of FEPKs 110B, a plurality of MPPKs 120B, a plurality of CMPKs 130B, and a plurality of BEPKs 140B, which are connected via the internal network 150B. The second management computer 20B manages the second storage system 10A.

The first storage system 10A and the second storage system 10B have an asynchronous re-mode copy function. The primary volume (PVOL) 171P of the first storage system 10A and the secondary volume (SVOL) 171S of the second storage system 10B constitute a copy pair. A volume typically consists of one or more storage areas in one or more RAID groups.

The primary volume 171P is the copy source volume, the secondary volume 171S is the copy destination volume, and the data of the primary volume 171P is copied to the secondary volume 171S. The order of data writing to the primary volume 171P and the order of data copying to the secondary volume 171S match (order guarantee).

In the case of synchronous copy, when the host computer 180 writes to the primary volume 171P, after the copy to the secondary volume 171S is completed (typically after writing to the cache memory), I / O success is made to the host computer 180. Notice. In contrast, asynchronous copy notifies the host computer 180 of I / O success after completion of writing to the primary volume 171P and before completion of copying to the secondary volume 171S.

The storage system of this embodiment uses journal volumes (JVOL) 171JP and 171JS as a buffer for copying from the primary volume 171P to the secondary volume 171S. In the first storage system 10A, the primary volume 171P and the journal volume 171JP are grouped. In the second storage system 10B, the secondary volume 171S and the journal volume 171JS are grouped.

Update data in the primary volume 171P is transmitted to the secondary volume 171S via the journal volumes 171JP and 171JS. This makes it possible to use a wide area network with unstable performance in remote copy data transfer.

33, the flow of data writing from the host computer 180 to the primary volume 171P and copying of the updated data to the secondary volume 171S will be described. The FEPK 110A receives a write command and write data from the host computer 180. The MPPK 120 (the microprocessor 121) analyzes the write command and instructs the FEPK 110A and the BEPK 140A (not shown) to write the write data to the primary volume 171P and the journal volume 171JP.

Specifically, the MPPK 120 instructs the FEPK 110A and the BEPK 140A to transfer the write data to the next transfer destination specified. The final transfer destination is the primary volume 171P and the journal volume 171JP, and the write data is written to the primary volume 171P and the journal volume 171JP, respectively. The order of writing to the journal volume 171JP matches the order of writing to the primary volume 171P.

In this figure, the description of writing the write data to the cache memory 131 is omitted, or the write data is stored in the volume without going through the cache memory 131. The MPPK 120 notifies the host computer 180 of the completion of writing in response to the completion of writing of the write data to the cache memory 131 or the completion of writing to the volume.

The MPPK 120 updates the management data of the journal volume 171JP according to the update of the journal volume 171JP. As shown in FIG. 33, the journal volume 171JP has a management area 611 and a data area 612, each storing journal volume management data and update data. Journal volume management data may be stored outside the journal volume.

Journal volume management data includes a sequence number 601 and pointer 602 pair. A pair of these values is given to each write data (update data). In the example of this figure, the sequence number 601 is any value from 1 to n, and is assigned to each write data in ascending order in the order stored in the data area. The sequence number is cyclic, and 1 is assigned to the data next to the write data to which n is assigned. A pointer 602 indicates a position (address) where write data to which a corresponding sequence number is assigned in the data area 612 is stored.

The management area 611 includes an area where a pair of sequence number 601 and pointer 602 is written, and an unused area 604. The unused area 604 stores an initial value. In this example, the initial value is a zero value. When the microprocessor 121 transfers the update data stored in the data area 612 to the second storage system 10B, the value of the area storing the sequence number 601 and the pointer 602 of the data is set to the initial value (invalid value). Update. The transfer order of update data matches the write order of update data to the journal volume 171JP.

In the management area 611, the position where the next new pair of the sequence number 601 and the pointer 602 is written is determined. For example, the pair is written in the ascending order of addresses in the management area 611. The next pair written to the end point address is written to the start address.

In the area for storing the sequence number 601 and the pointer 602 (also referred to as a journal area), the sequence number 601 immediately before the area storing the initial value, that is, the first sequence number in the journal area indicates the latest update data. . On the other hand, the sequence number 601 immediately after the area storing the initial value, that is, the update data having the oldest sequence number in the journal area is indicated.

As described above, the MPPK 120A of the first storage system 10A transfers the update data stored in the journal volume 171JP to the second storage system 10B in the update order (write order). The MPPK 120B of the second storage system 10B sequentially stores the update data received by the FEPK 110B in the journal volume 171JS. In this figure, caching to the cache memory 131 is omitted. The MPPK 120B writes the update data stored in the journal volume 171JP to the secondary volume 171S in the update order at a specified timing.

As with the journal volume 171JP, the journal volume 171JS of the second storage system 10B includes a management area and a data area, and each stores journal management data and update data.

The MPPK 120B stores the update data in the journal volume 171JS, then writes a new sequence number and pointer, and updates the management data. The configuration of management data is the same as that of the journal volume 171JP. When update data in the journal volume 171JS is written to the secondary volume 171S, the MPPK 120B changes the corresponding sequence number and pointer value to the initial value (invalid value).

FIG. 34 shows control information stored in the local memory 122 in the first storage system 10A. In this embodiment, the LM asynchronous remote copy sequence number management table 330 is stored in the local memory 122. FIG. 35 shows control information stored in the shared memory 132 in the first storage system 10A. In this embodiment, an asynchronous remote copy management table 540 and an SM asynchronous remote copy sequence number management table 530 are stored.

The asynchronous remote copy management table 540 stores management information for pair management. Specifically, it includes management information for managing each pair of primary volume and secondary volume, remote copy path information, and journal volume information grouped with each of the primary volume and secondary volume. The microprocessor 121 refers to the management table 540 and controls execution of remote copy.

FIG. 36 shows an example of the LM asynchronous remote copy sequence number management table 330. The LM asynchronous remote copy sequence number management table 330 manages the latest sequence number of each journal volume in the local memory 122. The microprocessor 121 of the MPPK 120A can determine the sequence number of update data to be newly written to the journal volume 171JS with reference to the LM asynchronous remote copy sequence number management table 330.

The LM asynchronous remote copy sequence number management table 330 has a JVOL number column 331, a sequence number column 332, and a sequence number difference column 333. The JVOL number is an identifier of a journal volume in the first storage system 10A. The sequence number column 332 stores data indicating the leading sequence number in the JVOL. The sequence number difference will be described later.

FIG. 37 shows an example of the SM asynchronous remote copy sequence number management table 530. The SM asynchronous remote copy sequence number management table 530 manages the sequence number of each journal volume in the shared memory 132. The SM asynchronous remote copy sequence number management table 530 includes a JVOL number column 531 and a sequence number column 532.

The sequence number column 532 stores data indicating the head sequence number in JVOL. The value of the sequence number column 532 in one entry matches or is different from the value of the corresponding sequence number column 332 in the local memory 122 (the values of all the entries are different in the examples of FIGS. 36 and 37). Those updates are synchronous or asynchronous.

As shown in FIGS. 36 and 37, in each JVOL entry, the value of the field of the sequence number difference column 333 is the value of the corresponding field of the sequence number column 332 of the LM asynchronous remote copy sequence number management table 330 and the SM asynchronous. This is a difference from the value of the corresponding field in the sequence number column 532 of the remote copy sequence number management table 530.

As described above, the field value in the sequence number difference column 333 indicates the update of the sequence number in the JVOL from the previous update of the corresponding field in the sequence number column 532, and the first sequence at the time of the previous update stored in the shared memory 133 The difference between the number and the latest start sequence number is shown.

Each time update data is written to a journal volume, the microprocessor 121 of the MPPK 120A increments the values of the sequence number column 332 and the sequence number difference column 333 in the journal volume entry. Each field of the sequence number column 332 indicates the latest sequence number (the last assigned sequence number) of the corresponding journal volume. The value of each field in the sequence number column 332 returns to the minimum value when incremented from the maximum value.

The number of bits (maximum value) in the sequence number difference column 333 is smaller than the number of bits (maximum value) in the sequence number column 332. When the value of the field in the sequence number difference column 333 reaches the maximum value, the microprocessor 121 updates the entry in the LM asynchronous remote copy sequence number management table 330 to the corresponding entry in the SM asynchronous remote copy sequence number management table 530. reflect.

Specifically, the sequence number of the corresponding entry in the SM asynchronous remote copy sequence number management table 530 is matched with the sequence number of the corresponding entry in the LM asynchronous remote copy sequence number management table 330. The update value in the SM asynchronous remote copy sequence number management table 530 is a value obtained by adding the value of the corresponding field in the sequence number difference column 333 to the value before the update.

In this way, the change in the sequence number is counted up to a predetermined number smaller than the maximum number of sequence numbers in the local memory 122, and the change in the sequence number in the local memory 122 is reflected in the sequence number in the shared memory 132, whereby the microprocessor The number of accesses to the CMPK 130 by 121 can be reduced, and the load on the microprocessor 121 and the CMPK 130 due to communication between them can be reduced.

38, update of the asynchronous remote copy sequence number according to the present embodiment will be described. This processing is executed by the microprocessor 121 of the MPPK 120A in charge of the journal volume 171JP. In this embodiment, the primary volume 171P and the journal volume 171JP constituting the group are assigned to the same MPPK 120.

When update data is written to the journal volume 171JS, the microprocessor 121 refers to the LM asynchronous remote copy sequence number management table 330 and adds a new sequence number and pointer to the management area 611 of the journal volume 171JS. . Further, the microprocessor 121 updates the sequence number and the sequence number difference value of the entry of the journal volume 171JS in the LM asynchronous remote copy sequence number management table 330 (in this example, increments these values) (S412).

The microprocessor 121 determines whether or not the performance boost function of the volume is ON (S413). When the performance boost function is OFF (S413: NO), the microprocessor 121 proceeds to step S417. When the performance boost function is ON (S413: YES), the microprocessor 121 determines whether its own overload flag is ON (S414).

If the overload flag is ON (S414: YES), the microprocessor 121 proceeds to step S416. When the overload flag is OFF (S414: NO), the microprocessor 121 determines whether the overload flag of the CMPK that is the access destination is ON (S415).

If the CMPK overload flag is OFF (S415: NO), the microprocessor 121 proceeds to step S417. If the CMPK overload flag is ON (S415: YES), the microprocessor 121 proceeds to step S416. Details of step S413 to step S415 are as already described in the second embodiment. By controlling the update reflection of the control information in accordance with the load on the microprocessor 121 and / or the CMPK 130, the shared memory can be updated more appropriately while suppressing a decrease in system performance.

In step S416, the microprocessor 121 determines whether the sequence number difference of the journal volume 171JS is the maximum value in the LM asynchronous remote copy sequence number management table 330. If the value is not the maximum value (S416: NO), the microprocessor 121 ends this process.

When the above value is the maximum value (S416: YES), the microprocessor 121 updates the sequence number of the journal volume 171JS in the SM asynchronous remote copy sequence number management table 530. Specifically, the microprocessor 121 updates the current sequence number value to a value obtained by adding the sequence number difference value. In step S417, the microprocessor 121 updates (initializes) the value of the sequence number difference field that has reached the maximum value to 0 value.

When the sequence number update (performance boost function) in the shared memory 132 using the sequence number difference is not used, the updates of the LM asynchronous remote copy sequence number management table 330 and the SM asynchronous remote copy sequence number management table 530 are synchronized.

When a failure occurs in the MPPK 120A, the LM asynchronous remote copy sequence number management table 330 on the local memory 122 is lost. As described above, this table 330 has information indicating the latest head sequence number of each journal volume. In order to perform normal remote copy, the latest start sequence number in the journal management data is required.

In the first storage system 10 of this embodiment, the MPPK 120A different from the MPPK 120A in which the failure has occurred refers to the management area 611 of the journal volume 171JS and confirms the latest start sequence number indicating the start of the journal area. The asynchronous remote copy sequence number recovery processing when an MPPK failure occurs will be described with reference to the flowchart of FIG.

The microprocessor 121 of the normal MPPK 120A that has taken over the charge selects one journal volume from the SM asynchronous remote copy sequence number management table 530 stored in the shared memory 123, and reads the sequence number (S422). The microprocessor 121 reads data from the sequence number area next to the sequence number area from the journal volume (S423).

The microprocessor 121 determines whether the sequence number read in step S423 is a 0 value (invalid value) (S424). If the sequence number is not 0 (S424: NO), the microprocessor 121 stores the read sequence number in a temporal area (typically, an area in the local memory 122) (S425).

If the sequence number is 0 (S424: YES), the area is an unused area, and the microprocessor 121 uses the sequence number stored in the temporal area to store the value in the SM asynchronous remote copy sequence number management table 530. Update the sequence number of the corresponding journal volume. When the sequence number in the SM asynchronous remote copy sequence number management table 530 is the latest head sequence number, updating is not necessary. The microprocessor 121 performs the above update for all journal volumes stored in the SM asynchronous remote copy sequence number management table 530.

According to the above flow, the SM asynchronous remote copy sequence number management table 530 is updated to include the latest information, and the other MPPK 120A can take over the responsibility of the MPPK 120A in which the failure has occurred and continue normal asynchronous remote copy.

The values stored in the sequence number management tables 330 and 530 are examples, and any values may be stored in the sequence number management tables 330 and 530 as long as they can indicate the head sequence numbers or the difference between the head sequence numbers in the tables 330 and 350. It may be.

Fourth Embodiment Hereinafter, an example in which the present invention is applied to asynchronous local copy will be described. In the following, differences from the other embodiments will be mainly described. FIG. 40 shows control information stored in the local memory 122 of this embodiment. The local memory 122 stores an LM local copy difference management table 340 and an LM local copy difference area thinning operation management table 350.

FIG. 41 shows control information in the shared memory 132 of the present embodiment. The SM local copy difference management table 560, the SM local copy difference area thinning operation management table 570, and the local copy management table 580 are included in the control information in the shared memory 132. A plurality of MPPKs 120 can refer to the tables 560, 570, and 580 in the shared memory 132. In particular, the SM local copy difference management table 560 and the SM local copy difference area thinning operation management table 570 are referred to by other MPPKs 120 when the MPPK 120 fails.

The local copy management table 580 includes management information for managing each pair of a primary volume and a secondary volume. For example, it includes identification information of a primary volume and a secondary volume that constitute a pair, address information thereof, and copy policy information. The microprocessor 121 refers to the local copy management table 580 and controls execution of local copy.

The SM local copy difference management table 560 and the SM local copy difference area thinning operation management table 570 in the shared memory 132 are respectively the LM local copy difference management table 340 and the LM local copy difference area thinning operation management table 350 in the local memory 122. This is a backup. The microprocessor 121 reflects the update of the tables 340 and 350 in the local memory 122 in the tables 560 and 570 of the shared memory 132 according to a predetermined rule.

FIG. 42 shows an example of the LM local copy difference management table 340. The LM local copy difference management table 340 includes a volume number column 341, a logical address column 342, and a differential bit string column 343. The volume number is an identifier of the primary volume in the storage system. Each entry indicates a storage area (address range) having a predetermined size in the volume. The logical address indicates the start logical address of the storage area of each entry. In this example, the storage areas of the entries are common.

The bit string with difference indicates whether or not there is a data difference between the primary volume and the secondary volume in the storage area of the entry, that is, whether or not the update in the primary volume is reflected in the secondary volume. .

Each bit of the bit string with difference (also referred to as a bit with difference) indicates whether or not the data of each partial area in the storage area of the entry is different between the primary volume and the secondary volume. In this example, the area size corresponding to each bit is common. In this example, if the bit of the differential bit string is 1, it indicates that the data in that area is different between the primary volume and the secondary volume.

The microprocessor 121 copies the update data of the primary volume to the secondary volume at a predetermined timing (asynchronous local copy). In the asynchronous local copy, the microprocessor 121 refers to the LM local copy difference management table 340 and copies the data in the area where the differential bit in the primary volume is 1 to the secondary volume.

In response to this asynchronous local copy, the microprocessor 121 updates the difference presence bit of the area in which the update is reflected in the secondary volume to 0 value in the LM local copy difference management table 340. In this example, all the update data of the primary volume is copied to the secondary volume in one copy operation.

FIG. 43 shows an example of the SM local copy difference management table 560. The SM local copy difference management table 560 is a backup table of the LM local copy difference management table 340, and has the same configuration as the LM local copy difference management table 340. Specifically, it includes a column 561 for volume numbers, a column 562 for logical addresses, and a column 563 for bit strings with difference.

The microprocessor 121 copies the update in the LM local copy difference management table 340 to the SM local copy difference management table 560 according to a predetermined rule. In this example, the update of the LM local copy difference management table 340 and the update of the SM local copy difference management table 560 by asynchronous local copy from the primary volume to the secondary volume are synchronized. The update of the SM local copy difference management table 560 for the update by data write to the primary volume will be described later.

FIG. 44 shows an example of the LM local copy difference area thinning operation management table 350. The LM local copy difference area thinning-out operation management table 350 includes a volume number column 351, a logical address column 352, and a thinning-out bit string column 353. Each entry indicates a storage area (address range) having a predetermined size in the volume.

The logical address indicates the start logical address of the storage area of each entry. In this example, the storage areas of the entries are common. Preferably, the entry storage area in the LM local copy difference area thinning-out operation management table 350 is wider than the entry storage area in the LM local copy difference management table 340.

The thinning-out bit string indicates whether or not the update of the bit string with difference in the LM local copy difference management table 340 is reflected in the corresponding bit string with difference in the LM local copy difference management table 340. As described above, in the LM local copy difference area thinning operation management table 350, the thinning-out bit string is associated with the storage area in the logical volume.

Each bit of the thinning-out bit string (also referred to as thinning-out bit string) is associated with a partial area of the storage area associated with the thinning-out bit string. Each bit of the thinning-out bit string is associated with one or a plurality of differential bits through a partial area to which the bit string is associated.

In the preferred example, the thinning-out bit is associated with a plurality of differential bits. The entry storage area (address range) in the LM local copy difference area thinning operation management table 350 is wider than the entry storage area (address range) in the LM local copy difference management table 340. The number of bits in the thinned-out bit string is the same as or different from the number of bits in the differential bit string (same in the examples of FIGS. 43 and 44).

As described above, in the LM local copy difference management table 340, each difference bit is associated with a storage area. If at least a part of the storage area associated with the thinning-out bit matches the storage area with a difference bit, the thinning-out bit is associated with the difference-with-bit.

When the thinning-out bit is 1, the update in response to the primary volume update (data write) of the differential bit associated with it in the local memory 122 is not reflected in the differential bit in the shared memory 132. Specifically, in response to the reception of the write command to the primary volume, the microprocessor 121 refers to the thinning-out bit in the area indicated by the write command in the LM local copy difference area thinning-out operation management table 350.

When the thinning-out bit is 1, the microprocessor 121 does not reflect the corresponding difference bit update in the LM local copy difference management table 340 in the SM local copy difference management table 560. Thereby, the load of MPPK120 and CMPK130 by communication between MPPK120 and CMPK130 is reduced.

FIG. 45 shows an example of the SM local copy difference area thinning operation management table 570. The SM local copy difference area thinning operation management table 570 is a backup table of the LM local copy difference area thinning operation management table 350 and has the same configuration as that. Specifically, it has a column 571 for volume numbers, a column 572 for logical addresses, and a column 573 for thinning-out bits. The microprocessor 121 updates the SM local copy difference area thinning operation management table 570 in synchronization with the update of the LM local copy difference area thinning operation management table 350.

The update of the asynchronous local copy difference management information will be described with reference to the flowchart of FIG. When data is written to the primary volume, the microprocessor 121 updates the LM local copy difference management table 340 (S502). Specifically, the differential bit associated with the updated area in the primary volume is updated.

The microprocessor 121 determines whether or not the performance boost function of the volume is ON (S503). If the performance boost function is OFF (S503: NO), the microprocessor 121 proceeds to step S509 and updates the SM local copy difference management table 560 (synchronous update). When the performance boost function is ON (S503: YES), the microprocessor 121 determines whether its own overload flag is ON (S504).

If the overload flag is ON (S504: YES), the microprocessor 121 proceeds to step S506. When the overload flag is OFF (S504: NO), the microprocessor 121 determines whether or not the overload flag of the access destination CMPK is ON (S505).

If the CMPK overload flag is OFF (S505: NO), the microprocessor 121 proceeds to step S509 and updates the SM local copy difference management table 560. When the overload flag of CMPK is ON (S505: YES), the microprocessor 121 proceeds to step S506. Details of step S503 to step S505 are as already described in the second embodiment, and the control information of the shared memory 132 is appropriately updated while suppressing a decrease in system performance.

In step S506, the microprocessor 121 determines whether the area updated in the primary volume is being thinned. Specifically, the microprocessor 121 refers to the LM local copy difference area thinning operation management table 350 and checks each thinning bit in the update area. When the thinning-out bit is 1 (S506: YES), the microprocessor 121 omits the update of the differential bit corresponding to the thinning-out bit in the SM local copy difference management table 560.

When the thinning-out bit is 0 (S506: YES), the microprocessor 121 determines whether or not the difference between the areas associated with the thinning-out bit is greater than or equal to the threshold (S507). Specifically, the microprocessor 121 refers to the LM local copy difference management table 340 and determines whether or not the number of 1 bits is greater than or equal to the threshold value in the bits with differences corresponding to the thinned bits. This criterion will be described in the MPPK failure process described later with reference to FIG.

If the difference is less than the threshold (S507: NO), the microprocessor 121 updates the SM local copy difference management table 560 (S509). When the difference is equal to or larger than the threshold (S507: YES), the microprocessor 121 updates the LM local copy difference area thinning operation management table 350 and the SM local copy difference area thinning operation management table 560 (S508). Specifically, the microprocessor 121 changes the thinning-out bit from 0 to 1 in the two tables 350 and 560.

Next, a copy of the local copy difference when the MPPK 120 fails will be described with reference to the flowchart of FIG. When a failure occurs in the MPPK 120, another MPPK 120 copies the difference from the primary volume to the secondary volume in the copy pair that was handled by the failed MPPK 120. As a result, the identity of the copy pair is ensured, and the subsequent normal asynchronous remote copy is realized.

The microprocessor 121 in the other MPPK 120 refers to the SM local copy difference area thinning operation management table 570 (S512), and determines whether or not a thinning-out area remains (S513). The thinning-out area is an area where the thinning-out bit is 1. If no thinning area remains (S513: NO), this flow ends. When the thinning-out area remains (S513: YES), the microprocessor 121 copies the data in that area in the primary volume to the secondary volume (S514).

As described above, the shared memory 132 does not store the latest differential bit string corresponding to the thinning-out bit of “1”. Therefore, when a failure occurs in the MPPK 120, all data in the area where the thinning-out bit is 1 (ON) is copied from the primary volume to the secondary volume. Thereby, the data of the secondary volume can be exactly matched with the data of the primary volume.

As described with reference to the flowchart of FIG. 46, in this example, when the bit with difference “1” corresponding to the bit being thinned is equal to or larger than the threshold, the thinned bit is turned ON (1). Set to. In the event of a failure, all data for which the corresponding thinning-out bit is ON is copied from the primary volume to the secondary volume, so the update load is reduced by thinning out the update of the area where there is a lot of data that needs to be updated, and the processing at the time of failure Can be made more efficient.

In the present embodiment, the configuration of the difference management table and the thinning operation management table is an example, and any data may be used as long as the difference area and the thinning-out area can be indicated.

FIG. 48 shows an example 4800 of a menu screen for setting the performance boost function that can be used in the second to fourth embodiments. The menu screen 4800 includes a performance boost function setting area 4801, a per-volume performance boost function setting area 4802, and a per-function performance boost function setting area 4803.

The administrator can enable or disable the performance boost function of the storage system 10 by selecting either “ENABLE” or “DISABLE” in the performance boost function setting area 4801 with the input device 28. This setting is reflected in the performance boost function enabling table 210.

The per-volume performance boost function setting area 4802 enables / disables the performance boost function of each logical volume. The administrator can select enabling / disabling of the performance boost function of each logical volume with the input device 28 in the performance boost function setting area 4802 for each volume. This setting is reflected in the performance boost function enabling table 220 for each volume.

∙ Performance boost function setting area 4803 for each function enables / disables each performance boost function. The administrator can select enable / disable of each function with the input device 28 in the performance boost function setting area 4803 for each function. This setting is reflected in a function-by-function performance boost function enabling table (not shown) in the storage system 10. If all of the system, volume and function boost functions are enabled, the performance boost function is used on that volume.

Fifth Embodiment In this embodiment, an example in which the present invention is applied to a storage system including a plurality of storage modules coupled by switches will be described. This embodiment will mainly describe differences from the above-described other embodiments. FIG. 49 schematically shows the configuration of the computer system of this embodiment. The storage module 10 </ b> C and the storage module 10 </ b> D are communicably connected by an inter-module path 195 (also referred to as an X path) including the switch 198.

49. The configuration of the

storage modules

10C and 10D in FIG. 49 is the same as the configuration of the storage system 10 described with reference to FIG. In this example, two combined modules constitute one storage system, but three or more modules may constitute one storage system.

An X path (switch 198) 195 that couples the storage module 10C and the storage module 10D functions as a path similar to the path of the internal network 150, and an arbitrary package of one module is an arbitrary package and medium of the other module. And the X path 195. The host computer 180 can access any storage module.

The X path has a narrower bandwidth than the internal network 150 and a low data transfer capability. Therefore, the X path tends to be a bottleneck in data transfer between packages. Therefore, it is possible to reduce the degradation of the performance of the storage system by determining whether the performance boost function is ON / OFF based on the load of the X path.

The microprocessor 121 of the present embodiment refers to the operation rate of the X path 195 in the enable / disable control of the performance boost function. As a result, system performance can be appropriately improved in a storage system composed of a plurality of modules.

FIG. 50 shows control information stored in the local memory 122 of the present embodiment. In FIG. 50, an X path availability table 360 and an X availability threshold table 370 are stored in the local memory 122. FIG. 51 shows an example of the X path availability table 360. FIG. 52 shows an example of the X path availability threshold table 370.

The X path operating rate table 360 manages the X path operating rate. In this example, the X path operating rate table 360 includes an X path number column 361, an operating rate column 361, and an overload determination flag column 363. The X path number is an identifier that uniquely identifies the X path in the system. In the example of FIG. 51, the X path availability table 360 manages a plurality of X paths. That is, a plurality of X paths combine two or more storage modules. The plurality of X paths pass through the same or different switches.

The operating rate is the data transfer time per unit time. The operation rate of the X path is calculated by the controller of the switch through which the X path passes and stored in the register. The microprocessor 121 acquires the operation rate of each X path from the register of the switch and stores it in the X path operation rate table 360.

The microprocessor 121 compares each entry operating rate in the X path operating rate table 360 with a predetermined X path operating rate threshold value, and determines the value of the overload determination flag. When the X path operating rate is equal to or greater than the threshold, the microprocessor 121 sets the overload determination flag to 1. The X path operating rate threshold value is stored in the X path operating rate threshold value column of the X path operating rate threshold value table 370. For example, the X path availability threshold table 370 is loaded from a nonvolatile storage area in the storage system, and the value is set by the administrator.

Referring to the flowchart of FIG. 53, a description will be given of the determination about the update in the shared memory 132 of the control information related to the data caching in consideration of the operation rate of the X path. The basic part is the same as in the first embodiment. In the flowchart in FIG. 53, steps other than step S607 are the same as those in the flowchart shown in FIG. 16 in the first embodiment, and a description thereof will be omitted.

In step S607, the microprocessor 121 refers to the X path availability table 360 and determines that the X path overload flag used for accessing the shared memory 132 is 1 (ON). Control information indicating the relationship between the CMPK 130 to be accessed and the X path to be used is stored in the local memory 122, whereby the microprocessor 121 can specify the X path to be used.

If the overload flag is ON (S607: YES), the microprocessor 121 determines not to update the control information of the shared memory 132 (S608). When the overload flag is OFF (0) (S607: NO), the microprocessor 121 determines to update the control information of the shared memory 132 (S608). Although the present example refers to the X path availability in the update determination of the data caching control information, the other determination processes described in other embodiments can also refer to the X path availability.

Next, the update of the X path availability in the X path availability table 360 will be described with reference to the flowchart in FIG. Typically, this process is performed periodically, for example, every second. The microprocessor 121 selects one X path, for example, the X path 195, and acquires the operating rate of the X path 195 from the switch 198 (S612).

The microprocessor 121 updates the operation rate value of the corresponding entry in the X path operation rate table 360 with the acquired operation rate value (S613). The microprocessor 121 determines whether the acquired operating rate value is equal to or greater than the X path operating rate threshold value in the X path operating rate threshold value table 370 (S614). When the operating rate is equal to or greater than the threshold (S614: YES), the microprocessor 121 sets the overload flag of the entry in the X path operating rate table 360 to 1 (ON) (S615).

On the other hand, when the operating rate is less than the threshold (S614: NO), the microprocessor 121 sets the overload flag of the entry in the X path operating rate table 360 to 0 (OFF) (S616). The microprocessor 121 determines whether or not the operating rates of all X paths have been updated (S617). If all the X paths have been determined (S617: YES), this flow ends, and undecided X paths are found. If it remains (S617: NO), one X path is selected from the remaining X paths, and this flow is repeated.

Sixth Embodiment This embodiment describes a configuration in which the MPPK 120 can access a plurality of shared memory areas distributed in a plurality of different types of devices. In the present embodiment, differences from the other embodiments will be mainly described.

FIG. 55 schematically shows the configuration of the computer system of this embodiment. In the storage system 10, a shared memory (storage area) exists in a plurality of different devices. Specifically, in addition to the shared memory 132 on the CPMK 130, the shared memory 124 exists on the MPPK 120, and the shared memory 178 exists on the storage drive 170. The area of the shared memory 124 on the MPPK 120 is a storage area in the local memory 122. The area of the shared memory 178 on the storage drive 170 is a storage area of a nonvolatile storage medium in the storage drive.

FIG. 56 shows control information stored in the local memory 122 of the present embodiment. 56, an MP operating rate table 380, an MP operating rate threshold value table 390, and an SM area management table 400 are stored in the local memory 122.

FIG. 57 shows an example of the MP operation rate table 380. The MP operation rate table 380 includes an MP number column 381, an operation rate column 382, an overload determination flag 1 column 383, an overload determination flag 2 column 384, and an operation time column 385. The columns other than the column 384 of the overload determination flag 2 are the same as those in the MP availability table 270 shown in FIG. The overload determination flag 1 column 383 corresponds to the overload determination flag column 273.

FIG. 58 shows an example of the MP operation rate threshold table 390. The MP operating rate threshold table 390 includes a column 391 for the MP operating rate threshold 1 and a column 392 for the MP operating rate threshold 2. The value of the MP operation rate threshold 1 is higher than the value of the MP operation rate threshold 2. The MP operating rate threshold 1 corresponds to the MP operating rate threshold shown in FIG.

FIG. 59 shows an example of the SM area management table 400. The SM area management table 400 manages a shared memory area distributed to a plurality of devices. The SM area management table 400 includes a type column 401, a number column 402, a head address column 403, and a free capacity column 404. “Type” indicates the type of the device in which the shared memory area exists. “Number” is an identifier in devices of the same type. “Start address” indicates the start address of the shared memory area in each device. “Free capacity” is the free capacity of the shared memory area.

Values are set in advance in the type column 401, the number column 402, and the top address column 403. The microprocessor 121 acquires the value of the free capacity of the shared memory area from the controller of each device (the microprocessor 121 in MPPK), and stores it in the free capacity column 404.

Referring to FIGS. 60A and 60B, the determination regarding the update of the control information stored in the shared memory area related to data caching will be described. Steps S702 to S707 in the flowchart of FIG. 60A are the same as steps S122 to S127 in the flowchart of FIG. However, if the overload flag of the CMPK 130 is ON in step S706 (S706: YES), the microprocessor proceeds to step S709 in FIG. 60B.

If the overload flag of the CMPK 130 is OFF in step S706 (S706: NO) or if the performance boost function of the logical volume is OFF in step S702 (S702: NO), the microprocessor 121 shares the CMPK 130. It is decided to update the control information of the memory.

In step S709 in FIG. 60B, the microprocessor 121 refers to the SM area management table 400 and determines whether there is an MPPK 120 having a necessary free shared memory area. When any MPPK 120 has a necessary free shared memory area (S709: YES), the microprocessor 121 identifies the number of the MPPK 120, stores the caching control information in the shared memory 124 of the MPPK 120, and updates the MPPK 120. It is decided to perform (S710). The MPPK 120 is an MPPK different from the MPPK 120 on which the microprocessor 121 is mounted.

When the MPPK 120 having the necessary free shared memory area does not exist (S709: NO), the microprocessor 121 determines whether its own overload flag 2 is 1 (ON) (S711). If the overload flag 2 is ON (S711: YES), the microprocessor 121 determines not to update the control information in the shared memory area (S716).

When the overload flag 2 is OFF (S711: NO), the microprocessor 121 refers to the SM area management table 400 and determines whether there is an SSD RAID group having a necessary free shared memory area (S712).

When any SSD RAID group has a necessary free shared memory area (S712: YES), the microprocessor 121 identifies the number of the SSD RAID group and stores the cache control information in the shared memory area of the SSD RAID group. It is determined to store and update (S713).

When the SSD RAID group having the necessary free shared memory area does not exist (S712: NO), the microprocessor 121 refers to the SM area management table 400 and determines whether the HDD RAID group having the necessary free shared memory area exists. Determination is made (S714). When the HDD RAID group having the necessary free shared memory area does not exist (S714: NO), the microprocessor 121 determines not to update the control information in the shared memory 132 (S716).

If an HDD RAID group having the necessary free shared memory area exists (S714: NO), the microprocessor 121 identifies the number of the HDD RAID group and stores the cache control information in the shared memory area of the HDD RAID group. Then, it is determined to perform the update (S715).

When the microprocessor 121 stores the control information in any shared memory other than the shared memory 132 and decides to update the control information, the microprocessor 121 copies the data caching control information in the local memory 122 to the selected shared memory. . Data caching control in the shared memory 132 may be deleted.

As described above, by moving the control information from the current shared memory 132 area to another shared memory area, the update of the control information in the shared memory can be synchronized with the update in the local memory. The cache hit rate can be improved. The above flow determines whether there is a free shared memory area from a device with high access performance. As a result, the control information can be stored in a shared memory with higher access performance, and a decrease in system performance can be suppressed.

The shared memory area management of the present embodiment can be applied to the storage and update management of other control information described in the other embodiments in addition to the storage and update management of data caching control information. In the event of an MPPK failure, other MPPKs 120 can search for corresponding control information in the distributed shared memory area by referring to the shared memory area management table 400.

Referring to the flowchart of FIG. 61, the update of the MP operation rate will be described. This flow is called at a cycle such as 1 second. The microprocessor 121 acquires its own MP operating time (S722), and updates the value of the operating rate in the MP operating rate table 380 (S723). Steps S722 and S723 are the same as steps S232 and S233 in FIG.

Next, in step S724, the microprocessor 121 determines whether the updated operating rate value is equal to or greater than the MP operating rate threshold value 1. When the value of the operating rate is equal to or higher than the MP operating rate threshold 1 (S724: YES), the microprocessor 121 sets the overload flag 1 of the MP operating rate table 380 to 1 (ON) (S725). When the operating rate value is less than the MP operating rate threshold 1 (S724: NO), the microprocessor 121 sets the overload flag 1 of the MP operating rate table 380 to 0 (OFF) (S726).

Next, in step S727, the microprocessor 121 determines whether or not the updated operation rate value is equal to or greater than the MP operation rate threshold value 2. When the value of the operating rate is equal to or greater than the MP operating rate threshold 2 (S727: YES), the microprocessor 121 sets the overload flag 2 of the MP operating rate table 380 to 1 (ON) (S728). When the operating rate value is less than the MP operating rate threshold 2 (S727: NO), the microprocessor 121 sets the overload flag 1 of the MP operating rate table 380 to 0 (OFF) (S729).

Seventh Embodiment The storage system of this embodiment determines ON / OFF of the low hit rate flag based on the improvement of access performance by caching host data. The low hit rate flag is as described in the first embodiment. The access performance is represented by, for example, response time or throughput. The configuration described below uses response time.

The low hit rate flag (see the first embodiment) is set to OFF when the response time improvement by using data caching is large, and the low hit rate flag is set to ON when the response time improvement by using data caching is small. Is set. Thereby, an average response time can be improved.

Hereinafter, the present embodiment will be specifically described. Differences from the other embodiments will be mainly described. FIG. 62 shows control information stored in the local memory 122 of the present embodiment. A response table 410 and a CM use threshold table 420 are stored in the local memory 122. FIG. 63 shows an example of the response table 410, and FIG. 64 shows an example of the CM use threshold table 420.

The response table 410 is a table for managing the response time of media. In FIG. 63, the response table 410 includes a media type column 411 and a response time column 412. The response table 410 in this example manages the response time according to the media type, but may manage the response time according to a RAID group or a logical volume.

In this example, the response time is the time required to read data from the media. A value is stored in the response time column 412 in advance, or the microprocessor 121 may update the value in the response time column 412. The microprocessor 121 measures the response time in data reading, and stores, for example, the average value of the measured values in the response time column 412.

The response time may be determined using the data write response time. The data write response time and the data read response time may be managed separately, and the data write and data read hit ratios may be managed separately in accordance with the data write response time and the data read response time. Data caching control can be performed separately for write data caching and read data caching.

64, the CM use threshold value table 420 stores a threshold value of a value indicating response improvement in the response improvement column 421. The threshold is set in advance. For example, a value set by the administrator is stored in a nonvolatile storage area in the storage system. As will be described later, the microprocessor 121 uses the difference between the response time of the medium and the response time of the CMPK 130 (cache memory 131) to calculate a value indicating response improvement. If this value is larger than the above threshold, it indicates that the response improvement is at a level commensurate with data caching.

A hit rate update process including a low hit rate flag update based on the response improvement of the present embodiment will be described with reference to the flowchart of FIG. The MPPK 120 executes this process periodically, for example, every second. Steps S802, S803, and S805 to S807 in the flowchart of FIG. 65 are the same as steps S222, S223, and S225 to S227 in the flowchart of FIG.

In step S804, the microprocessor 121 calculates a value indicating response improvement according to the following equation.
Hit rate x (response time of the media-CMPK response time) / 100

The microprocessor 121 can identify the type of the media by referring to the media type table 230 from the RAID group of the volume. The response time value is stored in the response table 410 as described above. The microprocessor 121 compares the calculated value with the CM use threshold value in the CM use threshold value table 420.

When the calculated value is equal to or less than the CM use threshold (S804: YES), the microprocessor 121 sets the low hit rate flag of the volume to 1 (ON) (S805). When the calculated value is larger than the CM use threshold (S804: NO), the microprocessor 121 sets the low hit rate flag of the volume to 0 (OFF) (S806).

Eighth Embodiment In cache control, a cache slot is generally secured when a cache miss occurs. In the present embodiment, an efficient cache control method that considers the characteristics of the SSD while maintaining such a cache control premise in a storage system in which HDDs and SSDs coexist will be described.

The storage system according to the present embodiment performs caching using a normal cache area associated with a logical volume space and transfers read data to the host or a cache area associated with an I / O processing control process (job). The read data is transferred to the host using the (job buffer) or is determined according to a predetermined condition.

In the present embodiment, a job indicates a process for controlling I / O processing, and the job has a job # that can be uniquely specified in each MP. In the management area for each job #, for example, information on the access destination address and transfer length of the I / O being processed, information on the cache area being used, and the like are stored.

When searching for user data in the cache, search using the address of the logical volume space. When searching whether the above-mentioned job buffer has been secured, an address not assigned to the logical volume space is assigned to the job number of each MP, and the search is performed using that address.

Hereinafter, the present embodiment will be specifically described. Differences from the other embodiments will be mainly described. FIG. 72 shows control information stored in the local memory 122 of this embodiment. A job management table 450, a job buffer address table 460, a buffer transfer rate calculation table 470, and a buffer transfer rate table 480 are stored in the local memory 122. 73 shows an example of the job management table 450, FIG. 74 shows an example of the job buffer address table 460, FIG. 75 shows an example of the buffer transfer ratio calculation table 470, and FIG. 76 shows an example of the buffer transfer ratio table 480. Show.

FIG. 73 shows a configuration example of the job management table 450. The job management table 450 is a table for managing whether each job number is in use. The job management table 450 has a microprocessor number column 451, a job number column 452, and a use / non-use column 453. The empty management of the job number in this example is performed in the use / non-use column, but queue management using a used queue and a non-used queue may be performed.

FIG. 74 shows a configuration example of the job buffer address table 460. The job buffer address table 460 manages a buffer search address to which each job number of each MP is assigned. The job buffer address table 460 includes a microprocessor number column 461, a job number column 462, and a job buffer address number column 463. The job buffer address number is a unique value in the storage system and is a value that does not overlap with the logical volume address.

FIG. 75 shows a configuration example of the buffer transfer ratio calculation table 470. The buffer transfer rate calculation table 470 is a table for calculating the transfer rate using the job buffer from the cache hit rate and the MP operating rate for each logical volume. The buffer transfer rate calculation table 470 includes a hit rate column 471, a microprocessor operating rate column 472, and a buffer transfer rate column 473.

For the purpose of reducing the microprocessor OVH for read processing for data that does not hit the cache, a high buffer transfer ratio is set when the hit rate is low, and a high buffer transfer ratio is set when the microprocessor operating rate is high. .

The lower limit of the buffer transfer ratio is 0, and the upper limit is 99 or less. The reason why the upper limit is 99 or less is that the hit rate cannot be calculated when transferring using a 100% job buffer. The hit rate used in this example is a hit rate when the job buffer is not used.

FIG. 76 shows a configuration example of the buffer transfer ratio table 480. The buffer transfer ratio table 480 is a table for managing the ratio of using the job buffer in the read processing for each logical volume. The buffer transfer ratio table 480 has a logical volume number column 481 and a buffer transfer ratio column 482.

Processing for the read command received from the host computer 180 in this example will be described with reference to the flowcharts shown in FIGS. 77A and 77B. Receiving the read command from the host computer 180 (S901), the microprocessor 121 refers to the job management table 430, searches for an unused job number, and secures an unused job number (S902).

The microprocessor 121 determines whether it has the access right to the logical volume (LDEV) indicated by the read command (S903). When the access right is not possessed (S903: NO), the microprocessor 121 transfers the read command to the MPPK 120 having the access right (S904).

When the microprocessor 121 has the access right (S903: YES), the microprocessor 121 searches the cache directory 310 using the logical volume address in the local memory 122 on the same MPPK 120 (S905). When the address (data) specified by the read command is found (cache hit) (S906: YES), the microprocessor 121 reads the read data from the cache memory 131 according to the information in the cache directory 310 and transmits it to the host computer 180. (S907).

If the address (data) specified by the read command is not found (cache miss) (S906: NO), the microprocessor 121 checks the uncached flag in the local memory 122 (S908). The uncached flag is a flag indicating whether all the data of the cache directory value 510 of the shared memory 132 is cached in the local memory 122, and is stored in the local memory 122. When some data is not read, the value is ON. For example, if the control information is not read from the shared memory 132 to the local memory 122 immediately after the failure failover, the uncached flag is ON.

If the uncached flag is ON (S908: YES), some data of the cache directory value 510 of the shared memory 132 is not cached. The microprocessor 121 transfers the cache directory (control information) from the shared memory 132 to the local memory 122 via the controller of the CMPK 130 (S909).

The microprocessor 121 searches the cache directory 310 in the local memory 122 (S910). When the data specified by the read command is found (cache hit) (S911: YES), the microprocessor 121 reads the read data from the cache memory 131 according to the information in the cache directory 310 and transmits it to the host computer 180 (S912). .

In the case of a cache miss (S911: NO) or the uncached flag is OFF (S908: NO), the microprocessor 121 performs caching using a normal cache area associated with the logical volume space, and stores read data as a host. Or the read data is transferred to the host (buffer transfer) using the cache area (job buffer) associated with the I / O processing control process (job) (S913). A specific method of this determination will be described in detail later.

When it is determined that the buffer transfer is not used (S914: NO), the microprocessor 121 secures a slot for the read data in the cache memory 131, and further, the cache directory 310 of the local memory 122 and the cache directory of the shared memory 132. 510 is updated (S915).

The microprocessor 121 reads the read data from the storage drive 170 (permanent medium) using the BEPK 140 and the CMPK 130, and stores the read data in the reserved slot on the cache memory 131. Thereafter, the microprocessor 121 transmits the cache data to the host computer 180 by the CMPK 130 and the FEPK 100 (S916).

If it is determined that buffer transfer is to be used (S914: YES), the microprocessor 121 searches the cache directory 310 using the job buffer address number in the job buffer address table 440 (S917).

When the job buffer address number (job buffer) is not found (S918: NO), the microprocessor 121 secures a slot for the job buffer in the cache memory 131, and further, the cache directory 310 and the shared memory in the local memory 122. The cache directory 510 of 132 is updated (S919), and the process proceeds to the next step S920.

If the job buffer address number (job buffer) is found (S918: YES), the microprocessor 121 proceeds to step 920 without updating the control information of the local memory 122 and the shared memory 132.

In step 920, the microprocessor 121 reads the read data from the storage drive 170 (permanent medium) using the BEPK 140 and the CMPK 130, and stores the read data in the job buffer slot on the cache memory 131. Thereafter, the microprocessor 121 transmits the cache data to the host computer 180 by the CMPK 130 and the FEPK 100.

78, the determination (S914) regarding the necessity of buffer transfer in the flowchart of FIG. 77A will be described. The microprocessor 121 starts this step S914, and determines whether or not the performance boost function of the logical volume designated by the read command is ON, by using the performance boost function enable table 210 and the per-volume performance boost function enable table 220. The determination is made with reference to (S932). If one table indicates that the performance boost function is OFF, the performance boost function for the volume is OFF.

When the performance boost function of the logical volume is not ON (S932: NO), the microprocessor 121 determines not to use buffer transfer (S937). When the performance boost function of the logical volume is ON (S932: YES), the microprocessor 121 next determines whether or not the media type of the RAID group in which the designated data is stored is SSD. With reference to the media type table 230 as a key, determination is made (S933).

If the media type is not SSD (S933: NO), the microprocessor 121 determines not to use buffer transfer (S937). When the media type is SSD (S933: YES), the microprocessor 121 next determines whether or not the current I / O is a buffer transfer target by using the logical volume number in which the designated data is stored as a key. The determination is made with reference to the buffer transfer ratio table 480 (S934).

The microprocessor 121 uses a random number from 0 to 100 as a method for determining whether or not it is a buffer transfer target using a buffer transfer ratio having a value from 0 to 99, and the random number falls below the buffer transfer ratio. In addition, it may be determined that the current I / O is a buffer transfer target. The microprocessor 121 uses the hash value from 0 to 100 using the read data address as a key, and determines that the current I / O is a buffer transfer target when the hash value falls below the buffer transfer ratio. Also good. The microprocessor 121 uses a counter that increases by 1 from 0 to 100 (returns to 0 after 100), and when the counter value falls below the buffer transfer ratio, the current I / O is subject to buffer transfer. It may be determined that

As a result of the determination in S934, when it is determined that the current I / O is a buffer transfer target (S935: YES), the microprocessor 121 determines to use buffer transfer (S936), and the current I / O is If it is determined that it is not a buffer transfer target (S935: NO), it is determined not to use the buffer transfer (S937).

As described above, when the specified condition is satisfied, the microprocessor 121 determines to use a buffer transfer that is unlikely to update the cache directory 310 in the local memory 122 and the cache directory 510 in the shared memory 132. . As a result, the load on the microprocessor 121 and the CMPK 130 can be reduced and the throughput of the system can be improved.

The calculation of the buffer ratio will be described with reference to the flowchart of FIG. This flow is called at a cycle such as 1 second for each LDEV (logical volume). The microprocessor 121 may calculate all LDEVs at a certain period, or may calculate when the I / O target LDEV has not been updated for 1 second or the like at the time of I / O processing.

The microprocessor 121 refers to the hit rate table 250 for each volume using the target LDEV number (logical volume number) as a key, obtains the hit rate from the number of I / Os and the number of hits, and uses the MP number of its own as an MP operation rate table 380, the MP operation rate is obtained, and the buffer transfer rate calculation table 470 is referred to using the hit rate and the MP operation rate as keys, and the buffer transfer rate is obtained (S942).

The microprocessor 121 updates the buffer transfer ratio column of the LDEV number (logical volume number) in the buffer transfer ratio table 480 with the buffer transfer ratio stopped in S942 (S943), and ends this process (S944).

The above is the description of the eighth embodiment of the present invention. Also in this embodiment, the write process may perform buffer staging determination, or may adopt a method of caching all data in a normal cache area.

For example, when all data is cached in the normal cache area, the flow is almost the same as the flow in which S901, S904, S907, S912, and S916 are replaced with S141, S143, S146, S151, and S156 in FIG. The flow is similar. However, in the case of write processing, the steps (S913, S914) relating to the buffer staging determination are omitted, and when it is determined NO in step S908 or S911, the process is shifted to step S915.

According to the eighth embodiment of the invention, it is possible to efficiently use a cache area by selecting whether to perform caching or transfer using a job buffer according to a predetermined condition. In addition, the performance of the storage system is improved by reducing the overhead of the processor that performs data caching.

When transferring read data using a reserved job buffer, it is not necessary to update the CMPK SM control information, and it is not necessary to update the LM control information. Therefore, in the present embodiment, the processor efficiently uses the cache area by selecting whether to perform caching or transfer using a job buffer according to a predetermined condition. The performance of the storage system is improved by reducing the overhead of the processor that performs data caching.

Also, by using a cache hit rate that changes with time, it is possible to cope with changes in I / O patterns in a short time that cannot be set from outside. Even if the persistent media is SSD, it is possible to improve the performance of the storage system by performing a lot of normal transfers in the time zone where the probability of hitting is high and performing a lot of job buffer transfers in the time zone where the probability of hitting is low. Contribute. As described above, according to the present embodiment, it is possible to improve the use efficiency of the cache and reduce the OVH of the cache memory and the processor with respect to the I / O pattern that changes with time.

Furthermore, by placing the buffer in the same management system as the cache slot, it is possible to efficiently use the memory with a low OVH and high temporal followability as compared with a method of separately managing the buffer and the cache slot. .

Hereinafter, the reason why the update of the cache control is reduced and the overhead of the processor is reduced will be described using cache LRU (Least Recently Used) replacement management and job number replacement management in the present embodiment shown in FIG.

In the embodiment of the present invention, as shown in FIG. 80, the cache slot is replaced and managed by the LRU algorithm (710). The replacement of the LRU algorithm is an algorithm in which the oldest accessed entry is replaced when a new entry is secured. In the case of data caching, when uncached data is accessed, the cache slot having the oldest access time is replaced in order to secure a new cache slot.

SLOT number 720 indicates a unique number of each cache slot. The LRU pointer 730 points to the cache slot used immediately before. The SLOT number of the slot used immediately before the SLOT number s1 is s2, and the SLOT number of the slot used immediately before the SLOT number s2 is s3. An MRU (Most Recently Used) pointer 740 points to the cache slot used immediately after. This is the SLOT number s2 of the slot used immediately after the SLOT number s3, and the SLOT number of the slot used immediately after the SLOT number s2 is s1. That is, it is shown that the SLOT numbers s3, s2, and s1 are used in this order.

In the embodiment of the present invention, unused job numbers are replaced by the MRU algorithm (720). The replacement of the MRU algorithm is an algorithm in which the most recently accessed entry is replaced when a new entry is secured. The job number 820 corresponds to the job number 432.

The LRU pointer 830 indicates the job number used immediately before, and the MRU pointer 840 indicates the job number used immediately after. That is, the job numbers j3, j2, and j1 are used in this order. When assigning a job number to process I / O, assign MRU job number j0 (850), and when I / O ends and return the job number, return MRU of MRU job number j0 (850). Return to the pointer (840).

In the transfer using normal data caching, the slot 751 having the LDEV number and the slot number 720 associated with the LBA 750 in the LDEV is used like the slots of the SLOT numbers s1 and s3.

Therefore, if the volume capacity to be accessed is larger than the expected cache capacity of the present embodiment, that is, if the host I / O access pattern is such that the user data cache is not reused, a new one will be provided for each host I / O. To secure a slot, the LRU slot s0 must be deleted from the cache directory 310 and the secured slot must be connected to the cache directory. Each PM / SM cache directory must be updated twice per I / O.

On the other hand, in the transfer using the job buffer, the slot 821 having the SLOT number 720 associated with the job number 820 is used like the slot of the SLOT number s2 associated with the job number s2. By replacing an unused job number by MRU management, the same job number can be reused, that is, a slot associated with the job number can also be reused. This eliminates the need to update the cache directory even in the case of a host I / O access pattern in which the user data cache is not reused. Thus, processor overhead is reduced.

In this example, the MRU algorithm is used for free management of jobs to increase the probability of using the same job number and to improve the usage efficiency of the cache memory. However, the use presence / absence column 432 of the job management table 430 is searched from the top. However, the method may be used because the reuse probability is high and the above-described effect can be obtained.

As mentioned above, although embodiment of this invention was described, this invention is not limited to said embodiment. A person skilled in the art can easily change, add, and convert each element of the above-described embodiment within the scope of the present invention. A part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment. It is possible to add, delete, and replace other configurations for a part of the configuration of each embodiment.

The above-described configurations, functions, processing units, processing means, etc. may be partially or wholly realized by hardware designed by, for example, an integrated circuit. Information such as programs, tables, and files that realize each function is stored in a non-volatile semiconductor memory, a hard disk drive, a storage device such as an SSD (Solid State Drive), or a computer-readable information such as an IC card, SD card, or DVD. It can be stored on a temporary data storage medium.

In the above embodiment, the control information is represented by a plurality of tables, but the control information used by the present invention does not depend on the data structure. In addition to the table, the control information can be expressed by a data structure such as a database, a list, or a queue. In the above embodiment, expressions such as an identifier, a name, and an ID can be replaced with each other.

A CPU, a microprocessor, or a group of a plurality of microprocessors that are processors executes a predetermined process by operating according to a program. Accordingly, in the present embodiment, the description with the processor as the subject may be an explanation with the program as the subject, and the processing executed by the processor is processing performed by the apparatus and system in which the processor is mounted.

Claims

A storage system,
A processor on which a control program runs;
A plurality of first or second type physical storage volumes that provide storage resources to a plurality of logical volumes;
A cache memory connected to the processor and storing a part of data stored in the plurality of physical storage volumes;
A memory connected to the processor, which is used for processing the write or read request and cache control information indicating whether the target data of the write or read request from the host is stored in the cache memory A memory for storing process management information for managing the usage status of a plurality of processes;
The processor is
When receiving the read request designating any area of the logical volume from the host, an unused process is allocated to the read request among the plurality of processes managed by the process management information,
Based on the first identifier specifying the area of the logical volume specified by the read request and the cache control information, it is determined whether the target data of the read request is in a cache memory, and the target data is the cache memory If you decide not to
When a part of the plurality of physical storage volumes constituting the logical volume specified by the read request is the first type of physical storage volume, the first identifier and the area secured on the cache memory Is stored in the memory as the cache control information in association with an identifier for identifying
When a part of the plurality of physical storage volumes constituting the logical volume specified by the read request is a second type of physical storage volume, a second process for specifying the process assigned to the read request Associating an identifier with an identifier for specifying an area secured on the cache memory and storing it in the memory as the cache control information;
The area secured on the cache memory is configured to store data read from a part of the plurality of physical storage volumes by the read request.
Storage system.
The storage system according to claim 1,
The processor is
When it is determined that the target data is not in the cache memory and a part of the types of the plurality of physical storage volumes constituting the logical volume specified by the read request is a second type volume, the second identifier Is present in the cache control information on the memory,
When the second identifier is present in the cache control information, the read data is stored in an area secured on the cache memory associated with the second identifier in the cache control information. ,
If the second identifier does not exist in the cache control information, a new area is secured on the cache memory, the first identifier is associated with a newly secured area on the cache memory, and Configured to store in the memory as cache control information;
Storage system.
The storage system according to claim 2,
The processor is
Configured to assign the most recently used process to the read request when there are a plurality of unused processes managed by the process management information;
Storage system.
The storage system according to claim 3,
The first type volume is a hard disk drive, and the second type volume is a solid state drive.
Storage system.
The storage system according to claim 3,
The processor is
When the type of the plurality of physical storage volumes is a second type volume, information on an operating rate of the processor and a cache hit rate that is a probability that the target data exists on the cache memory is acquired, and the processor When the operation rate and the cache hit rate satisfy a predetermined condition, the second identifier and an identifier for specifying an area secured on the cache memory are stored in association with each other.
Storage system.
The storage system according to claim 5,
A local memory connected to the processor and storing a copy of the cache control information and the process management information stored in the memory;
Storage system.