WO2012081156A1 - アレイ管理装置、アレイ管理方法及び集積回路 - Google Patents
アレイ管理装置、アレイ管理方法及び集積回路 Download PDFInfo
- Publication number
- WO2012081156A1 WO2012081156A1 PCT/JP2011/005805 JP2011005805W WO2012081156A1 WO 2012081156 A1 WO2012081156 A1 WO 2012081156A1 JP 2011005805 W JP2011005805 W JP 2011005805W WO 2012081156 A1 WO2012081156 A1 WO 2012081156A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- storage
- unit
- access
- redundancy
- storages
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2089—Redundant storage control functionality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1658—Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit
- G06F11/1662—Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit the resynchronized component or unit being a persistent storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2094—Redundant storage or storage space
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3034—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3055—Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1076—Parity data used in redundant arrays of independent storages, e.g. in RAID systems
- G06F11/1088—Reconstruction on already foreseen single or plurality of spare disks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/845—Systems in which the redundancy can be transformed in increased performance
Definitions
- the present invention relates to an array management apparatus for managing an array configured by making a plurality of storages redundant.
- RAID Redundant Arrays of Inexpensive Disks
- RAID configurations for example, in each form from level 1 to level 6, it is well known that reliability can be improved by a redundant configuration.
- RAID 5 For example, in a form called RAID 5, up to one storage failure is allowed among the storages managed by the array management apparatus. At this time, the storage array system temporarily shifts to a degenerate state called “degrade”. When an array falls into a state in which two or more storage devices have failed at the same time, the array is logically damaged, making it impossible to retrieve some or all of the stored data. The allowable number of storage failures depends on the RAID configuration.
- the array In the degraded state, after the failed storage is replaced with normal storage, the array automatically sends another recovery command to the storage array system. Then, the storage array system can be restored from the degraded state to the normal state by copying the restored data to the normal storage that has been restored and replaced.
- spare storage is in a standby state until the storage array system transitions to a degraded state, and when the storage array system transitions to a degraded state, the failed storage and spare storage are logically replaced.
- the storage array system is in accordance with various storage architectures such as a NAS (Network Attached Storage) environment, a SAN (Storage Area Network) environment, or an environment directly attached to a client or host computer by a storage interface. To be implemented.
- NAS Network Attached Storage
- SAN Storage Area Network
- Each storage is connected to a network for data transfer or management in the storage array system.
- the term “network” here includes, of course, an IP (Internet Protocol) network, but is not limited thereto.
- the present invention has an object to provide an array management device, an array management method, and an integrated circuit that can change the judgment criterion for performing the redundancy again according to the configuration type of the communication path.
- the present invention provides an array management apparatus that makes a plurality of storages redundant and controls access to each storage, and whether the access to each of the plurality of storages has succeeded or failed.
- the storage unit that stores the configuration type of the communication path to each of the plurality of storages, and the storage unit
- the storage unit When a derivation unit for deriving a waiting time from when the failure occurs until redundancy is performed, and when the determination unit confirms that access to any one of the plurality of storages has failed, the storage unit Until the waiting time derived by the deriving unit elapses according to the configuration type of the communication path of When successful access to the storage is not confirmed, characterized in that it comprises a redundancy processing unit that performs redundant using the remaining storage excluding the storage by.
- the array management device derives the standby time until redundancy based on the type of storage communication path that failed to be accessed. Therefore, the array management apparatus can change the waiting time until it is determined that re-redundancy is performed, that is, the determination criterion for determining that re-redundancy is performed based on the type of communication path. As a result, if the access is successful within the standby time, it is not necessary to perform the redundancy again, so that the life of the storage apparatus becomes longer compared to the case where the redundancy is performed immediately after the failure occurs.
- FIG. 2 is a block diagram showing a configuration of an array management device 100.
- FIG. It is a figure which shows an example of the data structure of network state management table T100. It is a figure which shows an example of the data structure of storage state management table T200. It is a figure which shows an example of the data structure of network failure management table T300. It is a figure which shows an example of the data structure of the empty area information table T400. It is a figure which shows an example of the data structure of data temporary storage area
- 2 is a block diagram showing a configuration of a storage device 11.
- FIG. It is a flowchart which shows a network status monitoring process.
- FIG. 20A is a diagram illustrating a specific example of data writing in the normal state
- FIG. 20B is a diagram illustrating a specific example of data writing in the temporary storage state. It is a figure which shows the specific example of re-redundancy. It is a figure which shows an example of the data structure of policy determination table T600. It is a block diagram which shows an example of the array management apparatus 100A which consists of a structure by which operation
- FIG. 1 is a diagram showing a configuration of an array management system 1 including an array management apparatus according to the present invention.
- the storage device 12 is a spare storage device.
- the digital recorder 10 manages and stores digital data such as image data captured by a digital camera.
- digital data is stored in the storage devices 11 to 15. Redundant storage is possible.
- the storage devices 11 and 12 are locally connected to the digital recorder 10 via USB (Universal Serial Bus) or SCSI (Small Computer System Interface).
- the storage device 13 is connected to the digital recorder 10 via the Internet 2, and the storage device 14 is connected to each other via a local area network (LAN). Further, the storage device 15 is connected to the digital recorder 10 through a wireless local network.
- Ethernet registered trademark
- fiber channel for connection to each storage device, Ethernet (registered trademark), fiber channel, USB, IEEE 1394 (Institut of Electrical and Electronic Engineers 1394), IDE (Integrated Drive Electronics), Serial ATA Shield (AdTA ntAntAntAntAntAntAntAntAntAntAntAntAntAntAntAntAntAntAtAntAtAn
- the array management apparatus 100 monitors whether or not a failure such as a breakage has occurred in each storage device, and performs reconfiguration for redundancy when a failure is detected, as in the past.
- the reconfiguration of redundancy is called re-redundancy.
- the array management apparatus 100 also monitors whether or not a failure has occurred in the network connected to each storage apparatus.
- the network failure in the present embodiment means that a non-response time continues for a predetermined time or more when a response request for confirming the existence is transmitted.
- the array management device 100 detects a network failure, it waits for a recovery from the network failure for a time determined according to the network connection form. Do.
- the free capacity is a capacity in which data is not yet written in an area not used for the array configuration.
- the array management device 100 is a device that manages the storage devices 11 to 15. As shown in FIG. 2, the network status monitoring unit 101, the storage status monitoring unit 102, and the management information storage unit 103 array state monitoring unit 104, redundancy policy determining unit 105, processing unit 106, request receiving unit 107, and communication unit 108.
- Network status monitoring unit 101 The network state monitoring unit 101 monitors whether or not a network failure has occurred for each of the storage apparatuses 11 to 15.
- the network state monitoring unit 101 when the network state monitoring unit 101 receives a request instruction indicating an instruction to issue a response request to each of the storage devices 11 to 15 from the array state monitoring unit 104, the network state monitoring unit 101 sends a response request to the target storage device. And the existence of the storage device on the network is confirmed based on whether or not a response to the response request is received within a predetermined time T0 (for example, 1 second).
- T0 for example, 1 second
- the network status monitoring unit 101 If the network status monitoring unit 101 receives a response from the storage device to be issued within a predetermined time T0 after transmitting the response request, the network status monitoring unit 101 notifies the array status monitoring unit 104 of response confirmation information indicating that the response has been received. To do.
- the network status monitoring unit 101 If the network status monitoring unit 101 does not receive a response from the storage device to be issued within a predetermined time T0 after sending the response request, the network status monitoring unit 101 displays no response information indicating that no response has been received. To notify.
- Storage status monitoring unit 102 monitors the presence or absence of a storage failure for each of the storage devices 11 to 15 as in the conventional case.
- the storage state monitoring unit 102 periodically checks the storage devices 11 to 15 for storage failures (for example, disk damage state).
- the storage state monitoring unit 102 When determining that a storage failure has occurred, the storage state monitoring unit 102 notifies the array state monitoring unit 104 of storage failure information indicating that a storage failure has occurred.
- Management information storage unit 103 is a storage area for storing a plurality of tables managed by the array management apparatus 100.
- the management information storage unit 103 stores a network state management table T100, a storage state management table T200, a network failure management table T300, a free space information table T400, and a data temporary storage region information table T500 shown in FIGS. .
- the management information storage unit 103 also stores information indicating the configuration of the array, for example, an array configuration information table (not shown), as in the conventional case. Since the information indicating the configuration of the array is known, a detailed description thereof is omitted here.
- the array configuration information table has an area for storing a plurality of sets each including an array number, a redundancy method, the number of storages, a storage number, and an array capacity.
- the array number is a number for identifying a configured array, and the redundancy method indicates a method for redundancy such as RAID 1 and RAID 5, for example.
- the number of storages is the number constituting the array, and the storage number is a number indicating these storage devices.
- the array capacity indicates the total capacity of the configured array. Here, spare storage devices are also managed.
- the network state management table T100 is a table for managing the state of the network (for example, whether or not there is a response to the response request). As shown in FIG. 3, the storage number, network type, network information, final response confirmation time, and no response It has an area for storing a plurality of sets of flags.
- the storage number is for uniquely identifying the storage apparatus connected to the array management apparatus 100.
- the network type indicates the connection form with the storage device identified by the storage number. For example, as a network type, a type indicating whether the connection is wired or wireless, a type indicating whether the connection is within a LAN, or a connection via the Internet, and an IP address are assigned. The contents of whether the network is a predetermined network are described in advance by the system operator.
- the network information is information necessary for transmitting a response request, and indicates information for identifying a storage on the network. This network information is different for each network type. For example, if the network type is an IP network, an IP address or a MAC address corresponds to the network information, and if the network type is a USB network, a vendor ID, a product ID, a serial number, or the like corresponds to the network information.
- the final response confirmation time indicates the time when the array status monitoring unit 104 last received the response confirmation information for the corresponding storage device, and this time is updated each time the array status monitoring unit 104 receives the response confirmation information.
- the no-response flag indicates whether the array state monitoring unit 104 has received no-response information, and the value “0” indicates that no-response information has been received, that is, response confirmation information has been received. A value “1” indicates that no-response information has been received.
- the storage state management table T200 is a table for managing the state of the storage device (whether there is damage or the like), and for storing a plurality of sets of storage numbers, storage types, storage information, and damage flags as shown in FIG. It has the area of.
- the storage type is information indicating a type constituting the storage, for example, a logical drive, a physical drive, or an online storage.
- online storage is a kind of highly reliable storage.
- High-reliability storage refers to data that is protected by itself, such as online storage or a virtualized redundant array, and the probability of storage failure is very low.
- the storage type is online storage, it is highly reliable storage.
- the redundant array is virtualized as a single storage, that effect is described in the storage type.
- the array management device 100 determines that the storage device is a highly reliable storage when the storage type describes that the online storage or the redundant array is virtualized as a single storage. be able to.
- Storage information includes, for example, information indicating the total capacity and used capacity of the corresponding storage device.
- the corruption flag indicates whether the array state monitoring unit 104 has received storage failure information. A value “0” indicates that storage failure information has not been received, and a value “1” has received storage failure information. Indicates that
- the network failure management table T300 is a table for managing the failure occurrence time and the recovery time for a storage device in which a network failure has occurred. As shown in FIG. 5, the storage number, network failure occurrence time, and confirmation time (Tb), an area for storing a plurality of sets of recovery confirmation time and confirmation time (Td). A set including a storage number, network failure occurrence time, confirmation time (Tb), recovery confirmation time, and confirmation time (Td) is referred to as failure information.
- the network failure occurrence time indicates the time when the array state monitoring unit 104 determines that a network failure has occurred.
- Tb Confirmation time indicates the waiting time (the time that recovery can be expected) from the occurrence of a network failure to re-redundancy.
- the restoration confirmation time indicates a time when there is a response to the response request of the network state monitoring unit 101 within the confirmation time (Tb).
- Td Confirmation time
- the free area information table T400 is a table for managing the free capacity of each of the storage apparatuses 11 to 15, and as shown in FIG. 6, is an area for storing a plurality of combinations of storage number, offset, size, and temporary use have. As described above, the free capacity is a capacity in which data is not yet written in an area not used for the array configuration.
- the offset is a value indicating the start position of the free area.
- the size is a value indicating the capacity of the free area.
- Temporary use indicates whether another storage device temporarily holds data to be written to the other storage device during a network failure. When the value in use is “0”, it indicates that it is temporarily not used, and when the value is “1”, it indicates that it is temporarily used.
- the data temporary storage area information table T500 is a table for temporarily managing the storage destination of data to be written to the storage apparatus in which a network failure has occurred. As shown in FIG. 7, the data temporary storage area information table T500 has an area for storing a plurality of sets of non-response storage numbers, write offsets, write sizes, temporary storage numbers, and temporary storage offsets. .
- a set consisting of a non-response storage number, a write offset, a write size, a temporary storage number, and a temporary storage offset is referred to as temporary storage area information.
- the no-response storage number is the storage number of the storage device that is determined to be a network failure.
- the write offset indicates the position where data is originally written for the storage device indicated by the non-response storage number.
- the write size indicates the size of data written to the storage device indicated by the non-response storage number.
- the temporary storage number indicates the storage number of the storage device that temporarily stores the data indicated by the corresponding write offset and write size.
- Temporal save offset indicates a write position where data indicated by the corresponding write offset and write size is temporarily saved.
- the redundancy policy determination unit 105 sets a reference time (Ta) for determining that there is a network failure for the storage device that has not responded.
- the time Ta is also a waiting time until the start of temporary storage for preventing cache overflow.
- the redundancy policy determining unit 105 performs re-redundancy according to the network connection mode (network type) of the one storage device. The time (Tb) until judgment is derived.
- the redundancy policy determining unit 105 derives a time longer than the time Tb set as the initial value as a new Tb.
- the redundancy policy determination unit 105 derives a new time Tb that is shorter than the time Tb set as the initial value.
- the redundancy policy determination unit 105 determines a new time Tb that is longer than the initial value Tb as a time until re-redundancy is determined. Set.
- the redundancy policy determination unit 105 determines that a new time Tb shorter than the initial value Tb is to be re-redundant. Set as the time until.
- the redundancy policy determining unit 105 derives a time (Td) for the network state to be stabilized from the recovery according to the network connection form.
- the array state monitoring unit 104 monitors the array state, that is, the state of each of the storage devices 11 to 15 based on the monitoring results of the network state monitoring unit 101 and the storage state monitoring unit 102, and the failure state of the array configuration.
- the array state monitoring unit 104 For each storage device whose network status is to be monitored, the array state monitoring unit 104 sends information necessary for issuing a response request to the storage device (network information shown in FIG. 3) and a request instruction. Notify the state monitoring unit 101. Thereafter, the array state monitoring unit 104 updates the network state management table T100 according to the monitoring result of the non-response information from the network state monitoring unit 101.
- the array state monitoring unit 104 measures the time from when no response is detected for a storage device that has become non-responsive (hereinafter referred to as “target device”) until response confirmation information about the target device is received.
- target device a storage device that has become non-responsive
- the array state monitoring unit 104 notifies the processing unit 106 of network failure occurrence information indicating that a network failure has occurred, assuming that a network failure has occurred.
- the time from the time of determination until the response confirmation information about the target device is received is counted. Further, the array state monitoring unit 104 updates the network failure management table T300 for the storage device in which the network failure has occurred.
- the processing unit 106 performs redundancy again.
- the array state monitoring unit 104 When the array state monitoring unit 104 receives the response confirmation information about the target device within the time Tb, the array state monitoring unit 104 next measures time until the time Td. In addition, the array state monitoring unit 104 updates the network failure management table T300 using the time and time Td at which the response confirmation information is received. When the array state monitoring unit 104 receives no-response information about the target device within the time Td, the array state monitoring unit 104 again performs time counting up to the time Tb. If no response information about the target device is not received within the time Td, the array state monitoring unit 104 notifies the processing unit 106 of recovery information indicating that the failure has been recovered, and the network failure management table T300 and The network state management table T100 is updated.
- the array state monitoring unit 104 notifies the storage state monitoring unit 102 of information necessary for accessing each storage device whose storage state is to be monitored.
- the array state monitoring unit 104 causes the storage state monitoring unit 102 to check the storage state when the processing unit 106 fails to read / write data, and when a storage corruption has occurred, the array state monitoring unit 104 Update the damage flag.
- the array state monitoring unit 104 updates the final response confirmation time in the network state management table T100 for each storage device that has been read and written.
- Processing unit 106 reads / writes data from / to each storage device, executes re-redundancy, and performs recovery processing in the case of recovery from a network failure. As shown in FIG. An execution unit 111 and a recovery process execution unit 112 are included.
- (6-1) Redundancy execution unit 110 When the redundancy execution unit 110 receives a re-redundancy instruction from the array state monitoring unit 104, the redundancy execution unit 110 performs re-redundancy.
- the redundancy execution unit 110 performs the remaining storage devices except the storage device in which the failure is identified from the network state management table T100 and the storage state management table T200 and the spare storage device (here, Redundancy is performed using the storage device 12). Redundancy is stored in the remaining storage devices of the storage devices that make up the array, excluding the storage that has been determined to be faulty, and all data except temporarily saved data is used to determine that there is a fault. The data to be stored in the storage is recovered, and all the recovered data is written to the spare storage device.
- (6-2) Data processing execution unit 111 The data processing execution unit 111 reads / writes data from / to each storage device.
- the data processing execution unit 111 has different functional operations depending on whether a network failure has occurred or not, a description will be given below for each case. Note that whether or not a network failure has occurred can be identified by determining whether or not failure information exists in the network failure management table T300.
- the data processing execution unit 111 transmits a read command or a write command to each storage device when reading / writing from / to each storage device based on an external instruction received from the request reception unit 107.
- the data processing execution unit 111 performs array status monitoring on the response confirmation and the storage number of the target storage device in the same manner as the network status monitoring unit 101. In addition to notifying the unit 104, data reading or data writing is executed. If reading or writing fails, the array status monitoring unit 104 is notified of the storage number of the failed storage device and failure information indicating failure. Furthermore, failure information is also notified to the outside via the request receiving unit 107. When reading or writing is successful, the data processing execution unit 111 notifies the outside via the request receiving unit 107.
- the data processing execution unit 111 If the response is not received from the target storage device within the predetermined time T0, the data processing execution unit 111 notifies the array state monitoring unit 104 of the storage number of the storage device that is not responding and the non-response information.
- the data processing execution unit 111 writes the data to be written to the storage device in which the network failure has occurred to a storage device having a capacity that can be temporarily saved among the remaining storage devices other than the storage device.
- the data processing execution unit 111 updates the free space information table T400 and the temporary data storage region information table T500 with respect to the free space of the storage device that temporarily stores the data.
- the data processing execution unit 111 When the data to be read from the storage device in which the network failure has occurred is temporarily stored in another storage device, the data processing execution unit 111 reads the data from the other storage device. If the data to be read can be recovered using redundant data even if the data is not temporarily stored, the data to be read from the storage device in which the network failure has occurred is recovered using the redundant data. . If recovery is not possible, the data processing execution unit 111 notifies the request reception unit 107 of a read error.
- the recovery processing execution unit 112 writes back the temporarily written data to the storage device that should be originally written (the restored storage device) using the temporary storage area information corresponding to the restored storage device. .
- the restoration processing execution unit 112 deletes the temporary storage area information related to the written back data from the data temporary storage area information table T500.
- the recovery process execution unit 112 updates the free capacity in the free area information table T400. Specifically, the free capacity of the storage device that was the temporary storage destination and the free capacity of the storage device from which data has been restored are updated.
- Request receiving unit 107 receives a data read / write request from the outside, and outputs the received request to the processing unit 106. At this time, when accepting a read request, the request accepting unit 107 further accepts a read position and outputs the accepted read position to the processing unit 106. When accepting a write request, the request accepting unit 107 accepts data to be further written and outputs the accepted data to the processing unit 106.
- the request reception unit 107 receives an error notification from the processing unit 106, the request reception unit 107 outputs the received error notification to the outside.
- Communication unit 108 The communication unit 108 performs data input / output with the storage devices 11 to 15 to be managed.
- the storage device 11 includes a storage unit 201, a processing unit 202, a storage state acquisition unit 203, and a communication unit 204, as shown in FIG.
- the storage unit 201 is a large-capacity recording device that stores data written by the array management apparatus 100, and is, for example, a hard disk drive (HDD) or a solid state drive (SSD).
- HDD hard disk drive
- SSD solid state drive
- Processing unit 202 writes the data received from the array management device 100 to the storage unit 201 or reads the data from the storage unit 201 and transmits it to the array management device 100 via the communication unit 204 according to an instruction from the array management device 100.
- the processing unit 202 When the processing unit 202 receives a response request from the array management apparatus 100 via the communication unit 204, the processing unit 202 transmits a response to the command to the array management apparatus 100 via the communication unit 204.
- the processing unit 202 transmits the information to the array management apparatus 100 via the communication unit 204.
- Storage status acquisition unit 203 The storage status acquisition unit 203 checks whether the storage unit 201 is damaged. If it is damaged, the storage status acquisition unit 203 notifies the processing unit 202 of the damage information when the array management apparatus 100 checks the storage status.
- the array state monitoring unit 104 selects one storage device to be monitored (step S5).
- the array state monitoring unit 104 uses the network state management table T100 to determine whether or not the selected storage device is in a non-response state based on whether or not the value of the corresponding non-response flag is 1. (Step S10).
- step S10 If it is determined that it is 1, that is, it is determined that there is no response (“Yes” in step S10), a storage device return confirmation process is performed (step S30).
- the array state monitoring unit 104 uses the network state management table T100 to select the selected storage device Is acquired (step S15), and it is determined whether or not a predetermined time (T1, for example, 2 seconds) has elapsed from the acquired final response confirmation time (step S20).
- a predetermined time T1, for example, 2 seconds
- step S20 If it is determined that the time has passed (“Yes” in step S20), the storage device survival confirmation process is performed (step S25).
- the array state monitoring unit 104 When it is determined that the predetermined time T1 has not elapsed from the acquired final response confirmation time (“No” in step S20), the array state monitoring unit 104 performs management after executing step S20 and step S30. It is determined whether or not processing has been performed for all the target storage apparatuses, that is, there is an unselected storage apparatus (step S35).
- step S35 When determining that all the storage apparatuses have not been processed (“No” in step S35), the array state monitoring unit 104 selects the next storage apparatus (step S40), and the process returns to step S10. .
- step S35 If it is determined that processing has been performed on all storage devices (“Yes” in step S35), the processing ends.
- step S25 shown in FIG. 9 will be described with reference to the flowchart shown in FIG.
- the network state monitoring unit 101 transmits a response request to the storage device that is the object of the survival confirmation (step S100).
- the network state monitoring unit 101 determines whether a response has been received from the target storage device within the predetermined time T0 (step S105).
- the network state monitoring unit 101 When determining that the response has been received (“Yes” in step S105), the network state monitoring unit 101 notifies the response confirmation information to the array state monitoring unit 104, and the array state monitoring unit 104 in the network state management table T100.
- the final response confirmation time corresponding to the target storage device is updated to the current time (step S110).
- the network state monitoring unit 101 If it is determined that a response has not been received (“No” in step S105), the network state monitoring unit 101 notifies the non-response information to the array state monitoring unit 104, and the array state monitoring unit 104 When no response information is received from 101, the value of the no response flag corresponding to the target storage apparatus in the network status management table T100 is set to 1 (step S115).
- step S30 shown in FIG. 9 will be described with reference to the flowchart shown in FIG.
- the network state monitoring unit 101 transmits a response request to the storage device to be confirmed (step S150).
- the network state monitoring unit 101 determines whether a response has been received from the target storage device within the predetermined time T0 (step S155).
- the network state monitoring unit 101 When determining that the response has been received (“Yes” in step S155), the network state monitoring unit 101 notifies the response confirmation information to the array state monitoring unit 104, and the array state monitoring unit 104 is in the network state management table T100.
- the final response confirmation time corresponding to the target storage device is updated to the current time (step S160).
- the array state monitoring unit 104 sets the value of the no response flag corresponding to the target storage device in the network state management table T100 to 0 (step S165).
- the redundancy policy determination unit 105 determines whether or not the network type of the storage device that is not responding can be temporarily disconnected using the network type of the network state management table T100 (step S200). ). For example, what cannot be temporarily disconnected is SCSI connection or USB connection.
- the redundancy policy determination unit 105 determines a time Ta until the start of temporary storage for preventing cache overflow ( Step S205).
- the time Ta is, for example, 5 seconds.
- the redundancy policy determination unit 105 determines a time Tb (initial value) until it is determined that re-redundancy is performed (step S210).
- the time Tb is, for example, 10 seconds.
- Whether or not the network to which the target storage device is connected is wired is determined using the network type of the network state management table T100 (step S215).
- the redundancy policy determination unit 105 resets 5 ⁇ Tb as Tb (step S220).
- the redundancy policy determining unit 105 After executing step S220 and when determining that the network to which the target storage device is connected is wired (“Yes” in step S215), the redundancy policy determining unit 105 uses the corresponding network type, It is determined whether or not the network to which the target storage device is connected is via the Internet (step S225).
- the redundancy policy determination unit 105 resets 2 ⁇ Tb as Tb (step S230).
- the redundancy policy determination unit 105 After execution of step S230 and when it is determined that the network to which the target storage device is connected is via the Internet (“Yes” in step S225), the redundancy policy determination unit 105 is a highly reliable storage. It is determined using the network type in the network status management table T100 (step S225).
- the redundancy policy determining unit 105 resets 10 ⁇ Tb as Tb (step S240).
- the array state monitoring unit 104 issues an instruction to immediately perform redundancy again. (Step S245).
- Redundancy Policy Determination Processing at Recovery the redundancy policy determination processing at recovery performed by the redundancy policy determination unit 105 will be described with reference to the flowchart shown in FIG.
- the redundancy policy determination unit 105 determines a time Td (initial value) for the network state to be stabilized after recovery (step S300).
- Whether or not the network to which the target storage device is connected is wired is determined using the network type of the network state management table T100 (step S305).
- the redundancy policy determination unit 105 resets 2 ⁇ Td as Td (step S310).
- the redundancy policy determining unit 105 After executing step S310 and when determining that the network to which the target storage device is connected is wired (“Yes” in step S305), the redundancy policy determining unit 105 uses the corresponding network type, It is determined using the network type of the network state management table T100 whether or not the network to which the target storage device is connected is via the Internet (step S315).
- the redundancy policy determination unit 105 resets 2 ⁇ Td as Td (step S320).
- the data processing execution unit 111 transmits a read command or a write command to each storage device when reading / writing from / to each storage device based on an external instruction received from the request reception unit 107 (step S400). ).
- step S405 It is determined whether or not a response has been received from the target storage device within the predetermined time T0 (step S405).
- the array state monitoring unit 104 updates the final response confirmation time corresponding to the target storage device to the current time in the network state management table T100. (Step S410).
- the data processing execution unit 111 executes reading or writing, and determines whether or not it has succeeded (step S415). If it is determined that the access has succeeded (“Yes” in step S415), the data processing execution unit 111 notifies the outside of the successful access (step S435).
- the storage state monitoring unit 102 determines whether the target storage device performed by the storage state monitoring unit 102 has been damaged by checking the storage state. It is determined whether or not (step S420). When determining that the disk is damaged (“Yes” in step S420), the array state monitoring unit 104 sets the damage flag corresponding to the target storage apparatus to 1 in the storage state management table T200 (step S425). .
- the array status monitoring unit 104 sets the target storage for the network status management table T100.
- the value of the no response flag corresponding to the device is set to 1 (step S430).
- Step S425 and Step S430 After executing Step S425 and Step S430, and when determining that the storage is not damaged for the target storage device (“No” in Step S420), the data processing execution unit 111 notifies the access failure to the outside (Step S425). S440).
- the data processing execution unit 111 determines the data to be written and the write position for each storage device (step S500). For example, the write position for each storage device is determined by an algorithm according to the RAID system.
- the data processing execution unit 111 writes the determined data to the determined write position for the storage apparatus in which no network failure has occurred, that is, the storage apparatus that has a response (step S505).
- the data processing execution unit 111 acquires the free capacity of all storage devices that are not in the network failure among the storage devices managed in the free space information table T400 (step S510).
- the data processing execution unit 111 determines whether or not there is a free area in which data to be written to the storage device in which a network failure has occurred can be temporarily stored, that is, whether or not there is a storage device that can be temporarily stored. Judgment is made (step S515).
- step S5 the data processing execution unit 111 selects one storage device that can be temporarily stored, and temporarily stores it from the free capacity of the selected storage device. An area is determined (step S520), and data to be written to the storage apparatus in which the network failure has occurred is written in the determined area (step S525).
- the data processing execution unit 111 updates the free area information table T400 with respect to the free capacity of the storage device that temporarily stores the data (step S530).
- the data processing execution unit 111 stores the storage number of the storage device in which the network failure has occurred, the data writing position determined at the time of data reception, the size of the data to be written, the storage number of the temporary storage destination storage device, and the temporary storage destination
- the data temporary storage area information table T500 is updated using each of the writing positions in (Step S535). Specifically, the data processing execution unit 111 stores the storage number of the storage device in which the network failure has occurred, the data writing position determined at the time of data reception, the size of the data to be written, and the storage number of the temporary storage destination storage device And the write position at the temporary storage destination are written to the non-response storage number, write offset, write size, temporary storage number and temporary storage offset of the data temporary storage area information table T500.
- the data processing execution unit 111 determines the read position for each storage device (step S600). For example, the reading position for each storage device is determined by an algorithm according to the RAID system.
- the data processing execution unit 111 determines whether or not a network failure has occurred using the network failure management table T300 (step S605).
- step S605 If it is determined that a network failure has occurred (“Yes” in step S605), the data processing execution unit 111 reads the temporary storage area information table T500 from the storage device in which the network failure has occurred. It is determined whether there is temporary storage area information corresponding to (step S610).
- the data processing execution unit 111 determines whether or not the data to be read can be recovered using redundant data (step S615). For example, it is determined by determining whether or not there is a normal number of storage devices necessary for data recovery in accordance with an algorithm according to the RAID system.
- step S615 If it is determined that recovery is not possible (“No” in step S615), the data processing execution unit 111 notifies the request reception unit 107 of a read error (step S620).
- step S605 If it is determined that no network failure has occurred (“No” in step S605), the data processing execution unit 111 reads data from the determined reading position for each storage device (step S625).
- step S610 If it is determined that there is temporary storage area information corresponding to the read position to the storage device in which the network failure has occurred (“Yes” in step S610), the data processing execution unit 111 stores the storage in which the network failure has occurred. Data to be read from the apparatus is read from the temporary storage destination indicated by the temporary storage area information (step S630). For storage devices in which no network failure has occurred, data is read from the determined reading position.
- the data processing execution unit 111 acquires the redundant data from another storage device (step S635). Using the acquired redundant data, data to be read from the storage apparatus in which the network failure has occurred is recovered (step S640). For storage devices in which no network failure has occurred, data is read from the determined reading position.
- step S645 the data processing execution unit 111 determines whether reading of all data has been completed. For example, the determination is made based on whether data remains in the cache.
- step S645 If it is determined that it has not been completed (“No” in step S645), the process returns to step S605.
- the recovery processing execution unit 112 determines whether or not the temporary storage area information exists in the data temporary storage area information table T500 (step S700).
- the recovery process execution unit 112 selects one temporary storage area information (step S705).
- the recovery process execution unit 112 writes the data temporarily written using the selected temporary storage area information back to the storage apparatus that should be originally written (step S710). Specifically, the recovery processing execution unit 112 starts the temporary storage destination storage device and the temporary storage destination from the temporary storage number, the temporary storage offset, and the write size included in the selected temporary storage area information. A position and an end position are specified.
- the recovery processing execution unit 112 ends from the temporary storage destination start position in the storage device of the temporary storage destination specified earlier for the storage device indicated by the non-response storage number included in the selected temporary storage area information
- the data indicated up to the position is written from the position indicated by the write offset.
- the restoration processing execution unit 112 deletes the selected temporary storage area information from the data temporary storage area information table T500 (step S715). In addition, the recovery process execution unit 112 updates the free capacity in the free area information table T400 (step S720), and the process returns to step S700.
- the array state monitoring unit 104 sets the value of the no response flag corresponding to the storage device that is not responding in the network status management table T100 to 1 (step S800).
- the redundancy policy determination unit 105 determines the time Ta and Tb according to the network type corresponding to the storage device that is not responding by the redundancy policy determination process at the time of no response shown in FIG. 12 (step S805).
- the array state monitoring unit 104 determines whether or not it should be immediately re-redundant based on the result of the redundancy policy determination process at the time of no response (step S810).
- the array state monitoring unit 104 determines whether the time Ta determined by the redundancy policy determination unit 105 has elapsed (step S815). .
- the array state monitoring unit 104 determines whether or not the unresponsive storage device has returned, that is, from the network state monitoring unit 101 to the target storage device. It is determined whether or not response confirmation information has been received (step S820).
- step S820 If it is determined that it has not returned (“No” in step S820), the process returns to step S815 and continues to count the time Ta.
- the array state monitoring unit 104 When it is determined that the time Ta has elapsed (“Yes” in step S815), the array state monitoring unit 104 temporarily stores data writing (step S825). In addition, the array state monitoring unit 104 uses the storage number of the storage device determined to be a network failure, the time determined to be a network failure, and the time Tb calculated by the redundancy policy determination unit 105 as the storage in the network failure management table T300. Write to the number, network failure occurrence time and confirmation time (Tb).
- the array state monitoring unit 104 determines whether or not the time Tb determined by the redundancy policy determining unit 105 has elapsed (step S830).
- the array state monitoring unit 104 When determining that it has not elapsed (“No” in step S830), the array state monitoring unit 104 has received a response from the non-responding storage apparatus, that is, the target storage from the network state monitoring unit 101. It is determined whether or not response confirmation information for the device has been received (step S835).
- step S835 If it is determined that it has not been received (“No” in step S835), the process returns to step S830 and continues to measure time Tb.
- the redundancy policy determination unit 105 determines the time Td according to the network type corresponding to the restored storage device, and the array state monitoring unit 104 determines the redundancy. It is determined whether or not the time Td determined by the policy determination unit 105 has elapsed (step S840).
- step S840 If it is determined that it has not elapsed (“No” in step S840), the array state monitoring unit 104 determines whether or not there is no response again (step S845).
- step S845 If it is determined that there is no response again (“Yes” in step S845), the process returns to step S830, and the array state monitoring unit 104 restarts the time Td measurement again, and whether the time Tb has elapsed. Judge whether or not.
- step S845 If it is determined that there is no response (“No” in step S845), the process returns to step S840 and continues to measure time Td.
- the array state monitoring unit 104 When it is determined that the time Tb has elapsed (“Yes” in step S830), the array state monitoring unit 104 returns from the temporary storage state to the normal state (step S850). Specifically, the array state monitoring unit 104 notifies the processing unit 106 of recovery information indicating that the failure has been recovered, deletes the corresponding failure information in the network failure management table T300, and further detects the network state. Update of the management table T100, that is, the value of the corresponding no-response flag in the network state management table T100 is set to 0. In addition, the recovery process execution unit 112 performs the recovery process shown in FIG. 17, and writes back data, deletes temporary storage area data, and updates the free area information table T400.
- step S850 After execution of step S850 and when it is determined that the non-responsive storage apparatus has been restored (“Yes” in step S820), the array state monitoring unit 104 sets the value of the corresponding non-response flag to 0 ( Step S855).
- the redundancy execution unit 110 executes re-redundancy. (Step S860).
- FIG. 19 is a diagram showing state transitions until redundancy is executed.
- the array management system 1 is operating in a normal state (ST1).
- the array management apparatus 100 In the normal state, when the array management apparatus 100 detects one storage apparatus that is not responding (here, the storage apparatus 11), it waits for the time Ta to elapse as a recovery time. If there is a response from the storage apparatus 11 before the time Ta elapses, the normal state is maintained. If there is no response from the storage apparatus 11 until the time Ta elapses, that is, if the migration condition A is satisfied, it is considered that a network failure has occurred in the storage apparatus 11, and the transition process to the temporary storage state is performed to prevent cache overflow. Perform (ST2).
- the transition to the temporary storage state means calculating the time Tb and writing the storage number, the network failure occurrence time, and the confirmation time (Tb) for the storage apparatus in which the network failure has occurred in the network failure management table T300. is there.
- the temporary storage state is set (ST3). In this state, if there is a data write command, the array management apparatus 100 has a free area to write the target data instead of writing the target data to the storage apparatus 11 in which the network failure has occurred. To other storage devices.
- the storage apparatus 11 is regarded as having recovered from a network failure, and a transition process to a normal state is performed (ST4).
- the transition process to the normal state is a process of writing back the temporarily stored data to the storage apparatus to which the original data should be written (the storage apparatus 11 in which the network failure has occurred), the free space information table T400, and the data temporary This is an update process of the storage area information table T500.
- the degraded state which is a state in which redundancy is reduced
- the data loss state in which the array is damaged and data loss occurs
- FIG. 20A shows an image diagram of data writing in a normal state.
- the array management apparatus 100 accepts a write request in the normal state, data to be written to each storage apparatus (in this case, the storage apparatuses 11, 14, and 15) constituting redundancy is determined according to an algorithm such as RAID5. Then, data is written to each storage device. For example, when there is a write request for certain data X1, the array management apparatus 100 generates data A1, B1, and C1 as data to be written. When there is a write request for certain data X2, the array management apparatus 100 generates data A2, B2, and C2 as data to be written.
- data B1 can be recovered from data A1 and C1, and data A1 can be recovered from data B1 and C1.
- data A2, B2, and C2 generated from data X2 it is assumed that data B2 can be recovered from data A2 and C2, and data A2 can be recovered from data B2 and C2.
- FIG. 20B shows an image diagram of data writing at the time of network failure, that is, in the temporary storage state.
- the array management apparatus 100 determines the data to be written to each of the storage apparatuses 11, 14, and 15 as in the normal state, but the data (data A1) to be written to the storage apparatus 15 that is not responding. , A2) are temporarily held in a free area of another storage apparatus (here, the storage apparatus 11) that constitutes redundancy, or in a free area of the spare storage (here, the storage apparatus 12).
- the storage device 15 recovers from the network failure, the temporarily saved data is reflected in the storage to be originally written, and the temporarily saved data is deleted.
- FIG. 21 shows an image diagram of re-redundancy.
- the array management apparatus 100 separates the storage apparatus 15 from the redundant configuration, and reconfigures the redundant configuration using another storage (here, the storage apparatus 12 that is a spare storage apparatus). Then, the data held in the storage device 15 is restored from the storage devices 11 and 14, and the restored data is written to the storage device 12, thereby restoring the array configuration.
- the values used when setting times longer than the initial values as the times Tb and Td are examples, and a multiple that increases the time Td may be a value larger than 1.
- the wireless communication in the above embodiment means that a wireless section exists in a part of the shortest path on the network between the array management apparatus and the storage apparatus, or that all the shortest paths are wireless sections. Say. Also, wired means that there is no wireless section on the shortest path.
- the determination of the redundancy policy is performed when the storage apparatus becomes unresponsive and when a failure such as storage damage occurs, but is not limited thereto.
- the array management device may store a redundancy policy according to the network type or storage type in advance.
- the array management apparatus stores a policy determination table T600 as shown in FIG. 22 in the management information storage unit.
- the policy determination table T600 has an area for storing a plurality of sets including triggers, network types, storage types, and redundancy policies.
- Trigger indicates the status detected by the network status monitoring unit and storage status monitoring unit.
- the network type indicates the connection form with the storage device.
- the storage type is information indicating the type constituting the storage.
- the redundancy policy indicates a condition for determining a redundancy policy (in this case, determination of times Ta, Tb, and Td and determination of immediate redundancy).
- Redundancy policy is determined by the combination of trigger, network type and storage type.
- the transition condition A for example, time Ta
- the transition condition from the temporary storage state to the re-redundancy state B for example, time Tb
- the transition condition C from the normal state to the re-redundant state
- a storage device when a storage device is connected via a local area IP network, a physical drive that is connected wirelessly or by wire is not responding without returning a storage corruption response think of.
- Possible causes of such a state include network troubles and damage to the entire storage including the transfer control unit, but there is no way for the array management device to reliably determine the cause. Therefore, first, it is estimated that a network problem has occurred, and temporary storage is executed. If the storage device does not recover even after a certain period of time has elapsed from the temporary storage state, it is determined that storage damage has occurred and re-redundancy is performed.
- the stability of the network differs between wired and wireless. Generally, the wireless network is more unstable and takes longer to recover, so the time set for wireless connection is set for wired connection. It will be longer than the required time.
- the redundancy policy determination table can be created and held in advance according to the network state and storage state.
- the network state and storage state for example, the power state, user operation history information, and the like may be used for determining the times Ta, Tb, and Td.
- the power supply state is used, so that disconnection of the network due to battery exhaustion can be considered.
- the user operation history information it is possible to consider network disconnection due to an intentional power-off operation by the user. Accordingly, it is possible to more appropriately determine the standby time until re-redundancy, and there is an effect that unnecessary re-redundancy can be prevented.
- the redundancy policy determination unit in this case specifies the network type of the storage device that is the target of the redundancy policy determination from the network state management table T100, and specifies the storage type from the storage state management table T200. Then, the redundancy policy is determined from the detected state, the identified network type and storage type, and the policy determination table T600.
- the storage apparatus may periodically send a response regardless of whether a response request is received.
- the storage apparatus may transmit a response when it automatically detects network reconnection or the like.
- the storage state acquisition unit 203 acquires the presence or absence of storage damage, but is not limited to this.
- Storage information indicates information that can be acquired from the storage status acquisition unit in each storage device, and the information may be different for each storage device. For example, in a storage device with a built-in battery, the current power state and remaining battery capacity may be stored as storage information. If the storage device accepts user operations, user operation information and user operation time may be stored as storage information. It may be stored. Information about whether the storage device is a mobile device may be stored. Further, the storage apparatus may not have storage information.
- the storage apparatus notifies the presence or absence of a storage failure through a periodic check by the storage state monitoring unit, but is not limited thereto.
- the storage device may send a notification that a failure has occurred when storage information changes, such as a storage failure.
- some or all of the storage devices 11 to 15 may be stored inside the housing of the digital recorder 10.
- a CPU Central Processing Unit
- a program describing the procedure of the method may be stored in a recording medium and distributed.
- FIG. 23 is a configuration diagram showing an example of the array management apparatus 100A having a configuration in which the above-described method is realized by executing a program.
- the array management apparatus 100A includes a ROM 1000 that records various processing programs, a CPU 1010 that performs overall processing, a RAM 1020 that temporarily records data, and a lower-level transfer control unit 1030 that controls data transfer and management with the storage.
- a higher-order transfer control unit 1040 that controls data transfer and management with other devices such as a digital camera, and a management information storage unit 103 that is a recording device.
- a program in the ROM 1000 for example, a network status monitoring unit 101A, a storage status monitoring unit 102A, an array status monitoring unit 104A, a redundancy policy determination unit 105A, a redundancy execution unit 110A, a data processing execution unit 111A, and a recovery processing execution unit 112A is stored, and each program is executed by the CPU 1010, whereby each component is processed.
- the lower transfer control unit 1030 and the upper transfer control unit 1040 correspond to the communication unit 108 in the above embodiment.
- the ROM 1000 may be another recording device such as a hard disk drive (HDD). Further, the lower transfer control unit 1030 and the upper transfer control unit 1040 may share the same interface.
- HDD hard disk drive
- FIG. 24 is a configuration diagram illustrating an example of the storage apparatus 11A having a configuration in which the above-described method is realized by executing a program.
- the storage device 11A includes a ROM 2000, a CPU 2010, a RAM 2020, a transfer control unit 2030 for controlling data transfer and management with the array management unit, and one or more large-capacity recording devices 2040 to 2050.
- the large capacity recording devices 2040 to 2050 may be, for example, hard disk drives, solid state drives, or the like.
- the storage status acquisition unit 203 shown in the above embodiment is stored as a program, for example, the storage status acquisition unit 203A in the ROM 2000, and the CPU 2010 executes the program so that the processing of the component is performed. Done.
- the transfer control unit 2030 corresponds to the communication unit 204 in the above embodiment, and the large capacity recording devices 2040 to 2050 correspond to the storage unit 201 in the above embodiment.
- the ROM 2000 may be another recording device such as a hard disk drive.
- the storage device may have another system configuration.
- the storage device may be a large-capacity recording device, connected directly to the array management unit via a storage interface such as SCSI, and controlled by the CPU of the array management device.
- the array management apparatus shown in the above embodiment is typically realized as an LSI which is a semiconductor integrated circuit. These may be individually made into one chip, or may be made into one chip so as to include a part or all of them.
- the name used here is LSI, but it may also be called IC, system LSI, super LSI, or ultra LSI depending on the degree of integration.
- the method of circuit integration is not limited to LSI, and implementation with a dedicated circuit or a general-purpose processor is also possible.
- An FPGA Field Programmable Gate Array
- a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.
- a drawing device corresponding to various uses can be configured.
- the present invention can be used in cellular phones, televisions, digital video recorders, digital video cameras, car navigation systems, and the like.
- a display a cathode ray tube (CRT), a liquid crystal, a PDP (plasma display panel), a flat display such as an organic EL, a projection display typified by a projector, or the like can be combined.
- An array management apparatus 3000 that makes a plurality of storages 3100, 3101,..., 3102 redundant and controls access to each storage 3100, 3101,.
- a storage unit 3001 that stores a configuration type of a communication path to each of the plurality of storages
- a determination unit 3002 that confirms whether access to each of the plurality of storages has succeeded or failed
- the storage unit Based on the configuration type of the communication path stored in 3001, the deriving unit 3003 for deriving a waiting time from when the access to the storage fails until redundancy is performed, and the determination unit 3002 If a failure to access any of the storages is confirmed, the communication If the success of access to the storage is not confirmed by the determination unit 3002 until the time derived by the deriving unit 3003 elapses according to the configuration type, the plurality of the storage units excluded from the storage
- a redundancy processing unit 3004 that makes the storage redundant may be configured.
- the storage unit 3001 is the management information storage unit 103 shown in the above embodiment
- the determination unit 3002 is a combination of the network state monitoring unit 101 and the storage state monitoring unit 102 shown in the above embodiment
- the derivation unit 3003 is the above.
- the redundancy processing unit 3004 can be realized by a combination of the array state monitoring unit 104 and the redundancy execution unit 110 shown in the above embodiment.
- the storages 3100, 3101,..., 3102 correspond to any of the storage devices 11 to 15 shown in the above embodiment.
- an array management apparatus 3000A that makes a plurality of storages 3100, 3101,..., 3102 redundant and controls access to the storages 3100, 3101,.
- the temporary writing unit 3006 may be configured to write data to be written to the storage that has failed to be accessed.
- the request receiving unit 3005 can be realized by the request receiving unit 107 shown in the above embodiment
- the temporary writing unit 3006 can be realized by the data processing execution unit 111 shown in the above embodiment, particularly by a functional operation at the time of network failure.
- leading-out part 3003, and the redundancy process part 3004 are demonstrated above, description here is abbreviate
- the management integrated circuit in the management device that includes a storage unit that stores the configuration type of the communication path to each of the plurality of storages, makes the plurality of storages redundant, and controls access to each storage is shown by a broken line in FIG. It is good also as being comprised by each component (The determination part 3002, the derivation
- the array management method in the array management apparatus that controls access to each storage is such that the determination unit repeatedly confirms whether access to each of the plurality of storages has succeeded or failed.
- the derivation unit fails to access any one of the plurality of storages in the first determination step.
- a waiting time until the redundancy is executed after the access to the storage is failed is derived based on the configuration type of the communication path corresponding to the storage stored in the storage unit.
- step S1020 Confirmation of failure and whether or not the access to the storage for which access failure has been confirmed by the confirmation step in the first determination step has been confirmed (step S1025).
- a redundancy execution step step S1030 for performing redundancy using the remaining storage excluding the storage may be included.
- the confirmation step and the first determination step are the processing operations shown in FIG. 9, 101 and step S835 in FIG. 18 in the above embodiment, and the derivation step is the processing operation shown in FIG. 12 in the above embodiments.
- the determination step 2 can be realized in step S830 shown in FIG. 18, and the redundancy execution step can be realized in step S860 shown in FIG.
- An array management apparatus for controlling access to each storage by making a plurality of storages redundant has succeeded or failed in accessing each of the plurality of storages.
- a determination unit that repeatedly confirms the storage a storage unit that stores a configuration type of a communication path to each of the plurality of storages, and an access to the storage based on the configuration type of the communication path stored in the storage unit
- the deriving unit that derives the waiting time until the redundancy is performed after the failure
- the determination unit Confirmation operation by the determination unit again until the waiting time derived by the deriving unit elapses according to the configuration type of the communication path of the storage
- it comprises a redundancy processing unit that performs redundant using the remaining storage excluding the storage.
- the array management apparatus derives a standby time until redundancy is made based on the type of storage communication path that has failed to be accessed. Therefore, the array management apparatus can change the waiting time until it is determined that re-redundancy is performed, that is, the determination criterion for determining that re-redundancy is performed based on the type of communication path. As a result, if the access is successful within the standby time, it is not necessary to perform the redundancy again, so that the life of the storage apparatus becomes longer compared to the case where the redundancy is performed immediately after the failure occurs.
- the derivation unit is configured so that the waiting time is longer when the configuration type of the communication path indicates wireless than when the configuration type of the communication path indicates wired. Derivation may be performed.
- the array management apparatus when the array management apparatus is communicating wirelessly with the storage that has failed to access, the array management apparatus sets the standby time to be longer than that when performing wired communication.
- the failure of access is likely to be a temporary failure (for example, when communication is blocked due to an obstacle and communication cannot be established), so by making the standby time longer than in the case of wired, Since automatic recovery from a temporary failure can be expected, there is no need for immediate re-redundancy.
- the derivation unit compares the standby time when the communication path configuration type indicates the Internet as compared to when the communication path configuration type indicates the LAN (Local Area Network).
- the derivation may be performed so as to be longer.
- the array management apparatus when the array management apparatus is communicating with the storage that has failed to access via the Internet, the array management apparatus sets the standby time to be longer than that when communicating with the LAN.
- the amount of data traffic is larger than that of communication via a LAN, and access may take time. For this reason, an access failure is considered to be a temporary failure (for example, it takes a long time to access due to a large amount of traffic). You can expect automatic recovery from.
- the redundancy processing unit further immediately indicates that the configuration type of the communication path of the storage whose access failure has been confirmed by the determination unit indicates a connection form that cannot be temporarily disconnected. Redundancy may be performed using the remaining storage excluding the storage.
- the array management apparatus immediately performs re-redundancy when the connection method cannot be temporarily disconnected from the storage to which access has failed. If access fails with a connection method that cannot be temporarily disconnected, there is a high possibility that a physical failure, such as storage corruption, has occurred rather than a temporary failure. By doing so, you can take quick action.
- the storage unit further stores information on whether or not each of the plurality of storages is protected by data
- the deriving unit further stores the information for which the access failure has been confirmed.
- the waiting time is shorter than the case where the information indicating the data protection is received.
- the derivation may be performed so that the waiting time is longer than when the information indicates that the storage is not protected. .
- the array management apparatus sets the waiting time longer when the storage that has failed to be accessed is subjected to data protection than when the storage is not subjected to data protection.
- storage that is protected by data is less likely to be damaged than storage that is not protected by data. For this reason, it is highly possible that an access failure is a temporary failure. Therefore, it is possible to expect automatic recovery from the temporary failure by making the standby time longer than in the case of a LAN.
- the array management apparatus further includes a request receiving unit that receives an access request from the outside, and other storages other than the storage that has failed to access the data to be written to the storage that has failed to access.
- a temporary writing unit that performs writing the request receiving unit receives a write request for data from the outside, selects several storages that perform writing from the plurality of storages, and the redundancy processing unit includes: Further, when the determination unit confirms that the access to any one of the several storages is unsuccessful, the temporary unit waits until the waiting time derived by the deriving unit elapses. Data is written by the writing unit, and after the waiting time has elapsed, the storage that failed to access is excluded. May be redundant with the remaining storage was.
- the array management device writes the data to be written to the storage that failed to access to the other storages except the storage that failed to access until the standby time elapses. Does not increase the amount. For example, when data to be written is held in a buffer, buffer overflow can be prevented.
- the determination unit monitors a communication path state monitoring unit that monitors whether or not a failure has occurred on a communication path, and a storage state that monitors whether or not a failure of the storage has occurred for each of the plurality of storages
- the communication path state monitoring unit transmits a response request to each of the plurality of storages, determines that access has failed when no response to the transmitted response request is received, and the storage
- the state monitoring unit may determine that the access has failed when it is determined that storage damage has occurred for each of the plurality of storages.
- the array management apparatus can monitor the presence / absence of a failure on the communication path and the occurrence of a storage failure by the communication path state monitoring unit and the storage state monitoring unit.
- the storage unit stores, in association with the configuration type for each of the plurality of storages, a non-response flag indicating whether or not the response from the corresponding storage has been received, and further, For each of the plurality of storages, a damage flag indicating whether or not a storage failure has occurred is stored, and when the communication path state monitoring unit does not receive a response, the response is not received in the corresponding storage no-response flag.
- the storage status monitoring unit sets a value indicating that a corruption has occurred in the corresponding storage corruption flag when the storage corruption occurs, and the redundancy
- the processing unit includes an array state monitoring unit that monitors the state of the array configuration of the plurality of storages, and a redundancy execution unit that performs redundancy.
- the array state monitoring unit when the value of the no-response flag indicates that no response has been received, the storage until the standby time derived by the deriving unit for the corresponding storage elapses. If a response is not received from, the redundancy may be performed, and if the value of the damage flag indicates that the damage has occurred, it may be determined that the redundancy is performed immediately.
- the array management apparatus can easily determine whether or not a failure has occurred by using the non-response flag and the damage flag.
- the array management device when the storage that has failed to be accessed is restored, the array management device further restores the data written to the other storage by the temporary writing unit to the restored storage.
- a recovery processing execution unit that performs processing, and when the value of the no-response flag indicates that a response has not been received, the array state monitoring unit passes the waiting time derived by the deriving unit for the corresponding storage If a response is received from the storage until the time until, the recovery processing execution unit may be controlled to perform the recovery processing.
- the array management apparatus can write to the storage to be originally written after the failure recovery by the recovery processing execution unit, and can easily manage the data without re-redundancy.
- the array management device can be used as a device for managing a large amount of data.
- a menu display, a web browser, an editor a battery-powered portable display terminal such as a mobile phone, a portable music player, a digital camera, or a digital video camera, or a high-resolution information display device such as a TV, a digital video recorder, or a car navigation system, It has high utility value as a device that performs display using EPG, map display, or the like.
Abstract
Description
ここでは、図面を参照して本発明に係る実施の形態について説明する。
図1は、本発明に係るアレイ管理装置を含むアレイ管理システム1の構成を示す図である。
アレイ管理装置100は、ストレージ装置11~15を管理する装置であり、図2に示すように、ネットワーク状態監視部101、ストレージ状態監視部102、管理情報記憶部103アレイ状態監視部104、冗長化方針決定部105、処理部106、要求受付部107及び通信部108から構成されている。
ネットワーク状態監視部101は、各ストレージ装置11~15についてネットワーク障害の発生の有無を監視するものである。
ストレージ状態監視部102は、各ストレージ装置11~15について、従来と同様にストレージ障害の発生の有無を監視するものである。
管理情報記憶部103は、アレイ管理装置100が管理する複数のテーブルを記憶するための記憶領域である。
ネットワーク状態管理テーブルT100は、ネットワークの状態(例えば、応答要求に対する応答の有無)を管理するテーブルであり、図3に示すように、ストレージ番号、ネットワーク種別、ネットワーク情報、最終応答確認時間及び無応答フラグからなる組を複数記憶するための領域を有している。
ストレージ状態管理テーブルT200は、ストレージ装置の状態(破損の有無等)を管理するテーブルであり、図4に示すように、ストレージ番号、ストレージ種別、ストレージ情報及び破損フラグからなる組を複数記憶するための領域を有している。
ネットワーク障害管理テーブルT300は、ネットワーク障害が発生したストレージ装置について、当該障害の発生時刻及び復旧時の時刻を管理するテーブルであり、図5に示すように、ストレージ番号、ネットワーク障害発生時刻、確認時間(Tb)、復旧確認時刻、確認時間(Td)からなる組を複数記憶するための領域を有している。なお、ストレージ番号、ネットワーク障害発生時刻、確認時間(Tb)、復旧確認時刻、確認時間(Td)かならなる組を障害情報という。
空き領域情報テーブルT400は、各ストレージ装置11~15の空き容量を管理するテーブルであり、図6に示すように、ストレージ番号、オフセット、サイズ、一時使用中からなる組を複数記憶するための領域を有している。上述したように、空き容量とは、アレイ構成に利用されていない領域のうち、未だデータの書込みが行われていない容量である。
データ一時保存領域情報テーブルT500は、ネットワーク障害が発生しているストレージ装置に書き込むべきデータを一時的に保存先を管理するテーブルである。データ一時保存領域情報テーブルT500は、図7に示すように、無応答ストレージ番号、書込みオフセット、書込みサイズ、一時保存ストレージ番号及び一時保存オフセットからなる組を複数記憶するための領域を有している。なお、無応答ストレージ番号、書込みオフセット、書込みサイズ、一時保存ストレージ番号及び一時保存オフセットからなる組を一時保存領域情報という。
冗長化方針決定部105、無応答となったストレージ装置に対して、ネットワーク障害であると判断するための基準時間(Ta)を設定する。なお、この時間Taは、キャッシュのオーバーフローを防止するための一時保存の開始までの待機時間でもある。
アレイ状態監視部104は、アレイ状態、つまり各ストレージ装置11~15の状態をネットワーク状態監視部101及びストレージ状態監視部102の監視結果に基づいてネットワーク障害状況やアレイ構成の障害状況を監視する。
処理部106は、各ストレージ装置に対するデータの読み書き、再冗長化の実行、及びネットワーク障害から復旧した場合の復旧処理を行うものであり、図2に示すように、冗長化実行部110とデータ処理実行部111と復旧処理実行部112とを含んでいる。
冗長化実行部110は、再冗長化命令をアレイ状態監視部104から受け取ると、再冗長化を行う。
データ処理実行部111は、各ストレージ装置に対するデータの読み書きを行うものである。
ここでは、まず、ネットワーク障害が発生していない場合の機能動作について説明する。
ここでは、ネットワーク障害が発生している場合の機能動作について説明する。
復旧処理実行部112は、アレイ状態監視部104から復旧情報を受け取ると、ネットワーク障害から復帰したストレージ装置へデータを書き戻すものである。
要求受付部107は、外部からデータの読み書きの要求を受け付け、受け付けた要求を処理部106へ出力する。このとき、読込み要求を受け付ける場合には、要求受付部107は、さらに読込位置を受け付け、受け付けた読込位置をも処理部106へ出力する。また、書込み要求を受け付ける場合には、要求受付部107は、さらに書き込むべきデータを受け付け、受け付けたデータをも処理部106へ出力する。
通信部108は、管理対象である各ストレージ装置11~15とのデータの入出力を行うものである。
ストレージ装置11~15は、同様の構成要素であるので、ここでは、ストレージ装置11の構成要素について、図8を参照して説明する。
記憶部201は、アレイ管理装置100により書き込まれたデータを記憶する大容量記録装置であり、例えばハードディスクドライブ(HDD)や、ソリッドステートドライブ(SSD)等である。
処理部202は、アレイ管理装置100の指示より、記憶部201へアレイ管理装置100から受け取ったデータを書き込んだり、記憶部201からデータを読み出してアレイ管理装置100へ通信部204を介して送信したりする。
ストレージ状態取得部203は、記憶部201の破損の有無をチェックする。そして、破損している場合には、ストレージ状態取得部203は、アレイ管理装置100によるストレージ状態のチェック時において破損情報を処理部202へ通知する。
1.4 動作
ここでは、アレイ管理装置100の動作について説明する。
まず、各ストレージ装置11~15の生存確認を、定期的(例えば、2秒毎に)行うネットワーク状態監視処理について、図9に示す流れ図を用いて説明する。
ここでは、図9に示すステップS25の処理について、図10に示す流れ図を用いて説明する。
ここでは、図9に示すステップS30の処理について、図11に示す流れ図を用いて説明する。
ここでは、冗長化方針決定部105で行われる無応答時の冗長化方針決定の処理について、図12に示す流れ図を用いて説明する。
ここでは、冗長化方針決定部105で行われる復帰時の冗長化方針決定の処理について、図13に示す流れ図を用いて説明する。
ここでは、通常時のデータのアクセス(読み書き)処理について、図14に示す流れ図を用いて説明する。
ここでは、ネットワーク障害時のデータの書込処理について、図15に示す流れ図を用いて説明する。
ここでは、ネットワーク障害時のデータの読込処理について、図16に示す流れ図を用いて説明する。
ここでは、ネットワーク障害からの復旧処理について、図17に示す流れ図を用いて説明する。
ここでは、ストレージ装置が無応答である場合におけるアレイ管理装置100の全体の動作概要について、図18に示す流れ図を用いて説明する。
ここでは、冗長化状態の遷移について説明する。
(1)一時保存について
図20(a)は、通常時におけるデータの書込みのイメージ図を示す。通常状態において、アレイ管理装置100が書込み要求を受け付けると、RAID5等のアルゴリズムに従い、冗長化を構成する各ストレージ装置(ここでは、ストレージ装置11、14、15)に対して書込みを行うデータを決定し、それぞれのストレージ装置にデータを書き込む。例えば、あるデータX1の書込み要求があると、アレイ管理装置100は、書き込むべきデータとして、データA1、B1、C1を生成する。あるデータX2の書込み要求があると、アレイ管理装置100は、書き込むべきデータとして、データA2、B2、C2を生成する。ここでは、データA1とC1とからデータB1が、データB1とC1とからデータA1が、それぞれ回復可能であるとする。また、データX2から生成されたデータA2、B2、C2についても同様に、データA2とC2とからデータB2が、データB2とC2とからデータA2が、それぞれ回復可能であるとする。
図21は、再冗長化のイメージ図を示すものである。
以上、実施の形態に基づいて説明したが、本発明は上記の実施の形態に限られない。例えば、以下のような変形例が考えられる。
(1)本発明の一実施態様である、複数のストレージを冗長化して各ストレージへのアクセスを制御するアレイ管理装置は、前記複数のストレージ各々へのアクセスが成功したか失敗したかを繰り返し確認する判定部と、前記複数のストレージ各々への通信経路の構成種別を記憶する記憶部と、前記記憶部に記憶された前記通信経路の構成種別に基づいて、前記ストレージへのアクセスが失敗してから冗長化を実行するまでの待機時間を導出する導出部と、前記判定部によって、前記複数のストレージのうち、いずれかのストレージへのアクセスの失敗が確認された場合に、当該ストレージの前記通信経路の構成種別に応じて前記導出部によって導出された前記待機時間が経過するまでの間に、前記判定部による再度の確認動作によって当該ストレージへのアクセスの成功が確認されないときは、当該ストレージを除外した残りのストレージを用いて冗長化を行う冗長化処理部とを備えることを特徴とする。
2 インターネット
10 デジタルレコーダー
11~15、11A ストレージ装置
100、100A アレイ管理装置
101、101A ネットワーク状態監視部
102、102A ストレージ状態監視部
103 管理情報記憶部
104、104A アレイ状態監視部
105、105A 冗長化方針決定部
106 処理部
107 要求受付部
108 通信部
110、110A 冗長化実行部
111、111A データ処理実行部
112、112A 復旧処理実行部
201 記憶部
202 処理部
203、203A ストレージ状態取得部
204 通信部
1000、2000 ROM
1010、2010 CPU
1020、2020 RAM
1030 下位側転送制御部
1040 上位側転送制御部
2030 転送制御部
2040、2050 大容量記憶装置
3000、3000A アレイ管理装置
3001 記憶部
3002 判定部
3003 導出部
3004 冗長化処理部
3005 要求受付部
3006 一時書込部
3100、3101、3102 ストレージ装置
Claims (11)
- 複数のストレージを冗長化して、各ストレージへのアクセスを制御するアレイ管理装置であって、
前記複数のストレージ各々へのアクセスが成功したか失敗したかを繰り返し確認する判定部と、
前記複数のストレージ各々への通信経路の構成種別を記憶する記憶部と、
前記記憶部に記憶された前記通信経路の構成種別に基づいて、前記ストレージへのアクセスが失敗してから冗長化を実行するまでの待機時間を導出する導出部と、
前記判定部によって、前記複数のストレージのうち、いずれかのストレージへのアクセスの失敗が確認された場合に、当該ストレージの前記通信経路の構成種別に応じて前記導出部によって導出された前記待機時間が経過するまでの間に、前記判定部による再度の確認動作によって当該ストレージへのアクセスの成功が確認されないときは、当該ストレージを除外した残りのストレージを用いて冗長化を行う冗長化処理部とを備える
ことを特徴とするアレイ管理装置。 - 前記導出部は、前記通信経路の構成種別が有線を示す場合と比較して、前記通信経路の構成種別が無線を示す場合の方を前記待機時間が長くなるように前記導出を行う
ことを特徴とする請求項1に記載のアレイ管理装置。 - 前記導出部は、前記通信経路の構成種別がLAN(Local Area Network)を示す場合と比較して、前記通信経路の構成種別がインターネットを示す場合の方を前記待機時間が長くなるように前記導出を行う
ことを特徴とする請求項1に記載のアレイ管理装置。 - 前記冗長化処理部は、さらに、前記判定部によってアクセスの失敗が確認されたストレージの前記通信経路の構成種別が一時的に切断し得ない接続形態を示す場合は、直ちに当該ストレージを除外した残りのストレージを用いて冗長化する
ことを特徴とする請求項1に記載のアレイ管理装置。 - 前記記憶部は、さらに、前記複数のストレージ各々がデータ保護を受けているか否かに関する情報を記憶し、
前記導出部は、さらに、アクセスの失敗が確認されたストレージに対して、前記記憶部に記憶された、当該ストレージに対応する情報がデータ保護を受けていないことを示す場合には、データ保護を受けていることを示す場合よりも前記待機時間が短くなるよう前記導出を行い、当該ストレージに対応する情報がデータ保護を受けていることを示す場合には、データ保護を受けていないことを示す場合よりも前記待機時間が長くなるよう前記導出を行う
ことを特徴とする請求項1に記載のアレイ管理装置。 - 前記アレイ管理装置は、さらに、
外部からのアクセス要求を受け付ける要求受付部と、
アクセスに失敗したストレージを除く他のストレージに対して、アクセスに失敗したストレージへ書き込むべきデータの書き込みを行う一時書込部とを備え、
前記要求受付部は、外部からのデータの書き込み要求を受け付け、前記複数のストレージから書き込みを行う幾つかのストレージを選択し、
前記冗長化処理部は、さらに、前記判定部によって、前記幾つかのストレージのうち、いずれかのストレージへのアクセスの失敗が確認された場合に、前記導出部によって導出した前記待機時間が経過するまでの間に、前記一時書込部によってデータの書き込みを行い、前記待機時間が経過した後は、アクセスに失敗したストレージを除外した残りのストレージを用いて冗長化する
ことを特徴とする請求項2から5のいずれか1項に記載のアレイ管理装置。 - 前記判定部は、
通信経路上での障害の発生の有無を監視する通信経路状態監視部と、前記複数のストレージ各々について当該ストレージの障害の発生の有無を監視するストレージ状態監視部とからなり、
前記通信経路状態監視部は、前記複数のストレージ各々に対して応答要求を送信し、送信した前記応答要求に対する応答を受け取らない場合にアクセスに失敗したと判断し、
前記ストレージ状態監視部は、前記複数のストレージ各々に対してストレージの破損が発生していると判断する場合にアクセスに失敗したと判断する
ことを特徴とする請求項6に記載のアレイ管理装置。 - 前記記憶部は、
前記複数のストレージそれぞれに対する前記構成種別に、対応するストレージからの前記応答を受け取ったか否かを示す無応答フラグを対応付けて記憶し、
さらに、前記前記複数のストレージそれぞれについて、ストレージ障害の発生の有無を示す破損フラグを記憶しており、
前記通信経路状態監視部は、応答を受け取らない場合には対応するストレージの無応答フラグに当該応答を受け取っていない旨を示す値を設定し、
前記ストレージ状態監視部は、ストレージ破損が発生している場合には対応するストレージの破損フラグに破損が発生している旨を示す値を設定し、
前記冗長化処理部は、前記複数のストレージによるアレイ構成の状態を監視するアレイ状態監視部と、冗長化を実行する冗長化実行部とから構成され、
前記アレイ状態監視部は、無応答フラグの値が応答を受け取っていない旨を示す場合には対応するストレージについて前記導出部が導出した前記待機時間が経過するまでの間に当該ストレージから応答を受け取らないときには冗長化を行うと、破損フラグの値が破損が発生している旨を示す場合には直ちに冗長化を行うと、それぞれ判断する
ことを特徴とする請求項7に記載のアレイ管理装置。 - 前記アレイ管理装置は、さらに、
アクセスに失敗したストレージが復旧した場合に、前記一時書込部により他のストレージに対して書き込まれたデータを、復旧した前記ストレージへ書き込む復旧処理を行う復旧処理実行部を備え、
前記アレイ状態監視部は、無応答フラグの値が応答を受け取っていない旨を示す場合に、対応するストレージについて前記導出部が導出した前記待機時間が経過するまでの間に当該ストレージから応答を受け取ると、前記復旧処理実行部が前記復旧処理を行うよう制御する
ことを特徴とする請求項8に記載のアレイ管理装置。 - 複数のストレージ各々への通信経路の構成種別を記憶する記憶部と、判定部と、導出部と、冗長化処理部とを備え、前記複数のストレージを冗長化し、各ストレージへのアクセスを制御するアレイ管理装置におけるアレイ管理方法であって、
前記判定部が、前記複数のストレージ各々へのアクセスが成功したか失敗したかを繰り返し確認する確認ステップと、
前記判定部が、前記確認ステップによって前記複数のストレージうち、いずれかのストレージへのアクセスの失敗が確認されたか否かを判断する第1の判断ステップと、
前記導出部が、前記第1の判断ステップによって前記複数のストレージうち、いずれかのストレージへのアクセスの失敗が確認されたと判断する場合に、前記記憶部に記憶された当該ストレージに対応する前記通信経路の構成種別に基づいて、当該ストレージへのアクセスが失敗してから冗長化を実行するまでの待機時間を導出する導出ステップと、
前記冗長化処理部が、前記第1の判断ステップによって、前記複数のストレージのうち、いずれかのストレージへのアクセスの失敗が確認されたと判断する場合に、当該ストレージの前記通信経路の構成種別に応じて前記導出ステップによって導出された前記待機時間が経過したか否かを判断する第2の判断ステップと、
前記冗長化処理部が、前記判定部が前記待機時間内において前記確認ステップによるアクセスの失敗が確認されたストレージへのアクセスが成功したか失敗したかの確認、及び前記第1の判断ステップによる前記確認ステップによってアクセスの失敗が確認されたストレージへのアクセスが成功が確認されたか否かを判断して、アクセスの失敗が確認されたストレージへのアクセスの成功が確認されないときは、当該ストレージを除外した残りのストレージを用いて冗長化を行う冗長化実行ステップとを含む
ことを特徴とするアレイ管理方法。 - 複数のストレージ各々への通信経路の構成種別を記憶する記憶部を備え、前記複数のストレージを冗長化し、各ストレージへのアクセスを制御するアレイ管理装置における集積回路であって、
前記複数のストレージ各々へのアクセスが成功したか失敗したかを繰り返し確認する判定部と、
前記記憶部に記憶された前記通信経路の構成種別に基づいて、前記ストレージへのアクセスが失敗してから冗長化を実行するまでの待機時間を導出する導出部と、
前記判定ステップによって、前記複数のストレージのうち、いずれかのストレージへのアクセスの失敗が確認された場合に、当該ストレージの前記通信経路の構成種別に応じて前記導出ステップによって導出された前記待機時間が経過するまでの間に、前記判定ステップによる再度の確認動作によって当該ストレージへのアクセスの成功が確認されないときは、当該ストレージを除外した残りのストレージを用いて冗長化を行う冗長化処理部とを備える
ことを特徴とする集積回路。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/881,501 US20130219212A1 (en) | 2010-12-15 | 2011-10-18 | Array management device, array management method and integrated circuit |
CN2011800549206A CN103250127A (zh) | 2010-12-15 | 2011-10-18 | 阵列管理装置,阵列管理方法以及集成电路 |
JP2012548617A JPWO2012081156A1 (ja) | 2010-12-15 | 2011-10-18 | アレイ管理装置、アレイ管理方法及び集積回路 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010279219 | 2010-12-15 | ||
JP2010-279219 | 2010-12-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012081156A1 true WO2012081156A1 (ja) | 2012-06-21 |
Family
ID=46244276
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/005805 WO2012081156A1 (ja) | 2010-12-15 | 2011-10-18 | アレイ管理装置、アレイ管理方法及び集積回路 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20130219212A1 (ja) |
JP (1) | JPWO2012081156A1 (ja) |
CN (1) | CN103250127A (ja) |
WO (1) | WO2012081156A1 (ja) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015118685A (ja) * | 2013-11-12 | 2015-06-25 | 株式会社リコー | 情報処理システム、情報処理方法、及びプログラム |
US9690576B2 (en) * | 2015-02-11 | 2017-06-27 | Dell Software, Inc. | Selective data collection using a management system |
JP6705266B2 (ja) * | 2016-04-07 | 2020-06-03 | オムロン株式会社 | 制御装置、制御方法およびプログラム |
KR102536518B1 (ko) * | 2016-09-13 | 2023-05-24 | 한화비전 주식회사 | 저장장치 리빌드 중 녹화가 가능한 감시 카메라 시스템 및 방법 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001273220A (ja) * | 2000-01-18 | 2001-10-05 | Canon Inc | 情報処理装置及び方法及び記憶媒体並びにコンピュータプログラム |
JP2004248270A (ja) * | 2003-01-24 | 2004-09-02 | Matsushita Electric Ind Co Ltd | 共有鍵交換方法及び通信機器 |
JP2005182657A (ja) * | 2003-12-22 | 2005-07-07 | Sony Corp | データ記録再生装置及びデータ記録再生方法 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100407166C (zh) * | 2004-07-29 | 2008-07-30 | 普安科技股份有限公司 | 改善数据读取效率的方法及其储存系统 |
US20080005257A1 (en) * | 2006-06-29 | 2008-01-03 | Kestrelink Corporation | Dual processor based digital media player architecture with network support |
-
2011
- 2011-10-18 CN CN2011800549206A patent/CN103250127A/zh active Pending
- 2011-10-18 US US13/881,501 patent/US20130219212A1/en not_active Abandoned
- 2011-10-18 JP JP2012548617A patent/JPWO2012081156A1/ja active Pending
- 2011-10-18 WO PCT/JP2011/005805 patent/WO2012081156A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001273220A (ja) * | 2000-01-18 | 2001-10-05 | Canon Inc | 情報処理装置及び方法及び記憶媒体並びにコンピュータプログラム |
JP2004248270A (ja) * | 2003-01-24 | 2004-09-02 | Matsushita Electric Ind Co Ltd | 共有鍵交換方法及び通信機器 |
JP2005182657A (ja) * | 2003-12-22 | 2005-07-07 | Sony Corp | データ記録再生装置及びデータ記録再生方法 |
Also Published As
Publication number | Publication date |
---|---|
CN103250127A (zh) | 2013-08-14 |
JPWO2012081156A1 (ja) | 2014-05-22 |
US20130219212A1 (en) | 2013-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5127491B2 (ja) | ストレージサブシステム及びこれの制御方法 | |
JP5714571B2 (ja) | キャッシュクラスタを構成可能モードで用いるキャッシュデータ処理 | |
US7210061B2 (en) | Data redundancy for writes using remote storage system cache memory | |
US8458398B2 (en) | Computer-readable medium storing data management program, computer-readable medium storing storage diagnosis program, and multinode storage system | |
US7587627B2 (en) | System and method for disaster recovery of data | |
US9946655B2 (en) | Storage system and storage control method | |
US8443231B2 (en) | Updating a list of quorum disks | |
JP6056453B2 (ja) | プログラム、データ管理方法および情報処理装置 | |
US8402189B2 (en) | Information processing apparatus and data transfer method | |
US8707085B2 (en) | High availability data storage systems and methods | |
US20110107358A1 (en) | Managing remote procedure calls when a server is unavailable | |
JP2006338626A (ja) | ディスクアレイ装置及びその制御方法 | |
JP2007072571A (ja) | 計算機システム及び管理計算機ならびにアクセスパス管理方法 | |
US20140281688A1 (en) | Method and system of data recovery in a raid controller | |
JP2007226400A (ja) | 計算機管理方法、計算機管理プログラム、実行サーバの構成を管理する待機サーバ及び計算機システム | |
US20100275219A1 (en) | Scsi persistent reserve management | |
WO2012081156A1 (ja) | アレイ管理装置、アレイ管理方法及び集積回路 | |
US20130198731A1 (en) | Control apparatus, system, and method | |
US20110276822A1 (en) | Node controller first failure error management for a distributed system | |
US9841923B2 (en) | Storage apparatus and storage system | |
JP2010282324A (ja) | ストレージ制御装置、ストレージシステムおよびストレージ制御方法 | |
US8381027B1 (en) | Determining alternate paths in faulted systems | |
CN108512753B (zh) | 一种集群文件系统中消息传输的方法及装置 | |
CN107888405B (zh) | 管理设备和信息处理系统 | |
CN112887367B (zh) | 实现分布式集群高可用的方法、系统及计算机可读介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11848615 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2012548617 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13881501 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11848615 Country of ref document: EP Kind code of ref document: A1 |