JP5288331B2 - I / O instruction failure recovery circuit, I / O instruction failure recovery method, and I / O instruction failure recovery program - Google Patents

I / O instruction failure recovery circuit, I / O instruction failure recovery method, and I / O instruction failure recovery program Download PDF

Info

Publication number
JP5288331B2
JP5288331B2 JP2009024857A JP2009024857A JP5288331B2 JP 5288331 B2 JP5288331 B2 JP 5288331B2 JP 2009024857 A JP2009024857 A JP 2009024857A JP 2009024857 A JP2009024857 A JP 2009024857A JP 5288331 B2 JP5288331 B2 JP 5288331B2
Authority
JP
Japan
Prior art keywords
failure
time
disk device
time monitoring
response
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2009024857A
Other languages
Japanese (ja)
Other versions
JP2010182080A (en
Inventor
晃一 野村
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2009024857A priority Critical patent/JP5288331B2/en
Publication of JP2010182080A publication Critical patent/JP2010182080A/en
Application granted granted Critical
Publication of JP5288331B2 publication Critical patent/JP5288331B2/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Description

  The present invention relates to time-out detection when a minor failure such as a CRC (Cyclic Redundancy Check) error occurs in data communication at a host-disk device interface.

  There is a technique for detecting a failure when a minor failure such as a CRC error occurs in data communication at the host-disk device interface.

  For example, in the technique described in Patent Document 1, when a host computer (CPU) incorporates a peripheral device, each command has a longer execution time than the input / output monitoring time given by the CPU, and the monitoring time for the CPU. An event notification flag indicating that an extension of the event is to be notified is turned on. When the I / O is issued by the CPU, the peripheral device notifies the CPU of the extension time if the command notifies the extension of the monitoring time, and the CPU receiving the notification of the monitoring time extension receives the monitoring time counter. Is added to the counter value notified and corresponding to the extended time. In addition, when the peripheral device finishes the command execution within the extended time, the peripheral device transmits a monitoring time shortening notification event, and the CPU sets the time monitoring count value to a value corresponding to the notified shortening time. Just subtract.

JP 2001-147866 A

  However, in the technique described in Patent Document 1 described above, when a minor failure such as a CRC error occurs in data communication at the interface between the host and the disk device, the received information is indeterminate and what instruction and response Cannot judge whether there is. In such a case, a port that has detected a minor failure usually only counts that a failure has occurred, and does not perform aggressive failure processing.

  The abnormality that the failure has been detected is detected by timeout monitoring performed by the upper layer, and then failure processing is started. Therefore, it takes a long time of several s to several tens of seconds to complete an I / O instruction that is normally completed in less than 1 s. As a result, the entire system is delayed. At this time, taking into account the occurrence of a disk device internal failure, a response time that is two to three digits longer than the normal response time may be required, so it is not possible to simply shorten the timeout value of the upper layer.

  Therefore, the present invention provides an I / O instruction failure recovery circuit and an I / O instruction failure recovery circuit that can reduce processing delay due to timeout detection when a minor failure such as a CRC error occurs at the host-disk device interface. It is an object to provide a method and an I / O instruction failure recovery program.

According to a first aspect of the present invention, there is provided a I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface, when a failure is detected in the host, the number of the fault count to Luke counter, when the host issues an I / O command to the disk device, to monitor the response time, if there is no response within a first predetermined time, the output processing unit to that effect A first time monitoring means for notifying, and when the host issues an I / O command to the disk device, the response is time monitored, and a second fixed time which is shorter than the first fixed time If there is no response within, the second time monitoring means for notifying the input / output processing means to that effect, the notification from the first time monitoring means, and the notification from the second time monitoring means If there is If there is counted up before hear counter Te, provided with input and output processing means for starting a failure treatment, wherein the output processing means, there when there is a notification from the second time monitoring unit If the counter has not been counted up, the disk device is inquired whether the issued I / O processing is executed internally, and a reply that it is not executed inside the disk device is sent. When notified , an I / O instruction failure recovery circuit is provided , which determines that a failure has occurred in the interface and performs the failure processing for the failure in the interface .
According to a second aspect of the present invention, there is provided an I / O instruction failure recovery circuit incorporated in a host connected through an interface with a disk device, and when a failure is detected in the host, the number of times of the failure is detected. A first counter that counts the response time, and when the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within the first predetermined time, that fact is input / output processed. A first time monitoring means for notifying the means, and when the host issues an I / O command to the disk device, the response is time monitored, and a second time is shorter than the first predetermined time. When there is no response within a certain time, a second time monitoring means for notifying the input / output processing means to that effect, a case where there is a notification from the first time monitoring means, and a case where the second time monitoring means Notice from And an input / output processing means for starting fault processing when the first counter is counted up, and the input / output processing means is notified from the second time monitoring means. If a failure is detected in the disk device included in the disk device when the first counter included in the I / O instruction failure recovery circuit is not counted up, the failure is detected. The second counter that counts the number of times is inquired whether there is a count-up, and if a reply that there is a count-up is notified, it is determined that there is an influence due to the detected fault, and the count-up An I / O instruction failure recovery circuit is provided that performs the failure processing for a failure.

Furthermore, according to the third aspect of the present invention, there is provided a I / O instruction fault recovery method I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface is carried out, failure in the host when detecting, the steps of: providing a count to Luke counter the number of the fault, when the host issues an I / O command to the disk device, to monitor the response time, a first predetermined time A first time monitoring step for notifying the input / output processing step to that effect, and when the host issues an I / O command to the disk device, the response is time monitored, If there is no response within a second predetermined time is shorter than the first predetermined time, the second time monitoring step of notifying to that effect to the output processing step, the first time monitoring step Includes the case where there is a notification of, when a second time monitoring in a case where there is a notification in step increments before hear counter has an input and output processing step of starting the failure process, the In the input / output processing step, when the notification in the second time monitoring step is received and the counter is not counted up, the I / O processing issued to the disk device is internally performed. If a reply indicating that the disk device is not being executed is notified, it is determined that a failure has occurred in the interface, and the failure processing for the failure in the interface is performed. I / O instruction fault recovery method and performing are provided.
Further, according to a fourth aspect of the present invention, there is provided an I / O instruction failure recovery method performed by an I / O instruction failure recovery circuit incorporated in a host connected to a disk device through an interface, wherein a failure is detected in the host. A step of providing a first counter for counting the number of failures when detected, and a time monitoring of a response when the host issues an I / O command to the disk device, and a first predetermined time A first time monitoring step for notifying the input / output processing step to that effect, and when the host issues an I / O command to the disk device, the response is time monitored, A second time monitoring step for notifying the input / output processing step when there is no response within a second fixed time that is shorter than the fixed time of 1, and the first time monitoring An input / output processing step for starting fault processing when there is a notification at the step and when there is a notification at the second time monitoring step and the first counter is counted up, In the input / output processing step, when there is a notification in the second time monitoring step, and there is no count up of the first counter provided in the I / O instruction failure recovery circuit, When a failure is detected in the disk device included in the disk device, the second counter that counts the number of failures is inquired whether there is a count-up, and a reply that there is a count-up is notified Is determined to be the effect of the counted-up fault detection, and the fault processing is performed for the fault in the count up I / O instruction fault recovery method comprising Ukoto is provided.

Furthermore, according to the fifth aspect of the present invention, there is provided a I / O instruction fault recovery program to be installed in I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface, said host in the case of a failure, and counted to Luke counter the number of the fault, when the host issues an I / O command to the disk device, to monitor the response time, within a first predetermined time When there is no response, first time monitoring means for notifying the input / output processing means to that effect, and when the host issues an I / O command to the disk device, the response is time monitored, and the first time monitoring means When there is no response within a second fixed time that is shorter than the fixed time, there is a notification from the second time monitoring means for notifying the input / output processing means and the first time monitoring means. And If there serial second count-up time before the monitoring means in a case where there is a notification hear counter is provided with input and output processing means for starting a failure treatment, and the input and output processing means, When there is a notification from the second time monitoring means and the counter has not been counted up, an inquiry is made to the disk device as to whether or not the issued I / O processing is being performed internally. If a reply indicating that the disk device is not executed is notified, it is determined that a failure has occurred in the interface, and the failure processing for the failure in the interface is performed. I / O instruction failure recovery There is provided an I / O instruction failure recovery program characterized by causing a computer to function as a circuit.
Further, according to a sixth aspect of the present invention, there is provided an I / O instruction failure recovery program mounted on an I / O instruction failure recovery circuit incorporated in a host connected to a disk device through an interface, When a failure is detected, a first counter that counts the number of failures, and when the host issues an I / O command to the disk device, the response is time-monitored and within a first fixed time When there is no response, first time monitoring means for notifying the input / output processing means to that effect, and when the host issues an I / O command to the disk device, the response is time monitored, and the first time monitoring means When there is no response within a second fixed time that is shorter than the fixed time, there is a notification from the second time monitoring means for notifying the input / output processing means and the first time monitoring means. Where And input / output processing means for starting failure processing when the second counter is notified and the first counter is counted up, and the input / output processing The disk device is provided with a means when there is a notification from the second time monitoring means and there is no count up of the first counter provided in the I / O instruction failure recovery circuit. When a failure is detected in the disk device, an inquiry is made to the second counter that counts the number of failures, and if a reply that there has been a count-up is notified, the counted failure The computer is caused to function as an I / O instruction failure recovery circuit that determines that the failure is caused by detection and performs the failure processing for the failure in the count-up. I / O instruction fault recovery program is provided, characterized in that.

  According to the present invention, when a minor failure such as a CRC error occurs at the interface between the host and the disk device, it is possible to detect an abnormality and start the failure processing after a certain time has elapsed. Thus, the start of failure processing of the I / O instruction can be accelerated, and delay of the entire system can be prevented.

It is a figure showing the basic composition of the embodiment of the present invention. It is a figure (1/2) showing the basic operation | movement of embodiment of this invention. It is a figure (2/2) showing the basic operation | movement of embodiment of this invention. It is a flowchart (1/2) for demonstrating the effect of embodiment of this invention. It is a flowchart (2/2) for demonstrating the effect of embodiment of this invention.

  Next, embodiments of the present invention will be described in detail with reference to the drawings. Referring to FIG. 1, this embodiment includes a host 100 and a disk device 200. The host 100 and the disk device 200 are connected via a serial interface 300.

  The serial interface 300 is a serial interface typified by a fiber channel interface.

  The host 100 also includes a central processing unit 110, a main storage device 120, an input / output processing unit 130, a first time monitoring mechanism 141, a second time monitoring mechanism 142, a CRC error type counter 150, And a transmission / reception circuit 160.

  The central processing unit 110 is a CPU or the like, for example, and performs arithmetic processing in the host 100. The main storage device 120 is a storage device that can be directly accessed by the central processing unit 110.

  The input / output processing device 130 is a device that executes data read / write with the disk device 200 in accordance with an I / O command from the central processing unit 110.

  Further, the input / output processing device 130 further includes a first time monitoring mechanism 141 and a second time monitoring mechanism 142 that monitor the response from the disk device for detecting an abnormality of the I / O command. In addition, the input / output processing device 130 includes a CRC error counter 150 that stores the number of occurrences of serial interface failures such as CRC errors detected when the transmission / reception circuits of the host 100 and the disk device 200 transmit information.

  The transmission / reception circuit 160 is a transmission / reception circuit that transmits information to and from the disk device 200 via the serial interface 300 in accordance with an instruction from the input / output processing device 130.

  On the other hand, the disk device 200 includes a transmission / reception circuit 210, a CRC error counter 220, a disk 230, and a disk control device 240.

  The transmission / reception circuit 210 is a circuit that transmits information to the host 100 via the serial interface 300. Similar to the CRC error counter 150, the CRC error counter 220 is a counter for storing the number of occurrences of serial interface failures such as CRC errors detected when the transmission / reception circuits of the host 100 and the disk device 200 transmit information. .

  The disk 230 is a disk for storing data.

  The disk control device 240 is a device that analyzes an I / O command from the host based on information from the transmission / reception circuit 210 and reads / writes data on the disk 230.

  Next, the operation of this embodiment will be described with reference to the flowchart of FIG.

  First, the operation when the central processing unit 110 in the host 100 instructs the input / output processing unit 110 to perform data transfer with the disk unit 200 will be described.

  The input / output processing device 110 instructed by the central processing unit 110 to execute the I / O instruction receives the instruction (step S401).

  The input / output processing device 110 reads out and stores the CRC error counter 150 before starting the I / O instruction (step S402).

  Thereafter, the transmission / reception circuit 160 is instructed to transmit the start of the I / O command to the disk device 200. Upon receiving the instruction, the transmission / reception circuit 160 transmits an I / O command start to the disk device 200 via the serial interface 300 (step S403). When transmitting / receiving information via the serial interface 300, the transmission / reception circuit 160 counts up the CRC error counter 150 when a serial interface failure such as a CRC error is detected.

  The transmission / reception circuit 210 in the disk device 200 receives an I / O command start from the host 100 via the serial interface 300. Then, the transmission / reception circuit 210 notifies the disk controller 240 of this (step S404).

  Upon receiving the notification, the disk controller 240 analyzes the I / O command received from the host 100 and executes reading / writing of data on the disk 230. When the execution of the I / O command is completed, the disk controller 240 instructs the transmission / reception circuit 210 to transmit an I / O command response to the host 100 (step S405). At this time, if a transmission / reception circuit 210 detects a serial interface failure such as a CRC error while transmitting information via the serial interface 300, the transmission / reception circuit 210 counts up the CRC error counter 220.

  The input / output processor 110 monitors the time until an I / O command response is returned for the purpose of detecting an abnormality of the disk device 200, and the disk device 200 executes the I / O command to the first time monitoring mechanism 141. It is instructed to monitor for a sufficient time (hereinafter referred to as “time (A)”) (step S406). Further, the input / output processing device 110 has a time shorter than the above-mentioned time (A) to the second time monitoring mechanism 142 and a sufficient time until the normal I / O command response is returned (hereinafter, this time This is indicated as “time (B)” (step S407).

  After that, the I / O processing device 110 receives “I / O command response from the disk device 200” from the transmission / reception circuit 160, “first time monitoring over” from the first time monitoring mechanism 141, and second time monitoring. It continues to check whether the “second time monitoring over” from the mechanism 142 has occurred.

  When the I / O command with the disk device 200 is normally completed (Yes in step S408), the input / output processing device 110 recognizes “I / O command response from the disk device 200” via the transmission / reception circuit 160. The first time monitoring mechanism 141 and the second time monitoring mechanism 142 are instructed to stop time monitoring. Then, the first time monitoring mechanism 141 and the second time monitoring mechanism 142 receive the instruction and stop the time monitoring (steps S409 and S410).

  Finally, the input / output processing unit 110 reports the normal end of the I / O instruction to the central processing unit 110.

  On the other hand (No in step S408), the I / O processor 110 causes the I / O command from the disk device 200 to be "first time monitoring over", that is, even if a certain time (A) has elapsed by the first time monitoring mechanism 141. If there is no response (Yes in step S412), it is determined that the I / O command is abnormal, and failure processing is started (step S415).

  In the case of No in step S 412, the I / O processing device 110 causes the second time monitoring mechanism 142 to “I / O from the disk device 200 even if the“ second time monitoring over ”, that is, the time (B) elapses. If there is no O command response (Yes in step S413), the CRC error counter is read and compared with the value read before the start of the I / O command (step S413).

  Based on the comparison result, it is confirmed whether a CRC error or the like has occurred during execution of the I / O instruction. Here, if a CRC error or the like has occurred (Yes in step S414), it is determined that there is no response to the I / O command even if a certain time (B) has elapsed due to this failure, and the failure processing is started. (Step S415).

  Further, when the second time monitoring is not over (No in step S412), and when no CRC error or the like has occurred (No in step S404), the response is delayed due to other reasons. Judgment is made again, “I / O command response from the disk device 200”, “first time monitoring over” from the first time monitoring mechanism 141, and “second time monitoring from the second time monitoring mechanism 142”. Continue to check for "over".

[Other Embodiments]
The embodiment described above is the simplest in realizing the present invention. However, in the above-described embodiment, there is a drawback that the effect cannot be obtained when the transmission / reception circuit 210 on the disk device 200 side detects a failure on the serial interface 300. Therefore, the following two embodiments will be described as modified examples. .

  The first modification is that the I / O processing device 110 receives an I / O command from the disk device 200 even if the second time monitoring mechanism 142 causes “second time monitoring over”, that is, a certain time (B) elapses. If there is no response, the CRC error counter 150 is read and compared with a value read before the start of the I / O instruction to check whether a CRC error or the like has occurred during the execution of the I / O instruction. If a CRC error or the like has not occurred as a result of the confirmation, an inquiry is made to the disk device 200 as to whether or not the corresponding I / O instruction is being processed internally, and it is notified that the disk device 200 has not been executed. If a failure occurs in the serial interface 300, it is determined that there is no response to the I / O command even after a predetermined time (B) has elapsed, and failure processing is started.

  The second modification is that the I / O processing device 110 receives an I / O command from the disk device 200 even if the second time monitoring mechanism 142 causes “second time monitoring over”, that is, a certain time (B) has elapsed. If there is no response, the CRC error counter is read and compared with the value read before the start of the I / O instruction to check whether a CRC error or the like has occurred during execution of the I / O instruction. If a CRC error or the like has not occurred as a result of the confirmation, the disk device 200 is inquired of the CRC error counter of the port on the disk device 200 side. If the count-up has occurred, it is determined that there is no response to the I / O command even after a predetermined time (B) has elapsed due to the influence of the fault detection that has been counted up, and fault processing is started.

  In each of the embodiments described above, when a minor failure such as a CRC error occurs in the interface between the host and the disk device, it is possible to detect an abnormality and start the failure processing after a certain time (B) has elapsed. As a result, the failure processing of the affected I / O instruction can be started earlier, and the delay of the entire system can be prevented.

  This point will be described with reference to the sequence diagrams of FIGS.

  FIG. 3 shows an operation example when this embodiment is not applied. First, the host 100 notifies the disk device 200 of the start of an I / O command (step A501). Then, the disk device 200 notifies the host 100 of an I / O command response (step A502). In addition, as shown in FIG. 3, this process is normally completed in less than 1 second.

  When a CRC error occurs and information is discarded, only the CRC error counter is incremented (step A503).

  Thereafter, a timeout is detected after 20 seconds, and the processing of the I / O failure is started (step A504). As a result, it takes about 20 seconds to start the failure processing.

  Next, FIG. 4 shows an operation example when this embodiment is applied. First, the host 100 notifies the disk device 200 of the start of an I / O command (step A501). Then, the disk device 200 notifies the host 100 of an I / O command response (step A502). In addition, as shown in FIG. 3, this process is normally completed in less than 1 second.

  When a CRC error occurs and information is discarded, only the CRC error counter is incremented (step A503).

  Thereafter, a timeout is detected after 1 second (fixed time B), and the CRC error counter is checked. Since the counter is incremented as a result of the check, I / O failure processing is started (step A505). As a result, it takes about 1 second to start the failure processing.

  Furthermore, since this embodiment can determine whether the cause of the failure is a host, a disk device, or a serial interface, it is possible to start more appropriate failure processing.

  Note that the host and disk device according to the embodiment of the present invention can be realized by hardware, software, or a combination thereof.

  Moreover, although the above-described embodiment is a preferred embodiment of the present invention, the scope of the present invention is not limited only to the above-described embodiment, and various modifications are made without departing from the gist of the present invention. Implementation in the form is possible.

100 Host 110 Central processing unit 120 Main storage unit 130 Input / output processing unit 141 First time monitoring mechanism 142 Second time monitoring mechanism 150, 220 CRC error counter 160, 210 Transmission / reception circuit 200 Disk device 230 Disk 240 Disk control device 300 Serial interface

Claims (9)

  1. A I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface,
    If a failure is detected in the host, and counted to Luke counter the number of the fault,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within the first predetermined time, the first time monitoring for notifying the input / output processing means to that effect Means,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within a second fixed time, which is shorter than the first fixed time, this is indicated. Second time monitoring means for notifying the input / output processing means;
    And if there is a notification from the first time monitoring unit, when there is counted up before listen counter in a case where there is a notification from the second time monitoring unit, input for starting a failure treatment Output processing means;
    Equipped with a,
    When the input / output processing means receives a notification from the second time monitoring means and the counter has not been counted up, the I / O processing issued to the disk device is performed internally. If the reply indicating that the disk device is not being executed is notified, it is determined that a failure has occurred in the interface, and the failure processing is performed for the failure in the interface. I / O instruction fault recovery circuit, which comprises carrying out.
  2. A I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface,
    If a failure is detected in the host, and the first counter count the number of the fault,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within the first predetermined time, the first time monitoring for notifying the input / output processing means to that effect Means,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within a second fixed time, which is shorter than the first fixed time, this is indicated. Second time monitoring means for notifying the input / output processing means;
    And if there is a notification from the first time monitoring unit, if there is pre-Symbol counting of the first counter in a case where there is a notification from the second time monitoring means, the fault processing Input / output processing means to start;
    Equipped with a,
    The disk device when the input / output processing means is notified from the second time monitoring means and the first counter included in the I / O instruction failure recovery circuit is not counted up. When a failure is detected in the disk device, the second counter that counts the number of failures is inquired whether there is a count-up, and when a reply that there is a count-up is notified, An I / O instruction failure recovery circuit, characterized in that it is determined to be an effect due to a failure detection that has been counted up, and the failure processing is performed for the failure in the count up .
  3. A I / O instruction fault recovery circuit according to claim 1 or 2, before hear counter is, I / O instructions disorder characterized by detecting a failure is CRC (Cyclic Redundancy Check) error Recovery circuit.
  4. A I / O instruction fault recovery method I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface is carried out,
    If a failure is detected in the host, comprising the steps of: providing a count to Luke counter the number of the fault,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within the first predetermined time, a first time monitoring for notifying the input / output processing step to that effect Steps,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within a second fixed time, which is shorter than the first fixed time, this is indicated. A second time monitoring step for notifying the input / output processing step;
    And if there is notification in the first time monitoring step, if there is counted up before listen counter in a case where there is a notification of the second time monitoring step, the input for starting the failure handling An output processing step;
    Equipped with a,
    In the input / output processing step, when there is a notification in the second time monitoring step and the counter is not counted up, the I / O processing issued to the disk device is performed internally. If the reply indicating that the disk device is not being executed is notified, it is determined that a failure has occurred in the interface, and the failure processing is performed for the failure in the interface. I / O instruction fault recovery method, which comprises carrying out.
  5. A I / O instruction fault recovery method I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface is carried out,
    If a failure is detected in the host, comprising: providing a first counter you count the number of the fault,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within the first predetermined time, a first time monitoring for notifying the input / output processing step to that effect Steps,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within a second fixed time, which is shorter than the first fixed time, this is indicated. A second time monitoring step for notifying the input / output processing step;
    And if there is notification in the first time monitoring step, if there the SL before in the case of the notice in the second time monitoring step counts up the first counter is a fault processing I / O processing steps to start;
    Equipped with a,
    In the input / output processing step, when there is a notification in the second time monitoring step and there is no count up of the first counter provided in the I / O instruction failure recovery circuit, the disk device When a failure is detected in the disk device, the second counter that counts the number of failures is inquired whether there is a count-up, and if a reply that there is a count-up is notified, An I / O instruction failure recovery method, characterized in that it is determined that the failure is caused by counting-up failure detection, and the failure processing is performed for the failure in the count-up .
  6. In I / O instruction fault recovery method according to claim 4 or 5, before hear counter is, I / O instruction fault recovery method characterized by detecting a failure is CRC (Cyclic Redundancy Check) error .
  7. A I / O instruction fault recovery program to be installed in I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface,
    If a failure is detected in the host, and counted to Luke counter the number of the fault,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within the first predetermined time, the first time monitoring for notifying the input / output processing means to that effect Means,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within a second fixed time, which is shorter than the first fixed time, this is indicated. Second time monitoring means for notifying the input / output processing means;
    And if there is a notification from the first time monitoring unit, when there is counted up before listen counter in a case where there is a notification from the second time monitoring unit, input for starting a failure treatment Output processing means;
    Equipped with a,
    When the input / output processing means receives a notification from the second time monitoring means and the counter has not been counted up, the I / O processing issued to the disk device is performed internally. If the reply indicating that the disk device is not being executed is notified, it is determined that a failure has occurred in the interface, and the failure processing is performed for the failure in the interface. I / O instruction fault recovery program for causing a computer to function as I / O instruction fault recovery circuitry to perform.
  8. A I / O instruction fault recovery program to be installed in I / O instruction fault recovery circuit to be incorporated in the host that will be connected by the disk device and the interface,
    If a failure is detected in the host, and the first counter count the number of the fault,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within the first predetermined time, the first time monitoring for notifying the input / output processing means to that effect Means,
    When the host issues an I / O command to the disk device, the response is monitored for time, and if there is no response within a second fixed time, which is shorter than the first fixed time, this is indicated. Second time monitoring means for notifying the input / output processing means;
    And if there is a notification from the first time monitoring unit, if there is pre-Symbol counting of the first counter in a case where there is a notification from the second time monitoring means, the fault processing Input / output processing means to start;
    Equipped with a,
    The disk device when the input / output processing means is notified from the second time monitoring means and the first counter included in the I / O instruction failure recovery circuit is not counted up. When a failure is detected in the disk device, the second counter that counts the number of failures is inquired whether there is a count-up, and when a reply that there is a count-up is notified, An I / O instruction failure recovery program that determines that the failure is caused by counting up a failure and causes the computer to function as an I / O instruction failure recovery circuit that performs the failure processing for the failure in the count up .
  9. In I / O instruction fault recovery program according to claim 7 or 8, before hear counter is the CRC (Cyclic Redundancy Check) for detecting a fault I / O instruction fault recovery program, which is a error .
JP2009024857A 2009-02-05 2009-02-05 I / O instruction failure recovery circuit, I / O instruction failure recovery method, and I / O instruction failure recovery program Active JP5288331B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2009024857A JP5288331B2 (en) 2009-02-05 2009-02-05 I / O instruction failure recovery circuit, I / O instruction failure recovery method, and I / O instruction failure recovery program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2009024857A JP5288331B2 (en) 2009-02-05 2009-02-05 I / O instruction failure recovery circuit, I / O instruction failure recovery method, and I / O instruction failure recovery program

Publications (2)

Publication Number Publication Date
JP2010182080A JP2010182080A (en) 2010-08-19
JP5288331B2 true JP5288331B2 (en) 2013-09-11

Family

ID=42763635

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2009024857A Active JP5288331B2 (en) 2009-02-05 2009-02-05 I / O instruction failure recovery circuit, I / O instruction failure recovery method, and I / O instruction failure recovery program

Country Status (1)

Country Link
JP (1) JP5288331B2 (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3819166B2 (en) * 1998-11-27 2006-09-06 ヒタチグローバルストレージテクノロジーズネザーランドビーブイ Energy consumption reduction method
JP2001228981A (en) * 2000-02-16 2001-08-24 Hitachi Electronics Eng Co Ltd Storage medium library array device
JP2002023966A (en) * 2000-06-30 2002-01-25 Toshiba Corp Disk system for making transfer data as redundant data
JP4317436B2 (en) * 2003-12-16 2009-08-19 株式会社日立製作所 Disk array system and interface conversion device
JP5058582B2 (en) * 2006-12-21 2012-10-24 日本電気株式会社 Multipath system of storage device, failure location identification method and program
JP2008171139A (en) * 2007-01-10 2008-07-24 Hitachi Computer Peripherals Co Ltd Testing device and testing method for storage system

Also Published As

Publication number Publication date
JP2010182080A (en) 2010-08-19

Similar Documents

Publication Publication Date Title
EP1380953B1 (en) Fault-tolerant computer system, re-synchronization method thereof and re-synchronization program thereof
US20090199056A1 (en) Memory diagnosis method
US7406545B1 (en) Disk drive or any serial attached device logging a cable loss event
US7802138B2 (en) Control method for information processing apparatus, information processing apparatus, control program for information processing system and redundant comprisal control apparatus
US8656228B2 (en) Memory error isolation and recovery in a multiprocessor computer system
EP1881407A2 (en) Storage control system, control method for storage control system, port selector, and controller
US6845469B2 (en) Method for managing an uncorrectable, unrecoverable data error (UE) as the UE passes through a plurality of devices in a central electronics complex
US6823476B2 (en) Mechanism to improve fault isolation and diagnosis in computers
US8140922B2 (en) Method for correlating an error message from a PCI express endpoint
US8862944B2 (en) Isolation of faulty links in a transmission medium
JP3962956B2 (en) Information processing apparatus and information processing method
JP3962956B6 (en) Information processing apparatus and information processing method
JP2006309760A (en) Monitoring logic and monitoring method for detecting abnormal operation of data processor
JP5281942B2 (en) Computer and its fault handling method
US8843785B2 (en) Collecting debug data in a secure chip implementation
US20100185895A1 (en) Failure-specific data collection and recovery for enterprise storage controllers
US20140164839A1 (en) Programmable device, method for reconfiguring programmable device, and electronic device
CN100383747C (en) Failure isolation system and method in a communication system
US20120131382A1 (en) Memory controller and information processing system
US20100083043A1 (en) Information processing device, recording medium that records an operation state monitoring program, and operation state monitoring method
CN100394394C (en) Fault tolerant duplex computer system and its control method
CN102644545A (en) Method and system for processing faults of wind generating set
US8276018B2 (en) Non-volatile memory based reliability and availability mechanisms for a computing device
CN101901170B (en) A data processing device and a method for error detection and error correction
US9495233B2 (en) Error framework for a microprocesor and system

Legal Events

Date Code Title Description
RD03 Notification of appointment of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7423

Effective date: 20100715

RD04 Notification of resignation of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7424

Effective date: 20100715

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20120112

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20121015

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20121026

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20121220

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20130513

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20130526