US20170242760A1 - Monitoring device, fault-tolerant system, and control method - Google Patents
Monitoring device, fault-tolerant system, and control method Download PDFInfo
- Publication number
- US20170242760A1 US20170242760A1 US15/426,243 US201715426243A US2017242760A1 US 20170242760 A1 US20170242760 A1 US 20170242760A1 US 201715426243 A US201715426243 A US 201715426243A US 2017242760 A1 US2017242760 A1 US 2017242760A1
- Authority
- US
- United States
- Prior art keywords
- data
- fault
- read
- processor
- processor system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/1641—Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/1654—Error detection by comparing the output of redundant processing systems where the output of only one of the redundant processing components can drive the attached hardware, e.g. memory or I/O
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/04—Programme control other than numerical control, i.e. in sequence controllers or logic controllers
- G05B19/042—Programme control other than numerical control, i.e. in sequence controllers or logic controllers using digital processors
- G05B19/0428—Safety, monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1608—Error detection by comparing the output signals of redundant hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1608—Error detection by comparing the output signals of redundant hardware
- G06F11/1616—Error detection by comparing the output signals of redundant hardware where the redundant component is an I/O device or an adapter therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/1637—Error detection by comparing the output of redundant processing systems using additional compare functionality in one or some but not all of the redundant processing components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1629—Error detection by comparing the output of redundant processing systems
- G06F11/165—Error detection by comparing the output of redundant processing systems with continued operation after detection of the error
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1658—Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit
- G06F11/1662—Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit the resynchronized component or unit being a persistent storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2035—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant without idle spare hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3037—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
Definitions
- the present invention relates to a lockstep fault-tolerant system.
- a fault-tolerant system is known as a technique for enabling continuation of service processed by a computer in operation by masking a hardware fault even when the fault occurs in the computer.
- a fault-tolerant system which uses the lockstep scheme is available as an exemplary fault-tolerant system.
- hardware components of the computer serve as multiple-system components.
- the respective systems including identical hardware components perform the same operation in synchronism at the same clock frequency. Performing the same operation in synchronism at the same clock frequency will also be referred to as a lockstep operation hereinafter.
- the status in which the same operation is performed in synchronism at the same clock frequency will also be referred to as a lockstep status hereinafter.
- the status in which the lockstep status fails to be maintained due, for example, to a fault will also be referred to as loss of lockstep hereinafter.
- loss of lockstep In the lockstep scheme, even when one of a plurality of systems suffers a fault and causes loss of lockstep, the processing can be continued by the operations of the remaining normal systems.
- the fault-tolerant system disclosed in the reference 1 includes a plurality of systems including identical hardware components.
- Each system includes a processor system including a CPU (Central Processing Unit), an I/O system including I/O (input/output) devices such as a storage device and a network device, and a controller.
- the processor system of each system performs a lockstep operation.
- the I/O system of each system is configured to maintain sufficient redundancy between the individual I/O systems by mirroring processing which uses the CPU of the processor system.
- the controller determines whether an inconsistency has occurred in operation between the processor systems.
- the controller for example, compares data to be transferred from the self-system processor system to the self-system I/O system with data to be transferred from the different-system processor system to the self-system I/O system.
- the controller separates a processor system determined in accordance with a predefined method from the fault-tolerant system.
- An inconsistency may occur in the data when, for example, data flowing from the CPU is partially garbled, or the data timing becomes off. Further, the inconsistency may occur in the data when an abnormality occurs within the processor system performing the lockstep operation. It may be temporarily determined that a fault has occurred upon, for example, memory garbling due to the presence of external electrical noise, cosmic rays, or other types of radiation. In this case, the processor system detected to have the fault is separated from the fault-tolerant system.
- Various methods have been proposed to separate such a processor system. For example, a method is available for calculating levels of priority based on MTBF (Mean Time Between Failure) or a frequency of occurrence of faults of each processor system and determining the processor system to be separated based on the calculated levels of priority.
- a monitoring device of the present invention includes a processor executing instructions to:
- the memory being provided in an accessory device to be monitored, the accessory device connecting with a processor system of a fault-tolerant system including a plurality of operational systems, each operational system having an identical configuration including the processor system;
- a fault-tolerant system of the present invention includes:
- each operational system including:
- the monitoring device including a processor executing instructions to:
- a control method of the present invention includes:
- the memory being provided in an accessory device to be monitored, the accessory device connecting with a processor system of a fault-tolerant system including a plurality of operational systems, each operational system having an identical configuration including the processor system;
- FIG. 1 is a block diagram illustrating a configuration of a fault-tolerant system in a first example embodiment according to the present invention
- FIG. 2 is a block diagram illustrating exemplary hardware components constituting the fault-tolerant system in the first example embodiment
- FIG. 3 is a flowchart for explaining an exemplary operation for monitoring an external device in the first example embodiment
- FIG. 4 is a block diagram illustrating a configuration of a fault-tolerant system in a second example embodiment according to the present invention.
- FIG. 5 is a block diagram illustrating exemplary hardware components constituting the fault-tolerant system in the second example embodiment
- FIG. 6 is a flowchart for explaining an operation to update an address storage unit and a data storage unit in the fault-tolerant system of the second example embodiment
- FIG. 7 is a flowchart for explaining an operation to monitor an external device in the second example embodiment
- FIG. 8 is a block diagram illustrating a simplified configuration of a monitoring device in other example embodiments according to the present invention.
- FIG. 9 is a block diagram illustrating a simplified configuration of a fault-tolerant system in other example embodiments according to the present invention.
- FIG. 1 is a block diagram illustrating a configuration of a fault-tolerant system in a first example embodiment according to the present invention.
- a fault-tolerant system 1 includes a plurality of systems (operational systems) 100 . Although two systems 100 are illustrated in FIG. 1 , the number of systems 100 included in the fault-tolerant system 1 is not limited.
- Each system 100 includes identical hardware components.
- each system 100 includes a processor system 10 , an I/O system 20 , a controller 30 , an external device (accessory device) 40 , and a monitoring device 50 .
- a processor system 10 for processing instructions
- I/O system 20 for processing data
- controller 30 for controlling the flow of data
- external device for converting data into an external signal.
- monitoring device 50 for detecting the presence of a monitoring device.
- FIG. 1 the number of components of each type included in each system 100 is not limited.
- the processor system 10 performs the lockstep operation in cooperation with the processor systems 10 of the different systems 100 . More specifically, the processor system 10 includes a CPU (Central Processing Unit) 101 , a memory 102 , a device interface 103 , and a CPU state machine 104 , as illustrated in FIG. 2 , as hardware components.
- a self-system means the system 100 including itself or a component included in the system 100 including itself.
- a different-system means the system 100 which does not include itself or a component included in the system 100 not including itself.
- the CPU 101 performs the same operation in synchronism at the same clock frequency as it of the CPU 101 of the processor system 10 of different-system.
- the memory 102 functions as a main storage device and is kept in the same storage status as it of the memory 102 of the processor system 10 of different-system by the control operation of the CPU 101 .
- the processor system 10 is accessible to the I/O system 20 of self-system via the controller 30 .
- the processor system 10 is also accessible to the I/O system 20 of different-system via the controllers 30 of self-system and different-system.
- the processor system 10 includes a function of transferring data to the I/O systems 20 of self-system and different-system.
- the processor system 10 further includes a function of accessing the storage area of the external device 40 . More specifically, the device interface 103 of the processor system 10 includes a function of writing data into the external device 40 or reading data from the external device 40 in accordance with a command from the CPU 101 . The device interface 103 further includes a function of reading data from the external device 40 in accordance with a request from the monitoring device 50 .
- the CPU state machine 104 at least stores information representing whether the processor system 10 of self-system has been mounted in the fault-tolerant system 1 (also called an online status) or separated from the fault-tolerant system 1 (also called a broken status).
- the I/O system 20 includes at least one I/O (Input/Output) device.
- the I/O system 20 is configured to maintain sufficient redundancy between itself and the I/O systems 20 of different-system by mirroring process implemented by software executed on the processor system 10 .
- the controller 30 is connected with the processor system 10 and the I/O system 20 .
- the controllers 30 of the respective systems 100 are communicably connected to each other by cross-links.
- the controller 30 includes a function of monitoring whether the processor system 10 is in the lockstep status and determining whether the processor system 10 needs to be separated from the fault-tolerant system 1 in accordance with the monitoring result.
- the controller 30 compares data flowing from the processor system 10 of self-system to the I/O system 20 of self-system with data flowing from the processor system 10 of different-system to the I/O system 20 of self-system. If a result of the comparison indicates a difference (in the case of the loss of lockstep), the controller 30 determines whether the processor system 10 of self-system needs to be separated from the fault-tolerant system 1 . More specifically, the controller 30 determines that separation is necessary when it determines that the processor system 10 of self-system is more likely to suffer a fault than the different system.
- the controller 30 may determine whether the processor system 10 of self-system is more likely to suffer a fault than the different system, based on the numbers of past separation and the numbers of recombining operations recorded for each processor system 10 .
- the controller 30 includes a function of separating the processor system 10 of self-system from the fault-tolerant system 1 when it determines that the processor system 10 of self-system is more likely to suffer a fault than the different system.
- the external device 40 includes a storage function.
- the external device 40 is implemented as, for example, a flash memory device.
- the external device 40 is connected to the processor system 10 .
- the monitoring device 50 includes a function of monitoring whether the external device 40 suffers a fault.
- the monitoring device 50 includes a read unit 51 , a comparison unit 52 , a data storage unit 53 , and a separation unit 54 as functional units, as illustrated in FIG. 1 .
- the monitoring device 50 is implemented in a hardware configuration including a timer 501 , a read generation circuit 502 , a register 503 , a comparison circuit 504 , and a control signal output circuit 505 , as illustrated in FIG. 2 .
- the timer 501 , the read generation circuit 502 , the register 503 , the comparison circuit 504 , and the control signal output circuit 505 are formed in, for example, a processor 510 .
- FIG. 2 merely illustrates an example and the hardware components included in the system 100 are not limited to these examples.
- the read unit 51 of the monitoring device 50 includes a function of reading data from a predetermined storage area in the external device 40 for each predetermined timing.
- the read unit 51 is implemented by the timer 501 and the read generation circuit 502 illustrated in FIG. 2 and controls the device interface 103 of the processor system 10 to implement its function.
- the timer 501 outputs a signal for determining a predetermined timing.
- the read generation circuit 502 outputs a read command for reading data from the predetermined storage area in the external device 40 to the device interface 103 at a timing based on the signal output from the timer 501 .
- the external device 40 is implemented as a flash memory device.
- a flash memory device stores SFDP (Serial Flash Discoverable Parameter).
- SFDP is represented by a 32-bit fixed value defined by JEDEC (Joint Electron Device Engineering Council) and is independent of a vendor.
- the read unit 51 outputs the read command in a storage area of SFDP via the device interface 103 .
- Data stored in the predetermined storage area of the external device 40 may be the fixed value which is not updated, as described above, or data updated by, for example, the processor system 10 .
- the device interface 103 reads data from the predetermined storage area in the external device 40 in accordance with the read command and transmits (sends back) the read data to the monitoring device 50 .
- the data storage unit 53 is implemented by the register 503 illustrated in FIG. 2 .
- the data storage unit 53 stores reference data.
- the reference data means data to be compared with data read from the external device 40 by the read unit 51 .
- the external device 40 when the fixed value is stored in the storage area from which data is read by the read unit 51 , the fixed value is stored in the data storage unit 53 in advance.
- the external device 40 is implemented as a flash memory device and the SFDP area is defined as the storage area from which data is read by the read unit 51 , as described earlier.
- the data storage unit 53 stores the value of SFDP.
- the comparison unit 52 includes a function of comparing data (read-data) read from the external device 40 by the read unit 51 with reference data stored in the data storage unit 53 . More specifically, the comparison unit 52 is implemented by the comparison circuit 504 illustrated in FIG. 2 .
- the data (read-data) read from the external device 40 by the device interface 103 of the processor system 10 in accordance with the read command issued by the read unit 51 is input to the comparison circuit 504 .
- the reference data in the register 503 (data storage unit 53 ) is further input to the comparison circuit 504 .
- the comparison circuit 504 compares the read-data with the reference data and outputs a result of the comparison to the separation unit 54 .
- the separation unit 54 includes a function of separating the processor system 10 determined in accordance with predetermined separation conditions from the fault-tolerant system 1 when the comparison result obtained by the comparison unit 52 indicates a difference. More specifically, the separation unit 54 is implemented by the control signal output circuit 505 illustrated in FIG. 2 and controls the CPU state machine 104 of the processor system 10 to implement its function.
- the control signal output circuit 505 outputs a control signal to make a transition to a broken status to the CPU state machine 104 of the processor system 10 , in response to a signal input from the comparison circuit 504 and indicating the difference.
- the control signal output circuit 505 outputs an OFF signal, a reset signal, or the like required in separation process to each hardware component constituting the processor system 10 .
- the processor system 10 of each system 100 starts the lockstep operation.
- the operation for monitoring the lockstep status by the controller 30 and the operation for monitoring the external device 40 by the processor system 10 are performed.
- FIG. 3 is a flowchart illustrating an exemplary operation for monitoring the external device 40 by the monitoring device 50 .
- the read unit 51 waits until a predetermined timing first (step S 1 ).
- the read unit 51 outputs the read command for reading data from the predetermined storage area in the external device 40 when the predetermined timing comes (step S 2 ).
- the comparison unit 52 determines whether the read-data read from the external device 40 in accordance with the read command issued by the read unit 51 and the reference data in the data storage unit 53 are equal to each other (step S 3 ).
- the monitoring device 50 stands by to output the next read command.
- the separation unit 54 separates the process or system 10 of self-system from the fault-tolerant system 1 (step S 4 ).
- the monitoring device 50 ends the operation for monitoring the external device 40 .
- the fault-tolerant system 1 continues the processing using the processor system 10 of the unseparated system 100 .
- the processor system 10 When only one processor system 10 continues the processing, it operates without the operation (for example, the operation of the monitoring device 50 ) associated with the lockstep operation.
- the fault-tolerant system 1 includes two systems 100 .
- the two systems 100 are distinguished as systems 100 a and 100 b.
- a flash memory device is connected to the processor system 10 as the external device 40 .
- the external device 40 stores the BIOS (Basic Input Output System) code.
- the external device 40 stores SFDP and the data storage unit 53 of the processor system 10 stores the value of SFDP.
- the external device 40 includes no function of detecting and notifying a fault of its own. The frequency of access to the external device 40 by the CPU 101 is lower than that to the memory 102 by the CPU 101 .
- the frequency of access to the external device 40 by the CPU 101 is as low as, for example, the frequency of reading the BIOS code from the external device 40 by the CPU 101 at the start or restart of the system 100 .
- the processor system 10 of one of the systems 100 a and 100 b that has been separated and recombined more times in the past is separated from the fault-tolerant system 1 by the controller 30 .
- the external device 40 flash memory device of the system 100 a is assumed to suffer a fault while the processor system 10 of each of the systems 100 a and 100 b normally performs the lockstep operation.
- the read-data read from the SFDP area in the external device 40 and the value of SFDP that is the reference data in the data storage unit 53 become different from each other.
- the processor system 10 of the system 100 a is thus separated from the fault-tolerant system 1 by the operation of the separation unit 54 of the monitoring device 50 .
- the processor system 10 of the system 100 b continues the processing in the fault-tolerant system 1 .
- the processor system 10 accesses the external device 40 suffering the fault to read the BIOS code.
- the processor system 10 of the system 100 a detects the error resulting from the fault of the external device 40 and separates itself from the fault-tolerant system 1 .
- the processor systems 10 of both the systems 100 a and 100 b are separated from the fault-tolerant system 1 , resulting in the system crash.
- each of the systems 100 a and 100 b includes the monitoring device 50 .
- the processor system 10 of the system 100 a connected to the external device 40 suffering the fault is separated from the fault-tolerant system 1 by the monitoring device 50 before the loss of lockstep is detected by the controller 30 . Therefore, the fault-tolerant system 1 can avoid the system crash resulting from a fault of the external device 40 .
- the fault-tolerant system 1 in the first example embodiment can more reliably prevent the system crash or degradation in availability resulting from the fault of the external device 40 connected to the processor system 10 that performs a lockstep operation.
- the monitoring device 50 which detects the abnormality of the external device 40 by monitoring the operation of the external device 40 is provided.
- the fault-tolerant system 1 in the first example embodiment can quickly detect the fault of the external device 40 and quickly separate the system 100 with its external device 40 suffering the fault from the fault-tolerant system 1 .
- the fault-tolerant system 1 in the first example embodiment can reduce the possibility that the processor system 10 connected to the external device 40 suffering no fault will be separated from the fault-tolerant system 1 due to the fault of the external device 40 . Therefore, the fault-tolerant system 1 in the first example embodiment can prevent the system crash or degradation in availability resulting from the fault of the external device 40 .
- the second example embodiment exemplifies the case where an external device without an area which stores a fixed value, as in SFDP of a flash memory device, is employed as the external device 40 .
- FIG. 4 is a block diagram illustrating a configuration of a fault-tolerant system 2 in the second example embodiment.
- the fault-tolerant system 2 includes a plurality of systems 200 . Although two systems 200 are illustrated in FIG. 4 , the number of systems 200 included in the fault-tolerant system 2 is not limited.
- Each system 200 includes identical hardware components.
- Each system 200 includes a monitoring device 60 in place of the monitoring device 50 in the first example embodiment.
- the monitoring device 60 includes the comparison unit 52 , the separation unit 54 , a read unit 61 , a data storage unit 63 , a data update unit 65 , and an address storage unit 66 .
- the monitoring device 60 includes the timer 501 , the read generation circuit 502 , the register 503 , the comparison circuit 504 , the control signal output circuit 505 , an access monitoring circuit 606 , and a register 607 .
- the timer 501 , the read generation circuit 502 , the register 503 , the comparison circuit 504 , the control signal output circuit 505 , the access monitoring circuit 606 , and the register 607 are built into, for example, a processor 610 .
- FIG. 5 merely illustrates an example and the hardware components included in the system 200 are not limited to these examples.
- the address storage unit 66 of the monitoring device 60 is implemented by the register 607 illustrated in FIG. 5 .
- the data update unit 65 includes a function of storing in the address storage unit 66 , the address of the access destination at which the processor system 10 accesses the external device 40 at a predetermined point in time.
- the data update unit 65 is implemented by the access monitoring circuit 606 illustrated in FIG. 5 and controls the device interface 103 of the processor system 10 to implement its function.
- the predetermined point in time means herein, for example, the point in time at which the system 200 accesses the external device 40 for the first time after the start of the system 200 .
- the data update unit 65 includes a function of storing in the data storage unit 63 , data identical to that stored in the storage area of the external device 40 accessed by the processor system 10 at the predetermined point in time as described earlier.
- the data update unit 65 stores the read data in the data storage unit 63 .
- the data update unit 65 stores the data written in the external device 40 in the data storage unit 63 .
- the data update unit 65 further includes a function of, every time the data in the storage area of the external device 40 corresponding to the address stored in the address storage unit 66 is updated, updating the data in the data storage unit 63 to a updated-data updated in the external device 40 . That the data in the storage area of the external device 40 is updated can be detected by the access monitoring circuit 606 . In other words, the access monitoring circuit 606 can detect the update of the data in the external device 40 by detecting the write command input to the external device 40 and data to be written into it.
- the read unit 61 includes a function of reading data from the storage area of the external device 40 corresponding to the address stored in the address storage unit 66 , for each predetermined timing.
- Configurations of the fault-tolerant system 2 in the second example embodiment other than the above-mentioned configurations are the same as those of the fault-tolerant system 1 in the first example embodiment.
- the operation of the fault-tolerant system 2 in the second example embodiment will be described below with reference to the drawings.
- the processor system 10 of each system 200 starts the lockstep operation, as in the fault-tolerant system 1 in the first example embodiment.
- the lockstep operation the operation for monitoring the lockstep status by the controller 30 and the operation for monitoring the external device 40 by the monitoring device 60 are performed.
- FIG. 6 is a flowchart illustrating an exemplary data updating operation by the data update unit 65 .
- the data update unit 65 determines whether the processor system 10 has accessed the external device 40 at the predetermined point in time (step S 11 ). When the data update unit 65 detects that the processor system 10 has accessed the external device 40 , it stores the address of the access destination in the address storage unit 66 (step S 12 ).
- the data update unit 65 stores in the data storage unit 63 , data stored in the storage area at the access destination at which the processor system 10 accesses the external device 40 (step S 13 ).
- the data update unit 65 stores the read-data in the data storage unit 63 .
- the data update unit 65 stores the data in the external device 40 in the data storage unit 63 .
- the data update unit 65 determines whether the write command for writing data into the storage area of the external device 40 corresponding to the address stored in the address storage unit 66 has been output (step S 14 ). Upon detection of the write command, the data update unit 65 updates the data in the data storage unit 63 to data to be written into the external device 40 in accordance with the write command (step S 15 ).
- the data update unit 65 repeats the operations in step S 14 and the subsequent step.
- FIG. 7 is a flowchart illustrating an exemplary operation for monitoring the external device 40 by the monitoring device 60 in the second example embodiment.
- the read unit 61 when the predetermined timing is detected to have come (step S 1 ), the read unit 61 outputs the read command for reading data from the storage area of the external device 40 corresponding to the address stored in the address storage unit 66 (step S 22 ).
- the comparison unit 52 of the monitoring device 60 determines whether the read-data read from the external device 40 in accordance with the read command issued by the read unit 61 and the reference data in the data storage unit 63 are equal to each other (step S 3 ).
- the monitoring device 60 stands by to output the next read command.
- the separation unit 54 separates the processor system 10 of self-system from the fault-tolerant system 2 (step S 4 ). The monitoring device 60 thus ends its operation for monitoring the external device 40 .
- the fault-tolerant system 2 continues the processing using the processor systems 10 of the unseparated systems 200 .
- the processor systems 10 of the unseparated systems 200 operate without the operation (for example, the operation of the monitoring device 60 ) associated with the lockstep operation.
- the fault-tolerant system 2 in the second example embodiment can more reliably prevent the system crash or degradation in availability resulting from the fault of the external device 40 even when a device without a storage area for a fixed value is connected as the external device 40 .
- the monitoring device 60 in the second example embodiment includes the data update unit 65 , in addition to the configuration of the monitoring device 50 in the first example embodiment.
- the data update unit 65 includes the function of storing in the address storage unit 66 , the address of the access destination at which the processor system 10 accesses the external device 40 at the predetermined point in time.
- the data update unit 65 further includes the function of storing in the data storage unit 63 , the data in the storage area of the external device 40 corresponding to the address of the access destination as reference data. Every time the data in the storage area of the external device 40 indicated by the address stored in the address storage unit 66 is updated, the data update unit 65 updates the data in the data storage unit 63 to the updated data.
- the fault-tolerant system 2 in the second example embodiment can obtain the same effect as in the first example embodiment even when the external device 40 such as a flash memory device before SFDP definition or a flash memory device without the storage area for the fixed value such as SFDP is mounted in it.
- the fault-tolerant system 2 in the second example embodiment can quickly detect the fault of the external device 40 and quickly separate the system 200 with its external device 40 suffering the fault from the fault-tolerant system.
- the fault-tolerant system 2 in the second example embodiment can prevent the normal system 200 with its external device 40 suffering no fault from being separated from the fault-tolerant system, as in the first example embodiment. This reduces the system crash or degradation in availability resulting from separation of the system 200 with its external device 40 suffering the fault from the fault-tolerant system 2 after the normal system 200 is separated.
- the external device 40 is an external device without the area storing the fixed value.
- the configuration of the second example embodiment is also applicable to the fault-tolerant system which employs the external device (for example, a flash memory device including SFDP) including the area storing the fixed value as the external device 40 .
- the present invention is not limited to the first and second example embodiments and may take various example embodiments.
- the use of the flash memory device as the external device 40 has been taken as an example in the first and second example embodiments, the external device 40 is not limited to the flash memory device.
- the first and second example embodiments give an example in which the controller 30 sets the system to be separated, based on the numbers of separation and remounting operations as a criterion for determining a system to be separated upon detection of the loss of lockstep.
- the criterion for determining the system (operational system) to be separated from the fault-tolerant system by the controller 30 is not limited to that described in the first and second example embodiments.
- the separation unit 54 is configured to separate the processor system 10 by causing the CPU state machine 104 to make a transition.
- the processing for separating the processor system 10 by the separation unit 54 and the configuration of the separation unit 54 for separating the processor system 10 are not limited to those described in the first and second example embodiments.
- the hardware configurations described with reference to FIGS. 2 and 5 are merely examples and the present invention is not limited to these examples.
- the monitoring devices 50 and 60 in the first and second example embodiments need not always be physically independent devices (processors).
- each of the monitoring devices 50 and 60 may be implemented as a part of an integrated circuit included in the hardware components constituting the processor system 10 .
- Each of the fault-tolerant systems 1 and 2 in the first and second example embodiments is a dual system including two systems 100 or 200 .
- the fault-tolerant system to which the present invention is applied may be a triple or higher-order multiple system including three or more systems.
- FIG. 8 is a block diagram illustrating the simplified configuration of a monitoring device in other example embodiments according to the present invention.
- a monitoring device 70 illustrated in FIG. 8 is mounted in, for example, a fault-tolerant system 3 in other example embodiments according to the present invention illustrated in FIG. 9 .
- the fault-tolerant system 3 includes a plurality of operational systems 300 .
- the plurality of operational systems 300 have the same configuration including a processor system 80 .
- an accessory device 85 is connected to the processor system 80 .
- the accessory device 85 includes a memory 86 .
- a controller 90 includes a function of detecting an abnormality of the processor system 80 of the operational system 300 of self-system, based on data output from the processor system 80 of the operational system 300 of self-system and data input from the operational system 300 of different-system.
- the controller 90 further includes a function of separating the processor system 80 detected to suffer the abnormality from the fault-tolerant system 3 when the abnormality of the processor system 80 is detected.
- the monitoring device 70 includes a processor 71 .
- the processor 71 includes a function of reading data from a predetermined storage area in the memory 86 of the accessory device 85 to be monitored, connected to the processor system 80 of the operational system 300 of self-system.
- the processor 71 further includes a function of comparing the read-data with reference data held in advance to determine whether the read-data and the reference data are different from each other.
- the processor 71 further includes a function of separating the processor system 80 connected to the accessory device 85 to be monitored from the fault-tolerant system 3 when the read-data and the reference data are different from each other.
- Such the monitoring device 70 illustrated in FIG. 8 and the fault-tolerant system 3 including the monitoring device 70 can prevent the system crash or degradation in availability resulting from the fault of the accessory device 85 , as in the first and second example embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Automation & Control Theory (AREA)
- Hardware Redundancy (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
- This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2016-028976, filed on Feb. 18, 2016, the disclosure of which is incorporated herein in its entirety by reference.
- The present invention relates to a lockstep fault-tolerant system.
- A fault-tolerant system is known as a technique for enabling continuation of service processed by a computer in operation by masking a hardware fault even when the fault occurs in the computer. A fault-tolerant system which uses the lockstep scheme is available as an exemplary fault-tolerant system. In the lockstep scheme, hardware components of the computer serve as multiple-system components. The respective systems including identical hardware components perform the same operation in synchronism at the same clock frequency. Performing the same operation in synchronism at the same clock frequency will also be referred to as a lockstep operation hereinafter. The status in which the same operation is performed in synchronism at the same clock frequency will also be referred to as a lockstep status hereinafter. The status in which the lockstep status fails to be maintained due, for example, to a fault will also be referred to as loss of lockstep hereinafter. In the lockstep scheme, even when one of a plurality of systems suffers a fault and causes loss of lockstep, the processing can be continued by the operations of the remaining normal systems.
- An exemplary fault-tolerant system which uses such a lockstep scheme is disclosed in reference 1 (Japanese Unexamined Patent Application Publication No. 2009-205630).
- The fault-tolerant system disclosed in the
reference 1 includes a plurality of systems including identical hardware components. Each system includes a processor system including a CPU (Central Processing Unit), an I/O system including I/O (input/output) devices such as a storage device and a network device, and a controller. The processor system of each system performs a lockstep operation. The I/O system of each system is configured to maintain sufficient redundancy between the individual I/O systems by mirroring processing which uses the CPU of the processor system. - The controller determines whether an inconsistency has occurred in operation between the processor systems. The controller, for example, compares data to be transferred from the self-system processor system to the self-system I/O system with data to be transferred from the different-system processor system to the self-system I/O system. When an inconsistency occurs in these data, the controller separates a processor system determined in accordance with a predefined method from the fault-tolerant system.
- An inconsistency may occur in the data when, for example, data flowing from the CPU is partially garbled, or the data timing becomes off. Further, the inconsistency may occur in the data when an abnormality occurs within the processor system performing the lockstep operation. It may be temporarily determined that a fault has occurred upon, for example, memory garbling due to the presence of external electrical noise, cosmic rays, or other types of radiation. In this case, the processor system detected to have the fault is separated from the fault-tolerant system. Various methods have been proposed to separate such a processor system. For example, a method is available for calculating levels of priority based on MTBF (Mean Time Between Failure) or a frequency of occurrence of faults of each processor system and determining the processor system to be separated based on the calculated levels of priority.
- In this manner, with the lockstep fault-tolerant system, even when a processor system which may suffer the fault is separated, the processor systems of the remaining systems continue the processing. Then, when the separated processor system is determined to be normal or the like and is therefore mounted in the fault-tolerant system again, the processor system performs the lockstep operation again.
- It is the main object of the present invention to provide a technique to prevent a system crash or degradation in availability in a fault-tolerant system.
- A monitoring device of the present invention includes a processor executing instructions to:
- read data from a predetermined storage area in a memory, the memory being provided in an accessory device to be monitored, the accessory device connecting with a processor system of a fault-tolerant system including a plurality of operational systems, each operational system having an identical configuration including the processor system;
- compare read-data which is read from the storage area with reference data held in advance; and
- separate the processor system connected with the accessory device to be monitored from the fault-tolerant system when the read-data is different from the reference data.
- A fault-tolerant system of the present invention includes:
- a plurality of operational systems that have an identical configuration including a processor system and performs an identical operation,
- each operational system including:
-
- an accessory device connected with the processor system;
- a monitoring device that monitors the accessory device; and
- a controller that separates the processor system detected to suffer an abnormality from the fault-tolerant system when the abnormality of the processor system of the operational system is detected based on data output from the processor system of the operational system and data input from a different operational system,
- the monitoring device including a processor executing instructions to:
-
- read data from a predetermined storage area in a memory, the memory being provided in the accessory device to be monitored;
- compare read-data which is read from the storage area with reference data held in advance; and
- separate the processor system connected with the accessory device to be monitored from the fault-tolerant system when the read-data is different from the reference data.
- A control method of the present invention includes:
- reading data from a predetermined storage area in a memory, the memory being provided in an accessory device to be monitored, the accessory device connecting with a processor system of a fault-tolerant system including a plurality of operational systems, each operational system having an identical configuration including the processor system;
- comparing read-data which is read from the storage area with reference data held in advance; and
- separating the processor system connected with the accessory device to be monitored from the fault-tolerant system when the read-data is different from the reference data.
- Exemplary features and advantages of the present invention will become apparent from the following detailed description when taken with the accompanying drawings in which:
-
FIG. 1 is a block diagram illustrating a configuration of a fault-tolerant system in a first example embodiment according to the present invention; -
FIG. 2 is a block diagram illustrating exemplary hardware components constituting the fault-tolerant system in the first example embodiment; -
FIG. 3 is a flowchart for explaining an exemplary operation for monitoring an external device in the first example embodiment; -
FIG. 4 is a block diagram illustrating a configuration of a fault-tolerant system in a second example embodiment according to the present invention; -
FIG. 5 is a block diagram illustrating exemplary hardware components constituting the fault-tolerant system in the second example embodiment; -
FIG. 6 is a flowchart for explaining an operation to update an address storage unit and a data storage unit in the fault-tolerant system of the second example embodiment; -
FIG. 7 is a flowchart for explaining an operation to monitor an external device in the second example embodiment; -
FIG. 8 is a block diagram illustrating a simplified configuration of a monitoring device in other example embodiments according to the present invention; and -
FIG. 9 is a block diagram illustrating a simplified configuration of a fault-tolerant system in other example embodiments according to the present invention. - Example embodiments according to the present invention will be described below with reference to the drawings.
-
FIG. 1 is a block diagram illustrating a configuration of a fault-tolerant system in a first example embodiment according to the present invention. Referring toFIG. 1 , a fault-tolerant system 1 includes a plurality of systems (operational systems) 100. Although twosystems 100 are illustrated inFIG. 1 , the number ofsystems 100 included in the fault-tolerant system 1 is not limited. - Each
system 100 includes identical hardware components. In other words, eachsystem 100 includes aprocessor system 10, an I/O system 20, acontroller 30, an external device (accessory device) 40, and amonitoring device 50. Although only one module is illustrated for each type of component constituting eachsystem 100 inFIG. 1 , the number of components of each type included in eachsystem 100 is not limited. - The
processor system 10 performs the lockstep operation in cooperation with theprocessor systems 10 of thedifferent systems 100. More specifically, theprocessor system 10 includes a CPU (Central Processing Unit) 101, amemory 102, adevice interface 103, and aCPU state machine 104, as illustrated inFIG. 2 , as hardware components. A self-system means thesystem 100 including itself or a component included in thesystem 100 including itself. A different-system means thesystem 100 which does not include itself or a component included in thesystem 100 not including itself. - The
CPU 101 performs the same operation in synchronism at the same clock frequency as it of theCPU 101 of theprocessor system 10 of different-system. Thememory 102 functions as a main storage device and is kept in the same storage status as it of thememory 102 of theprocessor system 10 of different-system by the control operation of theCPU 101. - The
processor system 10 is accessible to the I/O system 20 of self-system via thecontroller 30. Theprocessor system 10 is also accessible to the I/O system 20 of different-system via thecontrollers 30 of self-system and different-system. Theprocessor system 10 includes a function of transferring data to the I/O systems 20 of self-system and different-system. - The
processor system 10 further includes a function of accessing the storage area of theexternal device 40. More specifically, thedevice interface 103 of theprocessor system 10 includes a function of writing data into theexternal device 40 or reading data from theexternal device 40 in accordance with a command from theCPU 101. Thedevice interface 103 further includes a function of reading data from theexternal device 40 in accordance with a request from themonitoring device 50. - The
CPU state machine 104 at least stores information representing whether theprocessor system 10 of self-system has been mounted in the fault-tolerant system 1 (also called an online status) or separated from the fault-tolerant system 1 (also called a broken status). - The I/
O system 20 includes at least one I/O (Input/Output) device. The I/O system 20 is configured to maintain sufficient redundancy between itself and the I/O systems 20 of different-system by mirroring process implemented by software executed on theprocessor system 10. - The
controller 30 is connected with theprocessor system 10 and the I/O system 20. Thecontrollers 30 of therespective systems 100 are communicably connected to each other by cross-links. Thecontroller 30 includes a function of monitoring whether theprocessor system 10 is in the lockstep status and determining whether theprocessor system 10 needs to be separated from the fault-tolerant system 1 in accordance with the monitoring result. - In other words, the
controller 30 compares data flowing from theprocessor system 10 of self-system to the I/O system 20 of self-system with data flowing from theprocessor system 10 of different-system to the I/O system 20 of self-system. If a result of the comparison indicates a difference (in the case of the loss of lockstep), thecontroller 30 determines whether theprocessor system 10 of self-system needs to be separated from the fault-tolerant system 1. More specifically, thecontroller 30 determines that separation is necessary when it determines that theprocessor system 10 of self-system is more likely to suffer a fault than the different system. For example, thecontroller 30 may determine whether theprocessor system 10 of self-system is more likely to suffer a fault than the different system, based on the numbers of past separation and the numbers of recombining operations recorded for eachprocessor system 10. Thecontroller 30 includes a function of separating theprocessor system 10 of self-system from the fault-tolerant system 1 when it determines that theprocessor system 10 of self-system is more likely to suffer a fault than the different system. - The
external device 40 includes a storage function. Theexternal device 40 is implemented as, for example, a flash memory device. Theexternal device 40 is connected to theprocessor system 10. - The
monitoring device 50 includes a function of monitoring whether theexternal device 40 suffers a fault. Themonitoring device 50 includes aread unit 51, acomparison unit 52, adata storage unit 53, and aseparation unit 54 as functional units, as illustrated inFIG. 1 . - The
monitoring device 50 is implemented in a hardware configuration including atimer 501, aread generation circuit 502, aregister 503, acomparison circuit 504, and a controlsignal output circuit 505, as illustrated inFIG. 2 . Thetimer 501, theread generation circuit 502, theregister 503, thecomparison circuit 504, and the controlsignal output circuit 505 are formed in, for example, aprocessor 510.FIG. 2 merely illustrates an example and the hardware components included in thesystem 100 are not limited to these examples. - The
read unit 51 of themonitoring device 50 includes a function of reading data from a predetermined storage area in theexternal device 40 for each predetermined timing. For example, theread unit 51 is implemented by thetimer 501 and theread generation circuit 502 illustrated inFIG. 2 and controls thedevice interface 103 of theprocessor system 10 to implement its function. In other words, thetimer 501 outputs a signal for determining a predetermined timing. Theread generation circuit 502 outputs a read command for reading data from the predetermined storage area in theexternal device 40 to thedevice interface 103 at a timing based on the signal output from thetimer 501. As a specific example, assume theexternal device 40 is implemented as a flash memory device. In general, a flash memory device stores SFDP (Serial Flash Discoverable Parameter). SFDP is represented by a 32-bit fixed value defined by JEDEC (Joint Electron Device Engineering Council) and is independent of a vendor. In this case, theread unit 51 outputs the read command in a storage area of SFDP via thedevice interface 103. - Data stored in the predetermined storage area of the
external device 40 may be the fixed value which is not updated, as described above, or data updated by, for example, theprocessor system 10. In response to the read command from the read unit 51 (monitoring device 50), thedevice interface 103 reads data from the predetermined storage area in theexternal device 40 in accordance with the read command and transmits (sends back) the read data to themonitoring device 50. - The
data storage unit 53 is implemented by theregister 503 illustrated inFIG. 2 . Thedata storage unit 53 stores reference data. The reference data means data to be compared with data read from theexternal device 40 by theread unit 51. For example, in theexternal device 40, when the fixed value is stored in the storage area from which data is read by theread unit 51, the fixed value is stored in thedata storage unit 53 in advance. Assume, for example, that theexternal device 40 is implemented as a flash memory device and the SFDP area is defined as the storage area from which data is read by theread unit 51, as described earlier. In this case, thedata storage unit 53 stores the value of SFDP. - The
comparison unit 52 includes a function of comparing data (read-data) read from theexternal device 40 by theread unit 51 with reference data stored in thedata storage unit 53. More specifically, thecomparison unit 52 is implemented by thecomparison circuit 504 illustrated inFIG. 2 . The data (read-data) read from theexternal device 40 by thedevice interface 103 of theprocessor system 10 in accordance with the read command issued by theread unit 51 is input to thecomparison circuit 504. The reference data in the register 503 (data storage unit 53) is further input to thecomparison circuit 504. Thecomparison circuit 504 compares the read-data with the reference data and outputs a result of the comparison to theseparation unit 54. - The
separation unit 54 includes a function of separating theprocessor system 10 determined in accordance with predetermined separation conditions from the fault-tolerant system 1 when the comparison result obtained by thecomparison unit 52 indicates a difference. More specifically, theseparation unit 54 is implemented by the controlsignal output circuit 505 illustrated inFIG. 2 and controls theCPU state machine 104 of theprocessor system 10 to implement its function. The controlsignal output circuit 505 outputs a control signal to make a transition to a broken status to theCPU state machine 104 of theprocessor system 10, in response to a signal input from thecomparison circuit 504 and indicating the difference. The controlsignal output circuit 505 outputs an OFF signal, a reset signal, or the like required in separation process to each hardware component constituting theprocessor system 10. - The operation of the fault-
tolerant system 1 configured as described above will be described below with reference to the drawings. - When the fault-
tolerant system 1 is started, theprocessor system 10 of eachsystem 100 starts the lockstep operation. During the lockstep operation, the operation for monitoring the lockstep status by thecontroller 30 and the operation for monitoring theexternal device 40 by theprocessor system 10 are performed. -
FIG. 3 is a flowchart illustrating an exemplary operation for monitoring theexternal device 40 by themonitoring device 50. - Referring to
FIG. 3 , theread unit 51 waits until a predetermined timing first (step S1). - The
read unit 51 outputs the read command for reading data from the predetermined storage area in theexternal device 40 when the predetermined timing comes (step S2). - The
comparison unit 52 determines whether the read-data read from theexternal device 40 in accordance with the read command issued by theread unit 51 and the reference data in thedata storage unit 53 are equal to each other (step S3). - When the read-data and the reference data are equal to each other, the
monitoring device 50 stands by to output the next read command. When the read-data and the reference data are not equal to each other, theseparation unit 54 separates the process orsystem 10 of self-system from the fault-tolerant system 1 (step S4). - With this operation, the
monitoring device 50 ends the operation for monitoring theexternal device 40. Subsequently, the fault-tolerant system 1 continues the processing using theprocessor system 10 of theunseparated system 100. When only oneprocessor system 10 continues the processing, it operates without the operation (for example, the operation of the monitoring device 50) associated with the lockstep operation. - A specific example of the operation of the fault-
tolerant system 1 will be described below. - Assume herein that the fault-
tolerant system 1 includes twosystems 100. For the sake of a better understanding, the twosystems 100 are distinguished as systems 100 a and 100 b. In each of the systems 100 a and 100 b, a flash memory device is connected to theprocessor system 10 as theexternal device 40. The external device 40 (flash memory device) stores the BIOS (Basic Input Output System) code. In addition, theexternal device 40 stores SFDP and thedata storage unit 53 of theprocessor system 10 stores the value of SFDP. Theexternal device 40 includes no function of detecting and notifying a fault of its own. The frequency of access to theexternal device 40 by theCPU 101 is lower than that to thememory 102 by theCPU 101. The frequency of access to theexternal device 40 by theCPU 101 is as low as, for example, the frequency of reading the BIOS code from theexternal device 40 by theCPU 101 at the start or restart of thesystem 100. When the lockstep status of the systems 100 a and 100 b are lost, theprocessor system 10 of one of the systems 100 a and 100 b that has been separated and recombined more times in the past is separated from the fault-tolerant system 1 by thecontroller 30. - Under such conditions, in the fault-
tolerant system 1, the external device 40 (flash memory device) of the system 100 a is assumed to suffer a fault while theprocessor system 10 of each of the systems 100 a and 100 b normally performs the lockstep operation. - In the system 100 a, the read-data read from the SFDP area in the
external device 40 and the value of SFDP that is the reference data in thedata storage unit 53 become different from each other. - The
processor system 10 of the system 100 a is thus separated from the fault-tolerant system 1 by the operation of theseparation unit 54 of themonitoring device 50. - Subsequently, the
processor system 10 of the system 100 b continues the processing in the fault-tolerant system 1. - In this status, since the
processor system 10 of the system 100 b performs no lockstep operation, no operation associated with the lockstep operation in the system 100 b is performed. In other words, even when theprocessor system 10 of the system 100 b causes loss of lockstep due to the fault of theexternal device 40 of the system 100 a, thecontroller 30 of the system 100 b does not detect loss of lockstep. Therefore, theprocessor system 10 of the system 100 b with itsexternal device 40 suffering no fault is prevented from being separated from the fault-tolerant system 1 due to determination of loss of lockstep. - The operation of a fault-tolerant system equipped with systems 100 a and 100 b each including no
monitoring device 50 will be described herein as a comparative example with respect to the fault-tolerant system 1 in the first example embodiment. - In this case, even when the external device 40 (flash memory device) accessed at a relatively low frequency in the system 100 a suffers a fault, an error resulting from the fault is more likely to remain to be detected until the point in time at which the
CPU 101 reads the BIOS at the restart of the system 100 a. In theprocessor system 10 of each of the systems 100 a and 100 b, the loss of lockstep resulting from the fault of theexternal device 40 is detected by thecontroller 30. Theprocessor system 10 of the system 100 b with itsexternal device 40 suffering no fault may be separated, depending on, for example, the numbers of past separation and the numbers of recombining operations. In this case, at the restart of theprocessor system 10 of the system 100 a that continues the processing, theprocessor system 10 accesses theexternal device 40 suffering the fault to read the BIOS code. Theprocessor system 10 of the system 100 a detects the error resulting from the fault of theexternal device 40 and separates itself from the fault-tolerant system 1. As a result, theprocessor systems 10 of both the systems 100 a and 100 b are separated from the fault-tolerant system 1, resulting in the system crash. - In the first example embodiment, each of the systems 100 a and 100 b includes the
monitoring device 50. Theprocessor system 10 of the system 100 a connected to theexternal device 40 suffering the fault is separated from the fault-tolerant system 1 by themonitoring device 50 before the loss of lockstep is detected by thecontroller 30. Therefore, the fault-tolerant system 1 can avoid the system crash resulting from a fault of theexternal device 40. - The fault-
tolerant system 1 in the first example embodiment can more reliably prevent the system crash or degradation in availability resulting from the fault of theexternal device 40 connected to theprocessor system 10 that performs a lockstep operation. - The reason will be given below. In the first example embodiment, the
monitoring device 50 which detects the abnormality of theexternal device 40 by monitoring the operation of theexternal device 40 is provided. The fault-tolerant system 1 in the first example embodiment can quickly detect the fault of theexternal device 40 and quickly separate thesystem 100 with itsexternal device 40 suffering the fault from the fault-tolerant system 1. The fault-tolerant system 1 in the first example embodiment can reduce the possibility that theprocessor system 10 connected to theexternal device 40 suffering no fault will be separated from the fault-tolerant system 1 due to the fault of theexternal device 40. Therefore, the fault-tolerant system 1 in the first example embodiment can prevent the system crash or degradation in availability resulting from the fault of theexternal device 40. - A second example embodiment according to the present invention will be described below. In the description of the second example embodiment, the same reference numerals denote the same components as in the first example embodiment, and a repetitive description thereof will not be given.
- The second example embodiment exemplifies the case where an external device without an area which stores a fixed value, as in SFDP of a flash memory device, is employed as the
external device 40. -
FIG. 4 is a block diagram illustrating a configuration of a fault-tolerant system 2 in the second example embodiment. Referring toFIG. 4 , the fault-tolerant system 2 includes a plurality ofsystems 200. Although twosystems 200 are illustrated inFIG. 4 , the number ofsystems 200 included in the fault-tolerant system 2 is not limited. - Each
system 200 includes identical hardware components. Eachsystem 200 includes amonitoring device 60 in place of themonitoring device 50 in the first example embodiment. Themonitoring device 60 includes thecomparison unit 52, theseparation unit 54, aread unit 61, adata storage unit 63, adata update unit 65, and anaddress storage unit 66. - Exemplary hardware components included in the
system 200 are illustrated inFIG. 5 . Referring toFIG. 5 , themonitoring device 60 includes thetimer 501, theread generation circuit 502, theregister 503, thecomparison circuit 504, the controlsignal output circuit 505, anaccess monitoring circuit 606, and aregister 607. Thetimer 501, theread generation circuit 502, theregister 503, thecomparison circuit 504, the controlsignal output circuit 505, theaccess monitoring circuit 606, and theregister 607 are built into, for example, aprocessor 610.FIG. 5 merely illustrates an example and the hardware components included in thesystem 200 are not limited to these examples. - The
address storage unit 66 of themonitoring device 60 is implemented by theregister 607 illustrated inFIG. 5 . Thedata update unit 65 includes a function of storing in theaddress storage unit 66, the address of the access destination at which theprocessor system 10 accesses theexternal device 40 at a predetermined point in time. For example, thedata update unit 65 is implemented by theaccess monitoring circuit 606 illustrated inFIG. 5 and controls thedevice interface 103 of theprocessor system 10 to implement its function. The predetermined point in time means herein, for example, the point in time at which thesystem 200 accesses theexternal device 40 for the first time after the start of thesystem 200. - The
data update unit 65 includes a function of storing in thedata storage unit 63, data identical to that stored in the storage area of theexternal device 40 accessed by theprocessor system 10 at the predetermined point in time as described earlier. When, for example, theprocessor system 10 at the predetermined point in time accesses theexternal device 40 to read data, thedata update unit 65 stores the read data in thedata storage unit 63. When theprocessor system 10 at the predetermined point in time accesses theexternal device 40 to write data, thedata update unit 65 stores the data written in theexternal device 40 in thedata storage unit 63. - The
data update unit 65 further includes a function of, every time the data in the storage area of theexternal device 40 corresponding to the address stored in theaddress storage unit 66 is updated, updating the data in thedata storage unit 63 to a updated-data updated in theexternal device 40. That the data in the storage area of theexternal device 40 is updated can be detected by theaccess monitoring circuit 606. In other words, theaccess monitoring circuit 606 can detect the update of the data in theexternal device 40 by detecting the write command input to theexternal device 40 and data to be written into it. - The
read unit 61 includes a function of reading data from the storage area of theexternal device 40 corresponding to the address stored in theaddress storage unit 66, for each predetermined timing. - Configurations of the fault-
tolerant system 2 in the second example embodiment other than the above-mentioned configurations are the same as those of the fault-tolerant system 1 in the first example embodiment. The operation of the fault-tolerant system 2 in the second example embodiment will be described below with reference to the drawings. - When the fault-
tolerant system 2 is started, theprocessor system 10 of eachsystem 200 starts the lockstep operation, as in the fault-tolerant system 1 in the first example embodiment. During the lockstep operation, the operation for monitoring the lockstep status by thecontroller 30 and the operation for monitoring theexternal device 40 by themonitoring device 60 are performed. -
FIG. 6 is a flowchart illustrating an exemplary data updating operation by thedata update unit 65. - In the data updating operation illustrated in
FIG. 6 , first, thedata update unit 65 determines whether theprocessor system 10 has accessed theexternal device 40 at the predetermined point in time (step S11). When thedata update unit 65 detects that theprocessor system 10 has accessed theexternal device 40, it stores the address of the access destination in the address storage unit 66 (step S12). - The
data update unit 65 stores in thedata storage unit 63, data stored in the storage area at the access destination at which theprocessor system 10 accesses the external device 40 (step S13). - In doing this, when the
processor system 10 reads data from theexternal device 40, thedata update unit 65 stores the read-data in thedata storage unit 63. When theprocessor system 10 writes data into theexternal device 40, thedata update unit 65 stores the data in theexternal device 40 in thedata storage unit 63. - The
data update unit 65 determines whether the write command for writing data into the storage area of theexternal device 40 corresponding to the address stored in theaddress storage unit 66 has been output (step S14). Upon detection of the write command, thedata update unit 65 updates the data in thedata storage unit 63 to data to be written into theexternal device 40 in accordance with the write command (step S15). - The
data update unit 65 repeats the operations in step S14 and the subsequent step. -
FIG. 7 is a flowchart illustrating an exemplary operation for monitoring theexternal device 40 by themonitoring device 60 in the second example embodiment. - In the second example embodiment, when the predetermined timing is detected to have come (step S1), the
read unit 61 outputs the read command for reading data from the storage area of theexternal device 40 corresponding to the address stored in the address storage unit 66 (step S22). - As in the operation for monitoring the
external device 40 by themonitoring device 50 in the first example embodiment, thecomparison unit 52 of themonitoring device 60 determines whether the read-data read from theexternal device 40 in accordance with the read command issued by theread unit 61 and the reference data in thedata storage unit 63 are equal to each other (step S3). - When the read-data and the reference data are equal to each other, the
monitoring device 60 stands by to output the next read command. When the read-data and the reference data are not equal to each other, theseparation unit 54 separates theprocessor system 10 of self-system from the fault-tolerant system 2 (step S4). Themonitoring device 60 thus ends its operation for monitoring theexternal device 40. - Subsequently, the fault-
tolerant system 2 continues the processing using theprocessor systems 10 of theunseparated systems 200. When only oneprocessor system 10 continues the processing, it operates without the operation (for example, the operation of the monitoring device 60) associated with the lockstep operation. - The fault-
tolerant system 2 in the second example embodiment can more reliably prevent the system crash or degradation in availability resulting from the fault of theexternal device 40 even when a device without a storage area for a fixed value is connected as theexternal device 40. - The reason will be given below. The
monitoring device 60 in the second example embodiment includes thedata update unit 65, in addition to the configuration of themonitoring device 50 in the first example embodiment. Thedata update unit 65 includes the function of storing in theaddress storage unit 66, the address of the access destination at which theprocessor system 10 accesses theexternal device 40 at the predetermined point in time. Thedata update unit 65 further includes the function of storing in thedata storage unit 63, the data in the storage area of theexternal device 40 corresponding to the address of the access destination as reference data. Every time the data in the storage area of theexternal device 40 indicated by the address stored in theaddress storage unit 66 is updated, thedata update unit 65 updates the data in thedata storage unit 63 to the updated data. - In this manner, in the second example embodiment, every time the data in the storage area of the
external device 40 from which thecomparison unit 52 reads data for each predetermined timing is updated, the reference data in thedata storage unit 63 used by thecomparison unit 52 is updated upon the update of this data. The fault-tolerant system 2 in the second example embodiment can obtain the same effect as in the first example embodiment even when theexternal device 40 such as a flash memory device before SFDP definition or a flash memory device without the storage area for the fixed value such as SFDP is mounted in it. In other words, the fault-tolerant system 2 in the second example embodiment can quickly detect the fault of theexternal device 40 and quickly separate thesystem 200 with itsexternal device 40 suffering the fault from the fault-tolerant system. - The fault-
tolerant system 2 in the second example embodiment can prevent thenormal system 200 with itsexternal device 40 suffering no fault from being separated from the fault-tolerant system, as in the first example embodiment. This reduces the system crash or degradation in availability resulting from separation of thesystem 200 with itsexternal device 40 suffering the fault from the fault-tolerant system 2 after thenormal system 200 is separated. - In the second example embodiment, the
external device 40 is an external device without the area storing the fixed value. Instead, the configuration of the second example embodiment is also applicable to the fault-tolerant system which employs the external device (for example, a flash memory device including SFDP) including the area storing the fixed value as theexternal device 40. - The present invention is not limited to the first and second example embodiments and may take various example embodiments. For example, although the use of the flash memory device as the
external device 40 has been taken as an example in the first and second example embodiments, theexternal device 40 is not limited to the flash memory device. - The first and second example embodiments give an example in which the
controller 30 sets the system to be separated, based on the numbers of separation and remounting operations as a criterion for determining a system to be separated upon detection of the loss of lockstep. However, the criterion for determining the system (operational system) to be separated from the fault-tolerant system by thecontroller 30 is not limited to that described in the first and second example embodiments. - In the first and second example embodiments, the
separation unit 54 is configured to separate theprocessor system 10 by causing theCPU state machine 104 to make a transition. However, the processing for separating theprocessor system 10 by theseparation unit 54 and the configuration of theseparation unit 54 for separating theprocessor system 10 are not limited to those described in the first and second example embodiments. The hardware configurations described with reference toFIGS. 2 and 5 are merely examples and the present invention is not limited to these examples. - The
monitoring devices monitoring devices processor system 10. - Each of the fault-
tolerant systems systems -
FIG. 8 is a block diagram illustrating the simplified configuration of a monitoring device in other example embodiments according to the present invention. Amonitoring device 70 illustrated inFIG. 8 is mounted in, for example, a fault-tolerant system 3 in other example embodiments according to the present invention illustrated inFIG. 9 . The fault-tolerant system 3 includes a plurality ofoperational systems 300. The plurality ofoperational systems 300 have the same configuration including aprocessor system 80. In theoperational system 300, anaccessory device 85 is connected to theprocessor system 80. Theaccessory device 85 includes amemory 86. Acontroller 90 includes a function of detecting an abnormality of theprocessor system 80 of theoperational system 300 of self-system, based on data output from theprocessor system 80 of theoperational system 300 of self-system and data input from theoperational system 300 of different-system. Thecontroller 90 further includes a function of separating theprocessor system 80 detected to suffer the abnormality from the fault-tolerant system 3 when the abnormality of theprocessor system 80 is detected. - The
monitoring device 70 includes aprocessor 71. Theprocessor 71 includes a function of reading data from a predetermined storage area in thememory 86 of theaccessory device 85 to be monitored, connected to theprocessor system 80 of theoperational system 300 of self-system. Theprocessor 71 further includes a function of comparing the read-data with reference data held in advance to determine whether the read-data and the reference data are different from each other. Theprocessor 71 further includes a function of separating theprocessor system 80 connected to theaccessory device 85 to be monitored from the fault-tolerant system 3 when the read-data and the reference data are different from each other. - Such the
monitoring device 70 illustrated inFIG. 8 and the fault-tolerant system 3 including themonitoring device 70 can prevent the system crash or degradation in availability resulting from the fault of theaccessory device 85, as in the first and second example embodiments. - The previous description of embodiments is provided to enable a person skilled in the art to make and use the present invention. Moreover, various modifications to these example embodiments will be readily apparent to those skilled in the art, and the generic principles and specific examples defined herein may be applied to other embodiments without the use of inventive faculty. Therefore, the present invention is not intended to be limited to the example embodiments described herein but is to be accorded the widest scope as defined by the limitations of the claims and equivalents.
- Further, it is noted that the inventor's intent is to retain all equivalents of the claimed invention even when the claims are amended during prosecution.
Claims (4)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016028976A JP6083480B1 (en) | 2016-02-18 | 2016-02-18 | Monitoring device, fault tolerant system and method |
JP2016-028976 | 2016-02-18 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20170242760A1 true US20170242760A1 (en) | 2017-08-24 |
US10360115B2 US10360115B2 (en) | 2019-07-23 |
Family
ID=58095216
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/426,243 Active 2037-06-29 US10360115B2 (en) | 2016-02-18 | 2017-02-07 | Monitoring device, fault-tolerant system, and control method |
Country Status (2)
Country | Link |
---|---|
US (1) | US10360115B2 (en) |
JP (1) | JP6083480B1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190391888A1 (en) * | 2018-06-21 | 2019-12-26 | Arm Limited | Methods and apparatus for anomaly response |
US10970180B2 (en) * | 2019-03-29 | 2021-04-06 | Nakamoto & Turing Labs Inc | Methods and apparatus for verifying processing results and/or taking corrective actions in response to a detected invalid result |
US11232197B2 (en) * | 2018-11-15 | 2022-01-25 | Hitachi, Ltd. | Computer system and device management method |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106293620B (en) * | 2016-08-09 | 2019-05-14 | 浪潮电子信息产业股份有限公司 | Method for detecting parameters in Flash Rom by intel platform |
Family Cites Families (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4882752A (en) * | 1986-06-25 | 1989-11-21 | Lindman Richard S | Computer security system |
JPH0792766B2 (en) * | 1988-10-25 | 1995-10-09 | 三菱電機株式会社 | Duplication computer system |
JP2731656B2 (en) * | 1992-01-16 | 1998-03-25 | 財団法人鉄道総合技術研究所 | Dual computer |
US6820213B1 (en) * | 2000-04-13 | 2004-11-16 | Stratus Technologies Bermuda, Ltd. | Fault-tolerant computer system with voter delay buffer |
EP1246033A1 (en) * | 2001-08-23 | 2002-10-02 | Siemens Aktiengesellschaft | Method for monitoring consistent memory contents in a redundant system |
EP1249744A1 (en) * | 2001-08-23 | 2002-10-16 | Siemens Aktiengesellschaft | Method and apparatus for providing consistent memory contents in a redundant system |
JP3982353B2 (en) * | 2002-07-12 | 2007-09-26 | 日本電気株式会社 | Fault tolerant computer apparatus, resynchronization method and resynchronization program |
JP4161276B2 (en) * | 2004-12-17 | 2008-10-08 | 日本電気株式会社 | Fault-tolerant computer device and synchronization method thereof |
JP2006178616A (en) * | 2004-12-21 | 2006-07-06 | Nec Corp | Fault tolerant system, controller used thereform, operation method and operation program |
JP2007026010A (en) * | 2005-07-15 | 2007-02-01 | Yaskawa Electric Corp | Radio communication method of safety related signal processing system |
DE102005037246A1 (en) * | 2005-08-08 | 2007-02-15 | Robert Bosch Gmbh | Method and device for controlling a computer system having at least two execution units and a comparison unit |
US7562264B2 (en) * | 2006-09-06 | 2009-07-14 | Intel Corporation | Fault tolerant soft error detection for storage subsystems |
US8301791B2 (en) * | 2007-07-26 | 2012-10-30 | Netapp, Inc. | System and method for non-disruptive check of a mirror |
JP4822024B2 (en) | 2008-02-29 | 2011-11-24 | 日本電気株式会社 | Fault-tolerant server, full backup method, and full backup program |
JP5509637B2 (en) * | 2009-03-18 | 2014-06-04 | 日本電気株式会社 | Fault tolerant system |
US20110208948A1 (en) * | 2010-02-23 | 2011-08-25 | Infineon Technologies Ag | Reading to and writing from peripherals with temporally separated redundant processor execution |
EP2550599B1 (en) * | 2010-03-23 | 2020-05-06 | Continental Teves AG & Co. OHG | Control computer system, method for controlling a control computer system, and use of a control computer system |
WO2011117155A1 (en) * | 2010-03-23 | 2011-09-29 | Continental Teves Ag & Co. Ohg | Redundant two-processor controller and control method |
US8281188B2 (en) * | 2010-08-05 | 2012-10-02 | Miller Gary L | Data processing system with peripheral configuration information error detection |
JP5740644B2 (en) * | 2010-10-08 | 2015-06-24 | 日本電産サンキョー株式会社 | Electronic device apparatus, pairing processing method thereof and pairing monitoring method |
US8479042B1 (en) * | 2010-11-01 | 2013-07-02 | Xilinx, Inc. | Transaction-level lockstep |
US8443230B1 (en) * | 2010-12-15 | 2013-05-14 | Xilinx, Inc. | Methods and systems with transaction-level lockstep |
JP6098778B2 (en) * | 2012-03-29 | 2017-03-22 | 日本電気株式会社 | Redundant system, redundancy method, redundancy system availability improving method, and program |
US20140088338A1 (en) * | 2012-09-26 | 2014-03-27 | Alice Chang | Clothing with magnets systems |
JP6070374B2 (en) * | 2013-03-29 | 2017-02-01 | 富士通株式会社 | Information processing apparatus, memory test program, and memory test method |
US9697094B2 (en) * | 2015-02-06 | 2017-07-04 | Intel Corporation | Dynamically changing lockstep configuration |
US10761925B2 (en) * | 2015-03-24 | 2020-09-01 | Nxp Usa, Inc. | Multi-channel network-on-a-chip |
JP6697360B2 (en) * | 2016-09-20 | 2020-05-20 | キオクシア株式会社 | Memory system and processor system |
-
2016
- 2016-02-18 JP JP2016028976A patent/JP6083480B1/en active Active
-
2017
- 2017-02-07 US US15/426,243 patent/US10360115B2/en active Active
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190391888A1 (en) * | 2018-06-21 | 2019-12-26 | Arm Limited | Methods and apparatus for anomaly response |
US10810094B2 (en) * | 2018-06-21 | 2020-10-20 | Arm Limited | Methods and apparatus for anomaly response |
US11232197B2 (en) * | 2018-11-15 | 2022-01-25 | Hitachi, Ltd. | Computer system and device management method |
US10970180B2 (en) * | 2019-03-29 | 2021-04-06 | Nakamoto & Turing Labs Inc | Methods and apparatus for verifying processing results and/or taking corrective actions in response to a detected invalid result |
Also Published As
Publication number | Publication date |
---|---|
US10360115B2 (en) | 2019-07-23 |
JP2017146833A (en) | 2017-08-24 |
JP6083480B1 (en) | 2017-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9582373B2 (en) | Methods and systems to hot-swap a virtual machine | |
EP1703401B1 (en) | Information processing apparatus and control method therefor | |
US10360115B2 (en) | Monitoring device, fault-tolerant system, and control method | |
CN109032822B (en) | Method and device for storing crash information | |
JP5347414B2 (en) | Synchronization control device, information processing device, and synchronization management method | |
JP7351933B2 (en) | Error recovery method and device | |
EP3629176B1 (en) | Fault detection circuit with progress register and status register | |
JP5874492B2 (en) | Fault tolerant control device and control method of fault tolerant system | |
US10860411B2 (en) | Automatically detecting time-of-fault bugs in cloud systems | |
US20170286324A1 (en) | Semiconductor device and access management method | |
JP6135403B2 (en) | Information processing system and information processing system failure processing method | |
JP4500346B2 (en) | Storage system | |
WO2008004330A1 (en) | Multiple processor system | |
US20090228745A1 (en) | Error backup method | |
US9176806B2 (en) | Computer and memory inspection method | |
US10540222B2 (en) | Data access device and access error notification method | |
CN108415788B (en) | Data processing apparatus and method for responding to non-responsive processing circuitry | |
JP2001175545A (en) | Server system, fault diagnosing method, and recording medium | |
JP7524562B2 (en) | Processor and method for making the processor redundant | |
JP4613019B2 (en) | Computer system | |
JP3539687B2 (en) | Processor dual-processing information processor | |
JPH05216855A (en) | Multi-cpu control system | |
JP2815730B2 (en) | Adapters and computer systems | |
JPH08263455A (en) | Degrading method for fault processor in multiprocessor system | |
JP2012243205A (en) | Semiconductor integrated circuit and data evacuating method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NEC CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TANAKA, YUKIHIRO;REEL/FRAME:041191/0226 Effective date: 20170201 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |