EP2953028A1 - Computer device and control method for computer device - Google Patents
Computer device and control method for computer device Download PDFInfo
- Publication number
- EP2953028A1 EP2953028A1 EP13873790.3A EP13873790A EP2953028A1 EP 2953028 A1 EP2953028 A1 EP 2953028A1 EP 13873790 A EP13873790 A EP 13873790A EP 2953028 A1 EP2953028 A1 EP 2953028A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- interrupt
- initialization procedure
- cpu
- ras module
- ras
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/24—Handling requests for interconnection or transfer for access to input/output bus using interrupt
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4812—Task transfer initiation or dispatching by interrupt, e.g. masked
Definitions
- the present invention relates to a technique for adding a RAS module which is a program for realizing a malfunction handling function (i.e., a RAS (Reliability, Availability, Serviceability) function) for handling a CPU (Central Processing Unit) exception to a computer apparatus on which an OS (Operating System) is implemented, without modifying the OS module.
- a RAS Reliability, Availability, Serviceability
- CPU Central Processing Unit
- the present invention also relates to a technique for adding a RAS module to a computer apparatus on which an OS and a hypervisor are implemented, without modifying the OS and hypervisor modules.
- the hypervisor is software that realizes virtualization of the computer apparatus.
- the hypervisor is software located between the OS and hardware of the computer apparatus and emulates the operation of the computer apparatus.
- the hypervisor allows a plurality of OSs to operate simultaneously on a single computer apparatus, and acts as an intermediary for communication and sharing of resources among the plurality of OSs.
- a CPU exception is an exception in which the CPU is unable to continue normal processing (for example, division by zero, etc.).
- an interrupt other than a CPU exception will be referred to as a “regular interrupt” to be distinguished from a CPU exception.
- a RAS function for handling a CPU exception is realized by a method such as adding a process for handling a CPU exception to each OS or the hypervisor.
- Patent Literature 1 discloses a configuration in which a VM monitor (corresponding to a hypervisor in the present Specification) is provided with means to extract failure information of a process being executed in a virtual computer which experienced a main system failure (corresponding to a CPU exception in the present Specification) from a failure information storage area.
- a VM monitor corresponding to a hypervisor in the present Specification
- a main system failure corresponding to a CPU exception in the present Specification
- Patent Literature 2 discloses a technique for resolving an exception in a virtual computer on which a hypervisor is employed to operate a plurality of OSs.
- Patent Literature 2 discloses the technique for resolving an exception by copying to the hypervisor a memory image of the portion of a process being executed by an OS when the exception occurred and emulating a privileged instruction included in the process being executed by the OS when the exception occurred.
- each OS there may be one OS or a plurality of OSs
- a hypervisor the computer system may be configured without a hypervisor
- the RAS function is realized by providing a function to handle a CPU exception in the hypervisor in advance.
- the present invention mainly aims to solve the above-described problem.
- the present invention mainly aims to add a RAS module to a computer apparatus and to realize a RAS function appropriately without modifying an OS.
- a computer apparatus includes a CPU (Central Processing Unit) including an interrupt detection unit that detects an interrupt; and an OS (Operating System) including an interrupt determining part that is called by the interrupt detection unit when the interrupt detection unit has detected an interrupt, and determines whether or not the interrupt detected by the interrupt detection unit is a CPU exception, wherein when a RAS (Reliability Availability Serviceability) module which is a program for carrying out a process to handle a CPU exception is added to the computer apparatus, at start-up of the computer apparatus the CPU calls a first initialization procedure included in the RAS module, and executes the first initialization procedure to initialize a resource to be used by the RAS module; after execution of the first initialization procedure of the RAS module, the CPU calls an initialization procedure included in the OS, and executes the initialization procedure to initialize a resource to be used by the OS; and after execution of the initialization procedure of the OS, the CPU calls a second initialization procedure included in the RAS module, and executes the second initialization procedure to copy the
- a RAS module can be added to a computer apparatus without modifying an OS which is implemented on the computer apparatus, and when an interrupt detection unit of a CPU has detected an interrupt, the RAS module is called appropriately and a RAS function is realized.
- the first to third embodiments hereinafter describe a computer apparatus in which one OS or a plurality of OSs and a hypervisor may operate.
- a RAS module for handling a CPU exception can be added without modifying the OS or hypervisor module, and upon occurrence of a CPU exception the RAS module is called appropriately and the RAS module carries out a process to handle the CPU exception.
- the first to third embodiments also describe the computer apparatus in which a RAS function is executed even if a failure occurs in the OS or the hypervisor concurrently with an occurrence of a CPU exception.
- the first to third embodiments describe the computer apparatus that solves the above-described problem.
- the first to third embodiments also describe the RAS module that can obtain failure information of the pertinent OS (or hypervisor) by determining the OS (or hypervisor) that was operating when a CPU exception occurred.
- the RAS module is also described that can call an interrupt processing part properly upon occurrence of a regular interrupt, instead of a CPU exception, by referring to an interrupt determining part of the OS so as to call a corresponding interrupt processing part, even in a case where interrupt registration details have been changed in the OS (including a case where interrupt registration details have been changed dynamically after start-up of the computer apparatus and before occurrence of the interrupt).
- Fig. 1 is a block diagram illustrating an example configuration of a computer apparatus (100) according to the first embodiment.
- the computer apparatus (100) is configured with hardware and software.
- the computer apparatus (100) includes a CPU (101), a memory (103), and a secondary storage device (104) as hardware.
- the configuration may include one CPU (101) or a plurality of CPUs (101) (multiple cores, multiple CPUs, multiple processors, etc.).
- the CPU (101) includes an interrupt detection unit (102).
- the interrupt detection unit (102) detects an interrupt (a CPU exception and a regular interrupt).
- the memory (103) is a RAM (Random Access Memory).
- the secondary storage device (104) is, for example, a ROM (Read Only Memory), an HDD (Hard Disk Drive), or an SSD (Solid State Drive).
- the software to be described later is stored in the secondary storage device (104), and is loaded from the secondary storage device (104) into the memory (103) for execution.
- the software is sequentially read from the memory (103) into the CPU (101) and is then executed.
- the computer apparatus (100) is provided with various devices including an input/output device and a communication device.
- an OS 110
- a RAS module 130
- a boot program 140
- the boot program (140) is executed when the computer apparatus (100) is started.
- the OS (110) module includes an initialization procedure (111), an interrupt determining part (112), a CPU exception processing part (113), and an interrupt processing part (115).
- the initialization procedure (111) is a program for initializing resources to be used by the OS (110).
- the resources to be used by the OS (110) include both hardware resources and software resources.
- the interrupt determining part (112) is a program that is called by the interrupt detection unit (102) when the interrupt detection unit (102) has detected an interrupt.
- the interrupt determining part (112) determines whether the interrupt detected by the interrupt detection unit (102) is a CPU exception or a regular interrupt.
- the interrupt determining part (112) is copied to the RAS module (130) and becomes an interrupt determining part (134), and the setting of the interrupt detection unit (102) is changed such that the interrupt detection unit (102) calls the interrupt determining part (134) when the interrupt detection unit (102) has detected an interrupt, as will be described later.
- the interrupt determining part (112) will not be called by the interrupt detection unit (102).
- the CPU exception processing part (113) and the interrupt processing part (115) are programs that are executed upon occurrence of a regular interrupt which is not a CPU exception.
- the RAS module (130) is a program that carries out a process to handle a CPU exception.
- the RAS module (130) includes a first initialization procedure (132), a second initialization procedure (133), the interrupt determining part (134), an OS identifying part (135), a fault detecting part (136), a fault information collecting part (137), a fault specifying part (138), and a fault handling part (139).
- the RAS module (130) will also be referred to simply as the RAS.
- the first initialization procedure (132) is a program that is executed before the initialization procedure (111) of the OS (110) is executed.
- the first initialization procedure (132) initializes resources to be used by the RAS module (130).
- the resources to be used by the RAS module (130) include both hardware resources and software resources.
- the first initialization procedure (132) also carries out a process to rewrite the last portion of the program code of the initialization procedure (111) of the OS (110) such that the second initialization procedure (133) to be described later is executed.
- the second initialization procedure (133) is a program that is executed after execution of the initialization procedure (111) of the OS (110).
- the second initialization procedure (133) carries out a process to copy the interrupt determining part (112) included in the OS (110) to the RAS module (130), and to set the interrupt detection unit (102) such that upon detecting an interrupt the interrupt detection unit (102) calls the interrupt determining part (134) copied to the RAS module (130), instead of the interrupt determining part (112) of the OS (110).
- the interrupt determining part (134) is the interrupt determining part (112) that has been copied to the RAS module (130).
- the interrupt determining part (134) carries out the same process as that carried out by the interrupt determining part (112).
- the interrupt determining part (134) determines whether the interrupt detected by the interrupt detection unit (102) is a CPU exception or a regular interrupt.
- the OS identifying part (135) is a program that is executed when the interrupt determining part (134) determines that the interrupt detected by the interrupt detection unit (102) is a regular interrupt.
- the OS identifying part (135) identifies the OS (110) that was operating when the interrupt occurred.
- the fault detecting part (136), the fault information collecting part (137), the fault specifying part (138), and the fault handling part (139) are programs that are executed when the interrupt determining part (134) determines that the interrupt detected by the interrupt detection unit (102) is a CPU exception.
- the fault detecting part (136) identifies a fault that has caused the CPU exception.
- the fault information collecting part (137) identifies the OS (110) that was operating when the CPU exception occurred, and collects information about the fault from the identified OS (110).
- the fault specifying part (138) specifies a fault handling method corresponding to the fault, based on the information collected by the fault information collecting part (137).
- the fault handling part (139) carries out the fault handling method specified by the fault specifying part (138).
- the fault detecting part (136), the fault information collecting part (137), the fault specifying part (138), and the fault handling part (139) each correspond to an example of a CPU exception processing part.
- the interrupt determining part (112) and the interrupt determining part (134) hold addresses, on the memory (103), of the processes (programs) for handling a CPU exception and a regular interrupt.
- the interrupt processing part (115) holds the processes (programs) for handling an occurrence of an interrupt.
- the OS (110), the RAS module (130), and the boot program (140) are programs.
- the CPU (101) reads these programs and carries out processes according to the contents described in these programs.
- the CPU (101) carries out a process or the like.
- expressions may also be employed where the OS (110), the RAS module (130), or the boot program (140) is described as carrying out a process or the like, or an element (for example, the first initialization procedure (132)) included in these is described as carrying out a process or the like.
- the description means that the process is carried out by the CPU (101) executing a program.
- Fig. 2 illustrates an overall flowchart of a process to initialize the RAS module (130) and the OS (110) at start-up of the computer apparatus (100) according to the first embodiment.
- the boot program (140) is executed by the CPU (101).
- the first initialization procedure (132) of the RAS module (130) is called (S201), and the first initialization procedure (132) of the RAS module (130) is executed (S202).
- the boot program (140) and the "first initialization procedure" of the RAS module (130) (S202) will be described in detail later.
- the initialization procedure (111) of the OS (110) is called, and the initialization procedure (111) of the OS (110) is executed (S204).
- the initialization procedure (111) of each OS (110) is executed sequentially in the "initialization procedure" of the OS (110) (S204).
- Fig. 3 illustrates a detailed flowchart of the "first initialization procedure" of the RAS module (130) (S202) described above.
- the CPU (101) first executes the first initialization procedure (132) to initialize the RAS module (130) (S301).
- the CPU (101) carries out a process to add, to the end of the "initialization procedure” of the OS, a process to "call the "second initialization procedure” of the RAS module after execution of the "initialization procedure” of the OS"(S302).
- the second initialization procedure (133) of the RAS module (130) is a program located on the memory (103), which causes the last portion of the initialization procedure (111) of the OS to be rewritten such that the address, on the memory (103), of the program of the second initialization procedure (133) of the RAS module (130) is called at the end of the initialization procedure of the OS (110) (S204).
- the CPU (101) changes the last portion of the program code, stored in the memory (103), of the initialization procedure (111) of the OS (110) to a jump instruction to the program code of the second initialization procedure (133) of the RAS module (130).
- the memory address of the program of the initialization procedure (111) of the OS (110) is called.
- Fig. 4 illustrates a detailed flowchart of the "initialization procedure" of the OS (110) (S204) described above.
- the CPU (101) initializes the OS (110) (S401).
- the CPU (101) calls the second initialization procedure (133) of the RAS module (130) (S402).
- the CPU (101) realizes this by calling the memory address of the program of the second initialization procedure (133) of the RAS module (130).
- FIG. 5 illustrates a detailed flowchart of the "second initialization procedure" (S205) of the RAS module (130) described above.
- the CPU (101) copies the program code of the interrupt determining part (112) of the OS (110) to the RAS module (130) (S501).
- the interrupt determining part (112) that has been copied to the RAS module (130) becomes the interrupt determining part (134).
- the CPU (101) carries out a process to set the interrupt detection unit (102) such that upon occurrence of an interrupt the interrupt detection unit (102) calls the interrupt determining part (134) of the RAS module (130) (S502).
- Fig. 16 illustrates an example configuration of the computer apparatus (100) before the RAS module (130) is added.
- each OS (the configuration may include one OS or a plurality of OSs) is configured as an independent module.
- the computer apparatus (100) performs normal operation without the RAS module (130), as illustrated in Fig. 16 .
- the boot program (140) on the computer apparatus (100) starts up first, and the initialization procedure (111) of the OS (110) is called from the boot program (140).
- the program of the RAS module (130) to be added is first placed in free space on the secondary storage device (104) of the computer apparatus
- the boot program (140) of the computer apparatus (100) is changed such that at start-up of the computer apparatus (100) the boot program (140) calls the first initialization procedure (132) of the RAS module (130).
- the boot program (140) is changed such that the address, on the memory (103), of the program of the first initialization procedure (132) of the RAS module (130) is called from the boot program (140).
- the RAS function corresponding to the interrupt determining part provided in the OS can be easily added, without modifying the OS module, to the computer apparatus in which each OS (the configuration may include one OS or a plurality of OSs) is configured as an independent module.
- Fig. 6 illustrates a flow of processing upon occurrence of an interrupt while the OS (110) is operating.
- a solid arrow indicates the flow of processing
- a dashed arrow indicates the content of processing.
- the interrupt detection unit (102) of the CPU (101) calls the RAS module (130).
- the RAS module (130) determines whether or not the interrupt that has occurred is a CPU exception.
- the RAS module (130) determines the OS that was operating based on a program counter (a register holding the address being executed by the CPU) of the computer apparatus (100) (determines the OS that was operating based on the location of the code being executed which is held in the program counter), identifies the pertinent OS, and collects fault information from the identified OS.
- Fig. 7 illustrates a flowchart for when an interrupt has occurred.
- the interrupt detection unit (102) detects an interrupt (S701).
- the interrupt detection unit (102) calls the interrupt determining part (134) of the RAS module (130) (S702).
- the interrupt determining part (134) determines whether the interrupt detected by the interrupt detection unit (102) is a CPU exception or a regular interrupt (S703).
- interrupt detected by the interrupt detection unit (102) is a regular interrupt (NO in S703), processing transitions to S1001 of Fig. 15 .
- the fault detecting part (136) of the RAS module (130) identifies a fault that caused the CPU exception (S705).
- the fault information collecting part (137) of the RAS module (130) identifies the OS (110) that was operating when the CPU exception occurred, and collects fault information from the pertinent OS (110) (S706-1).
- the fault information collecting part (137) of the RAS module (130) identifies the OS (110) that was operating when the CPU exception occurred, based on the program counter of the computer apparatus (100).
- the fault specifying part (138) of the RAS module (130) specifies a fault handling method corresponding to the fault identified in S705 (S707).
- the fault handling part (139) of the RAS module (130) carries out a process to handle the fault in accordance with the fault handling method specified in S707 (S708).
- the RAS function can be executed even if the CPU exception has occurred concurrently with a failure in the OS.
- This embodiment has described a RAS scheme, according to which the RAS function corresponding to the interrupt determining part provided in the OS can be added, without modifying the OS module, to the computer apparatus in which each OS (the configuration may include one OS or a plurality of OSs) is configured as an independent module.
- this embodiment has described the following (1) to (4).
- This embodiment has also described the RAS scheme, according to which by executing the interrupt determining part, the fault detecting part, and so on in the RAS module upon occurrence of a CPU exception, the RAS function is executed even if the CPU exception has occurred concurrently with a failure in the OS.
- This embodiment has also described the RAS scheme, according to which by executing the fault information collecting part in the RAS module, it is possible to determine the OS that was operating when a CPU exception occurred based on the program counter of the time when the CPU exception occurred, and to collect failure information from the pertinent OS.
- Fig. 8 is a block diagram illustrating an example configuration of the computer apparatus (100) according to the second embodiment.
- a hypervisor (120) and an initialization procedure (121) of the hypervisor (120) are included, and the rest are the same as those of the configuration of the first embodiment ( Fig. 1 ).
- the configuration may include one OS (110) or a plurality of OSs (110) also in this embodiment.
- the configuration may also include one CPU (101) or a plurality of CPUs (101) (multiple cores, multiple CPUs, multiple processors, etc.).
- Fig. 9 illustrates an overall flow of a process to initialize the RAS module (130), the OS (110) module, and the hypervisor (120) module at start-up of the computer apparatus (100) according to the second embodiment.
- the difference from the operation at initialization according to the first embodiment ( Fig. 2 ) is that the "initialization procedure" of the hypervisor (S203) is added between the process in S202 and S204, and the rest of the flow is the same as the flow of the first embodiment ( Fig. 2 ).
- the CPU (101) mainly initializes resources to be used by the hypervisor (120).
- S205 The details of S205 are as illustrated in Fig. 5 of the first embodiment.
- the initialization procedure (111) of each OS (110) is executed sequentially in the "initialization procedure" of the OS (S204) in the flowchart of Fig. 9 .
- Fig. 10 illustrates a detailed flow of the "first initialization procedure" of the RAS module (S202) described above.
- the CPU (101) realizes this by calling the address, on the memory (103), of the program of the initialization procedure (121) of the hypervisor (120).
- Fig. 17 illustrates an example configuration of the computer apparatus (100) before the RAS module (130) is added.
- each OS the configuration may include one OS or a plurality of OSs
- the hypervisor are configured as independent modules.
- the computer apparatus (100) performs normal operation without the RAS module (130), as illustrated in Fig. 17 .
- the boot program (140) on the computer apparatus (100) starts up first.
- the initialization procedure (121) of the hypervisor (120) is called from the boot program (140), and the initialization procedure (111) of the OS (110) is called from the initialization procedure (121) of the hypervisor (120).
- the program of the RAS module (130) to be added is first placed in free space on the secondary storage device (104) of the computer apparatus (100).
- boot program (140) of the computer apparatus (100) is changed such that at start-up of the computer apparatus (100) the boot program (140) calls the first initialization procedure (132) of the RAS module (130).
- the boot program (140) is changed such that the address, on the memory (103), of the program of the first initialization procedure (132) of the RAS module (130) is called from the boot program (140).
- the RAS function corresponding to the interrupt determining part provided in the OS can be easily added, without modifying the OS or hypervisor module, to the computer apparatus in which each OS (the configuration may include one OS or a plurality of OSs) and the hypervisor are configured as independent modules.
- Fig. 11 illustrates a flow of processing upon occurrence of an interrupt while the OS (110) is operating.
- Fig. 12 illustrates a flow of processing upon occurrence of an interrupt while the hypervisor (120) is operating.
- Fig. 11 and Fig. 12 differ only in that whether the fault information collecting part (137) of the RAS module (130) collects fault information of the OS or collects fault information of the hypervisor.
- Fig. 11 and Fig. 12 the memory (103), the secondary storage device (104), and the boot program (140) which are not directly relevant to the description are not illustrated.
- a solid arrow indicates the flow of processing
- a dashed arrow indicates the content of processing.
- the interrupt detection unit (102) of the CPU (101) calls the RAS module (130).
- the RAS module (130) determines whether or not the interrupt that has occurred is a CPU exception.
- the RAS module (130) determines the OS or hypervisor that was operating when the CPU exception occurred based on the program counter (the register holding the address being executed by the CPU) of the computer apparatus (100) (determines the OS or hypervisor that was operating based on the location of the code being executed which is held in the program counter), identifies the pertinent OS or hypervisor, and collects fault information from the identified OS or hypervisor.
- the OS the configuration may include one OS or a plurality of OSs
- hypervisor that was operating when the CPU exception occurred
- Fig. 13 illustrates a flowchart for when an interrupt has occurred.
- the fault information collecting part (137) of the RAS module (130) identifies the OS (or hypervisor) that was operating when the CPU exception occurred, and collects fault information from the pertinent OS (or hypervisor) (S706-2).
- the fault information collecting part (137) of the RAS module (130) identifies the OS (or hypervisor) that was operating when the CPU exception occurred, based on the program counter of the computer apparatus (100).
- the RAS function can be executed even if the CPU exception has occurred concurrently with a failure in the OS.
- the RAS function can be executed even if the CPU exception has occurred concurrently with a failure in the hypervisor.
- This embodiment has described a RAS scheme, according to which the RAS function corresponding to the interrupt determining part provided in the OS can be added, without modifying the OS or hypervisor module, to the computer apparatus in which each OS (the configuration may include one OS or a plurality of OSs) and the hypervisor are configured as independent modules.
- this embodiment has described the following (1) to (4).
- This embodiment has also described the RAS scheme, according to which by executing the interrupt determining part, the fault detecting part, and so on in the RAS module upon occurrence of a CPU exception, the RAS function is executed even if the CPU exception has occurred concurrently with a failure in the OS.
- This embodiment has also described the RAS scheme, according to which by executing the fault information collecting part in the RAS module, it is possible to determine the OS (the configuration may include one OS or a plurality of OSs) or hypervisor that was operating when a CPU exception occurred based on the program counter of the time when the CPU exception occurred, and to collect failure information from the pertinent OS or hypervisor.
- the OS the configuration may include one OS or a plurality of OSs
- hypervisor that was operating when a CPU exception occurred based on the program counter of the time when the CPU exception occurred
- This embodiment describes an outline of the operation when the interrupt detected by the interrupt detection unit (102) is a regular interrupt instead of a CPU exception.
- the configuration of the computer apparatus (100) may be the same as the example configuration of the first embodiment ( Fig. 1 ) or the configuration diagram of the second embodiment ( Fig. 8 ).
- Fig. 14 illustrates a flow of processing upon occurrence of an interrupt.
- Fig. 14 the memory (103), the secondary storage device (104), and the boot program (140) which are not directly relevant to the description are not illustrated.
- a solid arrow indicates the flow of processing
- a dashed arrow indicates the content of processing.
- the interrupt detection unit (102) calls the interrupt determining part (134) of the RAS module (130) at the timing when processing transitions from the hypervisor (120) to the OS (110).
- the interrupt generation timing is when processing returns from the hypervisor (120) to the OS (110), and the program counter at the time when the interrupt occurred indicates that processing by the OS was in progress (the program counter never indicates that processing by the hypervisor was in progress).
- the interrupt mask process of the hypervisor (120) described above does not exist because the hypervisor (120) itself is not present.
- Fig. 15 illustrates the operation after S703 is determined as NO in Fig. 7 or Fig. 13 .
- the OS identifying part (135) of the RAS module (130) first identifies the OS (110) that was operating when the regular interrupt occurred (S1001).
- the OS identifying part (135) identifies the OS (110) that was operating when the regular interrupt occurred, based on the program counter of the computer apparatus (100).
- the OS identifying part (135) refers to the interrupt determining part (112) of the pertinent OS (110) to identify and call the address of the program of the interrupt processing part (115) of the pertinent OS (110) (S1002).
- the interrupt processing part of the OS can be called properly if the interrupt detected by the interrupt detection unit is a regular interrupt instead of a CPU exception.
- the OS can process the interrupt properly with the operation described in the third embodiment.
- This embodiment has described a RAS scheme, according to which by referring to the interrupt determining part of the OS so as to call the interrupt processing part of the corresponding OS upon occurrence of a regular interrupt instead of a CPU exception, the interrupt processing part of the OS can be called properly.
- one of these embodiments may be implemented partially.
- two or more of these embodiments may be implemented partially in combination.
- 100 computer apparatus, 101: CPU, 102: interrupt detection unit, 103: memory, 104: secondary storage device, 110: OS, 111: initialization procedure, 112: interrupt determining part, 113: CPU exception processing part, 115: interrupt processing part, 120: hypervisor, 121: initialization procedure, 130: RAS module, 132: first initialization procedure, 133: second initialization procedure, 134: interrupt determining part, 135: OS identifying part, 136: fault detecting part, 137: fault information collecting part, 138: fault specifying part, 139: fault handling part, 140: boot program
Abstract
Description
- The present invention relates to a technique for adding a RAS module which is a program for realizing a malfunction handling function (i.e., a RAS (Reliability, Availability, Serviceability) function) for handling a CPU (Central Processing Unit) exception to a computer apparatus on which an OS (Operating System) is implemented, without modifying the OS module.
- The present invention also relates to a technique for adding a RAS module to a computer apparatus on which an OS and a hypervisor are implemented, without modifying the OS and hypervisor modules.
- The hypervisor is software that realizes virtualization of the computer apparatus.
- The hypervisor is software located between the OS and hardware of the computer apparatus and emulates the operation of the computer apparatus. The hypervisor allows a plurality of OSs to operate simultaneously on a single computer apparatus, and acts as an intermediary for communication and sharing of resources among the plurality of OSs.
- A CPU exception is an exception in which the CPU is unable to continue normal processing (for example, division by zero, etc.).
- It is arranged such that another program which is set in advance can be called upon occurrence of a CPU exception.
- In the present Specification, both a CPU exception and an interrupt other than a CPU exception will be referred to using the term "interrupt".
- Among the "interrupts", an interrupt other than a CPU exception will be referred to as a "regular interrupt" to be distinguished from a CPU exception.
- In conventional art, in a computer apparatus in which each OS (there may be one OS or a plurality of OSs) and a hypervisor (the computer apparatus may be configured without a hypervisor) are operating as independent modules, a RAS function for handling a CPU exception is realized by a method such as adding a process for handling a CPU exception to each OS or the hypervisor.
- For example, Patent Literature 1 discloses a configuration in which a VM monitor (corresponding to a hypervisor in the present Specification) is provided with means to extract failure information of a process being executed in a virtual computer which experienced a main system failure (corresponding to a CPU exception in the present Specification) from a failure information storage area.
- For example, Patent Literature 2 discloses a technique for resolving an exception in a virtual computer on which a hypervisor is employed to operate a plurality of OSs.
- Specifically, Patent Literature 2 discloses the technique for resolving an exception by copying to the hypervisor a memory image of the portion of a process being executed by an OS when the exception occurred and emulating a privileged instruction included in the process being executed by the OS when the exception occurred.
-
- Patent Literature 1:
JP 01-053238 A - Patent Literature 2:
JP 2006-155272 A - Conventionally, in a computer system in which each OS (there may be one OS or a plurality of OSs) and a hypervisor (the computer system may be configured without a hypervisor) are configured as independent modules, it has been necessary to modify or change an interrupt detection unit provided in the OS, the hypervisor, or a CPU in order to add a RAS function corresponding to an exception process of the OS to the computer system.
- For example, in the techniques disclosed in Patent Literature 1 and Patent Literature 2, the RAS function is realized by providing a function to handle a CPU exception in the hypervisor in advance.
- For this reason, there is a problem, which is that adding the RAS function is difficult in a case where it is difficult to modify the OS or hypervisor module (including a case where a high level of technical difficulty leads to increased costs, a case where modification cannot be made for licensing reasons, and a case where modification is not desirable in terms of quality preservation).
- The present invention mainly aims to solve the above-described problem.
- That is, the present invention mainly aims to add a RAS module to a computer apparatus and to realize a RAS function appropriately without modifying an OS.
- A computer apparatus according to the present invention includes
a CPU (Central Processing Unit) including an interrupt detection unit that detects an interrupt; and
an OS (Operating System) including an interrupt determining part that is called by the interrupt detection unit when the interrupt detection unit has detected an interrupt, and determines whether or not the interrupt detected by the interrupt detection unit is a CPU exception,
wherein when a RAS (Reliability Availability Serviceability) module which is a program for carrying out a process to handle a CPU exception is added to the computer apparatus,
at start-up of the computer apparatus the CPU calls a first initialization procedure included in the RAS module, and executes the first initialization procedure to initialize a resource to be used by the RAS module;
after execution of the first initialization procedure of the RAS module, the CPU calls an initialization procedure included in the OS, and executes the initialization procedure to initialize a resource to be used by the OS; and
after execution of the initialization procedure of the OS, the CPU calls a second initialization procedure included in the RAS module, and executes the second initialization procedure to copy the interrupt determining part included in the OS to the RAS module, and to set the interrupt detection unit such that upon detecting an interrupt the interrupt detection unit calls an interrupt determining part copied to the RAS module, instead of the interrupt determining part in the OS. - According to the present invention, a RAS module can be added to a computer apparatus without modifying an OS which is implemented on the computer apparatus, and when an interrupt detection unit of a CPU has detected an interrupt, the RAS module is called appropriately and a RAS function is realized.
-
-
Fig. 1 is a diagram illustrating an example configuration of a computer apparatus according to a first embodiment; -
Fig. 2 is a flowchart diagram illustrating an outline of an initialization process in the computer apparatus according to the first embodiment; -
Fig. 3 is a flowchart diagram illustrating in detail a first initialization procedure of a RAS module according to the first embodiment; -
Fig. 4 is a flowchart diagram illustrating in detail an initialization procedure of an OS according to the first embodiment; -
Fig. 5 is a flowchart diagram illustrating in detail a second initialization procedure of the RAS module according to the first embodiment; -
Fig. 6 is a diagram illustrating an example of operation upon occurrence of a CPU exception in the computer apparatus according to the first embodiment; -
Fig. 7 is a flowchart diagram illustrating an example of operation upon occurrence of a CPU exception in the computer apparatus according to the first embodiment; -
Fig. 8 is a diagram illustrating an example configuration of the computer apparatus according to a second embodiment; -
Fig. 9 is a flowchart diagram illustrating an outline of an initialization process in the computer apparatus according to the second embodiment; -
Fig. 10 is a flowchart diagram illustrating in detail a first initialization procedure of the RAS module according to the second embodiment; -
Fig. 11 is a diagram illustrating an example of operation upon occurrence of a CPU exception in the computer apparatus according to the second embodiment; -
Fig. 12 is a diagram illustrating an example of operation upon occurrence of a CPU exception in the computer apparatus according to the second embodiment; -
Fig. 13 is a flowchart diagram illustrating an example of operation upon occurrence of a CPU exception in the computer apparatus according to the second embodiment; -
Fig. 14 is a diagram illustrating an example of operation upon occurrence of an interrupt in the computer apparatus according to a third embodiment; -
Fig. 15 is a flowchart diagram illustrating an example of operation upon occurrence of an interrupt in the computer apparatus according to the third embodiment; -
Fig. 16 is a diagram illustrating an example configuration of the computer apparatus according to the first embodiment before the RAS module is added; and -
Fig. 17 is a diagram illustrating an example configuration of the computer apparatus according to the second embodiment before the RAS module is added. - The first to third embodiments hereinafter describe a computer apparatus in which one OS or a plurality of OSs and a hypervisor may operate.
- More specifically, the computer apparatus and a control method of the computer apparatus are described in which a RAS module for handling a CPU exception can be added without modifying the OS or hypervisor module, and upon occurrence of a CPU exception the RAS module is called appropriately and the RAS module carries out a process to handle the CPU exception.
- The first to third embodiments also describe the computer apparatus in which a RAS function is executed even if a failure occurs in the OS or the hypervisor concurrently with an occurrence of a CPU exception.
- When it is configured that the OS or the hypervisor itself implements the RAS function, there is a problem, which is that the RAS function cannot be executed if a failure has also occurred in the OS or the hypervisor itself that was operating when a CPU exception occurred.
- The first to third embodiments describe the computer apparatus that solves the above-described problem.
- The first to third embodiments also describe the RAS module that can obtain failure information of the pertinent OS (or hypervisor) by determining the OS (or hypervisor) that was operating when a CPU exception occurred.
- The RAS module is also described that can call an interrupt processing part properly upon occurrence of a regular interrupt, instead of a CPU exception, by referring to an interrupt determining part of the OS so as to call a corresponding interrupt processing part, even in a case where interrupt registration details have been changed in the OS (including a case where interrupt registration details have been changed dynamically after start-up of the computer apparatus and before occurrence of the interrupt).
-
Fig. 1 is a block diagram illustrating an example configuration of a computer apparatus (100) according to the first embodiment. - The computer apparatus (100) is configured with hardware and software.
- The computer apparatus (100) includes a CPU (101), a memory (103), and a secondary storage device (104) as hardware.
- The configuration may include one CPU (101) or a plurality of CPUs (101) (multiple cores, multiple CPUs, multiple processors, etc.).
- The CPU (101) includes an interrupt detection unit (102).
- The interrupt detection unit (102) detects an interrupt (a CPU exception and a regular interrupt).
- The memory (103) is a RAM (Random Access Memory).
- The secondary storage device (104) is, for example, a ROM (Read Only Memory), an HDD (Hard Disk Drive), or an SSD (Solid State Drive).
- The software to be described later is stored in the secondary storage device (104), and is loaded from the secondary storage device (104) into the memory (103) for execution. The software is sequentially read from the memory (103) into the CPU (101) and is then executed.
- Information, data, a variable value, and the like that are obtained as a result of executing the software to be described later are stored in the memory (103) and a register in the CPU (101).
- Although not illustrated in
Fig. 1 , the computer apparatus (100) is provided with various devices including an input/output device and a communication device. - As the software of the computer apparatus (100), an OS (110), a RAS module (130), and a boot program (140) are included as separate modules.
- The boot program (140) is executed when the computer apparatus (100) is started.
- There may be one OS (110) or a plurality of OSs (110).
- The OS (110) module includes an initialization procedure (111), an interrupt determining part (112), a CPU exception processing part (113), and an interrupt processing part (115).
- The initialization procedure (111) is a program for initializing resources to be used by the OS (110).
- The resources to be used by the OS (110) include both hardware resources and software resources.
- The interrupt determining part (112) is a program that is called by the interrupt detection unit (102) when the interrupt detection unit (102) has detected an interrupt.
- The interrupt determining part (112) determines whether the interrupt detected by the interrupt detection unit (102) is a CPU exception or a regular interrupt.
- At start-up of the computer apparatus (100), the interrupt determining part (112) is copied to the RAS module (130) and becomes an interrupt determining part (134), and the setting of the interrupt detection unit (102) is changed such that the interrupt detection unit (102) calls the interrupt determining part (134) when the interrupt detection unit (102) has detected an interrupt, as will be described later.
- For this reason, after the RAS module (130) has been added to the computer apparatus (100), the interrupt determining part (112) will not be called by the interrupt detection unit (102).
- The CPU exception processing part (113) and the interrupt processing part (115) are programs that are executed upon occurrence of a regular interrupt which is not a CPU exception.
- The operation of the CPU exception processing part (113) and the interrupt processing part (115) will be described in detail in the third embodiment.
- The RAS module (130) is a program that carries out a process to handle a CPU exception.
- The RAS module (130) includes a first initialization procedure (132), a second initialization procedure (133), the interrupt determining part (134), an OS identifying part (135), a fault detecting part (136), a fault information collecting part (137), a fault specifying part (138), and a fault handling part (139).
- The RAS module (130) will also be referred to simply as the RAS.
- The first initialization procedure (132) is a program that is executed before the initialization procedure (111) of the OS (110) is executed.
- The first initialization procedure (132) initializes resources to be used by the RAS module (130).
- The resources to be used by the RAS module (130) include both hardware resources and software resources.
- The first initialization procedure (132) also carries out a process to rewrite the last portion of the program code of the initialization procedure (111) of the OS (110) such that the second initialization procedure (133) to be described later is executed.
- The second initialization procedure (133) is a program that is executed after execution of the initialization procedure (111) of the OS (110).
- The second initialization procedure (133) carries out a process to copy the interrupt determining part (112) included in the OS (110) to the RAS module (130), and to set the interrupt detection unit (102) such that upon detecting an interrupt the interrupt detection unit (102) calls the interrupt determining part (134) copied to the RAS module (130), instead of the interrupt determining part (112) of the OS (110).
- As described above, the interrupt determining part (134) is the interrupt determining part (112) that has been copied to the RAS module (130).
- For this reason, the interrupt determining part (134) carries out the same process as that carried out by the interrupt determining part (112).
- That is, the interrupt determining part (134) determines whether the interrupt detected by the interrupt detection unit (102) is a CPU exception or a regular interrupt.
- The OS identifying part (135) is a program that is executed when the interrupt determining part (134) determines that the interrupt detected by the interrupt detection unit (102) is a regular interrupt.
- The OS identifying part (135) identifies the OS (110) that was operating when the interrupt occurred.
- The operation of the OS identifying part (135) will be described in detail in the third embodiment.
- The fault detecting part (136), the fault information collecting part (137), the fault specifying part (138), and the fault handling part (139) are programs that are executed when the interrupt determining part (134) determines that the interrupt detected by the interrupt detection unit (102) is a CPU exception.
- The fault detecting part (136) identifies a fault that has caused the CPU exception.
- The fault information collecting part (137) identifies the OS (110) that was operating when the CPU exception occurred, and collects information about the fault from the identified OS (110).
- The fault specifying part (138) specifies a fault handling method corresponding to the fault, based on the information collected by the fault information collecting part (137).
- The fault handling part (139) carries out the fault handling method specified by the fault specifying part (138).
- The fault detecting part (136), the fault information collecting part (137), the fault specifying part (138), and the fault handling part (139) each correspond to an example of a CPU exception processing part.
- The interrupt determining part (112) and the interrupt determining part (134) hold addresses, on the memory (103), of the processes (programs) for handling a CPU exception and a regular interrupt.
- The interrupt processing part (115) holds the processes (programs) for handling an occurrence of an interrupt.
- As described above, the OS (110), the RAS module (130), and the boot program (140) are programs. In the computer apparatus (100), the CPU (101) reads these programs and carries out processes according to the contents described in these programs.
- In the following, it will be described that the CPU (101) carries out a process or the like. For ease of understanding and depending on the context, expressions may also be employed where the OS (110), the RAS module (130), or the boot program (140) is described as carrying out a process or the like, or an element (for example, the first initialization procedure (132)) included in these is described as carrying out a process or the like.
- In the present Specification, even when the software is described as the subject that carries out a process or the like, the description means that the process is carried out by the CPU (101) executing a program.
- First, an outline of the overall operation at initialization of the computer apparatus (100) according to this embodiment will be described.
-
Fig. 2 illustrates an overall flowchart of a process to initialize the RAS module (130) and the OS (110) at start-up of the computer apparatus (100) according to the first embodiment. - First, when the computer apparatus (100) is started, the boot program (140) is executed by the CPU (101). The first initialization procedure (132) of the RAS module (130) is called (S201), and the first initialization procedure (132) of the RAS module (130) is executed (S202).
- The boot program (140) and the "first initialization procedure" of the RAS module (130) (S202) will be described in detail later.
- Next, the initialization procedure (111) of the OS (110) is called, and the initialization procedure (111) of the OS (110) is executed (S204).
- The "initialization procedure" of the OS (S204) will be described in detail later.
- Lastly, the second initialization procedure (133) of the RAS module (130) is called, and the second initialization procedure (133) of the RAS module (130) is executed (S205).
- The "second initialization procedure" of the RAS module (130) (S205) will be described in detail later.
- When the computer apparatus (100) includes a plurality of OSs (110), the initialization procedure (111) of each OS (110) is executed sequentially in the "initialization procedure" of the OS (110) (S204).
- [First Embodiment: Description of Outline of Operation (Operation of the "First Initialization Procedure" of the RAS Module)
- Next, the operation at initialization according to the first embodiment will be described in detail.
-
Fig. 3 illustrates a detailed flowchart of the "first initialization procedure" of the RAS module (130) (S202) described above. - In the "first initialization procedure" (S202), the CPU (101) first executes the first initialization procedure (132) to initialize the RAS module (130) (S301).
- In S301, the resources to be used by the RAS module (130) are initialized mainly.
- Then, the CPU (101) carries out a process to add, to the end of the "initialization procedure" of the OS, a process to "call the "second initialization procedure" of the RAS module after execution of the "initialization procedure" of the OS"(S302).
- Specifically, the second initialization procedure (133) of the RAS module (130) is a program located on the memory (103), which causes the last portion of the initialization procedure (111) of the OS to be rewritten such that the address, on the memory (103), of the program of the second initialization procedure (133) of the RAS module (130) is called at the end of the initialization procedure of the OS (110) (S204).
- That is, the CPU (101) changes the last portion of the program code, stored in the memory (103), of the initialization procedure (111) of the OS (110) to a jump instruction to the program code of the second initialization procedure (133) of the RAS module (130).
- A more specific example of a method for realizing this process is indicated in the following (1) to (4).
- (1) It is arranged that the RAS module (130) can obtain the position of the code (i.e., the memory address of the program) and the size of the code of the initialization procedure (111) of the OS (110) from symbol information of a compiled executable file of the OS (110).
For example, it is arranged that the RAS module (130) can obtain the position and size of the code of the initialization procedure (111) of the OS (110) by employing a method in which the symbol information of the OS (110) is loaded into the memory (103) to allow the RAS module (130) to refer to the symbol information, or a method in which the symbol information of the OS (110) is captured into the RAS module (130) in advance (by hard coding, etc.). - (2) The first initialization procedure (132) of the RAS module (130) obtains the last code position of the initialization procedure (111) of the OS (110) from the code position and size information of the initialization procedure (111) of the OS (110) that have been made available in the above (1).
- (3) In the portion of the last code position, a code (jump instruction) is written to return to the position from which the initialization procedure (111) of the OS (110) is called.
The first initialization procedure (132) of the RAS module (130) saves this position from which the call is made, and modifies the portion of the last code to a jump instruction to the code position of the second initialization procedure (133) of the RAS module (130). - (4) The first initialization procedure (132) of the RAS module (130) modifies the last portion of the code of the second initialization procedure (133) to a jump instruction to the position, saved in the above (3), from which the initialization procedure (111) of the OS (110) is called.
- Then, the initialization procedure (111) of the OS (110) is called (S303).
- Specifically, the memory address of the program of the initialization procedure (111) of the OS (110) is called.
- Next,
Fig. 4 illustrates a detailed flowchart of the "initialization procedure" of the OS (110) (S204) described above. - Here, the CPU (101) initializes the OS (110) (S401).
- In S401, a process to initialize the resources to be used by the OS (110) itself is carried out mainly.
- Then, in accordance with the code rewritten in the "first initialization procedure" of the RAS module (130) (S202) as described above, the CPU (101) calls the second initialization procedure (133) of the RAS module (130) (S402).
- Specifically, the CPU (101) realizes this by calling the memory address of the program of the second initialization procedure (133) of the RAS module (130).
- Lastly,
Fig. 5 illustrates a detailed flowchart of the "second initialization procedure" (S205) of the RAS module (130) described above. - Here, the CPU (101) copies the program code of the interrupt determining part (112) of the OS (110) to the RAS module (130) (S501).
- The interrupt determining part (112) that has been copied to the RAS module (130) becomes the interrupt determining part (134).
- Then, the CPU (101) carries out a process to set the interrupt detection unit (102) such that upon occurrence of an interrupt the interrupt detection unit (102) calls the interrupt determining part (134) of the RAS module (130) (S502).
- The above has described the initialization operation at start-up in the configuration to which the RAS module (130) has been added. In the following, a method for adding the RAS module (130) to the computer apparatus (100) will be described.
-
Fig. 16 illustrates an example configuration of the computer apparatus (100) before the RAS module (130) is added. - In the computer apparatus (100) before the RAS module (130) is added, each OS (the configuration may include one OS or a plurality of OSs) is configured as an independent module.
- Before the RAS module (130) is added, the computer apparatus (100) performs normal operation without the RAS module (130), as illustrated in
Fig. 16 . - At start-up of the computer apparatus (100) of
Fig. 16 , the boot program (140) on the computer apparatus (100) starts up first, and the initialization procedure (111) of the OS (110) is called from the boot program (140). - To add the RAS module (130) according to this embodiment to the computer apparatus (100) of
Fig. 16 , the program of the RAS module (130) to be added is first placed in free space on the secondary storage device (104) of the computer apparatus - (100). Also, the boot program (140) of the computer apparatus (100) is changed such that at start-up of the computer apparatus (100) the boot program (140) calls the first initialization procedure (132) of the RAS module (130).
- Specifically, the boot program (140) is changed such that the address, on the memory (103), of the program of the first initialization procedure (132) of the RAS module (130) is called from the boot program (140).
- The above is the outline of the method for adding the RAS module (130) to the computer apparatus (100).
- When the computer apparatus (100) is started after the RAS module (130) has been added to the computer apparatus (100) as described above (as illustrated in
Fig. 1 ), the operation as illustrated in the flowchart ofFig. 2 is performed. - With the above-described configuration and operation, the RAS function corresponding to the interrupt determining part provided in the OS can be easily added, without modifying the OS module, to the computer apparatus in which each OS (the configuration may include one OS or a plurality of OSs) is configured as an independent module.
- Next, an outline of the operation of the computer apparatus (100) according to the first embodiment upon occurrence of an interrupt will be described.
-
Fig. 6 illustrates a flow of processing upon occurrence of an interrupt while the OS (110) is operating. - Here, there may be a plurality of OSs.
- In
Fig. 6 , the memory (103), the secondary storage device (104), and the boot program (140) which are not directly relevant to the description are not illustrated. - In
Fig. 6 , a solid arrow indicates the flow of processing, and a dashed arrow indicates the content of processing. - Upon occurrence of an interrupt, the interrupt detection unit (102) of the CPU (101) calls the RAS module (130).
- The RAS module (130) determines whether or not the interrupt that has occurred is a CPU exception.
- Then, if the interrupt that has occurred is a CPU exception, the RAS module (130) determines the OS that was operating based on a program counter (a register holding the address being executed by the CPU) of the computer apparatus (100) (determines the OS that was operating based on the location of the code being executed which is held in the program counter), identifies the pertinent OS, and collects fault information from the identified OS.
- With this operation, it is possible in the configuration including a plurality of OSs to determine the OS that was operating when the CPU exception occurred and to collect fault information from the pertinent OS.
-
Fig. 7 illustrates a flowchart for when an interrupt has occurred. - First, the interrupt detection unit (102) detects an interrupt (S701).
- Then, the interrupt detection unit (102) calls the interrupt determining part (134) of the RAS module (130) (S702).
- The interrupt determining part (134) determines whether the interrupt detected by the interrupt detection unit (102) is a CPU exception or a regular interrupt (S703).
- If the interrupt detected by the interrupt detection unit (102) is a regular interrupt (NO in S703), processing transitions to S1001 of
Fig. 15 . - The flowchart of
Fig. 15 will be described in detail in the third embodiment. - On the other hand, if the interrupt detected by the interrupt detection unit (102) is a CPU exception (YES in S703), the fault detecting part (136) of the RAS module (130) identifies a fault that caused the CPU exception (S705).
- Then, the fault information collecting part (137) of the RAS module (130) identifies the OS (110) that was operating when the CPU exception occurred, and collects fault information from the pertinent OS (110) (S706-1).
- As described above, the fault information collecting part (137) of the RAS module (130) identifies the OS (110) that was operating when the CPU exception occurred, based on the program counter of the computer apparatus (100).
- Then, the fault specifying part (138) of the RAS module (130) specifies a fault handling method corresponding to the fault identified in S705 (S707).
- Lastly, the fault handling part (139) of the RAS module (130) carries out a process to handle the fault in accordance with the fault handling method specified in S707 (S708).
- According to the first embodiment, by executing the interrupt determining part (134), the fault detecting part (136), and so on in the RAS module (130) upon occurrence of a CPU exception, the RAS function can be executed even if the CPU exception has occurred concurrently with a failure in the OS.
- This embodiment has described a RAS scheme, according to which the RAS function corresponding to the interrupt determining part provided in the OS can be added, without modifying the OS module, to the computer apparatus in which each OS (the configuration may include one OS or a plurality of OSs) is configured as an independent module.
- More specifically, this embodiment has described the following (1) to (4).
- (1) The "initialization procedure" of the RAS module which is called at start-up of the computer apparatus is divided into the "first initialization procedure" which is executed before the "initialization procedure" of the OS and the "second initialization procedure" which is executed after completion of the "initialization procedure" of the OS.
- (2) In the "first initialization procedure" of the RAS module, the last portion of the "initialization procedure" of the OS is rewritten to call the "second initialization procedure" of the RAS module after the "initialization procedure" of the OS.
- (3) In the "second initialization procedure" of the RAS module, the "interrupt determining part" of the OS is copied to the RAS module.
- (4) In the "second initialization procedure" of the RAS module, the "interrupt detection unit" is set to call the "interrupt determining part" of the RAS module, instead of the "interrupt determining part" of the OS, upon occurrence of an interrupt.
- This embodiment has also described the RAS scheme, according to which by executing the interrupt determining part, the fault detecting part, and so on in the RAS module upon occurrence of a CPU exception, the RAS function is executed even if the CPU exception has occurred concurrently with a failure in the OS.
- This embodiment has also described the RAS scheme, according to which by executing the fault information collecting part in the RAS module, it is possible to determine the OS that was operating when a CPU exception occurred based on the program counter of the time when the CPU exception occurred, and to collect failure information from the pertinent OS.
-
Fig. 8 is a block diagram illustrating an example configuration of the computer apparatus (100) according to the second embodiment. - The only difference from the configuration of the first embodiment (
Fig. 1 ) is that a hypervisor (120) and an initialization procedure (121) of the hypervisor (120) are included, and the rest are the same as those of the configuration of the first embodiment (Fig. 1 ). - The configuration may include one OS (110) or a plurality of OSs (110) also in this embodiment.
- The configuration may also include one CPU (101) or a plurality of CPUs (101) (multiple cores, multiple CPUs, multiple processors, etc.).
- As the description of the operation according to the second embodiment, an outline of the overall operation at initialization will be described first.
-
Fig. 9 illustrates an overall flow of a process to initialize the RAS module (130), the OS (110) module, and the hypervisor (120) module at start-up of the computer apparatus (100) according to the second embodiment. - Here, the difference from the operation at initialization according to the first embodiment (
Fig. 2 ) is that the "initialization procedure" of the hypervisor (S203) is added between the process in S202 and S204, and the rest of the flow is the same as the flow of the first embodiment (Fig. 2 ). - For this reason, the process in S203 and before and after S203 will be mainly described here.
- First, the flow of
Fig. 9 up to the end of the "first initialization procedure" of the RAS module (S202) is the same as that of the first embodiment. - Then, the initialization procedure (121) of the hypervisor (120) is called and executed (S203).
- In S203, the CPU (101) mainly initializes resources to be used by the hypervisor (120).
- Then, the flow of
Fig. 9 at and after the "initialization procedure" of the OS (S204) is the same as that of the first embodiment. - The details of S205 are as illustrated in
Fig. 5 of the first embodiment. - When the configuration includes a plurality of OSs, the initialization procedure (111) of each OS (110) is executed sequentially in the "initialization procedure" of the OS (S204) in the flowchart of
Fig. 9 . - Next, the operation at initialization according to the second embodiment will be described in detail.
-
Fig. 10 illustrates a detailed flow of the "first initialization procedure" of the RAS module (S202) described above. - Here, the only difference from the "first initialization procedure" of the RAS module of the first embodiment (
Fig. 3 ) is that in the last step the "initialization procedure" of the hypervisor is called (S304 ofFig. 10 ), instead of the "initialization procedure" of the OS being called (S303 ofFig. 3 ). Thus, only this portion will be described. - In S304, the CPU (101) calls the initialization procedure (121) of the hypervisor (120).
- Specifically, the CPU (101) realizes this by calling the address, on the memory (103), of the program of the initialization procedure (121) of the hypervisor (120).
- The above has described the initialization operation at start-up in the configuration to which the RAS module (130) has been added. In the following, a method for adding the RAS module (130) to the computer apparatus (100) will be described.
-
Fig. 17 illustrates an example configuration of the computer apparatus (100) before the RAS module (130) is added. - In the computer apparatus (100) before the RAS module (130) is added, each OS (the configuration may include one OS or a plurality of OSs) and the hypervisor are configured as independent modules.
- Before the RAS module (130) is added, the computer apparatus (100) performs normal operation without the RAS module (130), as illustrated in
Fig. 17 . - At start-up of the computer apparatus (100) of
Fig. 17 , the boot program (140) on the computer apparatus (100) starts up first. The initialization procedure (121) of the hypervisor (120) is called from the boot program (140), and the initialization procedure (111) of the OS (110) is called from the initialization procedure (121) of the hypervisor (120). - To add the RAS module (130) according to this embodiment to the computer apparatus (100) of
Fig. 17 , the program of the RAS module (130) to be added is first placed in free space on the secondary storage device (104) of the computer apparatus (100). - Also, the boot program (140) of the computer apparatus (100) is changed such that at start-up of the computer apparatus (100) the boot program (140) calls the first initialization procedure (132) of the RAS module (130).
- Specifically, the boot program (140) is changed such that the address, on the memory (103), of the program of the first initialization procedure (132) of the RAS module (130) is called from the boot program (140).
- The above is the outline of the method for adding the RAS module (130) to the computer apparatus (100).
- When the computer apparatus (100) is started after the RAS module (130) has been added to the computer apparatus (100) as described above (as illustrated in
Fig. 8 ), the operation as illustrated in the flowchart ofFig. 9 described above is performed. - With the above-described configuration and operation, the RAS function corresponding to the interrupt determining part provided in the OS can be easily added, without modifying the OS or hypervisor module, to the computer apparatus in which each OS (the configuration may include one OS or a plurality of OSs) and the hypervisor are configured as independent modules.
- Next, an outline of the operation of the computer apparatus (100) according to the second embodiment upon occurrence of an interrupt will be described.
-
Fig. 11 illustrates a flow of processing upon occurrence of an interrupt while the OS (110) is operating. - Here, there may be a plurality of OSs.
-
Fig. 12 illustrates a flow of processing upon occurrence of an interrupt while the hypervisor (120) is operating. -
Fig. 11 andFig. 12 differ only in that whether the fault information collecting part (137) of the RAS module (130) collects fault information of the OS or collects fault information of the hypervisor. - In
Fig. 11 andFig. 12 , the memory (103), the secondary storage device (104), and the boot program (140) which are not directly relevant to the description are not illustrated. - In
Fig. 11 andFig. 12 , a solid arrow indicates the flow of processing, and a dashed arrow indicates the content of processing. - Upon occurrence of an interrupt, the interrupt detection unit (102) of the CPU (101) calls the RAS module (130).
- The RAS module (130) determines whether or not the interrupt that has occurred is a CPU exception.
- Then, if the interrupt that has occurred is a CPU exception, the RAS module (130) determines the OS or hypervisor that was operating when the CPU exception occurred based on the program counter (the register holding the address being executed by the CPU) of the computer apparatus (100) (determines the OS or hypervisor that was operating based on the location of the code being executed which is held in the program counter), identifies the pertinent OS or hypervisor, and collects fault information from the identified OS or hypervisor.
- With this operation, it is possible to determine the OS (the configuration may include one OS or a plurality of OSs) or hypervisor that was operating when the CPU exception occurred, and to collect fault information from the pertinent OS or hypervisor.
-
Fig. 13 illustrates a flowchart for when an interrupt has occurred. - The only difference between this flowchart and the flowchart of
Fig. 7 of the first embodiment is that the process in S706-1 is replaced with the process in S706-2. - Thus, only the process in S706-2 will be described hereinafter.
- After the fault detecting part (136) of the RAS module (130) has identified the pertinent fault (S705), the fault information collecting part (137) of the RAS module (130) identifies the OS (or hypervisor) that was operating when the CPU exception occurred, and collects fault information from the pertinent OS (or hypervisor) (S706-2).
- As described above, the fault information collecting part (137) of the RAS module (130) identifies the OS (or hypervisor) that was operating when the CPU exception occurred, based on the program counter of the computer apparatus (100).
- The process thereafter (in and after S707) is the same as that of
Fig. 7 of the first embodiment. - According to the second embodiment, by executing the interrupt determining part (134), the fault detecting part (136), and so on in the RAS module (130) upon occurrence of a CPU exception, the RAS function can be executed even if the CPU exception has occurred concurrently with a failure in the OS.
- By executing the interrupt determining part (134), the fault detecting part (136), and so on in the RAS module (130) upon occurrence of a CPU exception, the RAS function can be executed even if the CPU exception has occurred concurrently with a failure in the hypervisor.
- This embodiment has described a RAS scheme, according to which the RAS function corresponding to the interrupt determining part provided in the OS can be added, without modifying the OS or hypervisor module, to the computer apparatus in which each OS (the configuration may include one OS or a plurality of OSs) and the hypervisor are configured as independent modules.
- More specifically, this embodiment has described the following (1) to (4).
- (1) The "initialization procedure" of the RAS module which is called at start-up of the computer apparatus is divided into the "first initialization procedure" which is executed before the "initialization procedures" of the OS and the hypervisor and the "second initialization procedure" which is executed after completion of the "initialization procedures" of the OS and the hypervisor.
- (2) In the "first initialization procedure" of the RAS module, the last portion of the "initialization procedure" of the OS is rewritten to call the "second initialization procedure" of the RAS module after the "initialization procedure" of the OS.
- (3) In the "second initialization procedure" of the RAS module, the "interrupt determining part" of the OS is copied to the RAS module.
- (4) In the "second initialization procedure" of the RAS module, the "interrupt detection unit" is set to call the "interrupt determining part" of the RAS module, instead of the "interrupt determining part" of the OS, upon occurrence of an interrupt.
- This embodiment has also described the RAS scheme, according to which by executing the interrupt determining part, the fault detecting part, and so on in the RAS module upon occurrence of a CPU exception, the RAS function is executed even if the CPU exception has occurred concurrently with a failure in the OS.
- This embodiment has also described the RAS scheme, according to which by executing the fault information collecting part in the RAS module, it is possible to determine the OS (the configuration may include one OS or a plurality of OSs) or hypervisor that was operating when a CPU exception occurred based on the program counter of the time when the CPU exception occurred, and to collect failure information from the pertinent OS or hypervisor.
- This embodiment describes an outline of the operation when the interrupt detected by the interrupt detection unit (102) is a regular interrupt instead of a CPU exception.
- The configuration of the computer apparatus (100) may be the same as the example configuration of the first embodiment (
Fig. 1 ) or the configuration diagram of the second embodiment (Fig. 8 ). - Although the description is provided herein using the configuration of the second embodiment (the configuration with the hypervisor), the following operation is also performed in the configuration of the first embodiment (the configuration without the hypervisor).
-
Fig. 14 illustrates a flow of processing upon occurrence of an interrupt. - In
Fig. 14 , the memory (103), the secondary storage device (104), and the boot program (140) which are not directly relevant to the description are not illustrated. - In
Fig. 14 , a solid arrow indicates the flow of processing, and a dashed arrow indicates the content of processing. - It is assumed that in
Fig. 14 the hypervisor (120) has carried out an interrupt mask process and does not accept any interrupt. - For this reason, if an interrupt occurs while processing by the hypervisor (120) is in progress, the interrupt detection unit (102) calls the interrupt determining part (134) of the RAS module (130) at the timing when processing transitions from the hypervisor (120) to the OS (110).
- Therefore, the interrupt generation timing is when processing returns from the hypervisor (120) to the OS (110), and the program counter at the time when the interrupt occurred indicates that processing by the OS was in progress (the program counter never indicates that processing by the hypervisor was in progress).
- In the first embodiment (the configuration without the hypervisor), the interrupt mask process of the hypervisor (120) described above does not exist because the hypervisor (120) itself is not present.
-
Fig. 15 illustrates the operation after S703 is determined as NO inFig. 7 orFig. 13 . - That is, if the interrupt determining part (134) determines that the interrupt is a regular interrupt instead of a CPU exception in S703 of
Fig. 7 orFig. 13 , S1001 ofFig. 15 is carried out. - In
Fig. 15 , the OS identifying part (135) of the RAS module (130) first identifies the OS (110) that was operating when the regular interrupt occurred (S1001). - Like the fault information collecting part (137) described in the first and second embodiments, the OS identifying part (135) identifies the OS (110) that was operating when the regular interrupt occurred, based on the program counter of the computer apparatus (100).
- Then, the OS identifying part (135) refers to the interrupt determining part (112) of the pertinent OS (110) to identify and call the address of the program of the interrupt processing part (115) of the pertinent OS (110) (S1002).
- Then, the interrupt processing part (115) of the OS (110) is executed. Thereafter, processing returns to the OS (110) (S1003).
- By carrying out the operation described in the third embodiment in the RAS function which is added according to the first embodiment or the second embodiment, the interrupt processing part of the OS can be called properly if the interrupt detected by the interrupt detection unit is a regular interrupt instead of a CPU exception.
- For example, even if the registration of interrupt operation is supplemented or changed dynamically by a user operation or the like in the OS after start-up of the computer apparatus and before occurrence of an interrupt, the OS can process the interrupt properly with the operation described in the third embodiment.
- This embodiment has described a RAS scheme, according to which by referring to the interrupt determining part of the OS so as to call the interrupt processing part of the corresponding OS upon occurrence of a regular interrupt instead of a CPU exception, the interrupt processing part of the OS can be called properly.
- The embodiments of the present invention have been described. Two or more of these embodiments may be implemented in combination.
- Alternatively, one of these embodiments may be implemented partially.
- Alternatively, two or more of these embodiments may be implemented partially in combination.
- The present invention is not limited to these embodiments, and various modifications are possible as required.
- 100: computer apparatus, 101: CPU, 102: interrupt detection unit, 103: memory, 104: secondary storage device, 110: OS, 111: initialization procedure, 112: interrupt determining part, 113: CPU exception processing part, 115: interrupt processing part, 120: hypervisor, 121: initialization procedure, 130: RAS module, 132: first initialization procedure, 133: second initialization procedure, 134: interrupt determining part, 135: OS identifying part, 136: fault detecting part, 137: fault information collecting part, 138: fault specifying part, 139: fault handling part, 140: boot program
Claims (9)
- A computer apparatus comprising:a CPU (Central Processing Unit) including an interrupt detection unit that detects an interrupt; andan OS (Operating System) including an interrupt determining part that is called by the interrupt detection unit when the interrupt detection unit has detected an interrupt, and determines whether or not the interrupt detected by the interrupt detection unit is a CPU exception,wherein when a RAS (Reliability Availability Serviceability) module which is a program for carrying out a process to handle a CPU exception is added to the computer apparatus,at start-up of the computer apparatus the CPU calls a first initialization procedure included in the RAS module, and executes the first initialization procedure to initialize a resource to be used by the RAS module;after execution of the first initialization procedure of the RAS module, the CPU calls an initialization procedure included in the OS, and executes the initialization procedure to initialize a resource to be used by the OS; andafter execution of the initialization procedure of the OS, the CPU calls a second initialization procedure included in the RAS module, and executes the second initialization procedure to copy the interrupt determining part included in the OS to the RAS module, and to set the interrupt detection unit such that upon detecting an interrupt the interrupt detection unit calls an interrupt determining part copied to the RAS module, instead of the interrupt determining part in the OS.
- The computer apparatus according to claim 1, further comprising:a memory that stores program code of the OS and program code of the RAS module,wherein a last portion of program code corresponding to the initialization procedure of the OS is a jump instruction to program code which calls the initialization procedure; andwherein the CPU executes the first initialization procedure of the RAS module and changes the last portion of the program code, stored in the memory, corresponding to the initialization procedure of the OS to a jump instruction to program code of the second initialization procedure of the RAS module.
- The computer apparatus according to claim 2,
wherein the CPU executes the first initialization procedure of the RAS module and changes a last portion of the program code, stored in the memory, corresponding to the second initialization procedure of the RAS module to a jump instruction to the program code which calls the initialization procedure of the OS. - The computer apparatus according to any one of claims 1 to 3,
wherein the RAS module includes a CPU exception processing part that carries out a process to handle a CPU exception,
wherein when the interrupt detection unit has detected an interrupt, the interrupt determining part copied to the RAS module is called by the interrupt detection unit, and
wherein when the interrupt detected by the interrupt detection unit is determined as a CPU exception by the interrupt determining part copied to the RAS module, the process to handle the CPU exception is carried out by the CPU exception processing part. - The computer apparatus according to claim 4,
wherein when the interrupt detection unit has detected an interrupt, the interrupt determining part copied to the RAS module is called by the interrupt detection unit, and
wherein when the interrupt detected by the interrupt detection unit is determined as a CPU exception by the interrupt determining part copied to the RAS module, an OS that was operating when the CPU exception occurred is identified by the CPU exception processing part, and information is collected from the OS identified and the process to handle the CPU exception is carried out by the CPU exception processing part. - The computer apparatus according to any one of claims 1 to 5,
wherein the RAS module includes an OS identifying part that identifies an OS that was operating when an interrupt other than a CPU exception occurred,
wherein when the interrupt detection unit has detected an interrupt, the interrupt determining part copied to the RAS module is called by the interrupt detection unit, and
wherein when the interrupt detected by the interrupt detection unit is determined as other than a CPU exception by the interrupt determining part copied to the RAS module, an OS that was operating when the interrupt occurred is identified by the OS identifying part, and a process to handle the interrupt is carried out by the OS identified. - The computer apparatus according to any one of claims 1 to 6, further comprising a hypervisor,
wherein after execution of the first initialization procedure of the RAS module, the CPU calls an initialization procedure included in the hypervisor, and executes the initialization procedure to initialize a resource to be used by the hypervisor;
after execution of the initialization procedure of the hypervisor, the CPU calls the initialization procedure included in the OS, and executes the initialization procedure to initialize the resource to be used by the OS; and
after execution of the initialization procedure of the OS, the CPU calls the second initialization procedure included in the RAS module, and executes the second initialization procedure. - The computer apparatus according to claim 7,
wherein the RAS module includes a CPU exception processing part that carries out a process to handle a CPU exception,
wherein when the interrupt detection unit has detected an interrupt, the interrupt determining part copied to the RAS module is called by the interrupt detection unit, and
wherein when the interrupt detected by the interrupt detection unit is determined as a CPU exception by the interrupt determining part copied to the RAS module, a hypervisor that was operating when the CPU exception occurred is identified by the CPU exception processing part, and information is collected from the hypervisor identified and the process to handle the CPU exception is carried out by the CPU exception processing part. - A control method of a computer apparatus including
a CPU (Central Processing Unit) including an interrupt detection unit that detects an interrupt; and
an OS (Operating System) including an interrupt determining part that is called by the interrupt detection unit when the interrupt detection unit has detected an interrupt, and determines whether or not the interrupt detected by the interrupt detection unit is a CPU exception,
wherein a RAS (Reliability Availability Serviceability) module which is a program for carrying out a process to handle a CPU exception is added to the computer apparatus, the control method of the computer apparatus comprising:calling, at start-up of the computer apparatus, a first initialization procedure included in the RAS module, and executing the first initialization procedure to initialize a resource to be used by the RAS module, by the CPU;calling, after execution of the first initialization procedure of the RAS module, an initialization procedure included in the OS, and executing the initialization procedure to initialize a resource to be used by the OS, by the CPU; andcalling, after execution of the initialization procedure of the OS, a second initialization procedure included in the RAS module, and executing the second initialization procedure to copy the interrupt determining part included in the OS to the RAS module, and to set the interrupt detection unit such that upon detecting an interrupt the interrupt detection unit calls an interrupt determining part copied to the RAS module, instead of the interrupt determining part in the OS, by the CPU.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2013/052205 WO2014118940A1 (en) | 2013-01-31 | 2013-01-31 | Computer device and control method for computer device |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2953028A1 true EP2953028A1 (en) | 2015-12-09 |
EP2953028A4 EP2953028A4 (en) | 2016-10-12 |
Family
ID=51261685
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP13873790.3A Withdrawn EP2953028A4 (en) | 2013-01-31 | 2013-01-31 | Computer device and control method for computer device |
Country Status (5)
Country | Link |
---|---|
US (1) | US9959225B2 (en) |
EP (1) | EP2953028A4 (en) |
JP (1) | JP5877533B2 (en) |
CN (1) | CN104956337B (en) |
WO (1) | WO2014118940A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101887974B1 (en) * | 2016-12-01 | 2018-08-13 | 현대오트론 주식회사 | System and method for secure boot of engine management system |
KR20220154879A (en) * | 2021-05-14 | 2022-11-22 | 현대자동차주식회사 | Apparatus and method for controlling vehicle |
Family Cites Families (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6453238A (en) | 1987-08-24 | 1989-03-01 | Nec Corp | Processing system at fault generation of main body system in virtual computer system |
JPH0193830A (en) | 1987-10-05 | 1989-04-12 | Nec Corp | System for controlling interruption in virtual computer system |
US5369770A (en) * | 1992-11-02 | 1994-11-29 | Microsoft Corporation | Standardized protected-mode interrupt manager |
JPH0816420A (en) | 1994-06-28 | 1996-01-19 | Hitachi Ltd | Error processing method of small-sized information processor |
US5790846A (en) | 1996-04-18 | 1998-08-04 | International Business Machines Corporation | Interrupt vectoring for instruction address breakpoint facility in computer systems |
US5941972A (en) * | 1997-12-31 | 1999-08-24 | Crossroads Systems, Inc. | Storage router and method for providing virtual local storage |
JP4140286B2 (en) | 2002-06-04 | 2008-08-27 | 株式会社日立製作所 | Computer system |
JP2004287618A (en) | 2003-03-19 | 2004-10-14 | Ntt Data Corp | Starting control method of operating system, program making computer execute its method, and starting control device of operating system |
US20050283599A1 (en) * | 2004-06-22 | 2005-12-22 | Zimmerman Toby S | Exposing BIOS information to an ACPI aware operating system |
US7587639B2 (en) * | 2004-11-09 | 2009-09-08 | Intel Corporation | System and method for error injection using a flexible program interface field |
JP2006155272A (en) | 2004-11-30 | 2006-06-15 | Hitachi Ltd | Control method and program for virtual computer |
US7941810B2 (en) * | 2006-12-28 | 2011-05-10 | Intel Corporation | Extensible and flexible firmware architecture for reliability, availability, serviceability features |
JP2008217728A (en) | 2007-03-08 | 2008-09-18 | Hitachi Ltd | Fault information collecting method for virtual computer system |
US8145819B2 (en) | 2007-06-04 | 2012-03-27 | International Business Machines Corporation | Method and system for stealing interrupt vectors |
CN101373450B (en) | 2007-08-21 | 2010-09-29 | 联想(北京)有限公司 | Method and system for processing CPU abnormity |
JP4678396B2 (en) | 2007-09-25 | 2011-04-27 | 日本電気株式会社 | Computer and method for monitoring virtual machine monitor, and virtual machine monitor monitor program |
US8522236B2 (en) * | 2007-12-28 | 2013-08-27 | Intel Corporation | Method and system for establishing a robust virtualized environment |
US8255931B2 (en) * | 2008-02-11 | 2012-08-28 | Blue Coat Systems, Inc. | Method for implementing ejection-safe API interception |
US8418166B2 (en) * | 2011-01-11 | 2013-04-09 | International Business Machines Corporation | Transparent update of adapter firmware for self-virtualizing input/output device |
US9454379B2 (en) * | 2011-11-22 | 2016-09-27 | Intel Corporation | Collaborative processor and system performance and power management |
JP5786955B2 (en) * | 2011-11-28 | 2015-09-30 | 富士通株式会社 | Memory degeneration method and information processing apparatus |
KR101581608B1 (en) * | 2012-02-13 | 2015-12-30 | 미쓰비시덴키 가부시키가이샤 | Processor system |
US9229884B2 (en) * | 2012-04-30 | 2016-01-05 | Freescale Semiconductor, Inc. | Virtualized instruction extensions for system partitioning |
WO2014134808A1 (en) * | 2013-03-07 | 2014-09-12 | Intel Corporation | Mechanism to support reliability, availability, and serviceability (ras) flows in a peer monitor |
-
2013
- 2013-01-31 WO PCT/JP2013/052205 patent/WO2014118940A1/en active Application Filing
- 2013-01-31 CN CN201380071709.4A patent/CN104956337B/en not_active Expired - Fee Related
- 2013-01-31 JP JP2014559432A patent/JP5877533B2/en not_active Expired - Fee Related
- 2013-01-31 EP EP13873790.3A patent/EP2953028A4/en not_active Withdrawn
- 2013-01-31 US US14/650,630 patent/US9959225B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
WO2014118940A1 (en) | 2014-08-07 |
CN104956337B (en) | 2018-01-09 |
EP2953028A4 (en) | 2016-10-12 |
JP5877533B2 (en) | 2016-03-08 |
US20150331816A1 (en) | 2015-11-19 |
JPWO2014118940A1 (en) | 2017-01-26 |
US9959225B2 (en) | 2018-05-01 |
CN104956337A (en) | 2015-09-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8671405B2 (en) | Virtual machine crash file generation techniques | |
US8966477B2 (en) | Combined virtual graphics device | |
EP3033677B1 (en) | Request processing techniques | |
US8429669B2 (en) | Virtual machine switching control by prefetching information out of and updating a set of processor control information based on a bitmap having update status | |
US11093270B2 (en) | Fast-booting application image | |
US9594582B2 (en) | Detection and management of dynamic migration of virtual environments | |
US8612633B2 (en) | Virtual machine fast emulation assist | |
US10346148B2 (en) | Per request computer system instances | |
US20040205755A1 (en) | Operating systems | |
JP6032510B2 (en) | Recovery after I / O error containment event | |
US9792136B2 (en) | Hardware assisted inter hypervisor partition data transfers | |
US20170103206A1 (en) | Method and apparatus for capturing operation in a container-based virtualization system | |
US20140143372A1 (en) | System and method of constructing a memory-based interconnect between multiple partitions | |
JP2019526083A (en) | Speculative virtual machine execution | |
US9864708B2 (en) | Safely discovering secure monitors and hypervisor implementations in systems operable at multiple hierarchical privilege levels | |
US9535772B2 (en) | Creating a communication channel between different privilege levels using wait-for-event instruction in systems operable at multiple levels hierarchical privilege levels | |
EP2953028A1 (en) | Computer device and control method for computer device | |
US7890740B2 (en) | Processor comprising a first and a second mode of operation and method of operating the same | |
Im et al. | On-demand virtualization for live migration in bare metal cloud | |
US9619277B2 (en) | Computer with plurality of processors sharing process queue, and process dispatch processing method | |
Russinovich | Inside windows server 2008 kernel changes | |
Li et al. | Enhancing security of embedded Linux on a multi-core processor | |
RU2649293C1 (en) | System and method of transmitting intercepted drive to driver requests from during initialisation of drivers | |
RU2589853C1 (en) | Method of providing collaborative operation of several hypervisors in computer system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20150610 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAX | Request for extension of the european patent (deleted) | ||
A4 | Supplementary search report drawn up and despatched |
Effective date: 20160908 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06F 13/24 20060101ALI20160902BHEP Ipc: G06F 11/07 20060101ALI20160902BHEP Ipc: G06F 9/445 20060101ALI20160902BHEP Ipc: G06F 11/30 20060101AFI20160902BHEP Ipc: G06F 9/48 20060101ALI20160902BHEP |
|
17Q | First examination report despatched |
Effective date: 20180614 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20201114 |