WO2014132466A1 - ソフトウェア安全停止システム、ソフトウェア安全停止方法、およびプログラム - Google Patents
ソフトウェア安全停止システム、ソフトウェア安全停止方法、およびプログラム Download PDFInfo
- Publication number
- WO2014132466A1 WO2014132466A1 PCT/JP2013/072741 JP2013072741W WO2014132466A1 WO 2014132466 A1 WO2014132466 A1 WO 2014132466A1 JP 2013072741 W JP2013072741 W JP 2013072741W WO 2014132466 A1 WO2014132466 A1 WO 2014132466A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- virtual machine
- software
- memory
- console
- unit
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0712—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a virtual computing platform, e.g. logically partitioned systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1415—Saving, restoring, recovering or retrying at system level
- G06F11/1438—Restarting or rejuvenating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/4401—Bootstrapping
- G06F9/442—Shutdown
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45575—Starting, stopping, suspending or resuming virtual machine instances
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45583—Memory management, e.g. access or allocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/805—Real-time
Definitions
- the present invention relates to a software safe stop system, a software safe stop method, and a program.
- a memory leak is an example of a system failure caused by a software defect.
- Memory leak is a phenomenon in which the memory allocated for software objects and data is not properly released, and the system stops due to the exhaustion of the free memory area.
- the memory is managed by an operating system (OS) and is shared by a plurality of software, so that the entire OS may stop due to a single software defect.
- OS operating system
- the mobile terminal described in Patent Document 1 monitors the operating state of the operating system at regular intervals, and predicts that the operating system has become unstable due to an unstable factor such as a memory leak. A predetermined avoidance measure corresponding to the state is executed to prevent a failure from occurring.
- an object of the present invention is to avoid occurrence of a malfunction when a system failure occurs due to a memory leak.
- the software safety stop system includes an OS abnormal stop detection unit that detects that an operating system of a virtual machine executed on a computer system has stopped abnormally due to a memory leak, and recovery of the console function of the abnormally stopped virtual machine.
- an OS abnormal stop detection unit that detects that an operating system of a virtual machine executed on a computer system has stopped abnormally due to a memory leak, and recovery of the console function of the abnormally stopped virtual machine.
- a memory resource securing unit that secures memory resources necessary for the computer system from among available memory resources and a memory resource secured to the abnormally stopped virtual machine are allocated.
- a console acquisition unit that normally terminates the virtual machine using the console function.
- the block diagram which shows the structure of the software safe stop system by Embodiment 1 of this invention The flowchart of operation
- FIG. 1 is a block diagram showing a configuration of a software safe stop system 10 according to Embodiment 1 of the present invention.
- the software safe stop system 10 includes virtual machine execution units 101 and 102, an OS abnormal stop detection unit 103, a console acquisition unit 104, a memory resource securing unit 105, and a virtual machine memory allocation information storage unit 106. ing.
- the software safety stop system 10 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and other memories, an external storage device for storing various information, an input interface, an output interface, a communication interface, It can be configured by a dedicated or general-purpose computer having a bus connecting them.
- the software safety stop system 10 may be constituted by a single computer or may be constituted by a plurality of computers connected to each other via a communication line.
- the virtual machine execution units 101 and 102, the OS abnormal stop detection unit 103, the console acquisition unit 104, and the memory resource securing unit 105 are modules of functions realized by the CPU executing a predetermined program stored in a ROM or the like. Equivalent to.
- the virtual machine memory allocation information storage unit 106 is implemented by an external storage device. The external storage device may be connected to the software safe stop system 10 via a network or the like.
- the virtual machine execution units 101 and 102 provide an execution environment for the virtual machine OS.
- the OS abnormal stop detection unit 103 detects a memory leak that has occurred in the virtual machine execution unit 101.
- the console acquisition unit 104 performs additional allocation of memory to the abnormally stopped virtual machine execution unit 101, and when the console function of the virtual machine execution unit 101 is restored by the addition of memory, data storage and log recording are performed using the console. Then, take action to remove the abnormal state of the software, and terminate the virtual machine normally.
- the memory resource securing unit 105 secures memory resources necessary for acquiring the console function of the virtual machine execution unit 101 from among the memory resources that can be used in the software safety stop system 10.
- the virtual machine memory allocation information storage unit 106 stores the amount of memory resources allocated to each virtual machine execution unit.
- FIG. 2 is a flowchart of the operation of the software safe stop system 10.
- a fault due to a memory leak occurs in the virtual machine due to a defect in software executed in the virtual machine execution unit 101 (step A1).
- the OS abnormal stop detection unit 103 detects the abnormal stop state of the virtual machine execution unit 101 (step A2).
- the OS abnormal stop detection unit 103 constantly monitors the alive state of the OS in order to detect the abnormal stop state of the virtual machine execution unit 101.
- the memory resource securing unit 105 checks whether or not there is an amount of free memory necessary for restoring the console function of the virtual machine execution unit 101 in the software safety stop system 10 (step A3). If there is available free memory (step A4; YES), the console acquisition unit 104 adds a necessary amount of memory to the virtual machine execution unit 101 (step A5).
- Step A6 When the console function of the virtual machine execution unit 101 is restored by additional allocation of memory (step A6; YES), the console acquisition unit 104 uses the console to save data, record logs, and remove software abnormal states. (Step A7), the virtual machine is normally terminated (Step A8).
- the memory resource securing unit 105 tries to divert the memory allocated to the other virtual machine execution unit 102. (Step A9).
- the memory resource securing unit 105 attempts to restore the console function by adding memory to the virtual machine execution unit 101 (step A5). .
- Step A10 When the memory resource securing unit 105 cannot acquire memory from the other virtual machine execution unit 102 (step A10; NO), or when the console function of the virtual machine execution unit 101 is not restored even after adding memory (Step A6; NO), the console acquisition unit 104 deletes the virtual machine execution unit 101 and terminates (Step A11).
- the software safe stop system 10 acquires a console by additionally allocating memory to a virtual machine that is abnormally stopped due to a memory leak. It can be terminated normally.
- the software safety stop system 10 normally terminates the abnormally stopped virtual machine without deleting it. Therefore, the data held in the non-persistent storage before the abnormal stop is saved in the persistent storage and then virtualized. The machine can be stopped. Therefore, it is possible to avoid that the data held in the non-persistent storage is erased by deleting the virtual machine.
- the software safety stop system 10 can avoid losing information on the operation and state of the system before the abnormal stop due to the deletion of the virtual machine, so that the cause of the abnormal stop can be investigated and the problem can be solved quickly.
- the software safe stop system 10 can avoid inconsistency in data used by software due to deletion of a virtual machine.
- the software safety stop system 10 diverts memory resources assigned to other virtual machines when there is not enough memory to be assigned to the abnormally stopped virtual machine. Even if no extra memory resource is prepared, the abnormally stopped virtual machine can be safely stopped.
- FIG. FIG. 3 is a block diagram showing the configuration of the software safe stop system 20 according to the second embodiment of the present invention.
- the same reference numerals as those in FIG. 1 represent corresponding components.
- the software safe stop system 20 according to the second embodiment of the present invention includes a software isolation unit 107 in addition to the configuration of the first embodiment.
- the software isolation unit 107 isolates software that causes a memory leak executed by using the virtual machine execution unit 101 from other software.
- step A2 the software isolating unit 107 replaces the software causing the memory leak with other software executed by the virtual machine executing unit 101 or outside the software safe stop system 10. Perform processing to isolate it from linked components existing in.
- Processing details vary depending on the software configuration that caused the memory leak. For example, when the software has a load balancing cluster configuration, the software isolating unit 107 isolates the software from other components by excluding the software from the load balancing cluster.
- the software isolation unit 107 isolates the software from the network by invalidating the network interface of the virtual machine execution unit 101. To do. In this way, the software isolating unit 107 isolates the software that has caused the memory leak, reduces the risk of failure at the time of additional memory allocation, and then proceeds to step A3. Subsequent operations are the same as those in the first embodiment.
- the software safety stop system 20 is configured to attempt to acquire the console after isolating the software causing the memory leak from other software. It is possible to reduce the risk of a failure due to a memory leak again after the additional allocation, and to stop the virtual machine more safely.
- FIG. 4 is a block diagram showing a configuration of the software safe stop system 30 according to the third embodiment of the present invention.
- the software safe stop system 30 according to the third embodiment of the present invention includes a safe stop processing execution unit 108 in addition to the configuration of the third embodiment.
- the safe stop process execution unit 108 automatically collects information necessary for investigating the cause of the failure and automatically stores the data used by the software. To stop the virtual machine safely.
- the safety stop processing execution unit 108 loads and loads a script for automatically executing processing necessary for the safety stop of the virtual machine execution unit 101.
- the executed script is executed with a higher priority than other programs, and the software is safely stopped and terminated.
- Processing automatically executed by the safe stop processing execution unit 108 includes removal of software abnormal state, resolution of data inconsistency, storage of temporary data, collection of information on system behavior immediately before abnormal stop, log recording, Includes normal process termination.
- the software safe stop system 30 executes a program for safely stopping the software when the console function is acquired by the additional allocation of memory to the abnormally stopped virtual machine. Run automatically. As a result, the cost of manual recovery processing and the risk of operation errors can be reduced, and the virtual machine can be safely stopped in a shorter time.
- FIG. 5 is a block diagram showing the configuration of the software safe stop system 40 according to the embodiment of the present invention.
- the software safe stop system 40 includes a server computer 500 and a client computer 600.
- the server computer 500 includes a virtual machine execution unit 501, an OS abnormal stop detection unit 503, a console acquisition unit 504, a memory resource securing unit 505, and a virtual machine memory allocation information storage unit 506.
- the virtual machine execution unit 501 includes application software 507 and a cache server 508.
- the client computer 600 includes an application client 601.
- the application client 601 accesses the application software 507 through network communication and requests processing.
- the application software 507 is, for example, a web application system that uses HTTP (Hypertext Transfer Protocol) communication.
- the application client 601 may accept processing from a plurality of different client computers.
- Application software 507 executes processing requested from application client 601 and stores data in cache server 508 in the processing.
- the cache server 508 stores data used by the application software 507 on a memory.
- the data stored in the cache server 508 has a data structure (for example, a hash table) composed of key / value pairs.
- the application client 601 can request reading or writing of data by designating a key.
- the cache server 508 functions as part of the application software 507 and is activated by the application software 507.
- the application software 507 sets an upper limit value of the amount of data that can be stored in the cache server 508 at the time of activation. This upper limit value falls within the range of the amount of memory that can be used by the virtual machine execution unit 501.
- the cache server 508 determines the amount of memory that can be used by the virtual machine execution unit 501. Try to store more data in memory. For this reason, when the processing amount requested from the application client 601 exceeds a certain amount, a memory leak problem occurs.
- the OS abnormal stop detection unit 503 detects the memory leak, and the memory resource securing unit 505 checks the memory resources available on the server computer 500. If the server computer 500 has surplus memory resources, the console acquisition unit 504 allocates the memory resources to the virtual machine execution unit 501. As a result, a part of the process executed in the virtual machine execution unit 501 is restored, and the console function can be accessed.
- the functions of the application software 507 and the cache server 508 may be recovered.
- the process crashes due to an abnormal stop, or the virtual machine execution unit 501 is forcibly terminated before the OS abnormally stops, so that it is not guaranteed that all functions are restored.
- the console function it is possible to save the content stored in the cache server 508, collect log information before abnormal stop, and the like. Further, it is possible to remove the inconsistent state at the time of abnormal stop and safely stop the cache server 508, the application software 507, and the OS according to a normal stop procedure.
- An OS abnormal stop detection unit that detects that an operating system of a virtual machine executed in a computer system has stopped abnormally due to a memory leak
- a memory resource securing unit that secures memory resources necessary for recovery of the console function of the virtual machine that has stopped abnormally from memory resources available in the computer system
- a software safety stop system comprising: a console acquisition unit that allocates a memory resource secured to the virtual machine that has stopped abnormally and that normally terminates the virtual machine using the console function when the console function is restored.
- the memory resource securing unit The software safe stop according to appendix 1, wherein a memory resource necessary for recovering the console function of the abnormally stopped virtual machine is secured from among memory resources allocated to another virtual machine executed in the computer system. system.
- Additional remark 1 or 2 provided with the software isolation
- separation part which isolates the software which caused the memory leak from the other software currently performed with the said virtual machine, and the cooperation component which exists outside the said computer system
- the software safe stop system according to 2.
- the said safe stop process execution part is Automatic execution of at least one of removal of abnormal software status, resolution of data inconsistency, storage of temporary data, collection of information on system behavior immediately before abnormal stop, logging, and normal process termination
- the software safe stop system according to any one of supplementary notes 1 to 4.
- An OS abnormal stop detection unit for detecting that an operating system of a virtual machine executed on the computer has abnormally stopped due to a memory leak;
- a memory resource securing unit that secures memory resources necessary for recovery of the console function of the virtual machine that has stopped abnormally from memory resources available in the computer;
- a program for functioning as a console acquisition unit for allocating memory resources secured to the abnormally stopped virtual machine and normally terminating the virtual machine using the console function when the console function is restored.
- the present invention is suitable for avoiding a failure when a system failure due to a memory leak occurs in a virtual machine executed by a computer system.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Computer Security & Cryptography (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
- Hardware Redundancy (AREA)
- Stored Programmes (AREA)
Abstract
Description
次に、本発明を実施するための形態について、図面を参照して詳細に説明する。
図1は、本発明の実施の形態1によるソフトウェア安全停止システム10の構成を示すブロック図である。図1に示すように、ソフトウェア安全停止システム10は、仮想マシン実行部101,102、OS異常停止検出部103、コンソール獲得部104、メモリ資源確保部105、仮想マシンメモリ割り当て情報記憶部106を備えている。
OS異常停止検出部103は、仮想マシン実行部101で発生したメモリリークを検出する。
コンソール獲得部104は、異常停止した仮想マシン実行部101へのメモリの追加割り当てを行い、メモリの追加により仮想マシン実行部101のコンソール機能が復旧したら、コンソールを用いてデータの保存やログの記録、ソフトウェアの異常状態を取り除く処置を行い、仮想マシンを正常終了させる。
メモリ資源確保部105は、仮想マシン実行部101のコンソール機能の獲得に必要なメモリ資源をソフトウェア安全停止システム10内で利用可能なメモリ資源の中から確保する。
仮想マシンメモリ割り当て情報記憶部106は、各々の仮想マシン実行部に割り当てられているメモリ資源の量を格納する。
仮想マシン実行部101で実行されているソフトウェアの欠陥により、仮想マシンでメモリリークによる障害が発生する(ステップA1)。
図3は、本発明の実施の形態2によるソフトウェア安全停止システム20の構成を示すブロック図である。図1と同一の符号は、対応する構成要素を表している。図3に示すように、本発明の実施の形態2によるソフトウェア安全停止システム20は、実施の形態1の構成に加え、ソフトウェア隔離部107を備えている。ソフトウェア隔離部107は、仮想マシン実行部101を用いて実行されるメモリリークの原因となるソフトウェアを他のソフトウェアから隔離する。
図4は、本発明の実施の形態3によるソフトウェア安全停止システム30の構成を示すブロック図である。図1と同一の符号は、対応する構成要素を表している。図4に示すように、本発明の実施の形態3によるソフトウェア安全停止システム30は、実施の形態3の構成に加え、安全停止処理実行部108を備えている。安全停止処理実行部108は、コンソール獲得部104によってコンソール機能が獲得された際に、障害原因を究明するために必要な情報の収集やソフトウェアが利用する重要なデータを保存するための処理を自動で実行し、仮想マシンを安全に停止させる。
図5は、本発明の実施例によるソフトウェア安全停止システム40の構成を示すブロック図である。図5に示すように、ソフトウェア安全停止システム40は、サーバコンピュータ500とクライアントコンピュータ600を備えている。
(付記1)コンピュータシステムで実行される仮想マシンのオペレーティングシステムがメモリリークによって異常停止したことを検出するOS異常停止検出部と、
異常停止した前記仮想マシンのコンソール機能の復旧に必要なメモリ資源を、前記コンピュータシステム内で利用可能なメモリ資源の中から確保する、メモリ資源確保部と、
異常停止した前記仮想マシンに確保されたメモリ資源を割り当て、前記コンソール機能が復旧したら、前記コンソール機能を用いて前記仮想マシンを正常終了させるコンソール獲得部と、を備えたソフトウェア安全停止システム。
前記コンピュータシステムで実行される他の仮想マシンに割り当てられているメモリ資源の中から、異常停止した前記仮想マシンのコンソール機能の復旧に必要なメモリ資源を確保する、付記1に記載のソフトウェア安全停止システム。
ソフトウェアの異常状態の除去、データ不整合の解消、一時データの保存、異常停止直前のシステムの振る舞いに関する情報の収集、ログの記録、およびプロセスの正常終了処理のうちの少なくとも1つを自動で実行する、付記1から4のいずれか1項に記載のソフトウェア安全停止システム。
異常停止した前記仮想マシンのコンソール機能の復旧に必要なメモリ資源を、前記コンピュータシステム内で利用可能なメモリ資源の中から確保する工程と、
異常停止した前記仮想マシンに確保されたメモリ資源を割り当て、前記コンソール機能が復旧したら、前記コンソール機能を用いて前記仮想マシンを正常終了させる工程と、を備えたソフトウェア安全停止方法。
前記コンピュータで実行される仮想マシンのオペレーティングシステムがメモリリークによって異常停止したことを検出するOS異常停止検出部と、
異常停止した前記仮想マシンのコンソール機能の復旧に必要なメモリ資源を、前記コンピュータ内で利用可能なメモリ資源の中から確保する、メモリ資源確保部と、
異常停止した前記仮想マシンに確保されたメモリ資源を割り当て、前記コンソール機能が復旧したら、前記コンソール機能を用いて前記仮想マシンを正常終了させるコンソール獲得部と、して機能させるためのプログラム。
Claims (7)
- コンピュータシステムで実行される仮想マシンのオペレーティングシステムがメモリリークによって異常停止したことを検出するOS異常停止検出部と、
異常停止した前記仮想マシンのコンソール機能の復旧に必要なメモリ資源を、前記コンピュータシステム内で利用可能なメモリ資源の中から確保する、メモリ資源確保部と、
異常停止した前記仮想マシンに確保されたメモリ資源を割り当て、前記コンソール機能が復旧したら、前記コンソール機能を用いて前記仮想マシンを正常終了させるコンソール獲得部と、を備えたソフトウェア安全停止システム。 - 前記メモリ資源確保部は、
前記コンピュータシステムで実行される他の仮想マシンに割り当てられているメモリ資源の中から、異常停止した前記仮想マシンのコンソール機能の復旧に必要なメモリ資源を確保する、請求項1に記載のソフトウェア安全停止システム。 - 前記メモリリークの原因となったソフトウェアを、前記仮想マシンで実行されていた他のソフトウェアや前記コンピュータシステムの外部に存在する連携コンポーネントから隔離する、ソフトウェア隔離部を備えた請求項1または2に記載のソフトウェア安全停止システム。
- 前記コンソール機能の復旧後、前記仮想マシンを正常終了させる処理を自動で実行する、安全停止処理実行部を備えた請求項1から3のいずれか1項に記載のソフトウェア安全停止システム。
- 前記安全停止処理実行部は、
ソフトウェアの異常状態の除去、データ不整合の解消、一時データの保存、異常停止直前のシステムの振る舞いに関する情報の収集、ログの記録、およびプロセスの正常終了処理のうちの少なくとも1つを自動で実行する、請求項1から4のいずれか1項に記載のソフトウェア安全停止システム。 - コンピュータシステムで実行される仮想マシンのオペレーティングシステムがメモリリークによって異常停止したことを検出する工程と、
異常停止した前記仮想マシンのコンソール機能の復旧に必要なメモリ資源を、前記コンピュータシステム内で利用可能なメモリ資源の中から確保する工程と、
異常停止した前記仮想マシンに確保されたメモリ資源を割り当て、前記コンソール機能が復旧したら、前記コンソール機能を用いて前記仮想マシンを正常終了させる工程と、を備えたソフトウェア安全停止方法。 - コンピュータを、
前記コンピュータで実行される仮想マシンのオペレーティングシステムがメモリリークによって異常停止したことを検出するOS異常停止検出部と、
異常停止した前記仮想マシンのコンソール機能の復旧に必要なメモリ資源を、前記コンピュータ内で利用可能なメモリ資源の中から確保する、メモリ資源確保部と、
異常停止した前記仮想マシンに確保されたメモリ資源を割り当て、前記コンソール機能が復旧したら、前記コンソール機能を用いて前記仮想マシンを正常終了させるコンソール獲得部と、して機能させるためのプログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015502707A JP6164283B2 (ja) | 2013-02-28 | 2013-08-26 | ソフトウェア安全停止システム、ソフトウェア安全停止方法、およびプログラム |
US14/766,912 US9588798B2 (en) | 2013-02-28 | 2013-08-26 | Software safe shutdown system, software safe shutdown method, and program to prevent a problem caused by a system failure |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013-039061 | 2013-02-28 | ||
JP2013039061 | 2013-02-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014132466A1 true WO2014132466A1 (ja) | 2014-09-04 |
Family
ID=51427760
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/072741 WO2014132466A1 (ja) | 2013-02-28 | 2013-08-26 | ソフトウェア安全停止システム、ソフトウェア安全停止方法、およびプログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US9588798B2 (ja) |
JP (1) | JP6164283B2 (ja) |
WO (1) | WO2014132466A1 (ja) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108073441B (zh) * | 2016-11-14 | 2022-05-10 | 阿里巴巴集团控股有限公司 | 一种虚拟机内存监管方法与设备 |
CN111736514B (zh) * | 2020-06-10 | 2020-12-04 | 杭州凯尔达机器人科技股份有限公司 | 基于通用计算机的机器人控制系统 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001101034A (ja) * | 1999-09-29 | 2001-04-13 | Hitachi Ltd | 異種os間制御による障害復旧方法 |
JP2004213122A (ja) * | 2002-12-27 | 2004-07-29 | Idemitsu Kosan Co Ltd | クライアント/サーバによる制御システムの安定稼働方法及びそのプログラム |
JP2004252591A (ja) * | 2003-02-18 | 2004-09-09 | Hitachi Ltd | 計算機システム、i/oデバイス及びi/oデバイスの仮想共有方法 |
JP2007133544A (ja) * | 2005-11-09 | 2007-05-31 | Hitachi Ltd | 障害情報解析方法及びその実施装置 |
JP2011128967A (ja) * | 2009-12-18 | 2011-06-30 | Hitachi Ltd | 仮想計算機の移動方法、仮想計算機システム及びプログラム |
JP2012185865A (ja) * | 2012-07-06 | 2012-09-27 | Hitachi Ltd | 管理システム |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7478387B2 (en) * | 2002-09-25 | 2009-01-13 | International Business Machines Corporation | System and method for creating a restartable non-native language routine execution environment |
US7647589B1 (en) * | 2005-02-07 | 2010-01-12 | Parallels Software International, Inc. | Methods and systems for safe execution of guest code in virtual machine context |
JP4462238B2 (ja) | 2006-06-21 | 2010-05-12 | 株式会社デンソーウェーブ | 携帯端末 |
-
2013
- 2013-08-26 WO PCT/JP2013/072741 patent/WO2014132466A1/ja active Application Filing
- 2013-08-26 JP JP2015502707A patent/JP6164283B2/ja not_active Expired - Fee Related
- 2013-08-26 US US14/766,912 patent/US9588798B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001101034A (ja) * | 1999-09-29 | 2001-04-13 | Hitachi Ltd | 異種os間制御による障害復旧方法 |
JP2004213122A (ja) * | 2002-12-27 | 2004-07-29 | Idemitsu Kosan Co Ltd | クライアント/サーバによる制御システムの安定稼働方法及びそのプログラム |
JP2004252591A (ja) * | 2003-02-18 | 2004-09-09 | Hitachi Ltd | 計算機システム、i/oデバイス及びi/oデバイスの仮想共有方法 |
JP2007133544A (ja) * | 2005-11-09 | 2007-05-31 | Hitachi Ltd | 障害情報解析方法及びその実施装置 |
JP2011128967A (ja) * | 2009-12-18 | 2011-06-30 | Hitachi Ltd | 仮想計算機の移動方法、仮想計算機システム及びプログラム |
JP2012185865A (ja) * | 2012-07-06 | 2012-09-27 | Hitachi Ltd | 管理システム |
Also Published As
Publication number | Publication date |
---|---|
US9588798B2 (en) | 2017-03-07 |
JPWO2014132466A1 (ja) | 2017-02-02 |
JP6164283B2 (ja) | 2017-07-19 |
US20160011899A1 (en) | 2016-01-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5128944B2 (ja) | コンピュータアプリケーションにおけるデータ損失を最小限にする方法およびシステム | |
JP5423871B2 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
Yamakita et al. | Phase-based reboot: Reusing operating system execution phases for cheap reboot-based recovery | |
US8930764B2 (en) | System and methods for self-healing from operating system faults in kernel/supervisory mode | |
JP2007133544A (ja) | 障害情報解析方法及びその実施装置 | |
US10346269B2 (en) | Selective mirroring of predictively isolated memory | |
CN111800303A (zh) | 混合云场景下保证可用集群数量的方法、装置及系统 | |
US9195528B1 (en) | Systems and methods for managing failover clusters | |
CN111181780A (zh) | 基于ha集群的主机池切换方法、系统、终端及存储介质 | |
JP5403054B2 (ja) | メモリダンプ機能を有するサーバおよびメモリダンプ取得方法 | |
JP6164283B2 (ja) | ソフトウェア安全停止システム、ソフトウェア安全停止方法、およびプログラム | |
US10324811B2 (en) | Opportunistic failover in a high availability cluster | |
CN111147615B (zh) | Ip地址的接管方法、系统、计算机可读存储介质及服务器 | |
CN104360935A (zh) | 一种服务器系统崩溃转储收集的方法 | |
US11226875B2 (en) | System halt event recovery | |
CN115421960A (zh) | 一种ue内存故障恢复方法、装置、电子设备及介质 | |
WO2014024279A1 (ja) | メモリ障害リカバリ装置、方法、及びプログラム | |
US9465710B1 (en) | Systems and methods for predictively preparing restore packages | |
US9176806B2 (en) | Computer and memory inspection method | |
Cerveira et al. | Fast local VM migration against hypervisor corruption | |
CN112231063A (zh) | 一种故障处理方法及装置 | |
CN103197992A (zh) | GlusterFS脑裂的自动化恢复方法 | |
US11436112B1 (en) | Remote direct memory access (RDMA)-based recovery of dirty data in remote memory | |
JP2018022402A (ja) | 情報処理装置、情報処理システム、情報処理装置の制御方法および情報処理装置の制御プログラム | |
WO2015176455A1 (zh) | 基于Hadoop的硬盘损坏处理方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13876214 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14766912 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 2015502707 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13876214 Country of ref document: EP Kind code of ref document: A1 |