WO2008111124A1 - マルチcpu異常検出復旧システム、方法及びプログラム - Google Patents

マルチcpu異常検出復旧システム、方法及びプログラム Download PDF

Info

Publication number
WO2008111124A1
WO2008111124A1 PCT/JP2007/000211 JP2007000211W WO2008111124A1 WO 2008111124 A1 WO2008111124 A1 WO 2008111124A1 JP 2007000211 W JP2007000211 W JP 2007000211W WO 2008111124 A1 WO2008111124 A1 WO 2008111124A1
Authority
WO
WIPO (PCT)
Prior art keywords
program
abnormality detection
restoration system
abnormality
restoration
Prior art date
Application number
PCT/JP2007/000211
Other languages
English (en)
French (fr)
Inventor
Yoshiyuki Ohira
Original Assignee
Fujitsu Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Limited filed Critical Fujitsu Limited
Priority to JP2009503756A priority Critical patent/JP5212357B2/ja
Priority to PCT/JP2007/000211 priority patent/WO2008111124A1/ja
Publication of WO2008111124A1 publication Critical patent/WO2008111124A1/ja
Priority to US12/544,618 priority patent/US8074123B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0721Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
    • G06F11/0724Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU] in a multiprocessor or a multi-core unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Hardware Redundancy (AREA)

Abstract

 複数のCPUを備えるマルチCPUシステムにおいて、稼動しているプログラムの異常を検出する異常状態検出部と、前記異常状態検出部により異常が検出されたとき、当該検出された異常の内容に基づいて異常となっているデータの復旧可能かどうかを判断し、復旧可能なとき前記データを復旧する復旧部と、を備えることを特徴とするマルチCPUシステム。
PCT/JP2007/000211 2007-03-12 2007-03-12 マルチcpu異常検出復旧システム、方法及びプログラム WO2008111124A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2009503756A JP5212357B2 (ja) 2007-03-12 2007-03-12 マルチcpu異常検出復旧システム、方法及びプログラム
PCT/JP2007/000211 WO2008111124A1 (ja) 2007-03-12 2007-03-12 マルチcpu異常検出復旧システム、方法及びプログラム
US12/544,618 US8074123B2 (en) 2007-03-12 2009-08-20 Multi-CPU failure detection/recovery system and method for the same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2007/000211 WO2008111124A1 (ja) 2007-03-12 2007-03-12 マルチcpu異常検出復旧システム、方法及びプログラム

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/544,618 Continuation US8074123B2 (en) 2007-03-12 2009-08-20 Multi-CPU failure detection/recovery system and method for the same

Publications (1)

Publication Number Publication Date
WO2008111124A1 true WO2008111124A1 (ja) 2008-09-18

Family

ID=39759075

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2007/000211 WO2008111124A1 (ja) 2007-03-12 2007-03-12 マルチcpu異常検出復旧システム、方法及びプログラム

Country Status (3)

Country Link
US (1) US8074123B2 (ja)
JP (1) JP5212357B2 (ja)
WO (1) WO2008111124A1 (ja)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011134044A (ja) * 2009-12-24 2011-07-07 Nec Corp プロセス異常復旧装置及びプロセス異常復旧方法

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8850262B2 (en) 2010-10-12 2014-09-30 International Business Machines Corporation Inter-processor failure detection and recovery
WO2012053110A1 (ja) * 2010-10-22 2012-04-26 富士通株式会社 障害監視装置、障害監視方法及びプログラム
CN103150224B (zh) * 2013-03-11 2015-11-11 杭州华三通信技术有限公司 用于提高启动可靠性的电子设备及方法
US10459782B2 (en) * 2017-08-31 2019-10-29 Nxp Usa, Inc. System and method of implementing heartbeats in a multicore system
JP2019179395A (ja) * 2018-03-30 2019-10-17 オムロン株式会社 異常検知システム、サポート装置および異常検知方法
US11693727B2 (en) * 2021-03-08 2023-07-04 Jpmorgan Chase Bank, N.A. Systems and methods to identify production incidents and provide automated preventive and corrective measures
CN114218075B (zh) * 2021-11-25 2024-04-19 中国航空综合技术研究所 机载设备测试性试验实施样本库生成方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04160642A (ja) * 1990-10-25 1992-06-03 Nec Corp コンピュータシステム
JPH04365145A (ja) * 1991-06-13 1992-12-17 Hitachi Ltd メモリ障害処理方法
JPH0667901A (ja) * 1992-08-21 1994-03-11 Fuji Facom Corp リンクリスト履歴保存装置
JPH06214889A (ja) * 1993-01-20 1994-08-05 Hitachi Ltd 主記憶領域破壊検出方法
JPH0926888A (ja) * 1995-07-13 1997-01-28 Hitachi Ltd 排他制御装置
JP2002007218A (ja) * 2000-06-21 2002-01-11 Hitachi Eng Co Ltd メモリ照合方式
JP2004259146A (ja) * 2003-02-27 2004-09-16 Nippon Telegr & Teleph Corp <Ntt> 閾値自動設定方法及びシステム

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0748198B2 (ja) 1988-10-25 1995-05-24 株式会社ピーエフユー マルチプロセッサシステム
JP3025504B2 (ja) * 1989-08-29 2000-03-27 富士通株式会社 情報処理装置
US5875342A (en) * 1997-06-03 1999-02-23 International Business Machines Corporation User programmable interrupt mask with timeout
DE19827430C2 (de) * 1997-07-22 2001-07-12 Siemens Ag Überwachungsverfahren zur Erkennung von Endlosschleifen und blockierten Prozessen in einem Rechnersystem
JP2908430B1 (ja) 1998-05-14 1999-06-21 九州日本電気ソフトウェア株式会社 マルチプロセッサシステムのホストプロセッサ監視装置および監視方法
JPH11338838A (ja) * 1998-05-22 1999-12-10 Nagano Nippon Denki Software Kk マルチプロセッサシステムにおける障害情報のパラレルダンプ採取方法及び方式
US6393590B1 (en) * 1998-12-22 2002-05-21 Nortel Networks Limited Method and apparatus for ensuring proper functionality of a shared memory, multiprocessor system
US6301676B1 (en) * 1999-01-22 2001-10-09 Sun Microsystems, Inc. Robust and recoverable interprocess locks
US6898696B1 (en) * 1999-06-14 2005-05-24 International Business Machines Corporation Method and system for efficiently restoring a processor's execution state following an interrupt caused by an interruptible instruction
US6658595B1 (en) * 1999-10-19 2003-12-02 Cisco Technology, Inc. Method and system for asymmetrically maintaining system operability
JP3419392B2 (ja) * 2000-10-27 2003-06-23 日本電気株式会社 メモリアクセス監視装置、メモリアクセス監視方法およびメモリアクセス監視用プログラムを記録した記録媒体
US20040078681A1 (en) * 2002-01-24 2004-04-22 Nick Ramirez Architecture for high availability using system management mode driven monitoring and communications
US6961874B2 (en) * 2002-05-20 2005-11-01 Sun Microsystems, Inc. Software hardening utilizing recoverable, correctable, and unrecoverable fault protocols
US7162714B2 (en) * 2002-05-22 2007-01-09 American Power Conversion Corporation Software-based watchdog method and apparatus
JP2004164113A (ja) * 2002-11-11 2004-06-10 Nec Micro Systems Ltd マルチcpuのリセット回路およびリセット方法
US7219264B2 (en) * 2003-05-09 2007-05-15 Tekelec Methods and systems for preserving dynamic random access memory contents responsive to hung processor condition
US7162666B2 (en) * 2004-03-26 2007-01-09 Emc Corporation Multi-processor system having a watchdog for interrupting the multiple processors and deferring preemption until release of spinlocks
JP2006338605A (ja) * 2005-06-06 2006-12-14 Denso Corp プログラム異常監視方法及びプログラム異常監視装置
US7546487B2 (en) * 2005-09-15 2009-06-09 Intel Corporation OS and firmware coordinated error handling using transparent firmware intercept and firmware services
US7191098B1 (en) * 2005-09-22 2007-03-13 International Business Machines Corporation Automatic detection of excessive interrupt-disabled operating system code
US7702889B2 (en) * 2005-10-18 2010-04-20 Qualcomm Incorporated Shared interrupt control method and system for a digital signal processor

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04160642A (ja) * 1990-10-25 1992-06-03 Nec Corp コンピュータシステム
JPH04365145A (ja) * 1991-06-13 1992-12-17 Hitachi Ltd メモリ障害処理方法
JPH0667901A (ja) * 1992-08-21 1994-03-11 Fuji Facom Corp リンクリスト履歴保存装置
JPH06214889A (ja) * 1993-01-20 1994-08-05 Hitachi Ltd 主記憶領域破壊検出方法
JPH0926888A (ja) * 1995-07-13 1997-01-28 Hitachi Ltd 排他制御装置
JP2002007218A (ja) * 2000-06-21 2002-01-11 Hitachi Eng Co Ltd メモリ照合方式
JP2004259146A (ja) * 2003-02-27 2004-09-16 Nippon Telegr & Teleph Corp <Ntt> 閾値自動設定方法及びシステム

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011134044A (ja) * 2009-12-24 2011-07-07 Nec Corp プロセス異常復旧装置及びプロセス異常復旧方法

Also Published As

Publication number Publication date
US20090307526A1 (en) 2009-12-10
US8074123B2 (en) 2011-12-06
JPWO2008111124A1 (ja) 2010-06-24
JP5212357B2 (ja) 2013-06-19

Similar Documents

Publication Publication Date Title
WO2008111124A1 (ja) マルチcpu異常検出復旧システム、方法及びプログラム
WO2008011012A3 (en) Recoverable error detection for concurrent computing programs
WO2007009009A3 (en) Systems and methods for identifying sources of malware
WO2007005440A3 (en) Change event correlation
WO2004081920A3 (en) Policy-based response to system errors occuring during os runtime
WO2008092162A3 (en) Systems, methods, and media for recovering an application from a fault or attack
WO2014078585A3 (en) Methods, systems and computer readable media for detecting command injection attacks
WO2006031750A3 (en) Application of abnormal event detection technology to hydrocracking units
WO2007022364A3 (en) Change audit method, apparatus and system
WO2009006060A3 (en) System and methods for disruption detection, management, and recovery
WO2009064379A3 (en) A method of detecting and tracking multiple objects on a touchpad
WO2008121399A3 (en) Effective low-profile health monitoring or the like
WO2006029290A3 (en) Application of abnormal event detection technology to olefins recovery trains
IL190758A0 (en) A generic multiinstance method and gui detection system for tracking and monitoring computer applications
WO2007030549A3 (en) Threat detection and monitoring apparatus with integrated display system
WO2007131078A3 (en) Inflammatory condition progression, diagnosis and treatment monitoring methods, systems, apparatus, and uses
WO2007149307A3 (en) Fault detection and root cause identification in complex systems
WO2009022272A3 (en) System and method providing fault detection capability
WO2009073571A3 (en) Systems and methods for a property sentinel
WO2006076578A3 (en) System for maintaining fault-type selection during an out-of-step condition
WO2006096855A3 (en) Device, system and method of detection of input unit disconnection
WO2008069971A3 (en) Apparatus and associated methods for diagnosing configuration faults
WO2008157128A3 (en) Methods, systems, and computer program products for tokenized domain name resolution
WO2008099453A1 (ja) 縮退方法および情報処理装置
WO2007117734A3 (en) Method and system for detecting obfuscatory pestware in a computer memory

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07713593

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2009503756

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07713593

Country of ref document: EP

Kind code of ref document: A1