BR9912879A - Sistema de computador tolerante à falha, e, processo de operação tolerante à falha de um sistema de computador - Google Patents

Sistema de computador tolerante à falha, e, processo de operação tolerante à falha de um sistema de computador

Info

Publication number
BR9912879A
BR9912879A BR9912879-9A BR9912879A BR9912879A BR 9912879 A BR9912879 A BR 9912879A BR 9912879 A BR9912879 A BR 9912879A BR 9912879 A BR9912879 A BR 9912879A
Authority
BR
Brazil
Prior art keywords
event
computer system
fault
tolerant
interruption
Prior art date
Application number
BR9912879-9A
Other languages
English (en)
Inventor
Ronstroem Mikael
Original Assignee
Ericsson Telefon Ab L M
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ericsson Telefon Ab L M filed Critical Ericsson Telefon Ab L M
Publication of BR9912879A publication Critical patent/BR9912879A/pt

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2097Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2038Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant with a single idle spare processing component
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2041Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant with more than one idle spare processing component

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

"SISTEMA DE COMPUTADOR TOLERANTE à FALHA, E, PROCESSO DE OPERAçãO TOLERANTE à FALHA DE UM SISTEMA DE COMPUTADOR". Um sistema de computador com tolerância à falha e processo exigindo comunicações entre unidades reduzidas. Um sistema primário é disposto para executar processos de evento em resposta a comandos recebidos. Toda vez que a execução de um processo de evento é interrompida, devido ao encerramento normal ou a uma interrupção, um gerador de evento gera uma mensagem de evento indicando o tipo de processo de evento e a razão ou tempo de interrupção de processo de evento. A mensagem de evento é utilizada para instruir um sistema de reserva a executar o mesmo processo de evento. Visto que a mensagem de evento também especifica a razão e o tempo de interrupção do processo de evento, a execução do processo de evento pode ser replicada no sistema de reserva. Dessa forma, o sistema primário e pelo menos um sistema de reserva serão sincronizados. Pelo menos um sistema de espera pode ser fornecido para registrar em um registro de evento a seg³ência de mensagens de evento, e para armazenar uma cópia de arquivo de conteúdo de memória do sistema primário. O registro de evento com a cópia de arquivo devem ser utilizados para restaurar o estado do sistema primário.
BR9912879-9A 1998-08-11 1999-08-09 Sistema de computador tolerante à falha, e, processo de operação tolerante à falha de um sistema de computador BR9912879A (pt)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE19836347A DE19836347C2 (de) 1998-08-11 1998-08-11 Fehlertolerantes Computersystem
PCT/EP1999/005739 WO2000010087A1 (en) 1998-08-11 1999-08-09 Fault tolerant computer system

Publications (1)

Publication Number Publication Date
BR9912879A true BR9912879A (pt) 2001-05-08

Family

ID=7877184

Family Applications (1)

Application Number Title Priority Date Filing Date
BR9912879-9A BR9912879A (pt) 1998-08-11 1999-08-09 Sistema de computador tolerante à falha, e, processo de operação tolerante à falha de um sistema de computador

Country Status (10)

Country Link
US (1) US6438707B1 (pt)
EP (1) EP1110148B1 (pt)
JP (1) JP2002522845A (pt)
KR (1) KR100575497B1 (pt)
CN (1) CN1137439C (pt)
AU (1) AU5731699A (pt)
BR (1) BR9912879A (pt)
CA (1) CA2339783C (pt)
DE (1) DE19836347C2 (pt)
WO (1) WO2000010087A1 (pt)

Families Citing this family (66)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19836347C2 (de) 1998-08-11 2001-11-15 Ericsson Telefon Ab L M Fehlertolerantes Computersystem
US6898189B1 (en) * 2000-08-23 2005-05-24 Cisco Technology, Inc. Restartable spanning tree for high availability network systems
US7054892B1 (en) * 1999-12-23 2006-05-30 Emc Corporation Method and apparatus for managing information related to storage activities of data storage systems
GB0002972D0 (en) * 2000-02-09 2000-03-29 Orange Personal Comm Serv Ltd Data handling system
DE10014390C2 (de) * 2000-03-23 2002-02-21 Siemens Ag Hochverfügbares Rechnersystem und Verfahren zur Umschaltung von Bearbeitungsprogrammen eines hochverfügbaren Rechnersystems
US6694450B1 (en) * 2000-05-20 2004-02-17 Equipe Communications Corporation Distributed process redundancy
JP3968207B2 (ja) * 2000-05-25 2007-08-29 株式会社日立製作所 データ多重化方法およびデータ多重化システム
GB2372673B (en) * 2001-02-27 2003-05-28 3Com Corp Apparatus and method for processing data relating to events on a network
JP4273669B2 (ja) * 2001-02-28 2009-06-03 沖電気工業株式会社 ノード情報管理システム及びノード
DE10111864A1 (de) * 2001-03-13 2002-09-26 Tenovis Gmbh & Co Kg Anordnung mit zumindest einer Telekommunikationsanlage sowie Verfahren zum Sichern von Gebührendatensätzen
US7171434B2 (en) * 2001-09-07 2007-01-30 Network Appliance, Inc. Detecting unavailability of primary central processing element, each backup central processing element associated with a group of virtual logic units and quiescing I/O operations of the primary central processing element in a storage virtualization system
US7472231B1 (en) 2001-09-07 2008-12-30 Netapp, Inc. Storage area network data cache
US20030065861A1 (en) * 2001-09-28 2003-04-03 Clark Clyde S. Dual system masters
US6880111B2 (en) * 2001-10-31 2005-04-12 Intel Corporation Bounding data transmission latency based upon a data transmission event and arrangement
US6918060B2 (en) * 2001-10-31 2005-07-12 Intel Corporation Bounding data transmission latency based upon link loading and arrangement
US7437450B1 (en) 2001-11-30 2008-10-14 Cisco Technology Inc. End-to-end performance tool and method for monitoring electronic-commerce transactions
CN100397349C (zh) * 2001-11-30 2008-06-25 甲骨文国际公司 用于在网络系统上提供资源高可用性的方法
GB0206604D0 (en) * 2002-03-20 2002-05-01 Global Continuity Plc Improvements relating to overcoming data processing failures
US7426559B2 (en) 2002-05-09 2008-09-16 International Business Machines Corporation Method for sequential coordination of external database application events with asynchronous internal database events
US20030236826A1 (en) * 2002-06-24 2003-12-25 Nayeem Islam System and method for making mobile applications fault tolerant
US7099661B1 (en) * 2002-07-10 2006-08-29 The Directv Group, Inc. Risk-time protection backup system
JP3774826B2 (ja) * 2002-07-11 2006-05-17 日本電気株式会社 情報処理装置
US7149917B2 (en) * 2002-07-30 2006-12-12 Cisco Technology, Inc. Method and apparatus for outage measurement
US20040044799A1 (en) * 2002-09-03 2004-03-04 Nokia Corporation Method, device and system for synchronizing of data providing for the handling of an interrupted synchronization process
NZ521983A (en) * 2002-10-14 2005-05-27 Maximum Availability Ltd Journaling changes to system objects such as programs in the IBM OS/400 operating system
GB0308264D0 (en) * 2003-04-10 2003-05-14 Ibm Recovery from failures within data processing systems
US7720973B2 (en) * 2003-06-30 2010-05-18 Microsoft Corporation Message-based scalable data transport protocol
CN1292346C (zh) * 2003-09-12 2006-12-27 国际商业机器公司 用于在分布式计算体系结构中执行作业的系统和方法
US7133986B2 (en) * 2003-09-29 2006-11-07 International Business Machines Corporation Method, system, and program for forming a consistency group
KR100608751B1 (ko) * 2004-02-07 2006-08-08 엘지전자 주식회사 이동통신단말기의 에러로그 관리 방법
US7133989B2 (en) * 2004-05-05 2006-11-07 International Business Machines Corporation Point in time copy between data storage systems
CN100372094C (zh) * 2004-10-29 2008-02-27 力晶半导体股份有限公司 具自动回复功能的晶片测试装置与晶片测试方法
JP4182948B2 (ja) * 2004-12-21 2008-11-19 日本電気株式会社 フォールト・トレラント・コンピュータシステムと、そのための割り込み制御方法
EP1903441B1 (en) * 2005-07-14 2016-03-23 Fujitsu Ltd. Message analyzing device, message analyzing method and message analyzing program
JP4696759B2 (ja) * 2005-07-29 2011-06-08 Kddi株式会社 光終端システム
KR100725502B1 (ko) * 2005-09-09 2007-06-08 삼성전자주식회사 전자장치, 전자장치 시스템 및 전자장치의 제어방법
CN100465911C (zh) * 2006-04-13 2009-03-04 华为技术有限公司 一种备份方法
US7424642B2 (en) * 2006-04-24 2008-09-09 Gm Global Technology Operations, Inc. Method for synchronization of a controller
KR100820772B1 (ko) * 2006-04-27 2008-04-10 텔코웨어 주식회사 분산 네트워크 환경에서의 이중화 메모리 파일시스템 복구방법 및 복구 시스템
US7725764B2 (en) * 2006-08-04 2010-05-25 Tsx Inc. Failover system and method
US7865887B2 (en) * 2006-11-30 2011-01-04 Sap Ag Context based event handling and execution with prioritization and interrupt management
CN101145946B (zh) * 2007-09-17 2010-09-01 中兴通讯股份有限公司 一种基于消息日志的容错集群系统和方法
JP4644720B2 (ja) * 2008-03-10 2011-03-02 富士通株式会社 制御方法、情報処理装置及びストレージシステム
CN101593136B (zh) * 2008-05-30 2012-05-02 国际商业机器公司 使得计算机具有高可用性的方法和计算机系统
JP5366480B2 (ja) * 2008-08-27 2013-12-11 株式会社日立製作所 計算機システム及びそのバックアップ方法
CN101431401B (zh) * 2008-09-08 2012-04-04 华为终端有限公司 一种同步故障处理方法、客户端、服务器及其系统
US9569319B2 (en) 2009-09-18 2017-02-14 Alcatel Lucent Methods for improved server redundancy in dynamic networks
CN101815009B (zh) * 2010-03-30 2011-09-28 南京恩瑞特实业有限公司 支持容错的热备同步方法
US20130318535A1 (en) * 2010-08-11 2013-11-28 Nec Corporation Primary-backup based fault tolerant method for multiprocessor systems
CN102385637A (zh) * 2011-12-22 2012-03-21 山东中创软件商用中间件股份有限公司 一种数据库信息的备份方法及系统
EP4191431A1 (en) * 2013-03-12 2023-06-07 Toshiba Solutions Corporation Database system, program, and data processing method
CN103248499B (zh) * 2013-03-27 2014-09-17 天脉聚源(北京)传媒科技有限公司 一种信息交互的方法和系统
WO2015025384A1 (ja) 2013-08-21 2015-02-26 株式会社東芝 データベースシステム、プログラムおよびデータ処理方法
WO2015029139A1 (ja) * 2013-08-27 2015-03-05 株式会社東芝 データベースシステム、プログラムおよびデータ処理方法
CN103581177A (zh) * 2013-10-24 2014-02-12 华为技术有限公司 虚拟机管理方法及装置
ES2714218T3 (es) 2014-07-01 2019-05-27 Sas Inst Inc Sistemas y métodos para comunicaciones tolerantes a fallos
US9712382B2 (en) 2014-10-27 2017-07-18 Quanta Computer Inc. Retrieving console messages after device failure
US9703789B2 (en) 2015-07-27 2017-07-11 Sas Institute Inc. Distributed data set storage and retrieval
US9990367B2 (en) 2015-07-27 2018-06-05 Sas Institute Inc. Distributed data set encryption and decryption
US10496292B2 (en) 2017-01-19 2019-12-03 International Business Machines Corporation Saving/restoring guarded storage controls in a virtualized environment
US10725685B2 (en) 2017-01-19 2020-07-28 International Business Machines Corporation Load logical and shift guarded instruction
US10452288B2 (en) 2017-01-19 2019-10-22 International Business Machines Corporation Identifying processor attributes based on detecting a guarded storage event
US10496311B2 (en) 2017-01-19 2019-12-03 International Business Machines Corporation Run-time instrumentation of guarded storage event processing
US10732858B2 (en) 2017-01-19 2020-08-04 International Business Machines Corporation Loading and storing controls regulating the operation of a guarded storage facility
US10579377B2 (en) 2017-01-19 2020-03-03 International Business Machines Corporation Guarded storage event handling during transactional execution
EP3543870B1 (en) 2018-03-22 2022-04-13 Tata Consultancy Services Limited Exactly-once transaction semantics for fault tolerant fpga based transaction systems

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4590554A (en) * 1982-11-23 1986-05-20 Parallel Computers Systems, Inc. Backup fault tolerant computer system
EP0306211A3 (en) * 1987-09-04 1990-09-26 Digital Equipment Corporation Synchronized twin computer system
CA1297593C (en) 1987-10-08 1992-03-17 Stephen C. Leuty Fault tolerant ancillary messaging and recovery system and method within adigital switch
EP0441087B1 (en) 1990-02-08 1995-08-16 International Business Machines Corporation Checkpointing mechanism for fault-tolerant systems
US5157663A (en) 1990-09-24 1992-10-20 Novell, Inc. Fault tolerant computer system
JP2773424B2 (ja) * 1990-11-20 1998-07-09 株式会社日立製作所 ネットワークシステムおよび接続コンピュータ切替え方法
WO1993009494A1 (en) 1991-10-28 1993-05-13 Digital Equipment Corporation Fault-tolerant computer processing using a shadow virtual processor
US5551047A (en) 1993-01-28 1996-08-27 The Regents Of The Univeristy Of California Method for distributed redundant execution of program modules
US5473771A (en) * 1993-09-01 1995-12-05 At&T Corp. Fault-tolerant processing system architecture
US5544304A (en) * 1994-03-25 1996-08-06 International Business Machines Corporation Fault tolerant command processing
US5619656A (en) 1994-05-05 1997-04-08 Openservice, Inc. System for uninterruptively displaying only relevant and non-redundant alert message of the highest severity for specific condition associated with group of computers being managed
US5528516A (en) 1994-05-25 1996-06-18 System Management Arts, Inc. Apparatus and method for event correlation and problem reporting
US5623532A (en) 1995-01-12 1997-04-22 Telefonaktiebolaget Lm Ericsson Hardware and data redundant architecture for nodes in a communications system
US5737514A (en) * 1995-11-29 1998-04-07 Texas Micro, Inc. Remote checkpoint memory system and protocol for fault-tolerant computer system
SE515348C2 (sv) 1995-12-08 2001-07-16 Ericsson Telefon Ab L M Processorredundans i ett distribuerat system
DE19625195A1 (de) * 1996-06-24 1998-01-02 Siemens Ag Synchronisationsverfahren
JPH10240557A (ja) * 1997-02-27 1998-09-11 Mitsubishi Electric Corp 待機冗長化システム
DE19836347C2 (de) 1998-08-11 2001-11-15 Ericsson Telefon Ab L M Fehlertolerantes Computersystem

Also Published As

Publication number Publication date
EP1110148A1 (en) 2001-06-27
CA2339783A1 (en) 2000-02-24
CN1137439C (zh) 2004-02-04
KR20010072379A (ko) 2001-07-31
CA2339783C (en) 2011-03-08
JP2002522845A (ja) 2002-07-23
WO2000010087A1 (en) 2000-02-24
AU5731699A (en) 2000-03-06
US6438707B1 (en) 2002-08-20
CN1312922A (zh) 2001-09-12
KR100575497B1 (ko) 2006-05-03
DE19836347A1 (de) 2000-02-17
EP1110148B1 (en) 2002-07-17
DE19836347C2 (de) 2001-11-15

Similar Documents

Publication Publication Date Title
BR9912879A (pt) Sistema de computador tolerante à falha, e, processo de operação tolerante à falha de um sistema de computador
Borg et al. A message system supporting fault tolerance
US8127174B1 (en) Method and apparatus for performing transparent in-memory checkpointing
JP5578720B2 (ja) 高利用率と仮想化の観点から固体ドライブの管理を向上する方法
Borr Transaction monitoring in Encompass
US7921080B2 (en) System and method for a backup parallel server data storage system
EP0827079B1 (en) Checkpoint computer system
DE69311797D1 (de) Fehlertolerantes computersystem mit vorrichtung fuer die bearbeitung von externen ereignissen
CA2323106A1 (en) File server storage arrangement
CA2273523A1 (en) Method and apparatus for providing failure detection and recovery with predetermined replication style for distributed applications in a network
Garcia-Molina et al. Issues in disaster recovery
CN108964986B (zh) 协同办公系统应用级双活灾备系统
JPH0822424A (ja) クライアント・サーバ・システムおよびその制御方法
Emmerson et al. Fault tolerance achieved in VLSI
Bhide et al. Implicit replication in a network file server
Highleyman Breaking the Availability Barrier: Survivable Systems for Enterprise Computing
Sens et al. STAR: A fault-tolerant system for distributed applications
CHETTO Fault-tolerant scheduling in distributed critical real-time systems
KR930010952B1 (ko) 메모리 장애 처리 방법
JPH0417040A (ja) 分散処理システムのプログラム管理方法
Appel et al. Implications of fault management and replica determinism on the real-time execution scheme of VOTRICS
EP0720095A1 (en) Fault tolerant computer system
Sunada et al. Fault Tolerance: Methods Of Rollback Recovery
Crane et al. Failure and its Recovery in an Object-Oriented Distributed System
Mehdi Endurance 4000TM-enabling fault tolerance for Windows NT servers

Legal Events

Date Code Title Description
FA10 Dismissal: dismissal - article 33 of industrial property law
B11Y Definitive dismissal - extension of time limit for request of examination expired [chapter 11.1.1 patent gazette]