CN103370693B - 重启进程 - Google Patents

重启进程 Download PDF

Info

Publication number
CN103370693B
CN103370693B CN201280009224.8A CN201280009224A CN103370693B CN 103370693 B CN103370693 B CN 103370693B CN 201280009224 A CN201280009224 A CN 201280009224A CN 103370693 B CN103370693 B CN 103370693B
Authority
CN
China
Prior art keywords
execution
calculate system
execution phase
state
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201280009224.8A
Other languages
English (en)
Chinese (zh)
Other versions
CN103370693A (zh
Inventor
B.P.杜罗斯
J.S.霍利三世
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ab Initio Technology LLC
Original Assignee
Ab Initio Technology LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ab Initio Technology LLC filed Critical Ab Initio Technology LLC
Publication of CN103370693A publication Critical patent/CN103370693A/zh
Application granted granted Critical
Publication of CN103370693B publication Critical patent/CN103370693B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Retry When Errors Occur (AREA)
CN201280009224.8A 2011-02-18 2012-02-16 重启进程 Active CN103370693B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/030,998 US9021299B2 (en) 2011-02-18 2011-02-18 Restarting processes
US13/030,998 2011-02-18
PCT/US2012/025388 WO2012112748A1 (en) 2011-02-18 2012-02-16 Restarting processes

Publications (2)

Publication Number Publication Date
CN103370693A CN103370693A (zh) 2013-10-23
CN103370693B true CN103370693B (zh) 2016-09-14

Family

ID=45809631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280009224.8A Active CN103370693B (zh) 2011-02-18 2012-02-16 重启进程

Country Status (8)

Country Link
US (2) US9021299B2 (enrdf_load_stackoverflow)
EP (1) EP2676198B1 (enrdf_load_stackoverflow)
JP (2) JP2014509012A (enrdf_load_stackoverflow)
KR (1) KR20140004702A (enrdf_load_stackoverflow)
CN (1) CN103370693B (enrdf_load_stackoverflow)
AU (1) AU2012217621B2 (enrdf_load_stackoverflow)
CA (1) CA2826282C (enrdf_load_stackoverflow)
WO (1) WO2012112748A1 (enrdf_load_stackoverflow)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9495477B1 (en) * 2011-04-20 2016-11-15 Google Inc. Data storage in a graph processing system
FR2991840A1 (fr) * 2012-06-11 2013-12-13 France Telecom Procede de traitement de donnees par un module de navigation
SG11201603105VA (en) 2013-10-21 2016-05-30 Ab Initio Technology Llc Checkpointing a collection of data units
CN104346233B (zh) * 2014-10-13 2017-12-26 中国外汇交易中心 一种用于计算机系统的故障恢复方法及装置
JP6495779B2 (ja) * 2015-08-11 2019-04-03 日本電信電話株式会社 演算処理管理方法及び演算装置
EP3614266B1 (en) * 2016-01-14 2022-03-09 AB Initio Technology LLC Recoverable stream processing
US10002004B2 (en) * 2016-05-26 2018-06-19 International Business Machines Corporation Stream computing application shutdown and restart without data loss
US10073746B2 (en) * 2016-07-12 2018-09-11 Advanced Micro Devices, Inc. Method and apparatus for providing distributed checkpointing
US10459811B2 (en) * 2016-08-19 2019-10-29 Bank Of America Corporation System for increasing intra-application processing efficiency by transmitting failed processing work over a processing recovery network for resolution
US10270654B2 (en) 2016-08-19 2019-04-23 Bank Of America Corporation System for increasing computing efficiency of communication between applications running on networked machines
US10180881B2 (en) * 2016-08-19 2019-01-15 Bank Of America Corporation System for increasing inter-application processing efficiency by transmitting failed processing work over a processing recovery network for resolution
US10459894B2 (en) * 2016-09-22 2019-10-29 Bank Of America Corporation Database shutdown and restart stability optimizer
US10379968B2 (en) * 2017-05-05 2019-08-13 Pivotal Software, Inc. Backup and restore framework for distributed computing systems
CN108153620A (zh) * 2017-12-27 2018-06-12 深圳豪客互联网有限公司 一种进程控制方法及装置
CN110071880B (zh) * 2018-01-24 2021-06-18 北京金山云网络技术有限公司 报文转发方法、转发装置、服务器及存储介质
US10628321B2 (en) * 2018-02-28 2020-04-21 Qualcomm Incorporated Progressive flush of cache memory
CN108874549B (zh) * 2018-07-19 2021-02-02 北京百度网讯科技有限公司 资源复用方法、装置、终端和计算机可读存储介质
KR102700419B1 (ko) * 2018-09-04 2024-08-30 삼성전자주식회사 전자장치 및 그 제어방법
JP7372977B2 (ja) * 2018-09-25 2023-11-01 アビニシオ テクノロジー エルエルシー 監査データを出力する際の回復性を実装するための専用監査ポート
CN109274544B (zh) * 2018-12-11 2021-06-29 浪潮(北京)电子信息产业有限公司 一种分布式存储系统的故障检测方法及装置
WO2022003911A1 (ja) * 2020-07-02 2022-01-06 日本電信電話株式会社 ワークフロー整合性確保装置、ワークフロー整合性確保方法、および、ワークフロー整合性確保プログラム
CN113256909A (zh) * 2020-12-31 2021-08-13 深圳怡化电脑股份有限公司 设备驱动自恢复的方法、系统、存取款设备和存储介质

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1584847A (zh) * 2003-08-19 2005-02-23 英特尔公司 在交流电源不存在时的操作状态保存

Family Cites Families (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2848075B2 (ja) * 1992-01-10 1999-01-20 三菱電機株式会社 シーケンスコントローラ及びその順序制御方法
US5712971A (en) * 1995-12-11 1998-01-27 Ab Initio Software Corporation Methods and systems for reconstructing the state of a computation
US5819021A (en) 1995-12-11 1998-10-06 Ab Initio Software Corporation Overpartitioning system and method for increasing checkpoints in component-based parallel applications
KR970029114U (ko) * 1995-12-28 1997-07-24 윈드 실드 글래스의 시일링 구조
JP3122371B2 (ja) * 1996-01-31 2001-01-09 株式会社東芝 計算機システム
US5966072A (en) 1996-07-02 1999-10-12 Ab Initio Software Corporation Executing computations expressed as graphs
JPH113293A (ja) * 1997-06-13 1999-01-06 Nec Software Ltd 計算機システム
US6289474B1 (en) 1998-06-24 2001-09-11 Torrent Systems, Inc. Computer system and process for checkpointing operations on data in a computer system by partitioning the data
US7203732B2 (en) * 1999-11-11 2007-04-10 Miralink Corporation Flexible remote data mirroring
US6584581B1 (en) * 1999-12-06 2003-06-24 Ab Initio Software Corporation Continuous flow checkpointing data processing
US7213063B2 (en) * 2000-01-18 2007-05-01 Lucent Technologies Inc. Method, apparatus and system for maintaining connections between computers using connection-oriented protocols
GB0017336D0 (en) * 2000-07-15 2000-08-30 Ibm Preferable modes of software package deployment
US7164422B1 (en) 2000-07-28 2007-01-16 Ab Initio Software Corporation Parameterized graphs with conditional components
US6766471B2 (en) * 2000-12-28 2004-07-20 International Business Machines Corporation User-level checkpoint and restart for groups of processes
US7412520B2 (en) 2001-06-07 2008-08-12 Intel Corporation Systems and methods for recoverable workflow
US7058634B2 (en) * 2002-02-06 2006-06-06 United Devices, Inc. Distributed blast processing architecture and associated systems and methods
US7167850B2 (en) 2002-10-10 2007-01-23 Ab Initio Software Corporation Startup and control of graph-based computation
US7080225B1 (en) * 2002-12-10 2006-07-18 Emc Corporation Method and apparatus for managing migration of data in a computer system
US7174479B2 (en) * 2003-09-10 2007-02-06 Microsoft Corporation Method and system for rollback-free failure recovery of multi-step procedures
KR100899850B1 (ko) 2003-09-15 2009-05-27 아브 이니티오 소프트웨어 엘엘시 데이터 프로파일링
US7536591B2 (en) * 2003-11-17 2009-05-19 Virginia Tech Intellectual Properties, Inc. Transparent checkpointing and process migration in a distributed system
US7085788B2 (en) * 2003-12-03 2006-08-01 Hitachi, Ltd. Remote copy system configured to receive both a write request including a write time and a write request not including a write time.
US7380039B2 (en) * 2003-12-30 2008-05-27 3Tera, Inc. Apparatus, method and system for aggregrating computing resources
US20050246453A1 (en) * 2004-04-30 2005-11-03 Microsoft Corporation Providing direct access to hardware from a virtual environment
US7275183B2 (en) 2004-04-30 2007-09-25 Hewlett-Packard Development Company, L.P. Method of restoring processes within process domain
US8108429B2 (en) * 2004-05-07 2012-01-31 Quest Software, Inc. System for moving real-time data events across a plurality of devices in a network for simultaneous data protection, replication, and access services
KR101470712B1 (ko) * 2004-07-20 2014-12-08 마이크로소프트 코포레이션 컴퓨터 애플리케이션에서의 데이터 손실을 최소화하기 위한 방법 및 시스템
US8230426B2 (en) * 2004-10-06 2012-07-24 Digipede Technologies, Llc Multicore distributed processing system using selection of available workunits based on the comparison of concurrency attributes with the parallel processing characteristics
US7899833B2 (en) 2004-11-02 2011-03-01 Ab Initio Technology Llc Managing related data objects
US7392428B2 (en) * 2004-11-19 2008-06-24 International Business Machines Corporation Method and system for recovering from abnormal interruption of a parity update operation in a disk array system
US7673190B1 (en) * 2005-09-14 2010-03-02 Unisys Corporation System and method for detecting and recovering from errors in an instruction stream of an electronic data processing system
US20070168720A1 (en) * 2005-11-30 2007-07-19 Oracle International Corporation Method and apparatus for providing fault tolerance in a collaboration environment
US7870556B2 (en) 2006-05-16 2011-01-11 Ab Initio Technology Llc Managing computing resources in graph-based computations
US7469406B2 (en) * 2006-07-31 2008-12-23 Sap Ag Process suspension through process model design
US8572633B2 (en) * 2006-07-31 2013-10-29 Sap Ag Exception handling for collaborating process models
KR101495575B1 (ko) 2006-08-10 2015-02-25 아브 이니티오 테크놀로지 엘엘시 그래프 기반 연산에서의 분배 서비스
US7669081B2 (en) * 2006-09-27 2010-02-23 Raytheon Company Systems and methods for scheduling, processing, and monitoring tasks
US7743276B2 (en) * 2006-09-27 2010-06-22 Hewlett-Packard Development Company, L.P. Sufficient free space for redundancy recovery within a distributed data-storage system
CN101595456A (zh) * 2006-12-27 2009-12-02 莫尔It资源有限公司 用于事务资源控制的方法和系统
WO2008114415A1 (ja) * 2007-03-20 2008-09-25 Fujitsu Limited マルチプロセッシングシステム
US8069129B2 (en) 2007-04-10 2011-11-29 Ab Initio Technology Llc Editing and compiling business rules
JP4375435B2 (ja) * 2007-05-23 2009-12-02 株式会社日立製作所 予知型データ移行を行う階層ストレージシステム
AU2008302144B2 (en) 2007-09-20 2014-09-11 Ab Initio Technology Llc Managing data flows in graph-based computations
US8301593B2 (en) * 2008-06-12 2012-10-30 Gravic, Inc. Mixed mode synchronous and asynchronous replication system
JP5219660B2 (ja) * 2008-07-04 2013-06-26 キヤノン株式会社 印刷システム及び印刷制御方法、プログラム、印刷装置
US20100011435A1 (en) * 2008-07-08 2010-01-14 Asp Works Pte Ltd Method and System for Providing Guaranteed File Transfer in Corporate Environment Behind Firewall
WO2010022246A2 (en) 2008-08-20 2010-02-25 Wal-Mart Stores, Inc. Process auto-restart systems and methods
US8200771B2 (en) * 2008-10-10 2012-06-12 International Business Machines Corporation Workload migration using on demand remote paging
US7984332B2 (en) * 2008-11-17 2011-07-19 Microsoft Corporation Distributed system checker
AU2010260587A1 (en) * 2009-06-19 2011-12-22 Core Technology Ltd Computer process management
JP2011022959A (ja) * 2009-07-21 2011-02-03 Mitsubishi Electric Corp プロセス実行装置及びコンピュータプログラム及びプロセス実行方法
US8713294B2 (en) * 2009-11-13 2014-04-29 International Business Machines Corporation Heap/stack guard pages using a wakeup unit
US8108718B2 (en) * 2009-11-13 2012-01-31 Hewlett-Packard Development Company, L.P. Checkpointing in massively parallel processing
US8739164B2 (en) * 2010-02-24 2014-05-27 Advanced Micro Devices, Inc. Automatic suspend atomic hardware transactional memory in response to detecting an implicit suspend condition and resume thereof
US8627123B2 (en) * 2010-03-25 2014-01-07 Microsoft Corporation Managing power provisioning in distributed computing
EP2567518B1 (en) 2010-05-06 2014-03-26 Telefonaktiebolaget LM Ericsson (publ) Handling failure of request message during set up of label switched path
US8862937B2 (en) * 2010-05-06 2014-10-14 Verizon Patent And Licensing Inc. Method and system for migrating data from multiple sources
US9207993B2 (en) * 2010-05-13 2015-12-08 Microsoft Technology Licensing, Llc Dynamic application placement based on cost and availability of energy in datacenters
US8464104B2 (en) * 2010-09-10 2013-06-11 International Business Machines Corporation Mobility of versioned workload partitions
US20120158447A1 (en) * 2010-12-20 2012-06-21 Microsoft Corporation Pricing batch computing jobs at data centers
US9164806B2 (en) * 2011-01-28 2015-10-20 Oracle International Corporation Processing pattern framework for dispatching and executing tasks in a distributed computing grid
US9116759B2 (en) * 2011-02-18 2015-08-25 Ab Initio Technology Llc Restarting data processing systems

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1584847A (zh) * 2003-08-19 2005-02-23 英特尔公司 在交流电源不存在时的操作状态保存

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Characterization of Consistent Global Checkpoints in Large-Scale Distributed Systems;R.BALDONI ET AL;《PROCEEDINGS OF THE SOCIETY WORKSHOP ON FIFTH IEEE COMPUTER FUTURE TRENDS OF DISTRIBUTED COMPUTING SYSTEMS》;19950930;全文 *

Also Published As

Publication number Publication date
CN103370693A (zh) 2013-10-23
EP2676198A1 (en) 2013-12-25
US9021299B2 (en) 2015-04-28
WO2012112748A1 (en) 2012-08-23
CA2826282C (en) 2019-06-04
EP2676198B1 (en) 2025-05-21
US9268645B2 (en) 2016-02-23
JP6556110B2 (ja) 2019-08-07
US20120216073A1 (en) 2012-08-23
CA2826282A1 (en) 2012-08-23
AU2012217621B2 (en) 2016-02-18
JP2014509012A (ja) 2014-04-10
KR20140004702A (ko) 2014-01-13
JP2017041263A (ja) 2017-02-23
US20150212891A1 (en) 2015-07-30
AU2012217621A1 (en) 2013-05-02

Similar Documents

Publication Publication Date Title
CN103370693B (zh) 重启进程
JP6377703B2 (ja) データ処理システムの再開
JP2014509012A5 (enrdf_load_stackoverflow)
Gupta et al. Just-in-time checkpointing: Low cost error recovery from deep learning training failures
Jha et al. Resiliency of hpc interconnects: A case study of interconnect failures and recovery in blue waters
US8095826B1 (en) Method and apparatus for providing in-memory checkpoint services within a distributed transaction
Tardieu et al. Reliable actors with retry orchestration
Balazinska et al. Fault-tolerance and high availability in data stream management systems
CN104516778B (zh) 一种多任务环境下进程检查点的保存与恢复系统及方法
Arockiam et al. FTM-A middle layer architecture for fault tolerance in cloud computing
HK1187715A (en) Restarting data processing systems
HK1187715B (en) Restarting data processing systems
CN112650565A (zh) 一种应用进程恢复方法及装置
CN206224446U (zh) 一种电商平台网站数据保护系统
US20250199920A1 (en) Partition-based Escrow in a Distributed Computing System
Balazinska et al. Fault tolerance and high availability in data stream management systems
Santos Performability issues of fault tolerance solutions for message-passing systems: the case of RADIC
Sivapriya et al. RMS A Flexible Approach for Fault Tolerant Mechanism in the Grid Environments

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant