KR101691126B1 - 결함 내성 배치 처리 - Google Patents

결함 내성 배치 처리 Download PDF

Info

Publication number
KR101691126B1
KR101691126B1 KR1020127003290A KR20127003290A KR101691126B1 KR 101691126 B1 KR101691126 B1 KR 101691126B1 KR 1020127003290 A KR1020127003290 A KR 1020127003290A KR 20127003290 A KR20127003290 A KR 20127003290A KR 101691126 B1 KR101691126 B1 KR 101691126B1
Authority
KR
South Korea
Prior art keywords
checkpoint
processing
input data
buffer
fault tolerant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
KR1020127003290A
Other languages
English (en)
Korean (ko)
Other versions
KR20120040707A (ko
Inventor
브라이언 필 두로스
매튜 달시 애터버리
팀 웨이클링
Original Assignee
아브 이니티오 테크놀로지 엘엘시
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 아브 이니티오 테크놀로지 엘엘시 filed Critical 아브 이니티오 테크놀로지 엘엘시
Publication of KR20120040707A publication Critical patent/KR20120040707A/ko
Application granted granted Critical
Publication of KR101691126B1 publication Critical patent/KR101691126B1/ko
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Retry When Errors Occur (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
KR1020127003290A 2009-07-14 2010-07-13 결함 내성 배치 처리 Active KR101691126B1 (ko)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US12/502,851 2009-07-14
US12/502,851 US8205113B2 (en) 2009-07-14 2009-07-14 Fault tolerant batch processing
PCT/US2010/041791 WO2011008734A1 (en) 2009-07-14 2010-07-13 Fault tolerant batch processing

Related Child Applications (2)

Application Number Title Priority Date Filing Date
KR1020167035996A Division KR101721466B1 (ko) 2009-07-14 2010-07-13 결함 내성 배치 처리
KR20157008164A Division KR20150042873A (ko) 2009-07-14 2010-07-13 결함 내성 배치 처리

Publications (2)

Publication Number Publication Date
KR20120040707A KR20120040707A (ko) 2012-04-27
KR101691126B1 true KR101691126B1 (ko) 2016-12-29

Family

ID=43449727

Family Applications (3)

Application Number Title Priority Date Filing Date
KR1020167035996A Active KR101721466B1 (ko) 2009-07-14 2010-07-13 결함 내성 배치 처리
KR20157008164A Withdrawn KR20150042873A (ko) 2009-07-14 2010-07-13 결함 내성 배치 처리
KR1020127003290A Active KR101691126B1 (ko) 2009-07-14 2010-07-13 결함 내성 배치 처리

Family Applications Before (2)

Application Number Title Priority Date Filing Date
KR1020167035996A Active KR101721466B1 (ko) 2009-07-14 2010-07-13 결함 내성 배치 처리
KR20157008164A Withdrawn KR20150042873A (ko) 2009-07-14 2010-07-13 결함 내성 배치 처리

Country Status (8)

Country Link
US (3) US8205113B2 (enExample)
EP (2) EP2454666B1 (enExample)
JP (3) JP5735961B2 (enExample)
KR (3) KR101721466B1 (enExample)
CN (2) CN102473122B (enExample)
AU (1) AU2010273531B2 (enExample)
CA (1) CA2767667C (enExample)
WO (1) WO2011008734A1 (enExample)

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7877350B2 (en) 2005-06-27 2011-01-25 Ab Initio Technology Llc Managing metadata for graph-based computations
CN101821721B (zh) 2007-07-26 2017-04-12 起元技术有限责任公司 具有误差处理的事务型基于图的计算
EP2396724A4 (en) 2009-02-13 2012-12-12 Ab Initio Technology Llc TASK EXECUTION MANAGEMENT
US8205113B2 (en) * 2009-07-14 2012-06-19 Ab Initio Technology Llc Fault tolerant batch processing
US20110051729A1 (en) * 2009-08-28 2011-03-03 Industrial Technology Research Institute and National Taiwan University Methods and apparatuses relating to pseudo random network coding design
US8667329B2 (en) * 2009-09-25 2014-03-04 Ab Initio Technology Llc Processing transactions in graph-based applications
CN107066241B (zh) 2010-06-15 2021-03-09 起元技术有限责任公司 用于动态加载基于图的计算的系统和方法
US8839252B1 (en) * 2010-09-01 2014-09-16 Misys Ireland Limited Parallel execution of batch data based on modeled batch processing workflow and contention context information
US9495477B1 (en) 2011-04-20 2016-11-15 Google Inc. Data storage in a graph processing system
US8849929B2 (en) * 2011-04-27 2014-09-30 Microsoft Corporation Applying actions to item sets within a constraint
US8924974B1 (en) * 2011-06-08 2014-12-30 Workday, Inc. System for error checking of process definitions for batch processes
US20140137121A1 (en) * 2012-10-05 2014-05-15 Hitachi, Ltd. Job management system and job control method
US10108521B2 (en) 2012-11-16 2018-10-23 Ab Initio Technology Llc Dynamic component performance monitoring
US9507682B2 (en) 2012-11-16 2016-11-29 Ab Initio Technology Llc Dynamic graph performance monitoring
US9274926B2 (en) 2013-01-03 2016-03-01 Ab Initio Technology Llc Configurable testing of computer programs
US9256460B2 (en) * 2013-03-15 2016-02-09 International Business Machines Corporation Selective checkpointing of links in a data flow based on a set of predefined criteria
US9401835B2 (en) 2013-03-15 2016-07-26 International Business Machines Corporation Data integration on retargetable engines in a networked environment
US9323619B2 (en) 2013-03-15 2016-04-26 International Business Machines Corporation Deploying parallel data integration applications to distributed computing environments
US9223806B2 (en) * 2013-03-28 2015-12-29 International Business Machines Corporation Restarting a batch process from an execution point
US9477511B2 (en) 2013-08-14 2016-10-25 International Business Machines Corporation Task-based modeling for parallel data integration
JP6626823B2 (ja) 2013-12-05 2019-12-25 アビニシオ テクノロジー エルエルシー サブグラフから構成されるデータフローグラフ用のインターフェースの管理
AU2015312006B2 (en) 2014-09-02 2020-03-19 Ab Initio Technology Llc Managing invocation of tasks
AU2015312012B2 (en) 2014-09-02 2020-02-27 Ab Initio Technology Llc Compiling graph-based program specifications
WO2016036817A1 (en) 2014-09-02 2016-03-10 Ab Initio Technology Llc Executing graph-based program specifications
CN104536893B (zh) * 2015-01-05 2018-01-30 中国农业银行股份有限公司 一种批量处理程序容报错处理方法及装置
US10191948B2 (en) * 2015-02-27 2019-01-29 Microsoft Technology Licensing, Llc Joins and aggregations on massive graphs using large-scale graph processing
US10657134B2 (en) * 2015-08-05 2020-05-19 Ab Initio Technology Llc Selecting queries for execution on a stream of real-time data
WO2017105888A1 (en) * 2015-12-17 2017-06-22 Ab Initio Technology Llc Processing data using dynamic partitioning
CN108475189B (zh) 2015-12-21 2021-07-09 起元技术有限责任公司 子图接口生成的方法、系统及计算机可读介质
WO2017123849A1 (en) * 2016-01-14 2017-07-20 Ab Initio Technology Llc Recoverable stream processing
US10073746B2 (en) * 2016-07-12 2018-09-11 Advanced Micro Devices, Inc. Method and apparatus for providing distributed checkpointing
US10802945B2 (en) * 2016-12-07 2020-10-13 Ab Initio Technology Llc Differencing of executable dataflow graphs
CN108009037A (zh) * 2017-11-24 2018-05-08 中国银行股份有限公司 批处理作业故障处理方法、装置、存储介质及设备
CN108491159B (zh) * 2018-03-07 2020-07-17 北京航空航天大学 一种基于随机延迟缓解i/o瓶颈的大规模并行系统检查点数据写入方法
US11886433B2 (en) * 2022-01-10 2024-01-30 Red Hat, Inc. Dynamic data batching for graph-based structures
KR102625797B1 (ko) * 2023-03-06 2024-01-16 주식회사 모레 파이프라인 병렬 처리 컴파일링 방법 및 장치
CN120020718A (zh) * 2023-11-17 2025-05-20 北京度友信息技术有限公司 服务调度方法、装置、设备及存储介质
US20250199920A1 (en) * 2023-12-13 2025-06-19 Ab Initio Technology Llc Partition-based Escrow in a Distributed Computing System

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010042224A1 (en) * 1999-12-06 2001-11-15 Stanfill Craig W. Continuous flow compute point based data processing
US20080005227A1 (en) * 2006-07-03 2008-01-03 Srinivasan Subbian Method and system for content processing
JP2008293358A (ja) * 2007-05-25 2008-12-04 Fujitsu Ltd 分散処理プログラム、分散処理方法、分散処理装置、および分散処理システム

Family Cites Families (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0212532A (ja) * 1988-06-30 1990-01-17 Toshiba Corp 情報処理装置
EP0554854A3 (en) * 1992-02-04 1996-02-28 Digital Equipment Corp System and method for executing, tracking and recovering long running computations
JPH06274401A (ja) * 1993-03-18 1994-09-30 Nec Corp 分散データベース制御方式
US5819021A (en) 1995-12-11 1998-10-06 Ab Initio Software Corporation Overpartitioning system and method for increasing checkpoints in component-based parallel applications
US5966072A (en) 1996-07-02 1999-10-12 Ab Initio Software Corporation Executing computations expressed as graphs
US6154877A (en) 1997-07-03 2000-11-28 The University Of Iowa Research Foundation Method and apparatus for portable checkpointing using data structure metrics and conversion functions
JP2000039990A (ja) * 1998-07-24 2000-02-08 Nippon Telegr & Teleph Corp <Ntt> 情報提供装置および方法と情報提供プログラムを記録した記録媒体
US6401216B1 (en) * 1998-10-29 2002-06-04 International Business Machines Corporation System of performing checkpoint/restart of a parallel program
JP4094752B2 (ja) * 1998-11-27 2008-06-04 株式会社日立製作所 トランザクション処理方法及びその実施装置並びにその処理プログラムを記録した媒体
JP3463020B2 (ja) * 2000-06-14 2003-11-05 日本電信電話株式会社 ワークフロー実行方法および装置とワークフロー実行プログラムを記録した記録媒体
US7164422B1 (en) 2000-07-28 2007-01-16 Ab Initio Software Corporation Parameterized graphs with conditional components
US7412520B2 (en) * 2001-06-07 2008-08-12 Intel Corporation Systems and methods for recoverable workflow
JP2003085021A (ja) * 2001-09-07 2003-03-20 Nippon Soken Holdings:Kk リカバリ・リスタート機能を備えたバッチ処理システム、リカバリ・リスタート機能を備えたバッチ処理システム用プログラム、及びそのプログラムを記録した記録媒体
US6954877B2 (en) * 2001-11-29 2005-10-11 Agami Systems, Inc. Fault tolerance using logical checkpointing in computing systems
US7206964B2 (en) * 2002-08-30 2007-04-17 Availigent, Inc. Consistent asynchronous checkpointing of multithreaded application programs based on semi-active or passive replication
US7167850B2 (en) 2002-10-10 2007-01-23 Ab Initio Software Corporation Startup and control of graph-based computation
JP2004178316A (ja) * 2002-11-27 2004-06-24 Computer Consulting:Kk プログラム生成装置および方法
US7340741B2 (en) * 2003-02-28 2008-03-04 International Business Machines Corporation Auto-restart processing in an IMS batch application
CN101271471B (zh) * 2003-09-15 2011-08-17 起元科技有限公司 数据处理方法、软件和数据处理系统
US7756873B2 (en) 2003-09-15 2010-07-13 Ab Initio Technology Llc Functional dependency data profiling
US8024733B2 (en) * 2004-05-13 2011-09-20 International Business Machines Corporation Component model for batch computing in a distributed object environment
US7644050B2 (en) * 2004-12-02 2010-01-05 International Business Machines Corporation Method and apparatus for annotation-based behavior extensions
US7543001B2 (en) * 2004-06-17 2009-06-02 International Business Machines Corporation Storing object recovery information within the object
DE102004037713A1 (de) * 2004-08-04 2006-03-16 Robert Bosch Gmbh Verfahren, Betriebssystem und Rechengerät zum Abarbeiten eines Computerprogramms
US7899833B2 (en) 2004-11-02 2011-03-01 Ab Initio Technology Llc Managing related data objects
US7665093B2 (en) * 2004-12-22 2010-02-16 Microsoft Corporation Synchronization of runtime and application state via batching of workflow transactions
US7634687B2 (en) * 2005-01-13 2009-12-15 Microsoft Corporation Checkpoint restart system and method
JP4710380B2 (ja) * 2005-03-31 2011-06-29 日本電気株式会社 分散処理システム及び分散処理方法
US7613749B2 (en) * 2006-04-12 2009-11-03 International Business Machines Corporation System and method for application fault tolerance and recovery using topologically remotely located computing devices
US7870556B2 (en) * 2006-05-16 2011-01-11 Ab Initio Technology Llc Managing computing resources in graph-based computations
US8572236B2 (en) 2006-08-10 2013-10-29 Ab Initio Technology Llc Distributing services in graph-based computations
CN100444121C (zh) * 2006-09-11 2008-12-17 中国工商银行股份有限公司 批量任务调度引擎及调度方法
JP5018133B2 (ja) * 2007-02-27 2012-09-05 富士通株式会社 ジョブ管理装置、クラスタシステム、およびジョブ管理プログラム
US8069129B2 (en) 2007-04-10 2011-11-29 Ab Initio Technology Llc Editing and compiling business rules
US7900015B2 (en) * 2007-04-13 2011-03-01 Isilon Systems, Inc. Systems and methods of quota accounting
US7895474B2 (en) * 2007-05-03 2011-02-22 International Business Machines Corporation Recovery and restart of a batch application
US7779298B2 (en) * 2007-06-11 2010-08-17 International Business Machines Corporation Distributed job manager recovery
JP5453273B2 (ja) 2007-09-20 2014-03-26 アビニシオ テクノロジー エルエルシー グラフベース計算におけるデータフロー管理
US8949801B2 (en) * 2009-05-13 2015-02-03 International Business Machines Corporation Failure recovery for stream processing applications
US8205113B2 (en) * 2009-07-14 2012-06-19 Ab Initio Technology Llc Fault tolerant batch processing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010042224A1 (en) * 1999-12-06 2001-11-15 Stanfill Craig W. Continuous flow compute point based data processing
US20080005227A1 (en) * 2006-07-03 2008-01-03 Srinivasan Subbian Method and system for content processing
JP2008293358A (ja) * 2007-05-25 2008-12-04 Fujitsu Ltd 分散処理プログラム、分散処理方法、分散処理装置、および分散処理システム

Also Published As

Publication number Publication date
EP2851799B1 (en) 2016-02-17
EP2454666B1 (en) 2014-11-12
CN102473122A (zh) 2012-05-23
US20140053159A1 (en) 2014-02-20
KR101721466B1 (ko) 2017-03-30
US8205113B2 (en) 2012-06-19
JP2016129056A (ja) 2016-07-14
KR20150042873A (ko) 2015-04-21
KR20160150126A (ko) 2016-12-28
US20110016354A1 (en) 2011-01-20
CA2767667C (en) 2020-08-18
JP5897747B2 (ja) 2016-03-30
JP5735961B2 (ja) 2015-06-17
HK1165051A1 (en) 2012-09-28
AU2010273531A1 (en) 2012-01-19
CN105573866B (zh) 2018-11-13
AU2010273531B2 (en) 2014-09-25
CA2767667A1 (en) 2011-01-20
KR20120040707A (ko) 2012-04-27
US20120311588A1 (en) 2012-12-06
HK1202951A1 (en) 2015-10-09
JP2015143999A (ja) 2015-08-06
EP2454666A4 (en) 2013-04-24
CN105573866A (zh) 2016-05-11
EP2851799A1 (en) 2015-03-25
JP2012533796A (ja) 2012-12-27
US9304807B2 (en) 2016-04-05
EP2454666A1 (en) 2012-05-23
CN102473122B (zh) 2016-01-20
US8566641B2 (en) 2013-10-22
JP6499986B2 (ja) 2019-04-10
WO2011008734A1 (en) 2011-01-20

Similar Documents

Publication Publication Date Title
KR101691126B1 (ko) 결함 내성 배치 처리
US8739171B2 (en) High-throughput-computing in a hybrid computing environment
CN105487930A (zh) 一种基于Hadoop的任务优化调度方法
Petrov et al. Adaptive performance model for dynamic scaling Apache Spark Streaming
WO2024260034A1 (zh) 分布式训练任务调度方法、设备及非易失性可读存储介质
Liu et al. Optimizing shuffle in wide-area data analytics
Ousterhout et al. Performance clarity as a first-class design principle
JP2023183342A (ja) ジョブスケジューラおよびジョブスケジューリング方法
US6742086B1 (en) Affinity checking process for multiple processor, multiple bus optimization of throughput
AU2014274491B2 (en) Fault tolerant batch processing
CN117149381A (zh) 一种分布式任务管理方法、系统、计算设备及存储介质
HK1165051B (en) Fault tolerant batch processing
CN119248201B (zh) 机器学习平台的数据管理及分发的方法、设备和存储介质
HK1202951B (en) Fault tolerant batch processing
CN116302546A (zh) 流处理系统、状态恢复方法、设备及存储介质
CN120872511A (zh) 一种任务调度方法及设备
JP2004094422A (ja) チェックポイント情報採取方法及びその実施装置並びにその処理プログラム
CN115686761A (zh) 任务调度方法及装置
Roeland Scheduling in Multi-X: a performance evaluation

Legal Events

Date Code Title Description
PA0105 International application

St.27 status event code: A-0-1-A10-A15-nap-PA0105

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

A201 Request for examination
P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000

P13-X000 Application amended

St.27 status event code: A-2-2-P10-P13-nap-X000

PA0201 Request for examination

St.27 status event code: A-1-2-D10-D11-exm-PA0201

A107 Divisional application of patent
PA0104 Divisional application for international application

St.27 status event code: A-0-1-A10-A18-div-PA0104

St.27 status event code: A-0-1-A10-A16-div-PA0104

D13-X000 Search requested

St.27 status event code: A-1-2-D10-D13-srh-X000

D14-X000 Search report completed

St.27 status event code: A-1-2-D10-D14-srh-X000

E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

St.27 status event code: A-1-2-D10-D21-exm-PE0902

T11-X000 Administrative time limit extension requested

St.27 status event code: U-3-3-T10-T11-oth-X000

P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000

P13-X000 Application amended

St.27 status event code: A-2-2-P10-P13-nap-X000

E701 Decision to grant or registration of patent right
PE0701 Decision of registration

St.27 status event code: A-1-2-D10-D22-exm-PE0701

A107 Divisional application of patent
PA0104 Divisional application for international application

St.27 status event code: A-0-1-A10-A18-div-PA0104

St.27 status event code: A-0-1-A10-A16-div-PA0104

GRNT Written decision to grant
PR0701 Registration of establishment

St.27 status event code: A-2-4-F10-F11-exm-PR0701

PR1002 Payment of registration fee

St.27 status event code: A-2-2-U10-U12-oth-PR1002

Fee payment year number: 1

PG1601 Publication of registration

St.27 status event code: A-4-4-Q10-Q13-nap-PG1601

R17-X000 Change to representative recorded

St.27 status event code: A-5-5-R10-R17-oth-X000

FPAY Annual fee payment

Payment date: 20191213

Year of fee payment: 4

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 4

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 5

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 6

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 7

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 8

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 9

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 10