KR101511837B1 - 벡터 분할 루프들의 성능 향상 - Google Patents

벡터 분할 루프들의 성능 향상 Download PDF

Info

Publication number
KR101511837B1
KR101511837B1 KR20130035807A KR20130035807A KR101511837B1 KR 101511837 B1 KR101511837 B1 KR 101511837B1 KR 20130035807 A KR20130035807 A KR 20130035807A KR 20130035807 A KR20130035807 A KR 20130035807A KR 101511837 B1 KR101511837 B1 KR 101511837B1
Authority
KR
South Korea
Prior art keywords
vector
instruction
prediction
conditional branch
branch instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
KR20130035807A
Other languages
English (en)
Korean (ko)
Other versions
KR20130112009A (ko
Inventor
제프리 이. 고니온
Original Assignee
애플 인크.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 애플 인크. filed Critical 애플 인크.
Publication of KR20130112009A publication Critical patent/KR20130112009A/ko
Application granted granted Critical
Publication of KR101511837B1 publication Critical patent/KR101511837B1/ko
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/322Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
    • G06F9/323Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for indirect branch instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30072Arrangements for executing specific machine instructions to perform conditional operations, e.g. using predicates or guards
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • G06F9/30038Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • G06F9/30058Conditional branch instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • G06F9/3848Speculative instruction execution using hybrid branch prediction, e.g. selection between prediction techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Executing Machine-Instructions (AREA)
  • Devices For Executing Special Programs (AREA)
  • Advance Control (AREA)
  • Complex Calculations (AREA)
KR20130035807A 2012-04-02 2013-04-02 벡터 분할 루프들의 성능 향상 Active KR101511837B1 (ko)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/437,482 US9116686B2 (en) 2012-04-02 2012-04-02 Selective suppression of branch prediction in vector partitioning loops until dependency vector is available for predicate generating instruction
US13/437,482 2012-04-02

Publications (2)

Publication Number Publication Date
KR20130112009A KR20130112009A (ko) 2013-10-11
KR101511837B1 true KR101511837B1 (ko) 2015-04-13

Family

ID=48044642

Family Applications (1)

Application Number Title Priority Date Filing Date
KR20130035807A Active KR101511837B1 (ko) 2012-04-02 2013-04-02 벡터 분할 루프들의 성능 향상

Country Status (8)

Country Link
US (1) US9116686B2 (https=)
EP (1) EP2648090A3 (https=)
JP (1) JP2013254484A (https=)
KR (1) KR101511837B1 (https=)
CN (1) CN103383640B (https=)
BR (1) BR102013007865A2 (https=)
TW (1) TWI512617B (https=)
WO (1) WO2013151861A1 (https=)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10241793B2 (en) 2013-03-15 2019-03-26 Analog Devices Global Paralleizing loops in the presence of possible memory aliases
US9323531B2 (en) * 2013-03-15 2016-04-26 Intel Corporation Systems, apparatuses, and methods for determining a trailing least significant masking bit of a writemask register
GB2519108A (en) * 2013-10-09 2015-04-15 Advanced Risc Mach Ltd A data processing apparatus and method for controlling performance of speculative vector operations
GB2519107B (en) 2013-10-09 2020-05-13 Advanced Risc Mach Ltd A data processing apparatus and method for performing speculative vector access operations
KR101826707B1 (ko) 2014-03-27 2018-02-07 인텔 코포레이션 마스킹된 결과 요소들로의 전파를 이용하여 연속 소스 요소들을 마스킹되지 않은 결과 요소들에 저장하기 위한 프로세서, 방법, 시스템 및 명령어
EP3123300A1 (en) 2014-03-28 2017-02-01 Intel Corporation Processors, methods, systems, and instructions to store source elements to corresponding unmasked result elements with propagation to masked result elements
US10289417B2 (en) 2014-10-21 2019-05-14 Arm Limited Branch prediction suppression for blocks of instructions predicted to not include a branch instruction
US20160179550A1 (en) * 2014-12-23 2016-06-23 Intel Corporation Fast vector dynamic memory conflict detection
GB2540941B (en) * 2015-07-31 2017-11-15 Advanced Risc Mach Ltd Data processing
GB2545248B (en) 2015-12-10 2018-04-04 Advanced Risc Mach Ltd Data processing
GB2548602B (en) * 2016-03-23 2019-10-23 Advanced Risc Mach Ltd Program loop control
GB2549737B (en) * 2016-04-26 2019-05-08 Advanced Risc Mach Ltd An apparatus and method for managing address collisions when performing vector operations
GB2571527B (en) * 2018-02-28 2020-09-16 Advanced Risc Mach Ltd Data processing
US11860996B1 (en) 2018-04-06 2024-01-02 Apple Inc. Security concepts for web frameworks
US10915322B2 (en) * 2018-09-18 2021-02-09 Advanced Micro Devices, Inc. Using loop exit prediction to accelerate or suppress loop mode of a processor
JP7243742B2 (ja) * 2019-01-18 2023-03-22 日本電気株式会社 最適化装置、最適化方法、及びプログラム
US11403256B2 (en) * 2019-05-20 2022-08-02 Micron Technology, Inc. Conditional operations in a vector processor having true and false vector index registers
US11507374B2 (en) 2019-05-20 2022-11-22 Micron Technology, Inc. True/false vector index registers and methods of populating thereof
US11340904B2 (en) 2019-05-20 2022-05-24 Micron Technology, Inc. Vector index registers
US11327862B2 (en) 2019-05-20 2022-05-10 Micron Technology, Inc. Multi-lane solutions for addressing vector elements using vector index registers
US11989554B2 (en) * 2020-12-23 2024-05-21 Intel Corporation Processing pipeline with zero loop overhead
CN115113934B (zh) * 2022-08-31 2022-11-11 腾讯科技(深圳)有限公司 指令处理方法、装置、程序产品、计算机设备和介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005531848A (ja) 2002-06-28 2005-10-20 モトローラ・インコーポレイテッド 再構成可能なストリーミングベクトルプロセッサ
KR20060070576A (ko) * 1998-08-24 2006-06-23 어드밴스드 마이크로 디바이시즈, 인코포레이티드 저장 어드레스 생성에 대한 적재 블록을 위한 메커니즘
US20100058037A1 (en) 2008-08-15 2010-03-04 Apple Inc. Running-shift instructions for processing vectors
US20120060020A1 (en) * 2008-08-15 2012-03-08 Apple Inc. Vector index instruction for processing vectors

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5228131A (en) * 1988-02-24 1993-07-13 Mitsubishi Denki Kabushiki Kaisha Data processor with selectively enabled and disabled branch prediction operation
US5903750A (en) 1996-11-20 1999-05-11 Institute For The Development Of Emerging Architectures, L.L.P. Dynamic branch prediction for branch instructions with multiple targets
JPH1185515A (ja) 1997-09-10 1999-03-30 Ricoh Co Ltd マイクロプロセッサ
US6988183B1 (en) 1998-06-26 2006-01-17 Derek Chi-Lan Wong Methods for increasing instruction-level parallelism in microprocessors and digital system
US6353883B1 (en) 1998-08-04 2002-03-05 Intel Corporation Method and apparatus for performing predicate prediction
JP2000322257A (ja) 1999-05-10 2000-11-24 Nec Corp 条件分岐命令の投機的実行制御方法
US7571302B1 (en) 2004-02-04 2009-08-04 Lei Chen Dynamic data dependence tracking and its application to branch prediction
US20060168432A1 (en) 2005-01-24 2006-07-27 Paul Caprioli Branch prediction accuracy in a processor that supports speculative execution
US7587580B2 (en) 2005-02-03 2009-09-08 Qualcomm Corporated Power efficient instruction prefetch mechanism
US20070288732A1 (en) * 2006-06-08 2007-12-13 Luick David A Hybrid Branch Prediction Scheme
WO2008029450A1 (fr) 2006-09-05 2008-03-13 Fujitsu Limited Dispositif de traitement d'informations comprenant un mécanisme de correction d'erreur de prédiction d'embranchement
US7627742B2 (en) 2007-04-10 2009-12-01 International Business Machines Corporation Method and apparatus for conserving power by throttling instruction fetching when a processor encounters low confidence branches in an information handling system
US8006070B2 (en) 2007-12-05 2011-08-23 International Business Machines Corporation Method and apparatus for inhibiting fetch throttling when a processor encounters a low confidence branch instruction in an information handling system
US20110035568A1 (en) 2008-08-15 2011-02-10 Apple Inc. Select first and select last instructions for processing vectors
US20110283092A1 (en) 2008-08-15 2011-11-17 Apple Inc. Getfirst and assignlast instructions for processing vectors
US8447956B2 (en) 2008-08-15 2013-05-21 Apple Inc. Running subtract and running divide instructions for processing vectors
US8417921B2 (en) 2008-08-15 2013-04-09 Apple Inc. Running-min and running-max instructions for processing vectors using a base value from a key element of an input vector
US20100325399A1 (en) 2008-08-15 2010-12-23 Apple Inc. Vector test instruction for processing vectors
US8271832B2 (en) 2008-08-15 2012-09-18 Apple Inc. Non-faulting and first-faulting instructions for processing vectors
US8959316B2 (en) 2008-08-15 2015-02-17 Apple Inc. Actual instruction and actual-fault instructions for processing vectors
US8650383B2 (en) 2008-08-15 2014-02-11 Apple Inc. Vector processing with predicate vector for setting element values based on key element position by executing remaining instruction
US8984262B2 (en) 2008-08-15 2015-03-17 Apple Inc. Generate predicates instruction for processing vectors
US20100115233A1 (en) 2008-10-31 2010-05-06 Convey Computer Dynamically-selectable vector register partitioning
JP5387819B2 (ja) 2008-12-26 2014-01-15 日本電気株式会社 分岐予測の信頼度見積もり回路及びその方法
US7979675B2 (en) 2009-02-12 2011-07-12 Via Technologies, Inc. Pipelined microprocessor with fast non-selective correct conditional branch instruction resolution
US9176737B2 (en) 2011-02-07 2015-11-03 Arm Limited Controlling the execution of adjacent instructions that are dependent upon a same data condition
US9268569B2 (en) 2012-02-24 2016-02-23 Apple Inc. Branch misprediction behavior suppression on zero predicate branch mispredict

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20060070576A (ko) * 1998-08-24 2006-06-23 어드밴스드 마이크로 디바이시즈, 인코포레이티드 저장 어드레스 생성에 대한 적재 블록을 위한 메커니즘
JP2005531848A (ja) 2002-06-28 2005-10-20 モトローラ・インコーポレイテッド 再構成可能なストリーミングベクトルプロセッサ
US20100058037A1 (en) 2008-08-15 2010-03-04 Apple Inc. Running-shift instructions for processing vectors
US20120060020A1 (en) * 2008-08-15 2012-03-08 Apple Inc. Vector index instruction for processing vectors

Also Published As

Publication number Publication date
US20130262833A1 (en) 2013-10-03
WO2013151861A1 (en) 2013-10-10
TW201403470A (zh) 2014-01-16
JP2013254484A (ja) 2013-12-19
KR20130112009A (ko) 2013-10-11
BR102013007865A2 (pt) 2015-07-07
EP2648090A3 (en) 2017-11-01
EP2648090A2 (en) 2013-10-09
CN103383640A (zh) 2013-11-06
CN103383640B (zh) 2016-02-10
TWI512617B (zh) 2015-12-11
US9116686B2 (en) 2015-08-25

Similar Documents

Publication Publication Date Title
KR101511837B1 (ko) 벡터 분할 루프들의 성능 향상
KR101417597B1 (ko) 제로 프레디케이트 브랜치 예측실패에 대한 브랜치 예측실패 거동 억제
US8359460B2 (en) Running-sum instructions for processing vectors using a base value from a key element of an input vector
US8417921B2 (en) Running-min and running-max instructions for processing vectors using a base value from a key element of an input vector
US8793472B2 (en) Vector index instruction for generating a result vector with incremental values based on a start value and an increment value
US8447956B2 (en) Running subtract and running divide instructions for processing vectors
US8959316B2 (en) Actual instruction and actual-fault instructions for processing vectors
TW201419141A (zh) 用於執行推測式述詞指令之機制
US20100325399A1 (en) Vector test instruction for processing vectors
US20110283092A1 (en) Getfirst and assignlast instructions for processing vectors
US9715386B2 (en) Conditional stop instruction with accurate dependency detection
US9317284B2 (en) Vector hazard check instruction with reduced source operands
US9389860B2 (en) Prediction optimizations for Macroscalar vector partitioning loops
US9367309B2 (en) Predicate attribute tracker
US20160253179A1 (en) Concurrent execution of heterogeneous vector instructions
US20160092398A1 (en) Conditional Termination and Conditional Termination Predicate Instructions
US20130318332A1 (en) Branch misprediction behavior suppression using a branch optional instruction
US9390058B2 (en) Dynamic attribute inference
US20160092217A1 (en) Compare Break Instructions

Legal Events

Date Code Title Description
A201 Request for examination
PA0109 Patent application

St.27 status event code: A-0-1-A10-A12-nap-PA0109

PA0201 Request for examination

St.27 status event code: A-1-2-D10-D11-exm-PA0201

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

D13-X000 Search requested

St.27 status event code: A-1-2-D10-D13-srh-X000

D14-X000 Search report completed

St.27 status event code: A-1-2-D10-D14-srh-X000

E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

St.27 status event code: A-1-2-D10-D21-exm-PE0902

T11-X000 Administrative time limit extension requested

St.27 status event code: U-3-3-T10-T11-oth-X000

R17-X000 Change to representative recorded

St.27 status event code: A-3-3-R10-R17-oth-X000

E701 Decision to grant or registration of patent right
PE0701 Decision of registration

St.27 status event code: A-1-2-D10-D22-exm-PE0701

GRNT Written decision to grant
PR0701 Registration of establishment

St.27 status event code: A-2-4-F10-F11-exm-PR0701

PR1002 Payment of registration fee

St.27 status event code: A-2-2-U10-U11-oth-PR1002

Fee payment year number: 1

PG1601 Publication of registration

St.27 status event code: A-4-4-Q10-Q13-nap-PG1601

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

FPAY Annual fee payment

Payment date: 20180315

Year of fee payment: 4

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 4

FPAY Annual fee payment

Payment date: 20190315

Year of fee payment: 5

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 5

FPAY Annual fee payment

Payment date: 20200317

Year of fee payment: 6

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 6

FPAY Annual fee payment

Payment date: 20210324

Year of fee payment: 7

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 7

FPAY Annual fee payment

Payment date: 20220323

Year of fee payment: 8

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 8

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 9

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 10

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 11

P22-X000 Classification modified

St.27 status event code: A-4-4-P10-P22-nap-X000

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 12

U11 Full renewal or maintenance fee paid

Free format text: ST27 STATUS EVENT CODE: A-4-4-U10-U11-OTH-PR1001 (AS PROVIDED BY THE NATIONAL OFFICE)

Year of fee payment: 12