JP6659724B2 - 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法 - Google Patents

並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法 Download PDF

Info

Publication number
JP6659724B2
JP6659724B2 JP2017554900A JP2017554900A JP6659724B2 JP 6659724 B2 JP6659724 B2 JP 6659724B2 JP 2017554900 A JP2017554900 A JP 2017554900A JP 2017554900 A JP2017554900 A JP 2017554900A JP 6659724 B2 JP6659724 B2 JP 6659724B2
Authority
JP
Japan
Prior art keywords
kernel
parallel processor
workgroups
sequence
application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2017554900A
Other languages
English (en)
Japanese (ja)
Other versions
JP2018514869A5 (enExample
JP2018514869A (ja
Inventor
セン ラチジト
セン ラチジト
ポール インドラニ
ポール インドラニ
フアン ウェイ
フアン ウェイ
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Micro Devices Inc
Original Assignee
Advanced Micro Devices Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Micro Devices Inc filed Critical Advanced Micro Devices Inc
Publication of JP2018514869A publication Critical patent/JP2018514869A/ja
Publication of JP2018514869A5 publication Critical patent/JP2018514869A5/ja
Application granted granted Critical
Publication of JP6659724B2 publication Critical patent/JP6659724B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/545Interprogram communication where tasks reside in different layers, e.g. user- and kernel-space
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3404Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for parallel or distributed programming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3433Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment for load management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4893Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues taking into account power or heat criteria
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)
JP2017554900A 2015-05-13 2016-03-22 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法 Active JP6659724B2 (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/710,879 2015-05-13
US14/710,879 US9965343B2 (en) 2015-05-13 2015-05-13 System and method for determining concurrency factors for dispatch size of parallel processor kernels
PCT/US2016/023560 WO2016182636A1 (en) 2015-05-13 2016-03-22 System and method for determining concurrency factors for dispatch size of parallel processor kernels

Publications (3)

Publication Number Publication Date
JP2018514869A JP2018514869A (ja) 2018-06-07
JP2018514869A5 JP2018514869A5 (enExample) 2019-05-09
JP6659724B2 true JP6659724B2 (ja) 2020-03-04

Family

ID=57248294

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2017554900A Active JP6659724B2 (ja) 2015-05-13 2016-03-22 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法

Country Status (6)

Country Link
US (1) US9965343B2 (enExample)
EP (1) EP3295300B1 (enExample)
JP (1) JP6659724B2 (enExample)
KR (1) KR102548402B1 (enExample)
CN (1) CN107580698B (enExample)
WO (1) WO2016182636A1 (enExample)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10324730B2 (en) * 2016-03-24 2019-06-18 Mediatek, Inc. Memory shuffle engine for efficient work execution in a parallel computing system
US20180115496A1 (en) * 2016-10-21 2018-04-26 Advanced Micro Devices, Inc. Mechanisms to improve data locality for distributed gpus
US10558499B2 (en) * 2017-10-26 2020-02-11 Advanced Micro Devices, Inc. Wave creation control with dynamic resource allocation
US12405790B2 (en) * 2019-06-28 2025-09-02 Advanced Micro Devices, Inc. Compute unit sorting for reduced divergence
US11900123B2 (en) * 2019-12-13 2024-02-13 Advanced Micro Devices, Inc. Marker-based processor instruction grouping
US11809902B2 (en) * 2020-09-24 2023-11-07 Advanced Micro Devices, Inc. Fine-grained conditional dispatching
US20220334845A1 (en) * 2021-04-15 2022-10-20 Nvidia Corporation Launching code concurrently
US20240220315A1 (en) * 2022-12-30 2024-07-04 Advanced Micro Devices, Inc. Dynamic control of work scheduling

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6516310B2 (en) * 1999-12-07 2003-02-04 Sybase, Inc. System and methodology for join enumeration in a memory-constrained environment
US8136104B2 (en) 2006-06-20 2012-03-13 Google Inc. Systems and methods for determining compute kernels for an application in a parallel-processing computer system
US8375368B2 (en) * 2006-06-20 2013-02-12 Google Inc. Systems and methods for profiling an application running on a parallel-processing computer system
US9354944B2 (en) * 2009-07-27 2016-05-31 Advanced Micro Devices, Inc. Mapping processing logic having data-parallel threads across processors
CN102023844B (zh) * 2009-09-18 2014-04-09 深圳中微电科技有限公司 并行处理器及其线程处理方法
KR101079697B1 (ko) * 2009-10-05 2011-11-03 주식회사 글로벌미디어테크 범용 그래픽 처리장치의 병렬 프로세서를 이용한 고속 영상 처리 방법
US8250404B2 (en) * 2009-12-31 2012-08-21 International Business Machines Corporation Process integrity of work items in a multiple processor system
US8782645B2 (en) * 2011-05-11 2014-07-15 Advanced Micro Devices, Inc. Automatic load balancing for heterogeneous cores
US9092267B2 (en) * 2011-06-20 2015-07-28 Qualcomm Incorporated Memory sharing in graphics processing unit
US20120331278A1 (en) 2011-06-23 2012-12-27 Mauricio Breternitz Branch removal by data shuffling
US8707314B2 (en) * 2011-12-16 2014-04-22 Advanced Micro Devices, Inc. Scheduling compute kernel workgroups to heterogeneous processors based on historical processor execution times and utilizations
US9823927B2 (en) * 2012-11-30 2017-11-21 Intel Corporation Range selection for data parallel programming environments
CN105027089B (zh) * 2013-03-14 2018-05-22 英特尔公司 内核功能性检查器
US9772864B2 (en) * 2013-04-16 2017-09-26 Arm Limited Methods of and apparatus for multidimensional indexing in microprocessor systems
US10297073B2 (en) * 2016-02-25 2019-05-21 Intel Corporation Method and apparatus for in-place construction of left-balanced and complete point K-D trees

Also Published As

Publication number Publication date
US9965343B2 (en) 2018-05-08
EP3295300A1 (en) 2018-03-21
EP3295300B1 (en) 2022-03-23
EP3295300A4 (en) 2019-01-09
KR102548402B1 (ko) 2023-06-27
WO2016182636A1 (en) 2016-11-17
CN107580698A (zh) 2018-01-12
KR20180011096A (ko) 2018-01-31
CN107580698B (zh) 2019-08-23
US20160335143A1 (en) 2016-11-17
JP2018514869A (ja) 2018-06-07

Similar Documents

Publication Publication Date Title
JP6659724B2 (ja) 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法
KR101839544B1 (ko) 이종 코어의 자동 부하 균형
KR20130116166A (ko) 멀티-코어 프로세서용 멀티스레드 애플리케이션-인지 메모리 스케줄링 기법
US20110161637A1 (en) Apparatus and method for parallel processing
US10073783B2 (en) Dual mode local data store
JP2022160691A (ja) 複数の計算コア上のデータドリブンスケジューラ
KR20210089247A (ko) 그래픽 처리 장치에서 행렬 곱셈의 파이프라인 처리
US9830731B2 (en) Methods of a graphics-processing unit for tile-based rendering of a display area and graphics-processing apparatus
US9104490B2 (en) Methods, systems and apparatuses for processor selection in multi-processor systems
US9898333B2 (en) Method and apparatus for selecting preemption technique
US11893502B2 (en) Dynamic hardware selection for experts in mixture-of-experts model
JP2002049603A (ja) 動的負荷分散方法及び動的負荷分散装置
US12321733B2 (en) Apparatus and method with neural network computation scheduling
CN118796389A (zh) 归约调度方法和装置
CN114490002B (zh) 数据处理系统、任务调度方法、装置、芯片、及电子设备
US20180253288A1 (en) Dynamically predict and enhance energy efficiency
US11144428B2 (en) Efficient calculation of performance data for a computer
CN113362878A (zh) 用于存储器内计算的方法和用于计算的系统
US20250321792A1 (en) Method and apparatus for processing workload using memories of different types
Guo et al. DaCP: Accelerating Synchronization-Free SpTRSV via GPU-Friendly Data Communication and Parallelism Strategies
KR101721341B1 (ko) 이종 멀티코어 환경에서 사용되는 수행장치 결정 모듈 및 이를 이용한 수행장치 결정방법
KR102055617B1 (ko) 운영 체제에서 수행되는 프로세스의 가상 주소 공간을 확장하는 방법 및 시스템

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20171220

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20190322

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20190322

A871 Explanation of circumstances concerning accelerated examination

Free format text: JAPANESE INTERMEDIATE CODE: A871

Effective date: 20190322

A975 Report on accelerated examination

Free format text: JAPANESE INTERMEDIATE CODE: A971005

Effective date: 20190710

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20190717

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20190730

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20191030

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20200107

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20200206

R150 Certificate of patent or registration of utility model

Ref document number: 6659724

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250