JP2018514869A5 - - Google Patents

Download PDF

Info

Publication number
JP2018514869A5
JP2018514869A5 JP2017554900A JP2017554900A JP2018514869A5 JP 2018514869 A5 JP2018514869 A5 JP 2018514869A5 JP 2017554900 A JP2017554900 A JP 2017554900A JP 2017554900 A JP2017554900 A JP 2017554900A JP 2018514869 A5 JP2018514869 A5 JP 2018514869A5
Authority
JP
Japan
Prior art keywords
kernel
parallel processor
mini
workgroups
application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2017554900A
Other languages
English (en)
Japanese (ja)
Other versions
JP6659724B2 (ja
JP2018514869A (ja
Filing date
Publication date
Priority claimed from US14/710,879 external-priority patent/US9965343B2/en
Application filed filed Critical
Publication of JP2018514869A publication Critical patent/JP2018514869A/ja
Publication of JP2018514869A5 publication Critical patent/JP2018514869A5/ja
Application granted granted Critical
Publication of JP6659724B2 publication Critical patent/JP6659724B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

JP2017554900A 2015-05-13 2016-03-22 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法 Active JP6659724B2 (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/710,879 2015-05-13
US14/710,879 US9965343B2 (en) 2015-05-13 2015-05-13 System and method for determining concurrency factors for dispatch size of parallel processor kernels
PCT/US2016/023560 WO2016182636A1 (en) 2015-05-13 2016-03-22 System and method for determining concurrency factors for dispatch size of parallel processor kernels

Publications (3)

Publication Number Publication Date
JP2018514869A JP2018514869A (ja) 2018-06-07
JP2018514869A5 true JP2018514869A5 (enExample) 2019-05-09
JP6659724B2 JP6659724B2 (ja) 2020-03-04

Family

ID=57248294

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2017554900A Active JP6659724B2 (ja) 2015-05-13 2016-03-22 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法

Country Status (6)

Country Link
US (1) US9965343B2 (enExample)
EP (1) EP3295300B1 (enExample)
JP (1) JP6659724B2 (enExample)
KR (1) KR102548402B1 (enExample)
CN (1) CN107580698B (enExample)
WO (1) WO2016182636A1 (enExample)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10324730B2 (en) * 2016-03-24 2019-06-18 Mediatek, Inc. Memory shuffle engine for efficient work execution in a parallel computing system
US20180115496A1 (en) * 2016-10-21 2018-04-26 Advanced Micro Devices, Inc. Mechanisms to improve data locality for distributed gpus
US10558499B2 (en) * 2017-10-26 2020-02-11 Advanced Micro Devices, Inc. Wave creation control with dynamic resource allocation
US12405790B2 (en) * 2019-06-28 2025-09-02 Advanced Micro Devices, Inc. Compute unit sorting for reduced divergence
US11900123B2 (en) * 2019-12-13 2024-02-13 Advanced Micro Devices, Inc. Marker-based processor instruction grouping
US11809902B2 (en) * 2020-09-24 2023-11-07 Advanced Micro Devices, Inc. Fine-grained conditional dispatching
US20220334845A1 (en) * 2021-04-15 2022-10-20 Nvidia Corporation Launching code concurrently
US20240220315A1 (en) * 2022-12-30 2024-07-04 Advanced Micro Devices, Inc. Dynamic control of work scheduling

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6516310B2 (en) * 1999-12-07 2003-02-04 Sybase, Inc. System and methodology for join enumeration in a memory-constrained environment
US8136104B2 (en) 2006-06-20 2012-03-13 Google Inc. Systems and methods for determining compute kernels for an application in a parallel-processing computer system
US8375368B2 (en) * 2006-06-20 2013-02-12 Google Inc. Systems and methods for profiling an application running on a parallel-processing computer system
US9354944B2 (en) * 2009-07-27 2016-05-31 Advanced Micro Devices, Inc. Mapping processing logic having data-parallel threads across processors
CN102023844B (zh) * 2009-09-18 2014-04-09 深圳中微电科技有限公司 并行处理器及其线程处理方法
KR101079697B1 (ko) * 2009-10-05 2011-11-03 주식회사 글로벌미디어테크 범용 그래픽 처리장치의 병렬 프로세서를 이용한 고속 영상 처리 방법
US8250404B2 (en) * 2009-12-31 2012-08-21 International Business Machines Corporation Process integrity of work items in a multiple processor system
US8782645B2 (en) * 2011-05-11 2014-07-15 Advanced Micro Devices, Inc. Automatic load balancing for heterogeneous cores
US9092267B2 (en) * 2011-06-20 2015-07-28 Qualcomm Incorporated Memory sharing in graphics processing unit
US20120331278A1 (en) 2011-06-23 2012-12-27 Mauricio Breternitz Branch removal by data shuffling
US8707314B2 (en) * 2011-12-16 2014-04-22 Advanced Micro Devices, Inc. Scheduling compute kernel workgroups to heterogeneous processors based on historical processor execution times and utilizations
US9823927B2 (en) * 2012-11-30 2017-11-21 Intel Corporation Range selection for data parallel programming environments
CN105027089B (zh) * 2013-03-14 2018-05-22 英特尔公司 内核功能性检查器
US9772864B2 (en) * 2013-04-16 2017-09-26 Arm Limited Methods of and apparatus for multidimensional indexing in microprocessor systems
US10297073B2 (en) * 2016-02-25 2019-05-21 Intel Corporation Method and apparatus for in-place construction of left-balanced and complete point K-D trees

Similar Documents

Publication Publication Date Title
JP2018514869A5 (enExample)
Guo et al. A performance modeling and optimization analysis tool for sparse matrix-vector multiplication on GPUs
Singh et al. Energy-efficient run-time mapping and thread partitioning of concurrent OpenCL applications on CPU-GPU MPSoCs
Bailey et al. Adaptive configuration selection for power-constrained heterogeneous systems
JP6659724B2 (ja) 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法
Varrette et al. Hpc performance and energy-efficiency of xen, kvm and vmware hypervisors
JP2014513373A5 (enExample)
TW201423597A (zh) 關聯能源消耗與虛擬機器
JP2014513853A5 (enExample)
Sun et al. Concurrent average memory access time
US20150286491A1 (en) Methods for Compilation, a Compiler and a System
WO2018086467A1 (zh) 一种云环境下应用集群资源分配的方法、装置和系统
WO2016068999A1 (en) Integrated heterogeneous processing units
CN104615584B (zh) 面向gpdsp的大规模三角线性方程组求解向量化计算的方法
CN103245829B (zh) 一种虚拟机功耗测量方法
Dev et al. Scheduling challenges and opportunities in integrated cpu+ gpu processors
Lien et al. Case studies of multi-core energy efficiency in task based programs
JP2012014591A (ja) モンテカルロ法の効率的な並列処理手法
Almari et al. Performance analysis of oracle database in virtual environments
US9213585B2 (en) Controlling sprinting for thermal capacity boosted systems
Burtscher et al. Power characteristics of irregular GPGPU programs
El Zant et al. Performance evaluation of cloud service providers
CN102073798B (zh) 基于多核处理器的核糖核酸次级结构并行预测方法
Colmant et al. Improving the energy efficiency of software systems for multi-core architectures
Branco et al. Load indices-past, present and future