JP2018514869A5 - - Google Patents
Download PDFInfo
- Publication number
- JP2018514869A5 JP2018514869A5 JP2017554900A JP2017554900A JP2018514869A5 JP 2018514869 A5 JP2018514869 A5 JP 2018514869A5 JP 2017554900 A JP2017554900 A JP 2017554900A JP 2017554900 A JP2017554900 A JP 2017554900A JP 2018514869 A5 JP2018514869 A5 JP 2018514869A5
- Authority
- JP
- Japan
- Prior art keywords
- kernel
- parallel processor
- mini
- workgroups
- application
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims 15
- 230000006399 behavior Effects 0.000 claims 8
- 238000005259 measurement Methods 0.000 claims 6
- 230000035945 sensitivity Effects 0.000 claims 6
- 238000012886 linear function Methods 0.000 claims 2
- 230000000007 visual effect Effects 0.000 claims 1
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/710,879 | 2015-05-13 | ||
| US14/710,879 US9965343B2 (en) | 2015-05-13 | 2015-05-13 | System and method for determining concurrency factors for dispatch size of parallel processor kernels |
| PCT/US2016/023560 WO2016182636A1 (en) | 2015-05-13 | 2016-03-22 | System and method for determining concurrency factors for dispatch size of parallel processor kernels |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JP2018514869A JP2018514869A (ja) | 2018-06-07 |
| JP2018514869A5 true JP2018514869A5 (enExample) | 2019-05-09 |
| JP6659724B2 JP6659724B2 (ja) | 2020-03-04 |
Family
ID=57248294
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2017554900A Active JP6659724B2 (ja) | 2015-05-13 | 2016-03-22 | 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US9965343B2 (enExample) |
| EP (1) | EP3295300B1 (enExample) |
| JP (1) | JP6659724B2 (enExample) |
| KR (1) | KR102548402B1 (enExample) |
| CN (1) | CN107580698B (enExample) |
| WO (1) | WO2016182636A1 (enExample) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10324730B2 (en) * | 2016-03-24 | 2019-06-18 | Mediatek, Inc. | Memory shuffle engine for efficient work execution in a parallel computing system |
| US20180115496A1 (en) * | 2016-10-21 | 2018-04-26 | Advanced Micro Devices, Inc. | Mechanisms to improve data locality for distributed gpus |
| US10558499B2 (en) * | 2017-10-26 | 2020-02-11 | Advanced Micro Devices, Inc. | Wave creation control with dynamic resource allocation |
| US12405790B2 (en) * | 2019-06-28 | 2025-09-02 | Advanced Micro Devices, Inc. | Compute unit sorting for reduced divergence |
| US11900123B2 (en) * | 2019-12-13 | 2024-02-13 | Advanced Micro Devices, Inc. | Marker-based processor instruction grouping |
| US11809902B2 (en) * | 2020-09-24 | 2023-11-07 | Advanced Micro Devices, Inc. | Fine-grained conditional dispatching |
| US20220334845A1 (en) * | 2021-04-15 | 2022-10-20 | Nvidia Corporation | Launching code concurrently |
| US20240220315A1 (en) * | 2022-12-30 | 2024-07-04 | Advanced Micro Devices, Inc. | Dynamic control of work scheduling |
Family Cites Families (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6516310B2 (en) * | 1999-12-07 | 2003-02-04 | Sybase, Inc. | System and methodology for join enumeration in a memory-constrained environment |
| US8136104B2 (en) | 2006-06-20 | 2012-03-13 | Google Inc. | Systems and methods for determining compute kernels for an application in a parallel-processing computer system |
| US8375368B2 (en) * | 2006-06-20 | 2013-02-12 | Google Inc. | Systems and methods for profiling an application running on a parallel-processing computer system |
| US9354944B2 (en) * | 2009-07-27 | 2016-05-31 | Advanced Micro Devices, Inc. | Mapping processing logic having data-parallel threads across processors |
| CN102023844B (zh) * | 2009-09-18 | 2014-04-09 | 深圳中微电科技有限公司 | 并行处理器及其线程处理方法 |
| KR101079697B1 (ko) * | 2009-10-05 | 2011-11-03 | 주식회사 글로벌미디어테크 | 범용 그래픽 처리장치의 병렬 프로세서를 이용한 고속 영상 처리 방법 |
| US8250404B2 (en) * | 2009-12-31 | 2012-08-21 | International Business Machines Corporation | Process integrity of work items in a multiple processor system |
| US8782645B2 (en) * | 2011-05-11 | 2014-07-15 | Advanced Micro Devices, Inc. | Automatic load balancing for heterogeneous cores |
| US9092267B2 (en) * | 2011-06-20 | 2015-07-28 | Qualcomm Incorporated | Memory sharing in graphics processing unit |
| US20120331278A1 (en) | 2011-06-23 | 2012-12-27 | Mauricio Breternitz | Branch removal by data shuffling |
| US8707314B2 (en) * | 2011-12-16 | 2014-04-22 | Advanced Micro Devices, Inc. | Scheduling compute kernel workgroups to heterogeneous processors based on historical processor execution times and utilizations |
| US9823927B2 (en) * | 2012-11-30 | 2017-11-21 | Intel Corporation | Range selection for data parallel programming environments |
| CN105027089B (zh) * | 2013-03-14 | 2018-05-22 | 英特尔公司 | 内核功能性检查器 |
| US9772864B2 (en) * | 2013-04-16 | 2017-09-26 | Arm Limited | Methods of and apparatus for multidimensional indexing in microprocessor systems |
| US10297073B2 (en) * | 2016-02-25 | 2019-05-21 | Intel Corporation | Method and apparatus for in-place construction of left-balanced and complete point K-D trees |
-
2015
- 2015-05-13 US US14/710,879 patent/US9965343B2/en active Active
-
2016
- 2016-03-22 KR KR1020177033219A patent/KR102548402B1/ko active Active
- 2016-03-22 JP JP2017554900A patent/JP6659724B2/ja active Active
- 2016-03-22 CN CN201680026495.2A patent/CN107580698B/zh active Active
- 2016-03-22 WO PCT/US2016/023560 patent/WO2016182636A1/en not_active Ceased
- 2016-03-22 EP EP16793114.6A patent/EP3295300B1/en active Active
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP2018514869A5 (enExample) | ||
| Guo et al. | A performance modeling and optimization analysis tool for sparse matrix-vector multiplication on GPUs | |
| Singh et al. | Energy-efficient run-time mapping and thread partitioning of concurrent OpenCL applications on CPU-GPU MPSoCs | |
| Bailey et al. | Adaptive configuration selection for power-constrained heterogeneous systems | |
| JP6659724B2 (ja) | 並列プロセッサカーネルのディスパッチサイズのコンカレンシーファクタを決定するシステム及び方法 | |
| Varrette et al. | Hpc performance and energy-efficiency of xen, kvm and vmware hypervisors | |
| JP2014513373A5 (enExample) | ||
| TW201423597A (zh) | 關聯能源消耗與虛擬機器 | |
| JP2014513853A5 (enExample) | ||
| Sun et al. | Concurrent average memory access time | |
| US20150286491A1 (en) | Methods for Compilation, a Compiler and a System | |
| WO2018086467A1 (zh) | 一种云环境下应用集群资源分配的方法、装置和系统 | |
| WO2016068999A1 (en) | Integrated heterogeneous processing units | |
| CN104615584B (zh) | 面向gpdsp的大规模三角线性方程组求解向量化计算的方法 | |
| CN103245829B (zh) | 一种虚拟机功耗测量方法 | |
| Dev et al. | Scheduling challenges and opportunities in integrated cpu+ gpu processors | |
| Lien et al. | Case studies of multi-core energy efficiency in task based programs | |
| JP2012014591A (ja) | モンテカルロ法の効率的な並列処理手法 | |
| Almari et al. | Performance analysis of oracle database in virtual environments | |
| US9213585B2 (en) | Controlling sprinting for thermal capacity boosted systems | |
| Burtscher et al. | Power characteristics of irregular GPGPU programs | |
| El Zant et al. | Performance evaluation of cloud service providers | |
| CN102073798B (zh) | 基于多核处理器的核糖核酸次级结构并行预测方法 | |
| Colmant et al. | Improving the energy efficiency of software systems for multi-core architectures | |
| Branco et al. | Load indices-past, present and future |