CN107580698B - 用于确定并行处理器内核的调度大小的并发因子的系统和方法 - Google Patents

用于确定并行处理器内核的调度大小的并发因子的系统和方法 Download PDF

Info

Publication number
CN107580698B
CN107580698B CN201680026495.2A CN201680026495A CN107580698B CN 107580698 B CN107580698 B CN 107580698B CN 201680026495 A CN201680026495 A CN 201680026495A CN 107580698 B CN107580698 B CN 107580698B
Authority
CN
China
Prior art keywords
kernel
parallel processor
sequence
miniature
application program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201680026495.2A
Other languages
English (en)
Chinese (zh)
Other versions
CN107580698A (zh
Inventor
瑞斯吉特·森
因德拉尼·保罗
黄伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Micro Devices Inc
Original Assignee
Advanced Micro Devices Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Micro Devices Inc filed Critical Advanced Micro Devices Inc
Publication of CN107580698A publication Critical patent/CN107580698A/zh
Application granted granted Critical
Publication of CN107580698B publication Critical patent/CN107580698B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/545Interprogram communication where tasks reside in different layers, e.g. user- and kernel-space
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3404Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for parallel or distributed programming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3433Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment for load management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4893Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues taking into account power or heat criteria
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)
CN201680026495.2A 2015-05-13 2016-03-22 用于确定并行处理器内核的调度大小的并发因子的系统和方法 Active CN107580698B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/710,879 2015-05-13
US14/710,879 US9965343B2 (en) 2015-05-13 2015-05-13 System and method for determining concurrency factors for dispatch size of parallel processor kernels
PCT/US2016/023560 WO2016182636A1 (en) 2015-05-13 2016-03-22 System and method for determining concurrency factors for dispatch size of parallel processor kernels

Publications (2)

Publication Number Publication Date
CN107580698A CN107580698A (zh) 2018-01-12
CN107580698B true CN107580698B (zh) 2019-08-23

Family

ID=57248294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201680026495.2A Active CN107580698B (zh) 2015-05-13 2016-03-22 用于确定并行处理器内核的调度大小的并发因子的系统和方法

Country Status (6)

Country Link
US (1) US9965343B2 (enExample)
EP (1) EP3295300B1 (enExample)
JP (1) JP6659724B2 (enExample)
KR (1) KR102548402B1 (enExample)
CN (1) CN107580698B (enExample)
WO (1) WO2016182636A1 (enExample)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10324730B2 (en) * 2016-03-24 2019-06-18 Mediatek, Inc. Memory shuffle engine for efficient work execution in a parallel computing system
US20180115496A1 (en) * 2016-10-21 2018-04-26 Advanced Micro Devices, Inc. Mechanisms to improve data locality for distributed gpus
US10558499B2 (en) * 2017-10-26 2020-02-11 Advanced Micro Devices, Inc. Wave creation control with dynamic resource allocation
US12405790B2 (en) * 2019-06-28 2025-09-02 Advanced Micro Devices, Inc. Compute unit sorting for reduced divergence
US11900123B2 (en) * 2019-12-13 2024-02-13 Advanced Micro Devices, Inc. Marker-based processor instruction grouping
US11809902B2 (en) * 2020-09-24 2023-11-07 Advanced Micro Devices, Inc. Fine-grained conditional dispatching
US20220334845A1 (en) * 2021-04-15 2022-10-20 Nvidia Corporation Launching code concurrently
US20240220315A1 (en) * 2022-12-30 2024-07-04 Advanced Micro Devices, Inc. Dynamic control of work scheduling

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014084913A1 (en) * 2012-11-30 2014-06-05 Intel Corporation Range selection for data parallel programming environments
CN105027089A (zh) * 2013-03-14 2015-11-04 英特尔公司 内核功能性检查器

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6516310B2 (en) * 1999-12-07 2003-02-04 Sybase, Inc. System and methodology for join enumeration in a memory-constrained environment
US8136104B2 (en) 2006-06-20 2012-03-13 Google Inc. Systems and methods for determining compute kernels for an application in a parallel-processing computer system
US8375368B2 (en) * 2006-06-20 2013-02-12 Google Inc. Systems and methods for profiling an application running on a parallel-processing computer system
US9354944B2 (en) * 2009-07-27 2016-05-31 Advanced Micro Devices, Inc. Mapping processing logic having data-parallel threads across processors
CN102023844B (zh) * 2009-09-18 2014-04-09 深圳中微电科技有限公司 并行处理器及其线程处理方法
KR101079697B1 (ko) * 2009-10-05 2011-11-03 주식회사 글로벌미디어테크 범용 그래픽 처리장치의 병렬 프로세서를 이용한 고속 영상 처리 방법
US8250404B2 (en) * 2009-12-31 2012-08-21 International Business Machines Corporation Process integrity of work items in a multiple processor system
US8782645B2 (en) * 2011-05-11 2014-07-15 Advanced Micro Devices, Inc. Automatic load balancing for heterogeneous cores
US9092267B2 (en) * 2011-06-20 2015-07-28 Qualcomm Incorporated Memory sharing in graphics processing unit
US20120331278A1 (en) 2011-06-23 2012-12-27 Mauricio Breternitz Branch removal by data shuffling
US8707314B2 (en) * 2011-12-16 2014-04-22 Advanced Micro Devices, Inc. Scheduling compute kernel workgroups to heterogeneous processors based on historical processor execution times and utilizations
US9772864B2 (en) * 2013-04-16 2017-09-26 Arm Limited Methods of and apparatus for multidimensional indexing in microprocessor systems
US10297073B2 (en) * 2016-02-25 2019-05-21 Intel Corporation Method and apparatus for in-place construction of left-balanced and complete point K-D trees

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014084913A1 (en) * 2012-11-30 2014-06-05 Intel Corporation Range selection for data parallel programming environments
CN105027089A (zh) * 2013-03-14 2015-11-04 英特尔公司 内核功能性检查器

Also Published As

Publication number Publication date
US9965343B2 (en) 2018-05-08
EP3295300A1 (en) 2018-03-21
EP3295300B1 (en) 2022-03-23
EP3295300A4 (en) 2019-01-09
KR102548402B1 (ko) 2023-06-27
WO2016182636A1 (en) 2016-11-17
CN107580698A (zh) 2018-01-12
KR20180011096A (ko) 2018-01-31
US20160335143A1 (en) 2016-11-17
JP6659724B2 (ja) 2020-03-04
JP2018514869A (ja) 2018-06-07

Similar Documents

Publication Publication Date Title
CN107580698B (zh) 用于确定并行处理器内核的调度大小的并发因子的系统和方法
CN112204523B (zh) 多内核波前调度程序
JP6983154B2 (ja) 計算グラフの処理
JP6640243B2 (ja) ニューラルネットワークプロセッサにおけるバッチ処理
US8707320B2 (en) Dynamic partitioning of data by occasionally doubling data chunk size for data-parallel applications
JP2018533795A (ja) 計算グラフのストリームベースのアクセラレータ処理
US10073783B2 (en) Dual mode local data store
KR20210089247A (ko) 그래픽 처리 장치에서 행렬 곱셈의 파이프라인 처리
TWI851030B (zh) 用於人工智慧加速器的處理核心、可重組態處理元件及其操作方法
KR20150101870A (ko) 메모리의 뱅크 충돌을 방지하기 위한 방법 및 장치
US9760969B2 (en) Graphic processing system and method thereof
CN104731562A (zh) 在simd处理单元中的任务执行
EP4174671A1 (en) Method and apparatus with process scheduling
US9898333B2 (en) Method and apparatus for selecting preemption technique
CN119597414A (zh) 算子的调度方法、装置、设备、存储介质及程序产品
US12321733B2 (en) Apparatus and method with neural network computation scheduling
US11327766B2 (en) Instruction dispatch routing
JP5556326B2 (ja) 情報処理システム、タスク制御方法及びタスク制御プログラム
CN113362878A (zh) 用于存储器内计算的方法和用于计算的系统
US20250208924A1 (en) Systems and Methods for Heterogeneous Model Parallelism and Adaptive Graph Partitioning
US20250068464A1 (en) Hierarchical work scheduling
US20240248764A1 (en) Efficient data processing, arbitration and prioritization
CN117873708A (zh) 为机器指令分配功能单元的方法、装置、介质及设备
Sudarsan ReSHAPE: A Framework for Dynamic Resizing of Parallel Applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant