DE102017213160B4 - Kompilierung für knotenvorrichtungs-GPU-basierte Parallelverarbeitung - Google Patents

Kompilierung für knotenvorrichtungs-GPU-basierte Parallelverarbeitung Download PDF

Info

Publication number
DE102017213160B4
DE102017213160B4 DE102017213160.8A DE102017213160A DE102017213160B4 DE 102017213160 B4 DE102017213160 B4 DE 102017213160B4 DE 102017213160 A DE102017213160 A DE 102017213160A DE 102017213160 B4 DE102017213160 B4 DE 102017213160B4
Authority
DE
Germany
Prior art keywords
task
gpu
routine
node device
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
DE102017213160.8A
Other languages
German (de)
English (en)
Other versions
DE102017213160A1 (de
Inventor
Henry Gabriel Victor Bequet
Huina Chen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SAS Institute Inc
Original Assignee
SAS Institute Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/422,285 external-priority patent/US9760376B1/en
Application filed by SAS Institute Inc filed Critical SAS Institute Inc
Publication of DE102017213160A1 publication Critical patent/DE102017213160A1/de
Application granted granted Critical
Publication of DE102017213160B4 publication Critical patent/DE102017213160B4/de
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30043LOAD or STORE instructions; Clear instruction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • G06F8/456Parallelism detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5055Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/561Adding application-functional data or data for application control, e.g. adding metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/509Offload

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mathematical Physics (AREA)
  • Library & Information Science (AREA)
  • Debugging And Monitoring (AREA)
  • Devices For Executing Special Programs (AREA)
  • Advance Control (AREA)
  • Multi Processors (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)
DE102017213160.8A 2016-08-25 2017-07-31 Kompilierung für knotenvorrichtungs-GPU-basierte Parallelverarbeitung Active DE102017213160B4 (de)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201662379512P 2016-08-25 2016-08-25
US62/379,512 2016-08-25
US201662394411P 2016-09-14 2016-09-14
US62/394,411 2016-09-14
US15/422,285 US9760376B1 (en) 2016-02-01 2017-02-01 Compilation for node device GPU-based parallel processing
US15/422,285 2017-02-01

Publications (2)

Publication Number Publication Date
DE102017213160A1 DE102017213160A1 (de) 2018-03-01
DE102017213160B4 true DE102017213160B4 (de) 2023-05-25

Family

ID=59778869

Family Applications (1)

Application Number Title Priority Date Filing Date
DE102017213160.8A Active DE102017213160B4 (de) 2016-08-25 2017-07-31 Kompilierung für knotenvorrichtungs-GPU-basierte Parallelverarbeitung

Country Status (9)

Country Link
CN (1) CN107783782B (fr)
BE (1) BE1025002B1 (fr)
CA (1) CA2974556C (fr)
DE (1) DE102017213160B4 (fr)
DK (1) DK179709B1 (fr)
FR (1) FR3055438B1 (fr)
GB (1) GB2553424B (fr)
HK (1) HK1245439B (fr)
NO (1) NO343250B1 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111327921A (zh) * 2018-12-17 2020-06-23 深圳市炜博科技有限公司 视频数据处理方法及设备
CN109743453B (zh) * 2018-12-29 2021-01-05 出门问问信息科技有限公司 一种分屏显示方法及装置
CN110163791B (zh) * 2019-05-21 2020-04-17 中科驭数(北京)科技有限公司 数据计算流图的gpu处理方法及装置
CN111984322B (zh) * 2020-09-07 2023-03-24 北京航天数据股份有限公司 一种控制指令传输方法及装置
CN112783506B (zh) * 2021-01-29 2022-09-30 展讯通信(上海)有限公司 一种模型运行方法及相关装置

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110252411A1 (en) 2010-04-08 2011-10-13 The Mathworks, Inc. Identification and translation of program code executable by a graphical processing unit (gpu)

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8134561B2 (en) * 2004-04-16 2012-03-13 Apple Inc. System for optimizing graphics operations
US8549500B2 (en) * 2007-02-14 2013-10-01 The Mathworks, Inc. Saving and loading graphical processing unit (GPU) arrays providing high computational capabilities in a computing environment
US8938723B1 (en) * 2009-08-03 2015-01-20 Parallels IP Holdings GmbH Use of GPU for support and acceleration of virtual machines and virtual environments
US8310492B2 (en) * 2009-09-03 2012-11-13 Ati Technologies Ulc Hardware-based scheduling of GPU work
DE102013208418A1 (de) * 2012-05-09 2013-11-14 Nvidia Corp. Verfahren und System zur separaten Kompilierung von Geräte-Code, welcher in Host-Code eingebettet ist
US9152601B2 (en) * 2013-05-09 2015-10-06 Advanced Micro Devices, Inc. Power-efficient nested map-reduce execution on a cloud of heterogeneous accelerated processing units
EP2887219A1 (fr) * 2013-12-23 2015-06-24 Deutsche Telekom AG Système et procédé de programmation de tâches à réalité augmentée mobile
US9632761B2 (en) * 2014-01-13 2017-04-25 Red Hat, Inc. Distribute workload of an application to a graphics processing unit
US9235871B2 (en) * 2014-02-06 2016-01-12 Oxide Interactive, LLC Method and system of a command buffer between a CPU and GPU

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110252411A1 (en) 2010-04-08 2011-10-13 The Mathworks, Inc. Identification and translation of program code executable by a graphical processing unit (gpu)

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Hong, C. et al.: MapCG: Writing Parallel Program Portable between CPU and GPU. In: Proceedings of the PACT'10, September 11 – 15, Vienna, Austria, 2010. pp. 217 - 226
Suchard, M.A. et al.: Understanding GPU Programming for Statistical Computation: studies im Massively Parallel Massive Mixtures.PMCID: PMC2945379 in <https://www.ncbi.nlm.nih.gov/pmc/> am 25.9.2010

Also Published As

Publication number Publication date
CA2974556C (fr) 2018-06-05
CN107783782B (zh) 2019-03-15
CA2974556A1 (fr) 2018-02-25
GB2553424A (en) 2018-03-07
FR3055438B1 (fr) 2022-07-29
DK201770596A1 (en) 2018-03-12
NO20171277A1 (en) 2018-02-26
NO343250B1 (en) 2018-12-27
BE1025002A1 (fr) 2018-09-14
DK179709B1 (en) 2019-04-09
HK1245439B (zh) 2019-12-06
FR3055438A1 (fr) 2018-03-02
GB201712171D0 (en) 2017-09-13
BE1025002B1 (fr) 2018-09-17
DE102017213160A1 (de) 2018-03-01
CN107783782A (zh) 2018-03-09
GB2553424B (en) 2018-11-21

Similar Documents

Publication Publication Date Title
DE102017213160B4 (de) Kompilierung für knotenvorrichtungs-GPU-basierte Parallelverarbeitung
DE112016001075B4 (de) Verteiltes speichern und abrufen von datensätzen
DE102013208554B4 (de) Verfahren und System zum Managen verschachtelter Ausführungsströme
DE102010044531B4 (de) Autonome Speicherarchitektur
DE102016118210A1 (de) Granulare Dienstqualität für Computer-Ressourcen
DE102010044529B4 (de) Autonomes speicher-sub-system mit hardwarebeschleuniger
DE112012002905T5 (de) Technik zum Kompilieren und Ausführen von Programmen in höheren Programmiersprachen auf heterogenen Computern
DE102010055267A1 (de) Gemeinsames Benutzen von Ressourcen zwischen einer CPU und GPU
DE112012002465T5 (de) Grafikprozessor mit nicht blockierender gleichzeitiger Architektur
DE102013114072A1 (de) System und Verfahren zum Hardware-Scheduling von indexierten Barrieren
DE102017109239A1 (de) Computerimplementiertes verfahren, computerlesbares medium und heterogenes rechnersystem
DE112011101391T5 (de) GPU-fähige Datenbanksysteme
DE112010003750T5 (de) Hardware für parallele Befehlslistenerzeugung
DE102013100179A1 (de) Verfahren und System zum Auflösen von Thread-Divergenzen
DE102021102589A1 (de) Berechnungsgraph-optimierung
DE102013205886A1 (de) Dynamische Bankmodus-Adressierung für Speicherzugriff
DE102019103310A1 (de) Schätzer for einen optimalen betriebspunkt für hardware, die unter einer beschränkung der gemeinsam genutzten leistung/wärme arbeitet
DE102013020968A1 (de) Technik zum Zugreifen auf einen inhaltsadressierbaren Speicher
DE102013020966B4 (de) Leistungseffiziente Attribut-Handhabung für Parkettierungs- und Geometrie-Schattierungseinheiten
DE102020101814A1 (de) Effiziente ausführung anhand von aufgabengraphen festgelegter arbeitslasten
DE102013020485A1 (de) Technik zur Ausführung von Speicherzugriffsoperationen über eine Textur-Hardware
DE102013019333A1 (de) Registerzuweisung für als cluster vorliegende mehrebenen-registerdaten
DE102013208421A1 (de) Sharing einer Grafikverarbeitungseinheit unter vielen Anwendungen
DE102020110655A1 (de) Verfahren und vorrichtung zum verbessern der verwendung eines heterogenen systems, das software ausführt
DE112020005789T5 (de) Hierarchische partitionierung von operatoren

Legal Events

Date Code Title Description
R012 Request for examination validly filed
R016 Response to examination communication
R002 Refusal decision in examination/registration proceedings
R006 Appeal filed
R008 Case pending at federal patent court
R019 Grant decision by federal patent court
R020 Patent grant now final