DE102018126004A1 - Synchronisation in einer Multi-Kachel-Verarbeitungsanordnung - Google Patents

Synchronisation in einer Multi-Kachel-Verarbeitungsanordnung Download PDF

Info

Publication number
DE102018126004A1
DE102018126004A1 DE102018126004.0A DE102018126004A DE102018126004A1 DE 102018126004 A1 DE102018126004 A1 DE 102018126004A1 DE 102018126004 A DE102018126004 A DE 102018126004A DE 102018126004 A1 DE102018126004 A1 DE 102018126004A1
Authority
DE
Germany
Prior art keywords
tiles
group
tile
synchronization
exit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
DE102018126004.0A
Other languages
German (de)
English (en)
Inventor
Simon Christian Knowles
Alan Graham Alexander
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Graphcore Ltd
Original Assignee
Graphcore Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Graphcore Ltd filed Critical Graphcore Ltd
Publication of DE102018126004A1 publication Critical patent/DE102018126004A1/de
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/522Barrier synchronisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/30087Synchronisation or serialisation instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • G06F15/17325Synchronisation; Hardware support therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/3009Thread control instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/461Saving or restoring of program or task context
    • G06F9/462Saving or restoring of program or task context with multiple register sets
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4887Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues involving deadlines, e.g. rate based, periodic
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/10Interfaces, programming languages or software development kits, e.g. for simulating neural networks
    • G06N3/105Shells for specifying net layout
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Advance Control (AREA)
  • Executing Machine-Instructions (AREA)
  • Multi Processors (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
DE102018126004.0A 2017-10-20 2018-10-19 Synchronisation in einer Multi-Kachel-Verarbeitungsanordnung Pending DE102018126004A1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB1717291.7A GB2569269B (en) 2017-10-20 2017-10-20 Synchronization in a multi-tile processing arrangement
GB1717291.7 2017-10-20

Publications (1)

Publication Number Publication Date
DE102018126004A1 true DE102018126004A1 (de) 2019-04-25

Family

ID=60481655

Family Applications (1)

Application Number Title Priority Date Filing Date
DE102018126004.0A Pending DE102018126004A1 (de) 2017-10-20 2018-10-19 Synchronisation in einer Multi-Kachel-Verarbeitungsanordnung

Country Status (10)

Country Link
US (2) US10564970B2 (https=)
JP (1) JP6797881B2 (https=)
KR (1) KR102262483B1 (https=)
CN (1) CN110214317B (https=)
CA (1) CA3021416C (https=)
DE (1) DE102018126004A1 (https=)
FR (1) FR3072800B1 (https=)
GB (1) GB2569269B (https=)
TW (1) TWI700634B (https=)
WO (1) WO2019076714A1 (https=)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2569098B (en) * 2017-10-20 2020-01-08 Graphcore Ltd Combining states of multiple threads in a multi-threaded processor
US11995448B1 (en) 2018-02-08 2024-05-28 Marvell Asia Pte Ltd Method and apparatus for performing machine learning operations in parallel on machine learning hardware
US12112175B1 (en) 2018-02-08 2024-10-08 Marvell Asia Pte Ltd Method and apparatus for performing machine learning operations in parallel on machine learning hardware
US10970080B2 (en) 2018-02-08 2021-04-06 Marvell Asia Pte, Ltd. Systems and methods for programmable hardware architecture for machine learning
DE102018205392A1 (de) * 2018-04-10 2019-10-10 Robert Bosch Gmbh Verfahren und Vorrichtung zur Fehlerbehandlung in einer Kommunikation zwischen verteilten Software Komponenten
DE102018205390A1 (de) * 2018-04-10 2019-10-10 Robert Bosch Gmbh Verfahren und Vorrichtung zur Fehlerbehandlung in einer Kommunikation zwischen verteilten Software Komponenten
US10929778B1 (en) 2018-05-22 2021-02-23 Marvell Asia Pte, Ltd. Address interleaving for machine learning
US11016801B1 (en) 2018-05-22 2021-05-25 Marvell Asia Pte, Ltd. Architecture to support color scheme-based synchronization for machine learning
US10997510B1 (en) 2018-05-22 2021-05-04 Marvell Asia Pte, Ltd. Architecture to support tanh and sigmoid operations for inference acceleration in machine learning
US10929779B1 (en) * 2018-05-22 2021-02-23 Marvell Asia Pte, Ltd. Architecture to support synchronization between core and inference engine for machine learning
GB2575294B8 (en) * 2018-07-04 2022-07-20 Graphcore Ltd Host Proxy On Gateway
GB2580165B (en) 2018-12-21 2021-02-24 Graphcore Ltd Data exchange in a computer with predetermined delay
KR102670905B1 (ko) * 2019-08-22 2024-05-31 구글 엘엘씨 전파 지연 감소
CN112416053B (zh) * 2019-08-23 2023-11-17 北京希姆计算科技有限公司 多核架构的同步信号产生电路、芯片和同步方法及装置
GB2590658A (en) * 2019-12-23 2021-07-07 Graphcore Ltd Communication in a computer having multiple processors
GB2590661B (en) * 2019-12-23 2022-02-09 Graphcore Ltd Sync network
GB2591106B (en) * 2020-01-15 2022-02-23 Graphcore Ltd Control of data transfer between processors
GB2593756B (en) * 2020-04-02 2022-03-30 Graphcore Ltd Control of data transfer between processing nodes
GB2596872B (en) * 2020-07-10 2022-12-14 Graphcore Ltd Handling injected instructions in a processor
GB2597078B (en) 2020-07-14 2022-07-13 Graphcore Ltd Communication between host and accelerator over network
US12495042B2 (en) * 2021-08-16 2025-12-09 Capital One Services, Llc Systems and methods for resetting an authentication counter

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02114362A (ja) * 1988-10-24 1990-04-26 Nec Corp 並列演算装置
JP2001175618A (ja) * 1999-12-17 2001-06-29 Nec Eng Ltd 並列計算機システム
US8035650B2 (en) * 2006-07-25 2011-10-11 Qualcomm Incorporated Tiled cache for multiple software programs
US8046727B2 (en) * 2007-09-12 2011-10-25 Neal Solomon IP cores in reconfigurable three dimensional integrated circuits
US8866827B2 (en) 2008-06-26 2014-10-21 Microsoft Corporation Bulk-synchronous graphics processing unit programming
US8539204B2 (en) 2009-09-25 2013-09-17 Nvidia Corporation Cooperative thread array reduction and scan operations
US9081501B2 (en) * 2010-01-08 2015-07-14 International Business Machines Corporation Multi-petascale highly efficient parallel supercomputer
GB201001621D0 (en) * 2010-02-01 2010-03-17 Univ Catholique Louvain A tile-based processor architecture model for high efficiency embedded homogenous multicore platforms
US8713290B2 (en) 2010-09-20 2014-04-29 International Business Machines Corporation Scaleable status tracking of multiple assist hardware threads
US20120179896A1 (en) * 2011-01-10 2012-07-12 International Business Machines Corporation Method and apparatus for a hierarchical synchronization barrier in a multi-node system
CN104081781B (zh) 2012-01-20 2018-11-02 Ge视频压缩有限责任公司 一种编码器、解码器、编解码方法及用于传送解多工的方法
US10067768B2 (en) 2014-07-18 2018-09-04 Nvidia Corporation Execution of divergent threads using a convergence barrier
US9348658B1 (en) * 2014-12-12 2016-05-24 Intel Corporation Technologies for efficient synchronization barriers with work stealing support
US10101786B2 (en) * 2014-12-22 2018-10-16 Intel Corporation Holistic global performance and power management
US9747108B2 (en) 2015-03-27 2017-08-29 Intel Corporation User-level fork and join processors, methods, systems, and instructions
US10310861B2 (en) * 2017-04-01 2019-06-04 Intel Corporation Mechanism for scheduling threads on a multiprocessor
US10672175B2 (en) * 2017-04-17 2020-06-02 Intel Corporation Order independent asynchronous compute and streaming for graphics

Also Published As

Publication number Publication date
CA3021416A1 (en) 2019-04-20
KR20190044570A (ko) 2019-04-30
TW201923556A (zh) 2019-06-16
JP2019079528A (ja) 2019-05-23
JP6797881B2 (ja) 2020-12-09
CN110214317A (zh) 2019-09-06
US20200089499A1 (en) 2020-03-19
CN110214317B (zh) 2023-05-02
FR3072800B1 (fr) 2024-01-05
GB2569269B (en) 2020-07-15
GB201717291D0 (en) 2017-12-06
US11593185B2 (en) 2023-02-28
CA3021416C (en) 2021-03-30
GB2569269A (en) 2019-06-19
TWI700634B (zh) 2020-08-01
KR102262483B1 (ko) 2021-06-08
WO2019076714A1 (en) 2019-04-25
US10564970B2 (en) 2020-02-18
US20190121641A1 (en) 2019-04-25
FR3072800A1 (fr) 2019-04-26

Similar Documents

Publication Publication Date Title
DE102018126004A1 (de) Synchronisation in einer Multi-Kachel-Verarbeitungsanordnung
DE102018126005A1 (de) Synchronisation in einer Multi-Kachel-, Multi-Chip-Verarbeitungsanordnung
DE102018126003A1 (de) Kombinieren von Zuständen mehrerer Threads in einem Multithread-Prozessor
DE102018126001A1 (de) Synchronisation in einem Multi-Kachel-Verarbeitungsarray
DE68921906T2 (de) Verfahren für ein Multiprozessorsystem mit sich selbst zuordnenden Prozessoren.
DE68924934T2 (de) Parallelsynchronisationstechnik.
DE69106384T2 (de) Skalierbares parallel-vektorrechnersystem.
DE69419524T2 (de) Sperrsynchronisierung für verteilte speicher-massivparallelrechner
DE69032381T2 (de) Vorrichtung und Verfahren für die kollektive Verzweigung in einem Mehrbefehlsstrommultiprozessor
DE102019112301A1 (de) Befehls-Cache in einem Multithread-Prozessor
DE102018005172A1 (de) Prozessoren, verfahren und systeme mit einem konfigurierbaren räumlichen beschleuniger
DE102015112202A1 (de) Kombinieren von Pfaden
DE102013114072A1 (de) System und Verfahren zum Hardware-Scheduling von indexierten Barrieren
KR102183118B1 (ko) 복수-타일 프로세싱 구성에서의 동기화
DE102009012766A1 (de) Ein Zugriffssperrenvorgang, um atomare Aktualisierungen zu einem geteilten Speicher zu ermöglichen
DE102017213160B4 (de) Kompilierung für knotenvorrichtungs-GPU-basierte Parallelverarbeitung
DE112012004728T5 (de) Verfahren, Programm und System zur Simulationsausführung
DE102023105575A1 (de) Verteilter gemeinsamer speicher
DE112013000904T5 (de) Durchführen von Aktualisierungen von Quellcode, der auf einer Vielzahl von Rechenknoten ausgeführt wird
DE3855524T2 (de) Arithmetik-Parallelverarbeitungseinheit und zugehöriger Kompilator
DE102020134345A1 (de) Technologie zum lernen und abladen häufiger speicherzugriffs- und rechenmuster
DE102020103521A1 (de) Minimieren der Nutzung von Hardware-Zählern bei getriggerten Operationen für kollektive Kommunikation
DE102015015182A1 (de) Systeme, Vorrichtungen und Verfahren zur k-Nächste-Nachbarn-Suche
DE102010028896A1 (de) Verfahren und Vorrichtung zum Zuweisen einer Mehrzahl von Teilaufgaben einer Aufgabe zu einer Mehrzahl von Recheneinheiten einer vorgegebenen Prozessorarchitektur
JPH0695879A (ja) コンピュータシステム

Legal Events

Date Code Title Description
R012 Request for examination validly filed
R082 Change of representative

Representative=s name: WESTPHAL, MUSSGNUG & PARTNER PATENTANWAELTE MI, DE