DE102020119519A1 - Verfahren und einrichtungen zum ermöglichen einer "out-of-order"-pipeline-ausführung der statischen abbildung einer arbeitslast - Google Patents

Verfahren und einrichtungen zum ermöglichen einer "out-of-order"-pipeline-ausführung der statischen abbildung einer arbeitslast Download PDF

Info

Publication number
DE102020119519A1
DE102020119519A1 DE102020119519.2A DE102020119519A DE102020119519A1 DE 102020119519 A1 DE102020119519 A1 DE 102020119519A1 DE 102020119519 A DE102020119519 A DE 102020119519A DE 102020119519 A1 DE102020119519 A1 DE 102020119519A1
Authority
DE
Germany
Prior art keywords
credits
buffer
workload
workload node
scheduler
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
DE102020119519.2A
Other languages
German (de)
English (en)
Inventor
Michael Behar
Roni Rosner
Moshe Maor
Ronen Gabbai
Zigi Walter
Oren AGAM
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Publication of DE102020119519A1 publication Critical patent/DE102020119519A1/de
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/22Microcontrol or microprogram arrangements
    • G06F9/28Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0613Improving I/O performance in relation to throughput
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • G06F9/3846Speculative instruction execution using static prediction, e.g. branch taken strategy
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/509Offload
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Advance Control (AREA)
DE102020119519.2A 2019-08-15 2020-07-23 Verfahren und einrichtungen zum ermöglichen einer "out-of-order"-pipeline-ausführung der statischen abbildung einer arbeitslast Pending DE102020119519A1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16/542,012 2019-08-15
US16/542,012 US11231963B2 (en) 2019-08-15 2019-08-15 Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload

Publications (1)

Publication Number Publication Date
DE102020119519A1 true DE102020119519A1 (de) 2021-02-18

Family

ID=68693863

Family Applications (1)

Application Number Title Priority Date Filing Date
DE102020119519.2A Pending DE102020119519A1 (de) 2019-08-15 2020-07-23 Verfahren und einrichtungen zum ermöglichen einer "out-of-order"-pipeline-ausführung der statischen abbildung einer arbeitslast

Country Status (6)

Country Link
US (2) US11231963B2 (https=)
JP (1) JP7400169B2 (https=)
KR (1) KR102684511B1 (https=)
CN (2) CN112395010A (https=)
DE (1) DE102020119519A1 (https=)
TW (1) TWI802800B (https=)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10901657B2 (en) * 2018-11-29 2021-01-26 International Business Machines Corporation Dynamic write credit buffer management of non-volatile dual inline memory module
US11231963B2 (en) 2019-08-15 2022-01-25 Intel Corporation Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload
US11599780B2 (en) * 2020-03-02 2023-03-07 Apple Inc. Asynchronous task execution for neural processor circuit
US11875247B1 (en) * 2020-06-18 2024-01-16 Amazon Technologies, Inc. Input batching with serial dynamic memory access
US11704058B2 (en) * 2020-07-28 2023-07-18 Samsung Electronics Co., Ltd. Systems and methods for resource-based scheduling of commands
CN112003846B (zh) * 2020-08-13 2023-02-03 广州市百果园信息技术有限公司 一种信用阈值的训练、ip地址的检测方法及相关装置
EP4211566B1 (en) * 2020-10-26 2025-01-29 Google LLC Modulating credit allocations in memory subsystems
US11620159B2 (en) 2021-04-23 2023-04-04 Samsung Electronics Co., Ltd. Systems and methods for I/O command scheduling based on multiple resource parameters
US12001701B2 (en) * 2022-01-26 2024-06-04 Western Digital Technologies, Inc. Storage biasing for solid state drive accelerators

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5418953A (en) * 1993-04-12 1995-05-23 Loral/Rohm Mil-Spec Corp. Method for automated deployment of a software program onto a multi-processor architecture
JP3892829B2 (ja) 2003-06-27 2007-03-14 株式会社東芝 情報処理システムおよびメモリ管理方法
JP5349515B2 (ja) 2011-03-14 2013-11-20 株式会社東芝 バッファ管理装置、バッファ管理方法及び記憶装置
US9395990B2 (en) * 2013-06-28 2016-07-19 Intel Corporation Mode dependent partial width load to wider register processors, methods, and systems
JP6695320B2 (ja) 2014-07-30 2020-05-20 リニア アルジェブラ テクノロジーズ リミテッド 低電力コンピュータイメージング
US10002099B2 (en) * 2014-11-13 2018-06-19 Cavium, Inc. Arbitrated access to resources among multiple devices
US11153223B2 (en) * 2016-04-07 2021-10-19 International Business Machines Corporation Specifying a disaggregated compute system
US10289752B2 (en) * 2016-12-12 2019-05-14 Intel Corporation Accelerator for gather-update-scatter operations including a content-addressable memory (CAM) and CAM controller
GB2569275B (en) * 2017-10-20 2020-06-03 Graphcore Ltd Time deterministic exchange
GB2569271B (en) * 2017-10-20 2020-05-13 Graphcore Ltd Synchronization with a host processor
US10649813B2 (en) * 2018-03-29 2020-05-12 Intel Corporation Arbitration across shared memory pools of disaggregated memory devices
US11669372B2 (en) * 2018-12-13 2023-06-06 Intel Corporation Flexible allocation of compute resources
US11231963B2 (en) 2019-08-15 2022-01-25 Intel Corporation Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload

Also Published As

Publication number Publication date
CN112395010A (zh) 2021-02-23
US11231963B2 (en) 2022-01-25
KR102684511B1 (ko) 2024-07-15
TWI802800B (zh) 2023-05-21
JP7400169B2 (ja) 2023-12-19
TW202109285A (zh) 2021-03-01
US11847497B2 (en) 2023-12-19
US20220197703A1 (en) 2022-06-23
JP2021034020A (ja) 2021-03-01
CN114895965A (zh) 2022-08-12
CN114895965B (zh) 2025-09-09
US20190370073A1 (en) 2019-12-05
KR20210021263A (ko) 2021-02-25

Similar Documents

Publication Publication Date Title
DE102020119519A1 (de) Verfahren und einrichtungen zum ermöglichen einer "out-of-order"-pipeline-ausführung der statischen abbildung einer arbeitslast
CN111258744B (zh) 一种基于异构计算的任务处理方法及软硬件框架系统
DE102021102589A1 (de) Berechnungsgraph-optimierung
DE102020108374A1 (de) Verfahren und vorrichtung zur laufzeitmehrfachplanung von software, die in einem heterogenen system ausgeführt wird
DE102020110655A1 (de) Verfahren und vorrichtung zum verbessern der verwendung eines heterogenen systems, das software ausführt
DE102020114218A1 (de) Verfahren und Vorrichtungen zum Verbessern der Laufzeitleistung auf einem heterogenen System ausgeführter Software
DE102018132781A1 (de) Heterogenes Rechensystem, welches konfiguriert ist, um eine Cachekohärenz adaptiv zu steuern
DE102022105725A1 (de) Verfahren und einrichtungen zur durchführung von gewichtungs- und aktivierungskomprimierung und -dekomprimierung
DE102019122935A1 (de) Verfahren und vorrichtungen zum zuweisen einer arbeitslast an einen beschleuniger unter verwendung von maschinenlernen
DE102020132377A1 (de) Vorrichtung und Verfahren zur Drosselung einer Raytracing-Pipeline
DE102016118210A1 (de) Granulare Dienstqualität für Computer-Ressourcen
DE102020101814A1 (de) Effiziente ausführung anhand von aufgabengraphen festgelegter arbeitslasten
DE112010003750T5 (de) Hardware für parallele Befehlslistenerzeugung
DE112011101725T5 (de) Sub-Puffer-Objekte
DE112021005433T5 (de) Verfahren zur leistungsbalancierung mehrerer chips
DE112012002905T5 (de) Technik zum Kompilieren und Ausführen von Programmen in höheren Programmiersprachen auf heterogenen Computern
DE112020000865T5 (de) Speicherverwaltungssystem
DE102020201154A1 (de) Verfahren und vorrichtung zum speichern von und zugreifen auf mehrdimensionale daten
DE112021003274T5 (de) Ressourcenzuordnung zum optimieren von hyperparametern bei umfangreichen deep-learning-arbeitslasten
DE112022002258T5 (de) Tensormodifikation basierend auf der verarbeitung von ressourcen
DE102021104561A1 (de) Asynchrone datenbewegungspipeline
DE102025122674A1 (de) Verfahren und Einrichtungen zum Nutzen grosser Sprachmodelle für künstliche Intelligenz zum Umwandeln von Computercode
DE112022001917T5 (de) Synchronisieren einer graphausführung
CN106030453A (zh) 支持图形处理单元频率的动态调整的方法和装置
DE102022129219A1 (de) Verfahren und Vorrichtung für durch maschinelles Lernen gesteuerte Kompiliereroptimierungen für registerbasierte Hardwarearchitekturen

Legal Events

Date Code Title Description
R130 Divisional application to

Ref document number: 102020008218

Country of ref document: DE