JP7400169B2 - ワークロードのスタティックマッピングの順不同にパイプライン化された実行を可能にする方法及び装置 - Google Patents
ワークロードのスタティックマッピングの順不同にパイプライン化された実行を可能にする方法及び装置 Download PDFInfo
- Publication number
- JP7400169B2 JP7400169B2 JP2020104328A JP2020104328A JP7400169B2 JP 7400169 B2 JP7400169 B2 JP 7400169B2 JP 2020104328 A JP2020104328 A JP 2020104328A JP 2020104328 A JP2020104328 A JP 2020104328A JP 7400169 B2 JP7400169 B2 JP 7400169B2
- Authority
- JP
- Japan
- Prior art keywords
- credits
- buffer
- scheduler
- workload
- credit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/22—Microcontrol or microprogram arrangements
- G06F9/28—Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
- G06F3/0613—Improving I/O performance in relation to throughput
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0659—Command handling arrangements, e.g. command buffers, queues, command scheduling
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
- G06F9/3846—Speculative instruction execution using static prediction, e.g. branch taken strategy
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/509—Offload
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Advance Control (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/542,012 US11231963B2 (en) | 2019-08-15 | 2019-08-15 | Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload |
| US16/542,012 | 2019-08-15 |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JP2021034020A JP2021034020A (ja) | 2021-03-01 |
| JP2021034020A5 JP2021034020A5 (enExample) | 2022-06-21 |
| JP7400169B2 true JP7400169B2 (ja) | 2023-12-19 |
Family
ID=68693863
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2020104328A Active JP7400169B2 (ja) | 2019-08-15 | 2020-06-17 | ワークロードのスタティックマッピングの順不同にパイプライン化された実行を可能にする方法及び装置 |
Country Status (6)
| Country | Link |
|---|---|
| US (2) | US11231963B2 (enExample) |
| JP (1) | JP7400169B2 (enExample) |
| KR (1) | KR102684511B1 (enExample) |
| CN (2) | CN112395010A (enExample) |
| DE (1) | DE102020119519A1 (enExample) |
| TW (1) | TWI802800B (enExample) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10901657B2 (en) * | 2018-11-29 | 2021-01-26 | International Business Machines Corporation | Dynamic write credit buffer management of non-volatile dual inline memory module |
| US11231963B2 (en) | 2019-08-15 | 2022-01-25 | Intel Corporation | Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload |
| US11875247B1 (en) * | 2020-06-18 | 2024-01-16 | Amazon Technologies, Inc. | Input batching with serial dynamic memory access |
| US11704058B2 (en) * | 2020-07-28 | 2023-07-18 | Samsung Electronics Co., Ltd. | Systems and methods for resource-based scheduling of commands |
| CN112003846B (zh) * | 2020-08-13 | 2023-02-03 | 广州市百果园信息技术有限公司 | 一种信用阈值的训练、ip地址的检测方法及相关装置 |
| EP4211566B1 (en) * | 2020-10-26 | 2025-01-29 | Google LLC | Modulating credit allocations in memory subsystems |
| US11620159B2 (en) | 2021-04-23 | 2023-04-04 | Samsung Electronics Co., Ltd. | Systems and methods for I/O command scheduling based on multiple resource parameters |
| US12001701B2 (en) * | 2022-01-26 | 2024-06-04 | Western Digital Technologies, Inc. | Storage biasing for solid state drive accelerators |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040268083A1 (en) | 2003-06-27 | 2004-12-30 | Tatsunori Kanai | Information processing system including processors and memory managing method used in the same system |
| US20120239833A1 (en) | 2011-03-14 | 2012-09-20 | Kabushiki Kaisha Toshiba | Buffer management device, buffer management method, and storage device |
| US20160140071A1 (en) | 2014-11-13 | 2016-05-19 | Cavium, Inc. | Arbitrated Access To Resources Among Multiple Devices |
| JP2017525047A (ja) | 2014-07-30 | 2017-08-31 | リニア アルジェブラ テクノロジーズ リミテッド | 低電力コンピュータイメージング |
| US20190050261A1 (en) | 2018-03-29 | 2019-02-14 | Intel Corporation | Arbitration across shared memory pools of disaggregated memory devices |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5418953A (en) * | 1993-04-12 | 1995-05-23 | Loral/Rohm Mil-Spec Corp. | Method for automated deployment of a software program onto a multi-processor architecture |
| US9395990B2 (en) * | 2013-06-28 | 2016-07-19 | Intel Corporation | Mode dependent partial width load to wider register processors, methods, and systems |
| US11153223B2 (en) * | 2016-04-07 | 2021-10-19 | International Business Machines Corporation | Specifying a disaggregated compute system |
| US10289752B2 (en) * | 2016-12-12 | 2019-05-14 | Intel Corporation | Accelerator for gather-update-scatter operations including a content-addressable memory (CAM) and CAM controller |
| GB2569275B (en) * | 2017-10-20 | 2020-06-03 | Graphcore Ltd | Time deterministic exchange |
| GB2569271B (en) * | 2017-10-20 | 2020-05-13 | Graphcore Ltd | Synchronization with a host processor |
| US11669372B2 (en) * | 2018-12-13 | 2023-06-06 | Intel Corporation | Flexible allocation of compute resources |
| US11231963B2 (en) | 2019-08-15 | 2022-01-25 | Intel Corporation | Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload |
-
2019
- 2019-08-15 US US16/542,012 patent/US11231963B2/en active Active
-
2020
- 2020-06-17 JP JP2020104328A patent/JP7400169B2/ja active Active
- 2020-06-18 CN CN202010559855.3A patent/CN112395010A/zh active Pending
- 2020-06-18 TW TW109120637A patent/TWI802800B/zh active
- 2020-06-18 CN CN202210600897.6A patent/CN114895965B/zh active Active
- 2020-07-15 KR KR1020200087436A patent/KR102684511B1/ko active Active
- 2020-07-23 DE DE102020119519.2A patent/DE102020119519A1/de active Pending
-
2021
- 2021-12-23 US US17/561,500 patent/US11847497B2/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040268083A1 (en) | 2003-06-27 | 2004-12-30 | Tatsunori Kanai | Information processing system including processors and memory managing method used in the same system |
| JP2005018620A (ja) | 2003-06-27 | 2005-01-20 | Toshiba Corp | 情報処理システムおよびメモリ管理方法 |
| US20120239833A1 (en) | 2011-03-14 | 2012-09-20 | Kabushiki Kaisha Toshiba | Buffer management device, buffer management method, and storage device |
| JP2012190415A (ja) | 2011-03-14 | 2012-10-04 | Toshiba Corp | バッファ管理装置、バッファ管理方法及び記憶装置 |
| JP2017525047A (ja) | 2014-07-30 | 2017-08-31 | リニア アルジェブラ テクノロジーズ リミテッド | 低電力コンピュータイメージング |
| US20160140071A1 (en) | 2014-11-13 | 2016-05-19 | Cavium, Inc. | Arbitrated Access To Resources Among Multiple Devices |
| US20190050261A1 (en) | 2018-03-29 | 2019-02-14 | Intel Corporation | Arbitration across shared memory pools of disaggregated memory devices |
Also Published As
| Publication number | Publication date |
|---|---|
| TW202109285A (zh) | 2021-03-01 |
| US11847497B2 (en) | 2023-12-19 |
| CN114895965B (zh) | 2025-09-09 |
| KR20210021263A (ko) | 2021-02-25 |
| DE102020119519A1 (de) | 2021-02-18 |
| CN114895965A (zh) | 2022-08-12 |
| US20220197703A1 (en) | 2022-06-23 |
| KR102684511B1 (ko) | 2024-07-15 |
| JP2021034020A (ja) | 2021-03-01 |
| US11231963B2 (en) | 2022-01-25 |
| CN112395010A (zh) | 2021-02-23 |
| US20190370073A1 (en) | 2019-12-05 |
| TWI802800B (zh) | 2023-05-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7400169B2 (ja) | ワークロードのスタティックマッピングの順不同にパイプライン化された実行を可能にする方法及び装置 | |
| Han et al. | Microsecond-scale preemption for concurrent {GPU-accelerated}{DNN} inferences | |
| US10942716B1 (en) | Dynamic computational acceleration using a heterogeneous hardware infrastructure | |
| JP5658365B2 (ja) | ハイブリッド・コンピューティング環境における高スループット・コンピューティングの方法、システム及びプログラム | |
| US12217101B2 (en) | Methods and apparatus to configure heterogenous components in an accelerator | |
| US20070260446A1 (en) | DMA and Graphics Interface Emulation | |
| CN112148472A (zh) | 用于提高执行软件的异构系统的利用率的方法和装置 | |
| EP3779778A1 (en) | Methods and apparatus to enable dynamic processing of a predefined workload | |
| CN117120979A (zh) | 用于机器学习工作负荷的异步分布式数据流 | |
| WO2020005412A2 (en) | Method and system for opportunistic load balancing in neural networks using metadata | |
| US20230244525A1 (en) | Methods and apparatus for an xpu-aware dynamic compute scheduling framework | |
| US20220222177A1 (en) | Systems, apparatus, articles of manufacture, and methods for improved data transfer for heterogeneous programs | |
| US20190318229A1 (en) | Method and system for hardware mapping inference pipelines | |
| US20250036462A1 (en) | Methods and apparatus for multilevel balancing of computational tasks | |
| US20230168898A1 (en) | Methods and apparatus to schedule parallel instructions using hybrid cores | |
| US20230325185A1 (en) | Methods and apparatus to accelerate matrix operations using direct memory access | |
| US12001382B2 (en) | Methods, apparatus, and articles of manufacture to generate command lists to be offloaded to accelerator circuitry | |
| US20230236878A1 (en) | Efficiently launching tasks on a processor | |
| US11347544B1 (en) | Scheduling work items based on declarative constraints | |
| US20240095083A1 (en) | Parallel workload scheduling based on workload data coherence | |
| WO2024065826A1 (en) | Accelerate deep learning with inter-iteration scheduling | |
| US20240330045A1 (en) | Input locality-adaptive kernel co-scheduling | |
| Han et al. | Real-time, Work-conserving GPU Scheduling for Concurrent DNN Inference |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20220613 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220613 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20230424 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20230516 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230807 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20231107 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20231110 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7400169 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |