CN112395010A - 实现工作负载的静态映射的乱序流水线执行的方法和装置 - Google Patents
实现工作负载的静态映射的乱序流水线执行的方法和装置 Download PDFInfo
- Publication number
- CN112395010A CN112395010A CN202010559855.3A CN202010559855A CN112395010A CN 112395010 A CN112395010 A CN 112395010A CN 202010559855 A CN202010559855 A CN 202010559855A CN 112395010 A CN112395010 A CN 112395010A
- Authority
- CN
- China
- Prior art keywords
- credits
- buffer
- workload
- threshold number
- scheduler
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/22—Microcontrol or microprogram arrangements
- G06F9/28—Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
- G06F3/0613—Improving I/O performance in relation to throughput
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0659—Command handling arrangements, e.g. command buffers, queues, command scheduling
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
- G06F9/3846—Speculative instruction execution using static prediction, e.g. branch taken strategy
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/509—Offload
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Advance Control (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210600897.6A CN114895965B (zh) | 2019-08-15 | 2020-06-18 | 实现工作负载的静态映射的乱序流水线执行的方法和装置 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/542,012 US11231963B2 (en) | 2019-08-15 | 2019-08-15 | Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload |
| US16/542,012 | 2019-08-15 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210600897.6A Division CN114895965B (zh) | 2019-08-15 | 2020-06-18 | 实现工作负载的静态映射的乱序流水线执行的方法和装置 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN112395010A true CN112395010A (zh) | 2021-02-23 |
Family
ID=68693863
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010559855.3A Pending CN112395010A (zh) | 2019-08-15 | 2020-06-18 | 实现工作负载的静态映射的乱序流水线执行的方法和装置 |
| CN202210600897.6A Active CN114895965B (zh) | 2019-08-15 | 2020-06-18 | 实现工作负载的静态映射的乱序流水线执行的方法和装置 |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210600897.6A Active CN114895965B (zh) | 2019-08-15 | 2020-06-18 | 实现工作负载的静态映射的乱序流水线执行的方法和装置 |
Country Status (6)
| Country | Link |
|---|---|
| US (2) | US11231963B2 (enExample) |
| JP (1) | JP7400169B2 (enExample) |
| KR (1) | KR102684511B1 (enExample) |
| CN (2) | CN112395010A (enExample) |
| DE (1) | DE102020119519A1 (enExample) |
| TW (1) | TWI802800B (enExample) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10901657B2 (en) * | 2018-11-29 | 2021-01-26 | International Business Machines Corporation | Dynamic write credit buffer management of non-volatile dual inline memory module |
| US11231963B2 (en) | 2019-08-15 | 2022-01-25 | Intel Corporation | Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload |
| US11875247B1 (en) * | 2020-06-18 | 2024-01-16 | Amazon Technologies, Inc. | Input batching with serial dynamic memory access |
| US11704058B2 (en) * | 2020-07-28 | 2023-07-18 | Samsung Electronics Co., Ltd. | Systems and methods for resource-based scheduling of commands |
| CN112003846B (zh) * | 2020-08-13 | 2023-02-03 | 广州市百果园信息技术有限公司 | 一种信用阈值的训练、ip地址的检测方法及相关装置 |
| EP4211566B1 (en) * | 2020-10-26 | 2025-01-29 | Google LLC | Modulating credit allocations in memory subsystems |
| US11620159B2 (en) | 2021-04-23 | 2023-04-04 | Samsung Electronics Co., Ltd. | Systems and methods for I/O command scheduling based on multiple resource parameters |
| US12001701B2 (en) * | 2022-01-26 | 2024-06-04 | Western Digital Technologies, Inc. | Storage biasing for solid state drive accelerators |
Family Cites Families (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5418953A (en) * | 1993-04-12 | 1995-05-23 | Loral/Rohm Mil-Spec Corp. | Method for automated deployment of a software program onto a multi-processor architecture |
| JP3892829B2 (ja) | 2003-06-27 | 2007-03-14 | 株式会社東芝 | 情報処理システムおよびメモリ管理方法 |
| JP5349515B2 (ja) | 2011-03-14 | 2013-11-20 | 株式会社東芝 | バッファ管理装置、バッファ管理方法及び記憶装置 |
| US9395990B2 (en) * | 2013-06-28 | 2016-07-19 | Intel Corporation | Mode dependent partial width load to wider register processors, methods, and systems |
| EP3982234A3 (en) | 2014-07-30 | 2022-05-11 | Movidius Ltd. | Low power computational imaging |
| US10002099B2 (en) * | 2014-11-13 | 2018-06-19 | Cavium, Inc. | Arbitrated access to resources among multiple devices |
| US11153223B2 (en) * | 2016-04-07 | 2021-10-19 | International Business Machines Corporation | Specifying a disaggregated compute system |
| US10289752B2 (en) * | 2016-12-12 | 2019-05-14 | Intel Corporation | Accelerator for gather-update-scatter operations including a content-addressable memory (CAM) and CAM controller |
| GB2569275B (en) * | 2017-10-20 | 2020-06-03 | Graphcore Ltd | Time deterministic exchange |
| GB2569271B (en) * | 2017-10-20 | 2020-05-13 | Graphcore Ltd | Synchronization with a host processor |
| US10649813B2 (en) * | 2018-03-29 | 2020-05-12 | Intel Corporation | Arbitration across shared memory pools of disaggregated memory devices |
| US11669372B2 (en) * | 2018-12-13 | 2023-06-06 | Intel Corporation | Flexible allocation of compute resources |
| US11231963B2 (en) | 2019-08-15 | 2022-01-25 | Intel Corporation | Methods and apparatus to enable out-of-order pipelined execution of static mapping of a workload |
-
2019
- 2019-08-15 US US16/542,012 patent/US11231963B2/en active Active
-
2020
- 2020-06-17 JP JP2020104328A patent/JP7400169B2/ja active Active
- 2020-06-18 CN CN202010559855.3A patent/CN112395010A/zh active Pending
- 2020-06-18 TW TW109120637A patent/TWI802800B/zh active
- 2020-06-18 CN CN202210600897.6A patent/CN114895965B/zh active Active
- 2020-07-15 KR KR1020200087436A patent/KR102684511B1/ko active Active
- 2020-07-23 DE DE102020119519.2A patent/DE102020119519A1/de active Pending
-
2021
- 2021-12-23 US US17/561,500 patent/US11847497B2/en active Active
Also Published As
| Publication number | Publication date |
|---|---|
| TW202109285A (zh) | 2021-03-01 |
| US11847497B2 (en) | 2023-12-19 |
| CN114895965B (zh) | 2025-09-09 |
| KR20210021263A (ko) | 2021-02-25 |
| JP7400169B2 (ja) | 2023-12-19 |
| DE102020119519A1 (de) | 2021-02-18 |
| CN114895965A (zh) | 2022-08-12 |
| US20220197703A1 (en) | 2022-06-23 |
| KR102684511B1 (ko) | 2024-07-15 |
| JP2021034020A (ja) | 2021-03-01 |
| US11231963B2 (en) | 2022-01-25 |
| US20190370073A1 (en) | 2019-12-05 |
| TWI802800B (zh) | 2023-05-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114895965B (zh) | 实现工作负载的静态映射的乱序流水线执行的方法和装置 | |
| US11847508B2 (en) | Convergence among concurrently executing threads | |
| JP7523964B2 (ja) | アクセラレータにおいてヘテロジニアスコンポーネントを設定する方法及び装置 | |
| JP6329274B2 (ja) | コンパイラ最適化のためのメモリ参照メタデータ | |
| US11036477B2 (en) | Methods and apparatus to improve utilization of a heterogeneous system executing software | |
| US20160371081A1 (en) | Dynamic computational acceleration using a heterogeneous hardware infrastructure | |
| CN102576314A (zh) | 具有横跨多个处理器的数据并行线程之映射处理逻辑 | |
| US11880715B2 (en) | Method and system for opportunistic load balancing in neural networks using metadata | |
| EP3779778A1 (en) | Methods and apparatus to enable dynamic processing of a predefined workload | |
| US20190318229A1 (en) | Method and system for hardware mapping inference pipelines | |
| US20220206851A1 (en) | Regenerative work-groups | |
| US12499504B2 (en) | System and method for adaptive graph-to-stream scheduling | |
| US20240193721A1 (en) | System and method for adaptive graph-to-stream scheduling | |
| US20230136365A1 (en) | Methods and apparatus to allocate accelerator usage | |
| US20240330045A1 (en) | Input locality-adaptive kernel co-scheduling |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |