WO2019153683A1 - Configurable and flexible instruction scheduler - Google Patents

Configurable and flexible instruction scheduler Download PDF

Info

Publication number
WO2019153683A1
WO2019153683A1 PCT/CN2018/099749 CN2018099749W WO2019153683A1 WO 2019153683 A1 WO2019153683 A1 WO 2019153683A1 CN 2018099749 W CN2018099749 W CN 2018099749W WO 2019153683 A1 WO2019153683 A1 WO 2019153683A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
instruction
instruction queue
queue
configurable
Prior art date
Application number
PCT/CN2018/099749
Other languages
French (fr)
Chinese (zh)
Inventor
洪振洲
李庭育
陈育鸣
Original Assignee
江苏华存电子科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 江苏华存电子科技有限公司 filed Critical 江苏华存电子科技有限公司
Publication of WO2019153683A1 publication Critical patent/WO2019153683A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3856Reordering of instructions, e.g. using queues or age tags
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching

Definitions

  • the invention relates to the technical field of instruction scheduling, in particular to a configurable and flexible instruction scheduler.
  • Instruction scheduling is a technique in which instructions are executed in parallel.
  • the compiler or machine hardware increases the number of machine execution instructions per beat by adjusting the order of instructions.
  • the shot is the machine execution instruction that the compiler simulates when compiling the source program. Clock cycle.
  • a table scheduling algorithm is usually used to implement instruction scheduling, and a candidate instruction queue is usually adopted.
  • the data dependency graph is composed of a plurality of nodes, each node represents an instruction, and the data dependency graph can be used to represent a dependency between the instructions. relationship.
  • the priority of each instruction is then calculated, and then the instructions in the data dependency graph are scheduled on a beat-by-shot basis.
  • Instruction scheduling is an effective means of compiler-level mining of program-level parallelism. It improves the number of instructions that the target machine can execute in a cycle by re-adjusting the order of instructions without changing the semantics of the program and satisfying the dependencies and resource dependencies of the target machine. Instruction scheduling is a key technology of modern high-performance compilers. It determines the relative execution order of each operation, the specific execution time and which hardware resources are used. From the perspective of code block partitioning, instruction scheduling can be divided into local instruction scheduling and global instruction scheduling, where local instruction scheduling refers to instruction scheduling within a basic block, and global scheduling refers to instruction scheduling between basic blocks.
  • the existing system-level single-chip architecture consists of a plurality of sub-modules including a central microprocessor and is connected by an external bus.
  • the central controller separately performs an operation via an external bus, so that the performance of each subsequent instruction is degraded via the bus.
  • a configurable and flexible instruction scheduler including a central microprocessor, a memory, a first hardware module, and a second hardware module, the central microprocessor passing the bus
  • the memory, the first hardware module and the second hardware module are respectively connected, and the memory is provided with an instruction queue unit and an instruction queue setting unit, and the instruction queue setting unit is respectively connected to the first hardware module and the second hardware module.
  • the instruction queue unit includes a first instruction queue module, a second instruction queue module, a third instruction queue module, and an Nth instruction queue module, where N is an integer greater than 3.
  • the instruction queue setting unit includes a first instruction queue setting module, a second instruction queue setting module, a third instruction queue setting module, and an Mth instruction queue setting module, where M is an integer greater than 3.
  • the first instruction queue setting module is connected to the first instruction queue module
  • the second instruction queue setting module is connected to the second instruction queue module
  • the third instruction queue setting module is connected to the third instruction queue module.
  • the Mth instruction queue setting module is connected to the Nth instruction queue module.
  • the beneficial effects of the present invention are: in the present invention, the central microprocessor can give each sub-module a more flexible instruction length and instruction queue depth, thereby making it easier for the microprocessor to independently command.
  • Figure 1 is a schematic diagram of the structure of the present invention.
  • a configurable and flexible instruction scheduler including a central microprocessor 1, a memory 2, a first hardware module 3, and a second hardware module 4, the central The microprocessor 1 is respectively connected to the memory 2, the first hardware module 3 and the second hardware module 4 via a bus.
  • the memory 2 is provided with an instruction queue unit 5 and an instruction queue setting unit 6, and the instruction queue setting unit 5
  • the first hardware module 3 and the second hardware module 4 are connected, respectively.
  • the instruction queue unit 5 includes a first instruction queue module 7, a second instruction queue module 8, a third instruction queue module 9, and an Nth instruction queue module, where N is an integer greater than 3;
  • the instruction queue setting unit 6 includes The first instruction queue setting module 10, the second instruction queue setting module 11, the third instruction queue setting module 12, and the Mth instruction queue setting module, M is an integer greater than 3;
  • the first instruction queue is set The module 10 is connected to the first instruction queue module 7, the second instruction queue setting module 11 is connected to the second instruction queue module 8, and the third instruction queue setting module 12 is connected to the third instruction queue module 9, the Mth The instruction queue setting module is connected to the Nth instruction queue module.
  • the microprocessor commands the central control module via the external bus, and the commands corresponding to the different modules are written to the respective defined addresses in the memory, and each command queue corresponds to a different module in the system, and has different meanings, and
  • the instruction length and the number are set in the instruction queue setting register.
  • the microprocessor can issue multiple instructions at a time, and then write the register to inform the queue how many instructions have been written for the external hardware sub-module to execute.
  • Each hardware sub-module knows how many instructions need to be processed and fetches instructions through the external bus according to the instruction queue setting signal. Once the fetching is completed, the microprocessor rewrites the register to notify the microprocessor that many instructions have been fetched. This will optimize the memory usage of this instruction and adjust the instruction queue depth according to the actual hardware requirements, which is flexible.
  • the external hardware sub-module can also write commands in this way for the microprocessor to execute.
  • the central microprocessor can give each sub-module a more flexible instruction length and instruction queue depth, thereby making it easier for the microprocessor to independently command.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

Disclosed in the present invention is a configurable and flexible instruction scheduler, comprising a central microprocessor, a memory, a first hardware module and a second hardware module. The central microprocessor is connected to the memory, the first hardware module and the second hardware module by means of a bus. The memory is provided with an instruction queue unit and an instruction queue setting unit. The instruction queue setting unit is connected to the first hardware module and the second hardware module. In the present invention, the central microprocessor can provide each sub-module with a more flexible instruction length and instruction queue depth, so that the microprocessor can more easily issue an instruction.

Description

一种可配置且具弹性的指令调度器A configurable and flexible instruction scheduler 技术领域Technical field
本发明涉及指令调度技术领域,具体为一种可配置且具弹性的指令调度器。The invention relates to the technical field of instruction scheduling, in particular to a configurable and flexible instruction scheduler.
背景技术Background technique
指令调度是一种指令并行执行的技术,编译器或者机器硬件通过调整指令的顺序来提高每拍内机器执行指令的数量,所述拍为编译器在编译源程序时所模拟的机器执行指令的时钟周期。现有编译技术中通常采用表调度算法来实现指令调度,通常采用一个候选指令队列。具体的,在进行指令调度时,首先对需要调度的指令构建数据依赖图,该数据依赖图由若干个节点组成,每个节点代表一条指令,该数据依赖图可以用来表示指令之间的依赖关系。然后计算各条指令的优先级,接着逐拍对数据依赖图中的指令进行调度。指令调度是编译器挖掘程序潜在的指令级并行的有效手段。它是在不改变程序语义,满足目标机器的相关性和资源依赖性的前提下,通过重新调整指令顺序来提高一个周期内目标机器能够执行的指令数目。指令调度是现代高性能编译器的一项关键技术,它决定各操作的相对执行顺序,具体执行时间及使用哪些硬件资源等。从代码块划分角度来看,指令调度可以分为局部指令调度和全局指令调度,其中局部指令调度是指基本块内的指令调度,而全局调度是指基本块间的指令调度。Instruction scheduling is a technique in which instructions are executed in parallel. The compiler or machine hardware increases the number of machine execution instructions per beat by adjusting the order of instructions. The shot is the machine execution instruction that the compiler simulates when compiling the source program. Clock cycle. In the existing compilation technology, a table scheduling algorithm is usually used to implement instruction scheduling, and a candidate instruction queue is usually adopted. Specifically, when performing instruction scheduling, first constructing a data dependency graph for an instruction that needs to be scheduled, the data dependency graph is composed of a plurality of nodes, each node represents an instruction, and the data dependency graph can be used to represent a dependency between the instructions. relationship. The priority of each instruction is then calculated, and then the instructions in the data dependency graph are scheduled on a beat-by-shot basis. Instruction scheduling is an effective means of compiler-level mining of program-level parallelism. It improves the number of instructions that the target machine can execute in a cycle by re-adjusting the order of instructions without changing the semantics of the program and satisfying the dependencies and resource dependencies of the target machine. Instruction scheduling is a key technology of modern high-performance compilers. It determines the relative execution order of each operation, the specific execution time and which hardware resources are used. From the perspective of code block partitioning, instruction scheduling can be divided into local instruction scheduling and global instruction scheduling, where local instruction scheduling refers to instruction scheduling within a basic block, and global scheduling refers to instruction scheduling between basic blocks.
现有系统级单芯片架构由多个子模块包含中央微处理器组成,并由外部总线连接,中央控制器经由外部总线分别下指令完成运算,如此经由总线将导致每次下指令的效能低落。The existing system-level single-chip architecture consists of a plurality of sub-modules including a central microprocessor and is connected by an external bus. The central controller separately performs an operation via an external bus, so that the performance of each subsequent instruction is degraded via the bus.
发明内容Summary of the invention
本发明的目的在于提供一种可配置且具弹性的指令调度器,以解决上述背景技术中提出的问题。It is an object of the present invention to provide a configurable and flexible instruction scheduler to solve the problems set forth in the background art above.
为实现上述目的,本发明提供如下技术方案:一种可配置且具弹性的指令调度器,包括中央微处理器、内存、第一硬件模块和第二硬件模块,所述中央微处理器通过总线分别连接内存、第一硬件模块和第二硬件模块,所述内存内设有指令队列单元和指令队列设定单元,所述指令队列设定单元分别连接第一硬件模块和第二硬件模块。To achieve the above object, the present invention provides the following technical solution: a configurable and flexible instruction scheduler, including a central microprocessor, a memory, a first hardware module, and a second hardware module, the central microprocessor passing the bus The memory, the first hardware module and the second hardware module are respectively connected, and the memory is provided with an instruction queue unit and an instruction queue setting unit, and the instruction queue setting unit is respectively connected to the first hardware module and the second hardware module.
优选的,所述指令队列单元包括第一指令队列模块、第二指令队列模块、第三指令队列模块、第N指令队列模块,N为大于3的整数。Preferably, the instruction queue unit includes a first instruction queue module, a second instruction queue module, a third instruction queue module, and an Nth instruction queue module, where N is an integer greater than 3.
优选的,所述指令队列设定单元包括第一指令队列设定模块、第二指令队列设定模块、第三指令队列设定模块、第M指令队列设定模块,M为大于3的整数。Preferably, the instruction queue setting unit includes a first instruction queue setting module, a second instruction queue setting module, a third instruction queue setting module, and an Mth instruction queue setting module, where M is an integer greater than 3.
优选的,所述第一指令队列设定模块连接第一指令队列模块,所述第二指令队列设定模块连接第二指令队列模块,所述第三指令队列设定模块连接第三指令队列模块,所述第M指令队列设定模块连接第N指令队列模块。Preferably, the first instruction queue setting module is connected to the first instruction queue module, the second instruction queue setting module is connected to the second instruction queue module, and the third instruction queue setting module is connected to the third instruction queue module. The Mth instruction queue setting module is connected to the Nth instruction queue module.
与现有技术相比,本发明的有益效果是:本发明中,中央微处理器能够给予每个子模块有更弹性的指令长度与指令队列深度,从而让微处理器更容易独立下指令。Compared with the prior art, the beneficial effects of the present invention are: in the present invention, the central microprocessor can give each sub-module a more flexible instruction length and instruction queue depth, thereby making it easier for the microprocessor to independently command.
附图说明DRAWINGS
图1为本发明结构原理图.Figure 1 is a schematic diagram of the structure of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
请参阅图1,本发明提供一种技术方案:一种可配置且具弹性的指令调度器,包括中央微处理器1、内存2、第一硬件模块3和第二硬件模块4,所述中 央微处理器1通过总线分别连接内存2、第一硬件模块3和第二硬件模块4,所述内存2内设有指令队列单元5和指令队列设定单元6,所述指令队列设定单元5分别连接第一硬件模块3和第二硬件模块4。Referring to FIG. 1, the present invention provides a technical solution: a configurable and flexible instruction scheduler, including a central microprocessor 1, a memory 2, a first hardware module 3, and a second hardware module 4, the central The microprocessor 1 is respectively connected to the memory 2, the first hardware module 3 and the second hardware module 4 via a bus. The memory 2 is provided with an instruction queue unit 5 and an instruction queue setting unit 6, and the instruction queue setting unit 5 The first hardware module 3 and the second hardware module 4 are connected, respectively.
本发明中,指令队列单元5包括第一指令队列模块7、第二指令队列模块8、第三指令队列模块9、第N指令队列模块,N为大于3的整数;指令队列设定单元6包括第一指令队列设定模块10、第二指令队列设定模块11、第三指令队列设定模块12、第M指令队列设定模块,M为大于3的整数;所述第一指令队列设定模块10连接第一指令队列模块7,所述第二指令队列设定模块11连接第二指令队列模块8,所述第三指令队列设定模块12连接第三指令队列模块9,所述第M指令队列设定模块连接第N指令队列模块。In the present invention, the instruction queue unit 5 includes a first instruction queue module 7, a second instruction queue module 8, a third instruction queue module 9, and an Nth instruction queue module, where N is an integer greater than 3; the instruction queue setting unit 6 includes The first instruction queue setting module 10, the second instruction queue setting module 11, the third instruction queue setting module 12, and the Mth instruction queue setting module, M is an integer greater than 3; the first instruction queue is set The module 10 is connected to the first instruction queue module 7, the second instruction queue setting module 11 is connected to the second instruction queue module 8, and the third instruction queue setting module 12 is connected to the third instruction queue module 9, the Mth The instruction queue setting module is connected to the Nth instruction queue module.
微处理器经由外部总线对此指令中央控制模块下予命令,对应不同模块的命令将写到内存中各自定义的地址中,每个指令队列对应系统中不同的模块,并具有不同的意义,且指令长度,个数皆设定在指令队列设定寄存器之中,微处理器能够一次下达多个指令后,再通过写寄存器方式告知此队列已被写入多少指令供外部硬件子模块执行。各硬件子模块根据指令队列设定讯号得知目前有多少指令需要处理并透过外部总线抓取指令,一旦抓取完成后,透过总线改写寄存器通知微处理器,多少指令已被抓取,如此将可使此指令内存空间使用优化,并根据实际上硬件需求调整指令队列深度,附予弹性。相同的,外部硬件子模块也可经由此方式将命令写上去,让微处理器执行。The microprocessor commands the central control module via the external bus, and the commands corresponding to the different modules are written to the respective defined addresses in the memory, and each command queue corresponds to a different module in the system, and has different meanings, and The instruction length and the number are set in the instruction queue setting register. The microprocessor can issue multiple instructions at a time, and then write the register to inform the queue how many instructions have been written for the external hardware sub-module to execute. Each hardware sub-module knows how many instructions need to be processed and fetches instructions through the external bus according to the instruction queue setting signal. Once the fetching is completed, the microprocessor rewrites the register to notify the microprocessor that many instructions have been fetched. This will optimize the memory usage of this instruction and adjust the instruction queue depth according to the actual hardware requirements, which is flexible. In the same way, the external hardware sub-module can also write commands in this way for the microprocessor to execute.
本发明中,中央微处理器能够给予每个子模块有更弹性的指令长度与指令队列深度,从而让微处理器更容易独立下指令。In the present invention, the central microprocessor can give each sub-module a more flexible instruction length and instruction queue depth, thereby making it easier for the microprocessor to independently command.
尽管已经示出和描述了本发明的实施例,对于本领域的普通技术人员而言,可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由所附权利要求及其等同物限定。While the embodiments of the present invention have been shown and described, it will be understood by those skilled in the art The scope of the invention is defined by the appended claims and their equivalents.

Claims (4)

  1. 一种可配置且具弹性的指令调度器,其特征在于:包括中央微处理器(1)、内存(2)、第一硬件模块(3)和第二硬件模块(4),所述中央微处理器(1)通过总线分别连接内存(2)、第一硬件模块(3)和第二硬件模块(4),所述内存(2)内设有指令队列单元(5)和指令队列设定单元(6),所述指令队列设定单元(5)分别连接第一硬件模块(3)和第二硬件模块(4)。A configurable and flexible instruction scheduler, comprising: a central microprocessor (1), a memory (2), a first hardware module (3) and a second hardware module (4), the central micro The processor (1) is respectively connected to the memory (2), the first hardware module (3) and the second hardware module (4) via a bus, wherein the memory (2) is provided with an instruction queue unit (5) and an instruction queue setting Unit (6), the instruction queue setting unit (5) is connected to the first hardware module (3) and the second hardware module (4), respectively.
  2. 根据权利要求1所述的一种可配置且具弹性的指令调度器,其特征在于:所述指令队列单元(5)包括第一指令队列模块(7)、第二指令队列模块(8)、第三指令队列模块(9)、第N指令队列模块,N为大于3的整数。A configurable and flexible instruction scheduler according to claim 1, wherein said instruction queue unit (5) comprises a first instruction queue module (7), a second instruction queue module (8), The third instruction queue module (9) and the Nth instruction queue module, where N is an integer greater than 3.
  3. 根据权利要求2所述的一种可配置且具弹性的指令调度器,其特征在于:所述指令队列设定单元(6)包括第一指令队列设定模块(10)、第二指令队列设定模块(11)、第三指令队列设定模块(12)、第M指令队列设定模块,M为大于3的整数。A configurable and flexible instruction scheduler according to claim 2, wherein said instruction queue setting unit (6) comprises a first instruction queue setting module (10) and a second instruction queue setting The fixed module (11), the third command queue setting module (12), and the Mth command queue setting module, and M is an integer greater than 3.
  4. 根据权利要求3所述的一种可配置且具弹性的指令调度器,其特征在于:所述第一指令队列设定模块(10)连接第一指令队列模块(7),所述第二指令队列设定模块(11)连接第二指令队列模块(8),所述第三指令队列设定模块(12)连接第三指令队列模块(9),所述第M指令队列设定模块连接第N指令队列模块。A configurable and flexible instruction scheduler according to claim 3, wherein said first instruction queue setting module (10) is coupled to a first instruction queue module (7), said second instruction The queue setting module (11) is connected to the second instruction queue module (8), the third instruction queue setting module (12) is connected to the third instruction queue module (9), and the Mth instruction queue setting module is connected. N command queue module.
PCT/CN2018/099749 2018-02-06 2018-08-09 Configurable and flexible instruction scheduler WO2019153683A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810118967.8 2018-02-06
CN201810118967.8A CN108228242B (en) 2018-02-06 2018-02-06 Configurable and flexible instruction scheduler

Publications (1)

Publication Number Publication Date
WO2019153683A1 true WO2019153683A1 (en) 2019-08-15

Family

ID=62670696

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/099749 WO2019153683A1 (en) 2018-02-06 2018-08-09 Configurable and flexible instruction scheduler

Country Status (2)

Country Link
CN (1) CN108228242B (en)
WO (1) WO2019153683A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108228242B (en) * 2018-02-06 2020-02-07 江苏华存电子科技有限公司 Configurable and flexible instruction scheduler
US20220374237A1 (en) * 2021-05-21 2022-11-24 Telefonaktiebolaget Lm Ericsson (Publ) Apparatus and method for identifying and prioritizing certain instructions in a microprocessor instruction pipeline
KR20230095507A (en) * 2021-12-22 2023-06-29 삼성전자주식회사 Scheduling method for neural network computation and apparatus thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030163671A1 (en) * 2002-02-26 2003-08-28 Gschwind Michael Karl Method and apparatus for prioritized instruction issue queue
CN101710272A (en) * 2009-10-28 2010-05-19 北京龙芯中科技术服务中心有限公司 Device and method for instruction scheduling
CN102495724A (en) * 2011-11-04 2012-06-13 杭州中天微系统有限公司 Data processor for improving storage instruction execution efficiency
CN108228242A (en) * 2018-02-06 2018-06-29 江苏华存电子科技有限公司 A kind of configurable and tool elasticity instruction scheduler

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7130990B2 (en) * 2002-12-31 2006-10-31 Intel Corporation Efficient instruction scheduling with lossy tracking of scheduling information
US7613904B2 (en) * 2005-02-04 2009-11-03 Mips Technologies, Inc. Interfacing external thread prioritizing policy enforcing logic with customer modifiable register to processor internal scheduler
CN104424026B (en) * 2013-08-21 2017-11-17 华为技术有限公司 One kind instruction dispatching method and device
US10175988B2 (en) * 2015-06-26 2019-01-08 Microsoft Technology Licensing, Llc Explicit instruction scheduler state information for a processor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030163671A1 (en) * 2002-02-26 2003-08-28 Gschwind Michael Karl Method and apparatus for prioritized instruction issue queue
CN101710272A (en) * 2009-10-28 2010-05-19 北京龙芯中科技术服务中心有限公司 Device and method for instruction scheduling
CN102495724A (en) * 2011-11-04 2012-06-13 杭州中天微系统有限公司 Data processor for improving storage instruction execution efficiency
CN108228242A (en) * 2018-02-06 2018-06-29 江苏华存电子科技有限公司 A kind of configurable and tool elasticity instruction scheduler

Also Published As

Publication number Publication date
CN108228242A (en) 2018-06-29
CN108228242B (en) 2020-02-07

Similar Documents

Publication Publication Date Title
US10467183B2 (en) Processors and methods for pipelined runtime services in a spatial array
US10387319B2 (en) Processors, methods, and systems for a configurable spatial accelerator with memory system performance, power reduction, and atomics support features
TWI628594B (en) User-level fork and join processors, methods, systems, and instructions
Owaida et al. Synthesis of platform architectures from OpenCL programs
Govindaraju et al. Dyser: Unifying functionality and parallelism specialization for energy-efficient computing
US20190101952A1 (en) Processors and methods for configurable clock gating in a spatial array
JP6525286B2 (en) Processor core and processor system
US20190004878A1 (en) Processors, methods, and systems for a configurable spatial accelerator with security, power reduction, and performace features
CN110249302B (en) Simultaneous execution of multiple programs on a processor core
US9830156B2 (en) Temporal SIMT execution optimization through elimination of redundant operations
US9619298B2 (en) Scheduling computing tasks for multi-processor systems based on resource requirements
KR20180021812A (en) Block-based architecture that executes contiguous blocks in parallel
JP2019079525A (en) Instruction set
WO2019153683A1 (en) Configurable and flexible instruction scheduler
US20140317626A1 (en) Processor for batch thread processing, batch thread processing method using the same, and code generation apparatus for batch thread processing
US8615770B1 (en) System and method for dynamically spawning thread blocks within multi-threaded processing systems
EP2798520A1 (en) Method and apparatus for controlling a mxcsr
WO2019153681A1 (en) Smart instruction scheduler
Huthmann et al. Automatic high-level synthesis of multi-threaded hardware accelerators
EP1537476A2 (en) System and method for executing branch instructions in a vliw processor
US20060200648A1 (en) High-level language processor apparatus and method
US8959497B1 (en) System and method for dynamically spawning thread blocks within multi-threaded processing systems
WO2019136983A1 (en) Low-delay instruction scheduler
KR101420592B1 (en) Computer system
WO2019153684A1 (en) Method for automatically managing low-latency instruction scheduler

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18905633

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18905633

Country of ref document: EP

Kind code of ref document: A1