JPWO2020150013A5 - - Google Patents

Download PDF

Info

Publication number
JPWO2020150013A5
JPWO2020150013A5 JP2021540807A JP2021540807A JPWO2020150013A5 JP WO2020150013 A5 JPWO2020150013 A5 JP WO2020150013A5 JP 2021540807 A JP2021540807 A JP 2021540807A JP 2021540807 A JP2021540807 A JP 2021540807A JP WO2020150013 A5 JPWO2020150013 A5 JP WO2020150013A5
Authority
JP
Japan
Prior art keywords
pipeline
thread
stages
order
threads
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2021540807A
Other languages
Japanese (ja)
Other versions
JP7402240B2 (en
JP2022518209A (en
Publication date
Priority claimed from US16/247,269 external-priority patent/US11093682B2/en
Application filed filed Critical
Publication of JP2022518209A publication Critical patent/JP2022518209A/en
Publication of JPWO2020150013A5 publication Critical patent/JPWO2020150013A5/ja
Application granted granted Critical
Publication of JP7402240B2 publication Critical patent/JP7402240B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Claims (19)

コンピュータ実施方法であって、
マルチスレッドプログラミング言語で表現されたソースコードを受け取るステップであって、前記ソースコードは、複数のソースコードパスのうちの1つに実行を指示する分岐ステートメントを含む、受け取るステップと、
前記ソースコードを、パイプラインを含む回路記述にコンパイルするステップであって、前記パイプラインは、前記複数のソースコードパスに関連付けられた複数のコードパスを含み、前記コンパイルするステップは、
最大数のパイプラインステージを有する前記複数のコードパスのうち1つのコードパスにおけるパイプラインステージの数を決定するステップと、
前記複数のコードパスの各々がパイプラインステージの前記数を有するまで、パイプラインステージを前記複数のコードパスの少なくとも1つに追加するステップとを含み
前記回路記述に基づいて、回路実装を備える同期デジタル回路を生成するステップとを備える、コンピュータ実施方法。
A computer-implemented method comprising:
receiving source code expressed in a multithreaded programming language, the source code including branching statements directing execution to one of a plurality of source code paths;
compiling the source code into a circuit description including a pipeline, the pipeline including a plurality of code paths associated with the plurality of source code paths, the compiling comprising:
determining the number of pipeline stages in one of the plurality of code paths having the maximum number of pipeline stages;
adding pipeline stages to at least one of the plurality of code paths until each of the plurality of code paths has the number of pipeline stages ;
generating a synchronous digital circuit comprising a circuit implementation based on the circuit description.
複数のスレッドは、第1の順序で前記パイプラインに入り、前記複数のスレッドは、前記第1の順序で前記パイプラインを出る、請求項1に記載のコンピュータ実施方法。 2. The computer-implemented method of claim 1, wherein threads enter the pipeline in a first order and wherein threads exit the pipeline in the first order. 前記複数のコードパスのうちの1つまたは複数にパイプラインステージを追加するステップは、最長のコードパスにおけるパイプラインステージの数を決定するステップと、そのコードパスにおける前記パイプラインステージの数を、前記最長のコードパスにおける前記パイプラインステージの数から引いたものに等しい数のパイプラインステージを、各コードパスに追加するステップとを含む、請求項1に記載のコンピュータ実施方法。 adding pipeline stages to one or more of the plurality of code paths comprises determining a number of pipeline stages in the longest code path; and adding a number of pipeline stages to each code path equal to the number of pipeline stages in the longest code path minus the number of pipeline stages. 前記パイプラインは、第1のパイプラインを備え、前記回路記述は、第2のパイプラインを含み、前記第1のパイプラインを実行するスレッドは、ローカル変数を先入れ先出しキューにプッシュすることによって、実行を前記第2のパイプラインに渡し、前記第2のパイプラインは、プッシュされた順序で先入れ先出しキューからローカル変数を読み取ることにより、パイプライン全体でスレッドの実行順序を維持する、請求項1に記載のコンピュータ実施方法。 The pipeline comprises a first pipeline, the circuit description comprises a second pipeline, and threads executing the first pipeline execute by pushing local variables onto a first-in-first-out queue. to the second pipeline, which maintains thread execution order throughout the pipeline by reading local variables from a first-in-first-out queue in the order they were pushed. computer-implemented method. 前記ソースコードは、スレッドの実行順序を維持しないプログラミング構築をラップする並べ替えブロック構築を含み、前記並べ替えブロック構築は、
到来するスレッドの実行順序を記録し、
スレッドが、スレッドの実行順序を維持しない構築を実行できるようにし、
すべての下位スレッドが再開するまでスレッドの再開をブロックする回路実装にマッピングする、請求項1に記載のコンピュータ実施方法。
The source code includes a permuted block construct that wraps a programming construct that does not preserve thread execution order, the permuted block construct comprising:
record the order of execution of incoming threads,
allow threads to perform constructions that do not preserve the thread's order of execution,
2. The computer-implemented method of claim 1, mapping to a circuit implementation that blocks restarting a thread until all subordinate threads have restarted.
追加された前記パイプラインステージの少なくとも1つは、計算ユニットを備え、前記計算ユニットによって生成される結果をレジスタに格納するよう構成されている、請求項1に記載のコンピュータ実施方法。 2. The computer-implemented method of claim 1 , wherein at least one of said additional pipeline stages comprises a computation unit and is configured to store results produced by said computation unit in a register . 前記パイプラインは、順に実行されるステージを備え、前記ステージを前記順に通過することによって、複数のスレッドが、実行順序を維持する、請求項1に記載のコンピュータ実施方法。 2. The computer-implemented method of claim 1, wherein the pipeline comprises stages that are executed in order, and wherein multiple threads maintain execution order by passing through the stages in order. コンピューティングデバイスであって、
1つまたは複数のプロセッサと、
前記1つまたは複数のプロセッサによって実行された場合、前記コンピューティングデバイスに対して、
マルチスレッドプログラミング言語で表現されたソースコードを受け取らせ、
前記ソースコードを、第1のパイプライン、第2のパイプライン、および前記第1のパイプラインから前記第2のパイプラインに渡されるローカルスレッド変数のセットを格納する先入れ先出し(FIFO)キューを含む回路記述にコンパイルさせ、前記第1のパイプラインは、ローカルスレッド変数のセットを、スレッドの実行順序で前記FIFOキューに格納し、前記第2のパイプラインは、前記スレッドの実行順序で前記FIFOキューからローカルスレッド変数のセットを取得することによって前記スレッドの実行順序を維持し、前記ソースコードは、複数のソースコードパスのうちの1つに実行を指示する分岐ステートメントを含み、前記第1のパイプラインは、前記複数のソースコードパスに関連付けられた複数のコードパスを含み、前記複数のコードパスが同じ数のパイプラインステージを有するように、1つまたは複数のパイプラインステージが、前記複数のコードパスのうちの1つまたは複数に追加され、前記追加されたパイプランステージの少なくとも1つは、計算ユニットを含み、前記計算ユニットによって生成される結果をレジスタに格納するように構成され、
前記回路記述に基づいて、回路実装を備える同期デジタル回路を生成させる、コンピュータ実行可能命令を格納した少なくとも1つのコンピュータ記憶媒体と
を備える、コンピューティングデバイス。
a computing device,
one or more processors;
When executed by the one or more processors, to the computing device:
receive source code expressed in a multithreaded programming language;
A circuit comprising said source code into a first pipeline, a second pipeline, and a first-in-first-out (FIFO) queue storing a set of local thread variables to be passed from said first pipeline to said second pipeline. Having the description compiled, the first pipeline stores a set of local thread variables into the FIFO queue in thread execution order, and the second pipeline stores a set of local thread variables from the FIFO queue in thread execution order. maintaining execution order of the threads by obtaining a set of local thread variables, the source code including branching statements directing execution to one of a plurality of source code paths; includes a plurality of code paths associated with the plurality of source code paths, wherein one or more pipeline stages are associated with the plurality of code paths such that the plurality of code paths have the same number of pipeline stages. added to one or more of the paths, wherein at least one of said added pipeline stages includes a computation unit and is configured to store a result produced by said computation unit in a register;
and at least one computer storage medium storing computer-executable instructions for generating a synchronous digital circuit comprising a circuit implementation based on the circuit description.
前記複数のコードパスのうちの1つまたは複数にパイプラインステージを追加することは、最長のコードパスにおけるパイプラインステージの数を決定することと、そのコードパスにおける前記パイプラインステージの数を、前記最長のコードパスにおけるパイプラインステージの数から引いたものに等しい数のパイプラインステージを、各コードパスに追加することとを含む、請求項8に記載のコンピューティングデバイス。 Adding pipeline stages to one or more of the plurality of code paths comprises: determining a number of pipeline stages in a longest code path; determining the number of pipeline stages in that code path; adding a number of pipeline stages to each code path equal to the number of pipeline stages in the longest code path minus the number of pipeline stages. 前記ソースコードは、スレッドの実行順序を維持しないプログラミング構築をラップする並べ替えブロック構築を含み、前記並べ替えブロック構築が、
到来するスレッドの実行順序を記録し、
スレッドが、スレッドの実行順序を維持しない構築を実行できるようにし、
すべての下位スレッドが再開するまでスレッドの再開をブロックする回路実装にマッピングする、請求項8に記載のコンピューティングデバイス。
The source code includes a permuted block construct that wraps a programming construct that does not preserve thread execution order, the permuted block construct comprising:
record the order of execution of incoming threads,
allow threads to perform constructions that do not preserve the thread's order of execution,
9. The computing device of claim 8, mapping to a circuit implementation that blocks restarting a thread until all subordinate threads have restarted.
スレッドは、入った順序で前記並べ替えブロックを出る、請求項10に記載のコンピューティングデバイス。 11. The computing device of claim 10 , wherein threads exit the reorder block in the order they entered. スレッドは、実行のために前記第1のパイプラインに提供されるローカルスレッド変数の集合を備える、請求項8に記載のコンピューティングデバイス。 9. The computing device of claim 8, wherein a thread comprises a set of local thread variables provided to said first pipeline for execution. 前記第1のパイプラインは、順に実行されるステージを備え、前記ステージを順に通過することによって、複数のスレッドが、実行順序を維持する、請求項8に記載のコンピューティングデバイス。 9. The computing device of claim 8, wherein the first pipeline comprises stages that are executed in sequence, and wherein multiple threads maintain execution order by passing through the stages in sequence. 1つまたは複数のプロセッサによって実行された場合、コンピューティングデバイスに対して、
マルチスレッドプログラミング言語で表現されたソースコードを受け取らせ、前記ソースコードは、回路実装にマッピングする構築を備え、前記構築は、並べ替えブロックと、スレッドの実行順序を維持しない構築とを備え、前記回路実装は、
複数のスレッドを受信した順序でスレッド識別子を登録する並べ替えバッファと、
前記複数のスレッドの各々について、未知の数のクロックサイクルのために実行する回路とを備え、前記並べ替えバッファは、実行順序の低いすべてのスレッドが再開されるまで、スレッドの再開をブロックし、
前記構築を、回路記述へコンパイルさせ、
前記回路記述に基づいて、前記回路実装を備える同期デジタル回路を生成させる、コンピュータ実行可能命令を格納した少なくとも1つのコンピュータ記憶媒体。
To a computing device when executed by one or more processors:
receiving source code expressed in a multithreaded programming language, said source code comprising constructs for mapping to a circuit implementation, said constructs comprising a reordering block and constructs that do not preserve thread execution order, said The circuit implementation is
a reordering buffer that registers thread identifiers in the order in which the threads were received;
a circuit that executes for an unknown number of clock cycles for each of the plurality of threads, wherein the reorder buffer blocks thread restart until all threads with lower execution order have been restarted;
compiling the construction into a circuit description;
At least one computer storage medium storing computer-executable instructions for generating a synchronous digital circuit comprising the circuit implementation based on the circuit description.
前記ソースコードは、複数のソースコードパスのうちの1つに実行を指示する分岐ステートメントを含み、前記回路記述は、複数のコードパスを含むパイプラインを含み、前記複数のコードパスが同じ数のパイプラインステージを有するように、1つまたは複数のパイプラインステージが、前記複数のコードパスのうちの1つまたは複数に追加される、請求項14に記載の少なくとも1つのコンピュータ記憶媒体 The source code includes branch statements directing execution to one of a plurality of source code paths, the circuit description includes a pipeline including a plurality of code paths, the plurality of code paths having the same number of 15. The at least one computer storage medium of claim 14, wherein one or more pipeline stages are added to one or more of the plurality of code paths to have pipeline stages . 前記複数のコードパスのうちの1つまたは複数にパイプラインステージを追加することは、最長のコードパスにおけるパイプラインステージの数を決定することと、そのコードパスにおけるパイプラインステージの数を、前記最長のコードパスにおける前記パイプラインステージの数から引いたものに等しい数のパイプラインステージを、各コードパスに追加することとを含む、請求項15に記載の少なくとも1つのコンピュータ記憶媒体。 Adding pipeline stages to one or more of the plurality of code paths comprises: determining the number of pipeline stages in the longest code path; adding a number of pipeline stages to each code path equal to the number of pipeline stages in the longest code path minus the number of pipeline stages. スレッドは、実行のために前記パイプラインの第1のパイプラインに提供されるローカルスレッド変数の集合を備える、請求項15に記載の少なくとも1つのコンピュータ記憶媒体。 16. At least one computer storage medium according to claim 15, wherein a thread comprises a set of local thread variables provided to a first of said pipelines for execution. 前記パイプラインの第1のパイプラインは、順に実行されるステージを備え、前記ステージを前記順に通過することによって、複数のスレッドが、実行順序を維持する、請求項15に記載の少なくとも1つのコンピュータ記憶媒体。 16. At least one computer as recited in claim 15, wherein a first of said pipelines comprises stages that are executed in order, and wherein by passing through said stages in said order, multiple threads maintain execution order. storage medium. 前記パイプラインは第1のパイプラインを含み、前記回路記述は第2のパイプラインを含み、前記第1のパイプラインを実行するスレッドは、ローカル変数を先入れ先出しキューにプッシュすることによって、実行を前記第2のパイプラインに渡し、前記第2のパイプラインは、プッシュされた順序で先入れ先出しキューからローカル変数を読み取ることにより、パイプライン全体でスレッドの実行順序を維持する、請求項15に記載の少なくとも1つのコンピュータ記憶媒体。 The pipeline includes a first pipeline, the circuit description includes a second pipeline, and a thread executing the first pipeline initiates execution by pushing local variables into a first-in-first-out queue. 16. At least as recited in claim 15, passing to a second pipeline, which maintains thread execution order throughout the pipeline by reading local variables from a first-in-first-out queue in the order they were pushed. A computer storage medium.
JP2021540807A 2019-01-14 2020-01-04 Languages and compilers that generate synchronous digital circuits that maintain thread execution order Active JP7402240B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/247,269 US11093682B2 (en) 2019-01-14 2019-01-14 Language and compiler that generate synchronous digital circuits that maintain thread execution order
US16/247,269 2019-01-14
PCT/US2020/012278 WO2020150013A1 (en) 2019-01-14 2020-01-04 Language and compiler that generate synchronous digital circuits that maintain thread execution order

Publications (3)

Publication Number Publication Date
JP2022518209A JP2022518209A (en) 2022-03-14
JPWO2020150013A5 true JPWO2020150013A5 (en) 2023-01-04
JP7402240B2 JP7402240B2 (en) 2023-12-20

Family

ID=69400654

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2021540807A Active JP7402240B2 (en) 2019-01-14 2020-01-04 Languages and compilers that generate synchronous digital circuits that maintain thread execution order

Country Status (13)

Country Link
US (1) US11093682B2 (en)
EP (1) EP3912025A1 (en)
JP (1) JP7402240B2 (en)
KR (1) KR20210112330A (en)
CN (1) CN113316762A (en)
AU (1) AU2020209446A1 (en)
BR (1) BR112021010345A2 (en)
CA (1) CA3123903A1 (en)
IL (1) IL284548A (en)
MX (1) MX2021008474A (en)
SG (1) SG11202107262RA (en)
WO (1) WO2020150013A1 (en)
ZA (1) ZA202103821B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11144286B2 (en) 2019-01-14 2021-10-12 Microsoft Technology Licensing, Llc Generating synchronous digital circuits from source code constructs that map to circuit implementations
US10810343B2 (en) 2019-01-14 2020-10-20 Microsoft Technology Licensing, Llc Mapping software constructs to synchronous digital circuits that do not deadlock
US11113176B2 (en) 2019-01-14 2021-09-07 Microsoft Technology Licensing, Llc Generating a debugging network for a synchronous digital circuit during compilation of program source code
US11275568B2 (en) 2019-01-14 2022-03-15 Microsoft Technology Licensing, Llc Generating a synchronous digital circuit from a source code construct defining a function call
US11106437B2 (en) 2019-01-14 2021-08-31 Microsoft Technology Licensing, Llc Lookup table optimization for programming languages that target synchronous digital circuits
US11366647B2 (en) * 2020-04-30 2022-06-21 Intel Corporation Automatic compiler dataflow optimization to enable pipelining of loops with local storage requirements

Family Cites Families (116)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5343554A (en) 1988-05-20 1994-08-30 John R. Koza Non-linear genetic process for data encoding and for solving problems using automatically defined functions
US5642304A (en) 1991-08-16 1997-06-24 Simpson; John Richard Apparatus for high-speed solution of arbitrary mathematical expressions with logic code generator and programmable logic circuit
US5416719A (en) 1992-12-17 1995-05-16 Vlsi Technology, Inc. Computerized generation of truth tables for sequential and combinatorial cells
US8487653B2 (en) 2006-08-05 2013-07-16 Tang System SDOC with FPHA and FPXC: system design on chip with field programmable hybrid array of FPAA, FPGA, FPLA, FPMA, FPRA, FPTA and frequency programmable xtaless clockchip with trimless/trimfree self-adaptive bandgap reference xtaless clockchip
US6112019A (en) 1995-06-12 2000-08-29 Georgia Tech Research Corp. Distributed instruction queue
US5761483A (en) 1995-08-18 1998-06-02 Xilinx, Inc. Optimizing and operating a time multiplexed programmable logic device
US6212601B1 (en) 1996-08-30 2001-04-03 Texas Instruments Incorporated Microprocessor system with block move circuit disposed between cache circuits
US6061521A (en) 1996-12-02 2000-05-09 Compaq Computer Corp. Computer having multimedia operations executable as two distinct sets of operations within a single instruction cycle
US5909572A (en) 1996-12-02 1999-06-01 Compaq Computer Corp. System and method for conditionally moving an operand from a source register to a destination register
US6784903B2 (en) 1997-08-18 2004-08-31 National Instruments Corporation System and method for configuring an instrument to perform measurement functions utilizing conversion of graphical programs into hardware implementations
US6275508B1 (en) 1998-04-21 2001-08-14 Nexabit Networks, Llc Method of and system for processing datagram headers for high speed computer network interfaces at low clock speeds, utilizing scalable algorithms for performing such network header adaptation (SAPNA)
US6597664B1 (en) * 1999-08-19 2003-07-22 Massachusetts Institute Of Technology Digital circuit synthesis system
US7203718B1 (en) 1999-10-29 2007-04-10 Pentomics, Inc. Apparatus and method for angle rotation
US8095508B2 (en) 2000-04-07 2012-01-10 Washington University Intelligent data storage and processing using FPGA devices
US6988192B2 (en) 2002-02-11 2006-01-17 Hewlett-Packard Development Company, L.P. Method and apparatus for compiling source code to configure hardware
US7516446B2 (en) 2002-06-25 2009-04-07 International Business Machines Corporation Method and apparatus for efficient and precise datarace detection for multithreaded object-oriented programs
US7028281B1 (en) 2002-07-12 2006-04-11 Lattice Semiconductor Corporation FPGA with register-intensive architecture
US7305582B1 (en) 2002-08-30 2007-12-04 Availigent, Inc. Consistent asynchronous checkpointing of multithreaded application programs based on active replication
WO2004036463A1 (en) 2002-10-15 2004-04-29 Renesas Technology Corp. Compiler and logic circuit design method
AU2004290281A1 (en) 2003-05-23 2005-05-26 Washington University Intelligent data storage and processing using FPGA devices
US7805638B2 (en) 2003-06-18 2010-09-28 Nethra Imaging, Inc. Multi-frequency debug network for a multiprocessor array
US7111273B1 (en) 2003-07-03 2006-09-19 Xilinx, Inc. Softpal implementation and mapping technology for FPGAs with dedicated resources
US7711006B2 (en) 2003-08-15 2010-05-04 Napatech A/S Data merge unit, a method of producing an interleaved data stream, a network analyser and a method of analysing a network
KR100626368B1 (en) 2003-08-25 2006-09-20 삼성전자주식회사 Method of benchmarking garbage collection
US7844924B2 (en) 2003-11-19 2010-11-30 Kitakyushu Foundation For The Advancement Of Industry, Science And Technology Device for reducing the width of graph and a method to reduce the width of graph, and a device for logic synthesis and a method for logic synthesis
US7415681B2 (en) 2003-12-29 2008-08-19 Sicronic Remote Kg, Llc Optimal mapping of LUT based FPGA
DE102005005073B4 (en) 2004-02-13 2009-05-07 Siemens Ag Computer device with reconfigurable architecture for the parallel calculation of arbitrary algorithms
US7620917B2 (en) 2004-10-04 2009-11-17 Synopsys, Inc. Methods and apparatuses for automated circuit design
US7584449B2 (en) * 2004-11-22 2009-09-01 Fulcrum Microsystems, Inc. Logic synthesis of multi-level domino asynchronous pipelines
JP4390211B2 (en) 2004-11-30 2009-12-24 国立大学法人九州大学 Custom LSI development platform, instruction set architecture, logic circuit configuration information generation method, and program
US7386820B1 (en) * 2004-12-10 2008-06-10 Synopsys, Inc. Method and apparatus for formally checking equivalence using equivalence relationships
US7647567B1 (en) * 2005-01-31 2010-01-12 Bluespec, Inc. System and method for scheduling TRS rules
US7315991B1 (en) 2005-02-23 2008-01-01 Xilinx, Inc. Compiling HLL into massively pipelined systems
US7375550B1 (en) 2005-07-15 2008-05-20 Tabula, Inc. Configurable IC with packet switch configuration network
US8285972B2 (en) 2005-10-26 2012-10-09 Analog Devices, Inc. Lookup table addressing system and method
US7389479B2 (en) * 2005-12-20 2008-06-17 Synopsys, Inc. Formally proving the functional equivalence of pipelined designs containing memories
US7735050B2 (en) 2006-02-09 2010-06-08 Henry Yu Managing and controlling the use of hardware resources on integrated circuits
US8209580B1 (en) 2006-05-08 2012-06-26 Marvell International Ltd. Error correction coding for varying signal-to-noise ratio channels
US7496866B2 (en) * 2006-06-22 2009-02-24 International Business Machines Corporation Method for optimizing of pipeline structure placement
US20080005357A1 (en) 2006-06-30 2008-01-03 Microsoft Corporation Synchronizing dataflow computations, particularly in multi-processor setting
US7801299B2 (en) 2006-09-22 2010-09-21 Intel Corporation Techniques for merging tables
US7573407B2 (en) 2006-11-14 2009-08-11 Qualcomm Incorporated Memory efficient adaptive block coding
US7545293B2 (en) 2006-11-14 2009-06-09 Qualcomm Incorporated Memory efficient coding of variable length codes
TWI331278B (en) 2007-03-14 2010-10-01 Ind Tech Res Inst Debug method
JP4962564B2 (en) 2007-03-29 2012-06-27 富士通株式会社 Parallelization program generation method, parallelization program generation apparatus, and parallelization program generation program
US7908574B2 (en) 2007-05-09 2011-03-15 Synopsys, Inc. Techniques for use with automated circuit design and simulations
US7471104B1 (en) 2007-07-27 2008-12-30 Xilinx, Inc. Lookup table with relatively balanced delays
US7735047B1 (en) 2007-11-29 2010-06-08 Xilinx, Inc. Method for technology mapping considering boolean flexibility
US7823117B1 (en) 2007-12-21 2010-10-26 Xilinx, Inc. Separating a high-level programming language program into hardware and software components
US8468510B1 (en) * 2008-01-16 2013-06-18 Xilinx, Inc. Optimization of cache architecture generated from a high-level language description
JP2011511366A (en) 2008-02-01 2011-04-07 ジ・オリバー・グループ・リミテッド・ライアビリティ・カンパニー Data retrieval and indexing method and system for implementing the same
US8930926B2 (en) 2008-02-08 2015-01-06 Reservoir Labs, Inc. System, methods and apparatus for program optimization for multi-threaded processor architectures
US8347243B2 (en) 2008-05-15 2013-01-01 Universiteit Gent Parameterized configuration for a programmable logic device
US8392885B2 (en) 2008-12-19 2013-03-05 Microsoft Corporation Low privilege debugging pipeline
MX2012008075A (en) 2010-01-12 2013-12-16 Fraunhofer Ges Forschung Audio encoder, audio decoder, method for encoding and audio information, method for decoding an audio information and computer program using a modification of a number representation of a numeric previous context value.
CN104617944B (en) 2010-06-24 2018-03-16 太阳诱电株式会社 Semiconductor device
ES2937066T3 (en) 2010-07-20 2023-03-23 Fraunhofer Ges Forschung Audio decoder, method and computer program for audio decoding
US8812285B2 (en) * 2010-08-31 2014-08-19 The Regents Of The University Of California Designing digital processors using a flexibility metric
TWI420830B (en) 2010-12-31 2013-12-21 Ind Tech Res Inst Dynamic decoding lookup table generation method and electronic device applying the same
US20130054939A1 (en) * 2011-08-26 2013-02-28 Cognitive Electronics, Inc. Integrated circuit having a hard core and a soft core
US8607249B2 (en) 2011-09-22 2013-12-10 Oracle International Corporation System and method for efficient concurrent queue implementation
US8806410B2 (en) * 2011-10-28 2014-08-12 The Board Of Trustees Of The University Of Illinois Power balanced pipelines
US8752036B2 (en) * 2011-10-31 2014-06-10 Oracle International Corporation Throughput-aware software pipelining for highly multi-threaded systems
US8966457B2 (en) * 2011-11-15 2015-02-24 Global Supercomputing Corporation Method and system for converting a single-threaded software program into an application-specific supercomputer
US8631380B2 (en) 2011-11-28 2014-01-14 Maxeler Technologies, Ltd. Method of, and apparatus for, data path optimisation in parallel pipelined hardware
US8959469B2 (en) * 2012-02-09 2015-02-17 Altera Corporation Configuring a programmable device using high-level language
US9122523B2 (en) * 2012-05-03 2015-09-01 Nec Laboratories America, Inc. Automatic pipelining framework for heterogeneous parallel computing systems
US9047148B2 (en) 2012-06-15 2015-06-02 Lsi Corporation Pipelined vectoring-mode CORDIC
US9081583B2 (en) 2012-08-23 2015-07-14 National Instruments Corporation Compile time execution
US9484874B2 (en) 2012-11-16 2016-11-01 Nokia Solutions And Networks Oy Input amplitude modulated outphasing with an unmatched combiner
US8671371B1 (en) * 2012-11-21 2014-03-11 Maxeler Technologies Ltd. Systems and methods for configuration of control logic in parallel pipelined hardware
CN103023842B (en) 2012-11-26 2016-08-24 大唐移动通信设备有限公司 A kind of Multiband pre-distortion factor lookup table update method and system
US9779195B2 (en) 2012-12-04 2017-10-03 The Mathworks, Inc. Model-based retiming with functional equivalence constraints
US8924901B2 (en) 2013-02-15 2014-12-30 Synopsys, Inc. Look-up based fast logic synthesis
US8775986B1 (en) 2013-02-25 2014-07-08 Xilinx, Inc. Software debugging of synthesized hardware
US8881079B1 (en) * 2013-03-12 2014-11-04 Xilinx, Inc. Dataflow parameter estimation for a design
US9824756B2 (en) 2013-08-13 2017-11-21 Globalfoundries Inc. Mapping a lookup table to prefabricated TCAMS
US9916131B2 (en) 2013-10-02 2018-03-13 The Penn State Research Foundation Techniques and devices for performing arithmetic
US9251300B2 (en) * 2013-10-25 2016-02-02 Altera Corporation Methods and tools for designing integrated circuits with auto-pipelining capabilities
JP5842255B2 (en) * 2013-12-12 2016-01-13 国立大学法人東京工業大学 Apparatus and method for generating logic circuit from logic circuit description in programming language
US9158882B2 (en) 2013-12-19 2015-10-13 Netspeed Systems Automatic pipelining of NoC channels to meet timing and/or performance
US9471307B2 (en) 2014-01-03 2016-10-18 Nvidia Corporation System and processor that include an implementation of decoupled pipelines
US9690278B1 (en) * 2014-04-10 2017-06-27 Altera Corporation Method and apparatus for high-level programs with general control flow
US10090864B2 (en) 2014-09-22 2018-10-02 Samsung Display Co., Ltd. System and method for decoding variable length codes
EP3198556B1 (en) 2014-09-26 2018-05-16 Dolby Laboratories Licensing Corp. Encoding and decoding perceptually-quantized video content
US10254369B2 (en) 2014-10-29 2019-04-09 Heartvista, Inc. Pipeline engine for specifying, visualizing, and analyzing MRI image reconstructions
ES2966889T3 (en) 2014-11-21 2024-04-24 The Res Institute At Nationwide Childrens Hospital Parallel processing systems and methods for highly scalable analysis of biological sequence data
US9680459B2 (en) * 2014-12-11 2017-06-13 Intel Corporation Edge-aware synchronization of a data signal
US9727679B2 (en) 2014-12-20 2017-08-08 Intel Corporation System on chip configuration metadata
US9971858B1 (en) * 2015-02-20 2018-05-15 Altera Corporation Method and apparatus for performing register retiming in the presence of false path timing analysis exceptions
US9858373B2 (en) * 2015-07-15 2018-01-02 International Business Machines Corporation In-cycle resource sharing for high-level synthesis of microprocessors
US9846623B2 (en) 2015-08-20 2017-12-19 Qsigma, Inc. Simultaneous multi-processor apparatus applicable to acheiving exascale performance for algorithms and program systems
WO2017059043A1 (en) 2015-09-30 2017-04-06 Dolby Laboratories Licensing Corporation 2d lut color transforms with reduced memory footprint
US10698916B1 (en) 2015-10-16 2020-06-30 Dr Holdco 2, Inc. Data preparation context navigation
US10311558B2 (en) 2015-11-16 2019-06-04 Dolby Laboratories Licensing Corporation Efficient image processing on content-adaptive PQ signal domain
CN107112996B (en) 2015-11-20 2021-06-18 京微雅格(北京)科技有限公司 Lookup table process mapping method based on FPGA and lookup table
US9817747B2 (en) 2015-12-28 2017-11-14 Juniper Networks, Inc. Systems and methods for unit testing of functions on remote kernels
US10445271B2 (en) 2016-01-04 2019-10-15 Intel Corporation Multi-core communication acceleration using hardware queue device
US10359832B2 (en) 2016-03-22 2019-07-23 The Board Of Regents Of The University Of Texas System Method and apparatus for reducing power and cycle requirement for FFT of ECG signals
US10162918B1 (en) 2016-04-27 2018-12-25 Altera Corporation Integrated circuit retiming with selective modeling of flip-flop secondary signals
US10275219B2 (en) 2016-11-08 2019-04-30 Embry-Riddle Aeronautical University, Inc. Bit-serial multiplier for FPGA applications
US10248498B2 (en) 2016-11-21 2019-04-02 Futurewei Technologies, Inc. Cyclic redundancy check calculation for multiple blocks of a message
US10235272B2 (en) 2017-03-06 2019-03-19 Xilinx, Inc. Debugging system and method
US10579762B2 (en) 2017-05-15 2020-03-03 LegUp Computing Inc. High-level synthesis (HLS) method and apparatus to specify pipeline and spatial parallelism in computer hardware
US10521877B2 (en) 2017-05-23 2019-12-31 Samsung Electronics Co., Ltd Apparatus and method for speculative buffer reservations with cancellation mechanism
US11454188B2 (en) 2017-06-02 2022-09-27 The Mathworks, Inc. Systems and methods for rescaling executable simulation models
US10331836B1 (en) 2017-10-11 2019-06-25 Xilinx, Inc. Loop optimization for implementing circuit designs in hardware
US20190114548A1 (en) * 2017-10-17 2019-04-18 Xilinx, Inc. Static block scheduling in massively parallel software defined hardware systems
US11755382B2 (en) 2017-11-03 2023-09-12 Coherent Logix, Incorporated Programming flow for multi-processor system
US11307873B2 (en) 2018-04-03 2022-04-19 Intel Corporation Apparatus, methods, and systems for unstructured data flow in a configurable spatial accelerator with predicate propagation and merging
US10768916B2 (en) 2018-11-28 2020-09-08 Red Hat, Inc. Dynamic generation of CPU instructions and use of the CPU instructions in generated code for a softcore processor
US10810343B2 (en) 2019-01-14 2020-10-20 Microsoft Technology Licensing, Llc Mapping software constructs to synchronous digital circuits that do not deadlock
US11106437B2 (en) 2019-01-14 2021-08-31 Microsoft Technology Licensing, Llc Lookup table optimization for programming languages that target synchronous digital circuits
US11144286B2 (en) 2019-01-14 2021-10-12 Microsoft Technology Licensing, Llc Generating synchronous digital circuits from source code constructs that map to circuit implementations
US11113176B2 (en) 2019-01-14 2021-09-07 Microsoft Technology Licensing, Llc Generating a debugging network for a synchronous digital circuit during compilation of program source code
US11275568B2 (en) 2019-01-14 2022-03-15 Microsoft Technology Licensing, Llc Generating a synchronous digital circuit from a source code construct defining a function call

Similar Documents

Publication Publication Date Title
JP2928695B2 (en) Multi-thread microprocessor using static interleave and instruction thread execution method in system including the same
US7793079B2 (en) Method and system for expanding a conditional instruction into a unconditional instruction and a select instruction
US6237077B1 (en) Instruction template for efficient processing clustered branch instructions
JP2004302706A (en) Program parallelization device, program parallelization method, and program parallelization program
JP4841861B2 (en) Arithmetic processing device and execution method of data transfer processing
US20170090922A1 (en) Efficient Instruction Pair for Central Processing Unit (CPU) Instruction Design
JP2003029986A (en) Inter-processor register succeeding method and device therefor
US7200738B2 (en) Reducing data hazards in pipelined processors to provide high processor utilization
US11366669B2 (en) Apparatus for preventing rescheduling of a paused thread based on instruction classification
US9690590B2 (en) Flexible instruction execution in a processor pipeline
JP2014216021A (en) Processor for batch thread processing, code generation apparatus and batch thread processing method
US20200319893A1 (en) Booting Tiles of Processing Units
RU2375768C2 (en) Processor and method of indirect reading and recording into register
TWI613589B (en) Flexible instruction execution in a processor pipeline
JPWO2020150013A5 (en)
US11526432B2 (en) Parallel processing device
KR100837400B1 (en) Method and apparatus for processing according to multi-threading/out-of-order merged scheme
WO2016201699A1 (en) Instruction processing method and device
WO2022053152A1 (en) Method of interleaved processing on a general-purpose computing core
JP5238876B2 (en) Information processing apparatus and information processing method
JP2001051845A (en) Out-of-order execution system
RU2021124047A (en) LANGUAGE AND COMPILER WHICH FORM SYNCHRONOUS DIGITAL CIRCUITS WHICH MAINTAIN THE ORDER OF EXECUTION OF STREAMS
US10884738B2 (en) Arithmetic processing device and method of controlling arithmetic processing device
CN114116229B (en) Method and apparatus for adjusting instruction pipeline, memory and storage medium
US20230071941A1 (en) Parallel processing device