TW202331575A - Using and/or reduce carry chains on programmable hardware - Google Patents

Using and/or reduce carry chains on programmable hardware Download PDF

Info

Publication number
TW202331575A
TW202331575A TW111144744A TW111144744A TW202331575A TW 202331575 A TW202331575 A TW 202331575A TW 111144744 A TW111144744 A TW 111144744A TW 111144744 A TW111144744 A TW 111144744A TW 202331575 A TW202331575 A TW 202331575A
Authority
TW
Taiwan
Prior art keywords
logic
bit
carry
input
adder
Prior art date
Application number
TW111144744A
Other languages
Chinese (zh)
Inventor
史坎特 賀凱
艾倫麥克 蘭迪
Original Assignee
美商微軟技術授權有限責任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/740,831 external-priority patent/US20230214180A1/en
Application filed by 美商微軟技術授權有限責任公司 filed Critical 美商微軟技術授權有限責任公司
Publication of TW202331575A publication Critical patent/TW202331575A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting

Abstract

The present disclosure relates to a carry chain logic system that leverages carry in and carry out signals from logic blocks to implement logic functions on programmable hardware (e.g., FPGA hardware). In particular, implementations of the carry chain logic system facilitate implementation of logic gates (e.g., AND/OR gates) having high number of input signals without incurring routing delays caused by routing output signals between logic components implemented across different logic stages. For example, implementations described herein involve feeding carry out signals between adders of a logic chain across multiple logic components on a common logic stage, thus reducing routing penalties caused by routing signals via a routing fabric of the programmable hardware.

Description

在可編程的硬體上使用和/或減少進位鏈Use and/or reduce carry chains on programmable hardware

本揭示係關於在可編程的硬體上使用及/或減少進位鏈。This disclosure relates to the use and/or reduction of carry chains on programmable hardware.

本申請案主張於2021年12月30日申請的美國臨時申請案第63/295,323號的權益及優先權,該申請案的全部內容以引用方式併入本文中。This application claims the benefit of and priority to U.S. Provisional Application No. 63/295,323, filed December 30, 2021, which is hereby incorporated by reference in its entirety.

近年來已經看到使用可編程硬體來執行各種計算任務的增長。實際上,現在許多計算應用通常使用區塊的可編程陣列來執行各種任務。記憶體元件的此等可編程區塊為具有更專門或特定任務集合的特殊應用積體電路提供有用的替代方案。例如,現場可編程閘陣列(field programmable gate array; FPGA)提供可以獨立編程的可編程區塊,並且提供執行各種任務的顯著靈活性。Recent years have seen a growth in the use of programmable hardware to perform various computing tasks. In fact, many computing applications today often use programmable arrays of blocks to perform various tasks. Such programmable blocks of memory elements provide a useful alternative to application-specific integrated circuits with a more specialized or specific set of tasks. For example, field programmable gate arrays (FPGAs) provide programmable blocks that can be programmed independently and provide significant flexibility to perform various tasks.

由於可編程硬體的大小及複雜性增加,實施提供快速且有效處理的硬體的配置已經成為挑戰。例如,由於邏輯函數經配置為接受數量不斷增加的輸入信號,習知硬體單元(例如,邏輯模組)經常不能在不導致一定量延遲的情況下產生輸出。在許多情況下,此等延遲係不可接受的並且潛在地導致其他邏輯模組產生不正確的輸出。此外,固定此等延遲的習知途徑經常係複雜的,並且難以在給定的可編程硬體單元上實施。As programmable hardware has increased in size and complexity, implementing hardware configurations that provide fast and efficient processing has become challenging. For example, because logic functions are configured to accept an increasing number of input signals, conventional hardware units (eg, logic modules) are often unable to generate outputs without incurring a certain amount of delay. In many cases, such delays are unacceptable and potentially cause other logic modules to produce incorrect outputs. Furthermore, conventional approaches to fixing such delays are often complex and difficult to implement on a given programmable hardware unit.

關於在諸如FPGA的可編程硬體單元上實施邏輯函數存在此等及其他問題。These and other problems exist with regard to implementing logic functions on programmable hardware units such as FPGAs.

在第一態樣中,揭示了一種在邏輯模組的進位鏈上實施的方法,該方法包含:接收輸入向量,包含複數個輸入位元;接收位元向量,包含引子位元及複數個向量位元;將引子位元及複數個輸入位元中的第一輸入位元提供到進位鏈中的第一加法器;基於引子位元及第一輸入位元在第一加法器處產生第一進位輸出位元;將來自複數個向量位元的每個向量位元及來自複數個輸入位元的相關聯的輸入位元作為輸入提供到進位鏈中的額外加法器;以及基於來自進位鏈中的最後加法器的最後進位輸出位元,將進位輸出位元從每個加法器提供到進位鏈中的下一加法器以產生輸出。In a first aspect, a method implemented on a carry chain of a logic module is disclosed, the method includes: receiving an input vector, including a plurality of input bits; receiving a bit vector, including a primer bit and a plurality of vectors bit; providing the primer bit and the first input bit of the plurality of input bits to a first adder in the carry chain; generating a first adder at the first adder based on the primer bit and the first input bit carry output bits; each vector bit from the plurality of vector bits and an associated input bit from the plurality of input bits is provided as input to an additional adder in the carry chain; and The last carry-out bit of the last adder of , the carry-out bit is provided from each adder to the next adder in the carry chain to produce the output.

在第二態樣中,揭示了一種進位鏈邏輯函數,該進位鏈邏輯函數包含:包括第一加法器及第二加法器的第一邏輯模組,第一邏輯模組經配置為:在第一加法器處接收第一輸入及位元向量的引子位元以產生第一進位輸出信號來饋送到第二加法器;在第二加法器處接收作為第一進位輸入信號的第一進位輸出信號、第二輸入、及位元向量的相關聯的第二向量位元以產生第二進位輸出信號;包括第三加法器及第四加法器的第二邏輯模組,第二邏輯模組經配置為:在第三加法器處接收作為第二進位輸入信號的第二進位輸出信號、第三輸入、及位元向量的相關聯的第三向量位元以產生第三進位輸出信號;以及在第四加法器處接收作為第三進位輸入信號的第三進位輸出信號、第四輸入、及位元向量的相關聯的第三向量位元以產生第四進位輸出信號。In the second aspect, a carry chain logic function is disclosed, the carry chain logic function includes: a first logic module including a first adder and a second adder, the first logic module is configured to: An adder receives the first input and the primer bits of the bit vector to generate a first carry-out signal to feed to a second adder; the second adder receives the first carry-out signal as the first carry-in signal , a second input, and an associated second vector bit of the bit vector to generate a second carry output signal; a second logic module comprising a third adder and a fourth adder, the second logic module being configured being: receiving a second carry-out signal as a second carry-in signal at a third adder, a third input, and an associated third vector bit of the bit vector to generate a third carry-out signal; and The quad adder receives the third carry-out signal as the third carry-in signal, the fourth input, and the associated third vector bits of the bit vector to generate a fourth carry-out signal.

在第三態樣中,揭示了一種可編程硬體裝置,該可編程硬體裝置包含:在包含複數個加法器的邏輯鏈中編程的複數個邏輯模組,複數個邏輯模組經配置為:在複數個加法器的第一加法器處接收輸入位元,加法器位於複數個邏輯模組的第一邏輯模組中;在複數個加法器的第一加法器處接收向量位元;在複數個加法器的第一加法器處使用輸入位元及向量位元產生進位輸出位元;以及將進位輸出位元提供到複數個加法器的下一加法器,下一加法器接收進位輸出位元作為進位輸入位元,下一加法器係在複數個邏輯模組的第二邏輯模組中。In a third aspect, a programmable hardware device is disclosed, the programmable hardware device comprising: a plurality of logic modules programmed in a logic chain including a plurality of adders, the plurality of logic modules configured to : Receive the input bit at the first adder of the plurality of adders, the adder is located in the first logic module of the plurality of logic modules; receive the vector bit at the first adder of the plurality of adders; A first adder of the plurality of adders uses the input bit and the vector bit to generate a carry-out bit; and the carry-out bit is provided to a next adder of the plurality of adders, and the next adder receives the carry-out bit The element is used as the carry input bit, and the next adder is in the second logic module of the plurality of logic modules.

本揭示係關於進位鏈邏輯系統的特徵及功能,該進位鏈邏輯系統利用來自邏輯區塊的進位輸入及進位輸出信號來在可編程硬體(例如,FPGA硬體)上實施AND/OR邏輯函數。特定而言,本文描述的進位鏈邏輯系統的實施例使得能夠在額外邏輯函數的框架內實施大的(例如,大量輸入)AND/OR邏輯閘,而不會由於在多個邏輯模組系列(例如,多個邏輯位準)之間路由輸入而導致顯著的延遲。This disclosure relates to the features and functionality of a carry chain logic system that utilizes carry-in and carry-out signals from logic blocks to implement AND/OR logic functions on programmable hardware (e.g., FPGA hardware) . In particular, embodiments of the carry-chain logic system described herein enable the implementation of large (e.g., large number of inputs) AND/OR logic gates within the framework of additional logic functions without the need for multiple logic module families ( For example, routing inputs between multiple logic levels) can cause significant delays.

作為說明性實例,進位鏈邏輯系統的一或多個實施例可涉及在可編程硬體上實施的邏輯模組的進位鏈上實施的方法或動作系列。根據本文描述的一或多個實施例,進位鏈邏輯系統可接收包括複數個輸入位元的輸入向量。進位鏈邏輯系統可進一步接收與輸入向量分離的位元向量,包括複數個向量位元。進位鏈邏輯系統可導致輸入位元成為邏輯模組的進位鏈中的第一加法器的輸入。在一些實施例中,位元向量包括用作第一加法器的第一或引子輸入以開始邏輯鏈的引子位元(例如,位元向量的最低有效位元(least significant bit; LSB))。隨後,進位鏈邏輯系統可將每個向量位元及來自複數個輸入的相關聯的位元作為輸入提供到進位鏈中的額外加法器。在本文描述的一或多個實施例中,來自加法器的進位輸出信號可以作為進位輸入信號提供到進位鏈中的後續加法器,以基於來自進位鏈中的最後加法器的進位輸出信號產生輸出。As an illustrative example, one or more embodiments of a carry chain logic system may involve a method or series of actions implemented on a carry chain of a logic module implemented on programmable hardware. According to one or more embodiments described herein, a carry chain logic system may receive an input vector comprising a plurality of input bits. The carry chain logic system may further receive a bit vector separate from the input vector, including a plurality of vector bits. The carry chain logic system may cause the input bit to be the input of the first adder in the carry chain of the logic module. In some embodiments, the bit vector includes a primer bit (eg, the least significant bit (LSB) of the bit vector) that is used as a first or primer input to a first adder to start a logic chain. The carry chain logic system may then provide each vector bit and the associated bit from the plurality of inputs as input to additional adders in the carry chain. In one or more embodiments described herein, the carry-out signal from the adder may be provided as a carry-in signal to subsequent adders in the carry chain to produce an output based on the carry-out signal from the last adder in the carry chain .

作為另一實例,進位鏈邏輯系統可包括可在可編程硬體上實施的進位鏈邏輯函數。根據本文描述的一或多個實施例,進位鏈邏輯系統可包括具有第一加法器及第二加法器的第一邏輯模組、具有第三加法器及第四加法器的第二邏輯模組、及具有額外加法器部件的任何數量的額外邏輯模組。在此實例中,第一邏輯模組可經配置為在第一加法器處接收第一輸入及位元向量的引子位元以產生提供到第二加法器的進位輸出信號。第一邏輯模組可進一步經配置為在第二加法器處接收來自第一加法器的進位輸出信號以與第二輸入及來自位元向量的相關聯的位元值結合來產生另一進位輸出信號。進位輸出信號可結合額外輸入及位元向量的額外位元值一起饋送到其他邏輯模組上的額外加法器以根據本文描述的一或多個實施例實施邏輯函數。As another example, a carry chain logic system may include a carry chain logic function that may be implemented on programmable hardware. According to one or more embodiments described herein, a carry chain logic system may include a first logic module having a first adder and a second adder, a second logic module having a third adder and a fourth adder , and any number of additional logic modules with additional adder components. In this example, the first logic module may be configured to receive the first input and the primer bits of the bit vector at the first adder to generate a carry out signal that is provided to the second adder. The first logic module may be further configured to receive the carry out signal from the first adder at the second adder to combine with the second input and the associated bit value from the bit vector to generate another carry out Signal. The carry out signal may be fed to additional adders on other logic modules in conjunction with additional inputs and additional bit values of the bit vector to implement logic functions according to one or more embodiments described herein.

本揭示包括提供益處及/或解決與在可編程硬體上配置邏輯函數相關聯的問題的數個實際應用。一些此等益處的實例在下文更詳細論述。The present disclosure includes several practical applications that provide benefits and/or solve problems associated with configuring logic functions on programmable hardware. Examples of some of these benefits are discussed in more detail below.

例如,進位鏈邏輯系統的特徵實現大輸入邏輯閘,而不會由於在可編程硬體上串聯耦合在一起的多個邏輯位準而引起傳播延遲。例如,習知的邏輯模組配置經常涉及在常見硬體配置中從不同的邏輯位準接收輸入並且饋送輸出的邏輯模組。此等額外邏輯位準導致需要額外排序緩衝或較低時鐘頻率的較長傳播延遲。For example, features of the carry-chain logic system implement large input logic gates without incurring propagation delays due to multiple logic levels coupled together in series on programmable hardware. For example, conventional logic module configurations often involve logic modules that receive inputs and feed outputs from different logic levels in common hardware configurations. These additional logic levels result in longer propagation delays requiring additional sequencing buffers or lower clock frequencies.

進位鏈邏輯系統額外提供使得邏輯模組能夠在相同邏輯位準上結合以建立比任何個別邏輯模組更高數量的輸入的AND/OR邏輯的特徵。例如,邏輯模組可包括限制具體邏輯模組以接收並且處理六個輸入的硬體。儘管存在此種限制,進位鏈邏輯系統利用進位輸入及進位輸出信號以增加給定邏輯閘可以考慮的輸入數量,而不在邏輯位準之間路由輸入及輸出信號。The carry chain logic system additionally provides features that enable logic modules to combine at the same logic level to create a higher number of input AND/OR logic than any individual logic module. For example, a logic module may include hardware that limits a particular logic module to receive and process six inputs. Despite this limitation, carry-chain logic systems utilize carry-in and carry-out signals to increase the number of inputs that can be considered for a given logic gate without routing input and output signals between logic levels.

作為更具體的實例,其中習知邏輯模組中的查找表(lookup table; LUT)僅可支援固定數量的輸入(例如,六個輸入),若僅在LUT中實施,則諸如AND/OR減少超過六個輸入的任何操作可能需要多個邏輯位準。因此,實施需要多於LUT經預先配置為接收的固定數量的輸入的操作可能涉及在邏輯位準之間路由輸入,並且導致由於經由可編程邏輯的路由結構路由輸入而引起的延遲。As a more specific example, wherein the lookup table (lookup table; LUT) in the conventional logic module can only support a fixed number of inputs (for example, six inputs), if only implemented in the LUT, such as AND/OR reduces Any operation with more than six inputs may require multiple logic levels. Thus, implementing operations that require more than the fixed number of inputs that the LUT is preconfigured to receive may involve routing the inputs between logic levels and result in delays due to routing the inputs through the programmable logic's routing structure.

在一或多個實施例中,進位鏈邏輯系統藉由利用進位輸入及進位輸出信號作為額外邏輯函數的輸入(例如,在與相應進位輸入及進位輸出函數相關聯的邏輯模組相同的邏輯位準上)來改進習知系統的此限制。實際上,如將在本文更詳細論述,進位鏈邏輯系統可減少對經由可編程邏輯硬體的路由結構路由的輸入及輸出信號的依賴。由於在邏輯位準之間路由輸入可以花費100s的皮秒或甚至奈秒,此延遲可以導致邏輯函數失效或需要複雜的排序緩衝配置及/或較低的時鐘週期。相反,藉由利用進位輸入及進位輸出信號作為相同邏輯位準上的邏輯模組的輸入(例如,進位輸入信號),此延遲可以顯著減少並且涉及僅幾(例如,10)皮秒的低得多的延遲。In one or more embodiments, the carry chain logic system works by utilizing the carry-in and carry-out signals as inputs to additional logic functions (eg, at the same logic bits as the logic modules associated with the corresponding carry-in and carry-out functions standards) to improve this limitation of conventional systems. In fact, as will be discussed in more detail herein, the carry chain logic system can reduce reliance on input and output signals being routed through the routing structure of the programmable logic hardware. Since routing inputs between logic levels can take 100s of picoseconds or even nanoseconds, this delay can cause logic functions to fail or require complex sequencing buffer configurations and/or lower clock cycles. Conversely, by utilizing the carry-in and carry-out signals as inputs to logic modules on the same logic level (eg, the carry-in signal), this delay can be significantly reduced and involves a low gain of only a few (eg, 10) picoseconds. much delay.

本文描述的特徵可以數種方式實施以在可編程硬體系統內提供額外益處及最佳化。例如,在本文描述的一或多個實施例中,進位鏈邏輯系統可將待減少的輸入處理為位元向量,此可以藉由將輸入位元向量加到全1的位元向量來在OR減少函數內實施。類似地,藉由在最低有效位元(LSB)中將輸入位元向量與具有引子輸入1的位元向量0相加,除非輸入上的所有位元均為1,否則1可能不被視為進位輸出信號,此可提供AND減少操作。由於進位鏈產生比LUT更快的信號,此舉使得能夠使用進位鏈而非LUT來減少大量輸入的效能。將在下文結合第3圖至第5圖論述此實例及額外變化。The features described herein can be implemented in several ways to provide additional benefits and optimizations within programmable hardware systems. For example, in one or more embodiments described herein, the carry chain logic system can process the input to be reduced as a bit vector, which can be ORed by adding the input bit vector to the bit vector of all ones Reduced in-function implementation. Similarly, by adding the input bit vector in the least significant bit (LSB) to the bit vector 0 with the primer input 1, a 1 may not be considered unless all bits on the input are 1 Carry out signal, which provides an AND-reduce operation. This makes it possible to use the carry chain instead of the LUT to reduce performance for large numbers of inputs, since the carry chain generates faster signals than the LUT. This example and additional variations will be discussed below in connection with FIGS. 3-5 .

如在以上論述中示出,本揭示利用各種術語來描述進位鏈邏輯系統的特徵及優點。現將提供關於一些此等術語的含義的額外細節。As indicated in the discussion above, the present disclosure utilizes various terms to describe the features and advantages of the carry chain logic system. Additional details regarding the meaning of some of these terms will now be provided.

如本文使用,「邏輯模組」可指能夠接收複數個輸入並且基於其上實施的邏輯函數產生輸出的硬體的任何離散部件。在一或多個實施例中,邏輯模組可包括多個部件,諸如LUT、加法器、暫存器、多工器、及路由部件。在本文描述的一或多個實施例中,邏輯模組指允許來自數個製造商中的任何製造商的N輸入(例如,4輸入、6輸入、8輸入)邏輯函數的配置,該配置可以與可編程硬體裝置上的額外邏輯模組結合實施。在一或多個實施例中,可編程硬體裝置(例如,FPGA裝置)具有在其上實施的100s、1000s、或1,000,000s邏輯模組,該等邏輯模組係可編程的以實施各種各樣的函數。As used herein, a "logic module" may refer to any discrete component of hardware capable of receiving a plurality of inputs and producing an output based on a logic function implemented thereon. In one or more embodiments, a logic module may include multiple components, such as LUTs, adders, registers, multiplexers, and routing components. In one or more embodiments described herein, a logic module refers to a configuration that allows N-input (e.g., 4-input, 6-input, 8-input) logic functions from any of several manufacturers, which can be Implemented in conjunction with additional logic modules on programmable hardware devices. In one or more embodiments, a programmable hardware device (eg, an FPGA device) has 100s, 1000s, or 1,000,000s of logic modules implemented thereon that are programmable to implement various kind of function.

如本文使用,「邏輯位準」可指藉由一或多個邏輯模組的另一分組經由模組之間的路由結構分離的一或多個邏輯模組的分組。例如,在本文描述的一或多個實施例中,邏輯位準可指具有作為邏輯模組之間的進位信號輸入饋送的進位信號的邏輯模組的進位鏈。邏輯位準的邏輯模組可直接或經由能夠從一個時鐘週期到下一個時鐘週期擷取資料的暫存器將輸出信號提供到額外邏輯層。在本文描述的一或多個實施例中,邏輯模組的邏輯鏈可指共用邏輯位準內的邏輯模組。相反,一系列邏輯模組或邏輯模組系列可指跨不同邏輯位準(例如,經由路由結構)將信號饋送到彼此的邏輯模組。As used herein, "logical level" may refer to a grouping of one or more logical modules separated by another grouping of one or more logical modules via a routing structure between the modules. For example, in one or more embodiments described herein, a logic level may refer to a carry chain of logic modules having a carry signal fed as a carry signal input between logic modules. Logic level logic modules can provide output signals to additional logic layers directly or through registers that can retrieve data from one clock cycle to the next. In one or more embodiments described herein, a logic chain of logic modules may refer to logic modules within a common logic level. In contrast, a series of logic modules or family of logic modules may refer to logic modules that feed signals to each other across different logic levels (eg, via routing structures).

如本文使用,位元向量可係可用作一或多個邏輯模組的輸入的預定向量或位元值集合。位元向量由多個向量位元組成。向量位元的位元值可係基於進位鏈邏輯系統正在執行的邏輯函數的類型。例如,並且如將在本文更詳細論述,經配置為執行OR減少的進位鏈邏輯系統可包括具有位元值為1的向量位元的位元向量。在一些實例中,經配置為執行AND減少的進位鏈邏輯系統可包括具有位元值為零的向量位元的位元向量、及具有位元值一的第一向量位元。As used herein, a bit vector may be a predetermined vector or set of bit values that may be used as input to one or more logic modules. A bit vector consists of a number of vector bits. The bit values of the vector bits may be based on the type of logic function being performed by the carry chain logic system. For example, and as will be discussed in more detail herein, a carry-chain logic system configured to perform an OR reduction may include a bit vector having a vector bit with a bit value of one. In some examples, a carry chain logic system configured to perform an AND reduction may include a bit vector having a vector bit with a bit value of zero, and a first vector bit having a bit value of one.

在本文描述的一或多個實施方式中,第一向量位元被稱為引子位元。如本文使用,引子位元可係可添加到進位鏈邏輯系統的第一邏輯模組的初始位元。引子位元可具有基於進位鏈邏輯系統的配置的位元值。例如,對於AND或OR減少,引子位元可具有位元值一以允許第一邏輯模組執行AND或OR減少。在一些實施例中,引子位元可作為向量位元包括在位元向量中。In one or more embodiments described herein, the first vector bits are referred to as primer bits. As used herein, a primer bit may be an initial bit that may be added to a first logic module of a carry chain logic system. The primer bits may have bit values based on the configuration of the carry chain logic system. For example, for an AND or OR reduction, the primer bit may have a bit value of one to allow the first logic module to perform the AND or OR reduction. In some embodiments, primer bits may be included in a bit vector as vector bits.

如本文使用,輸入向量可包括用於進位鏈邏輯系統的複數個輸入位元。每個邏輯模組可從輸入向量接收輸入位元。如本文更詳細論述,位元向量的輸入位元可從可編程硬體中的其他函數接收。例如,輸入位元可係來自與進位鏈邏輯系統不同的邏輯位準的邏輯函數的輸出。在一些實例中,輸入位元可係來自其他邏輯函數、組合、減法、或在可編程硬體上執行的任何其他函數的輸出。在一些實施例中,輸入向量具有一定量的輸入位元。進位鏈邏輯系統中的邏輯模組的數量可等於輸入位元的量。As used herein, an input vector may include a plurality of input bits for a carry chain logic system. Each logic module may receive input bits from an input vector. As discussed in more detail herein, the input bits of a bit vector may be received from other functions in programmable hardware. For example, an input bit may be the output of a logic function from a different logic level than the carry chain logic system. In some examples, the input bits may be outputs from other logic functions, combinations, subtractions, or any other functions performed on programmable hardware. In some embodiments, the input vector has a certain number of input bits. The number of logic modules in the carry chain logic system can be equal to the amount of input bits.

現將相對於描繪示例實施例的說明性附圖關於進位鏈邏輯系統論述額外細節。例如,第1A圖示出了圖示其上實施有進位鏈邏輯系統102的可編程硬體裝置100的示例環境。如第1圖所示,根據本文論述的一或多個實例,進位鏈邏輯系統102可包括多個邏輯鏈104,該等邏輯鏈可獨立地包括鏈接在一起的任何數量的邏輯模組106。在一或多個實施例中,可編程硬體裝置指其上實施有任何數量的邏輯模組106的FPGA裝置。Additional details will now be discussed with respect to the carry chain logic system with respect to the illustrative figures depicting example embodiments. For example, FIG. 1A shows an example environment illustrating a programmable hardware device 100 on which a carry chain logic system 102 is implemented. As shown in FIG. 1 , carry chain logic system 102 may include a plurality of logic chains 104 that may independently include any number of logic modules 106 chained together, according to one or more examples discussed herein. In one or more embodiments, a programmable hardware device refers to an FPGA device on which any number of logic modules 106 are implemented.

第1B圖示出了在第1B圖所示的進位鏈邏輯系統102上的邏輯鏈104的示例實施方式。邏輯鏈104可指在第1A圖所示的可編程硬體裝置100上實施的進位鏈邏輯系統102上的邏輯鏈104的任一者。Figure 1B shows an example implementation of a logic chain 104 on the carry chain logic system 102 shown in Figure 1B. The logic chain 104 may refer to any of the logic chains 104 on the carry chain logic system 102 implemented on the programmable hardware device 100 shown in FIG. 1A.

如第1B圖所示,示例邏輯鏈104可包括經配置為接收輸入(統稱為110)(例如,多個輸入信號)並且產生輸出(統稱為112)(例如,一或多個輸出)的任何數量的邏輯模組(統稱為106)。如進一步圖示,邏輯模組106可經配置為在相應邏輯模組106之間接收並且提供進位信號(統稱為108)。As shown in FIG. 1B , the example logic chain 104 may include any logic chain configured to receive an input (collectively 110 ) (eg, a plurality of input signals) and generate an output (collectively 112 ) (eg, one or more outputs). number of logic modules (collectively 106). As further illustrated, logic modules 106 may be configured to receive and provide carry signals (collectively 108 ) between respective logic modules 106 .

作為一實例,並且如第1B圖所示,第一邏輯模組106-1可接收第一輸入110-1、第一進位信號108-1,並且產生第一輸出112-1及第二進位信號108-2。第一輸入110-1及/或第一進位信號108-1可從任何源接收,諸如輸入向量或另一邏輯模組106(例如,從不同邏輯級)。在一些實施例中,第一輸出112-1亦可提供到另一邏輯模組。如將在下文更詳細論述,邏輯模組106可額外接收位元向量作為輸入,該位元向量具有指代引子位元的第一位元及導致邏輯模組106模擬特定邏輯閘的額外位元。As an example, and as shown in FIG. 1B, a first logic module 106-1 may receive a first input 110-1, a first carry signal 108-1, and generate a first output 112-1 and a second carry signal 108-2. The first input 110-1 and/or the first carry signal 108-1 may be received from any source, such as an input vector or another logic module 106 (eg, from a different logic stage). In some embodiments, the first output 112-1 may also be provided to another logic module. As will be discussed in more detail below, logic module 106 may additionally receive as input a bit vector having a first bit that refers to a primer bit and an additional bit that causes logic module 106 to simulate a particular logic gate .

如進一步圖示,邏輯模組106可接收並且提供在邏輯鏈104的邏輯模組106之間的進位信號108。例如,每個邏輯模組106可接收進位輸入信號並且提供進位輸出信號。如第1B圖所示,進位信號1068可指代一個邏輯模組106的進位輸出信號及另一邏輯模組106的進位輸入信號兩者。例如,第二進位信號108-2可係第一邏輯模組106-1的進位輸出信號及第二邏輯模組106-2的進位輸入信號。在本文描述的一或多個實施例中,進位輸入信號作為邏輯模組106的輸入在相同或不同的邏輯鏈104內提供。As further illustrated, logic modules 106 may receive and provide carry signals 108 between logic modules 106 of logic chain 104 . For example, each logic module 106 may receive a carry-in signal and provide a carry-out signal. As shown in FIG. 1B , the carry signal 1068 may refer to both the carry out signal of one logic module 106 and the carry in signal of another logic module 106 . For example, the second carry signal 108-2 can be the carry-out signal of the first logic module 106-1 and the carry-in signal of the second logic module 106-2. In one or more embodiments described herein, the carry-in signal is provided within the same or a different logic chain 104 as an input to the logic module 106 .

如第1B圖所示,儘管進位信號108的每一者可作為進位輸入饋送到另一邏輯模組106,一或多個實施例可涉及用作另一邏輯函數的輸入的進位信號。例如,在本文描述的一或多個實施例中,進位信號108可用作AND/OR減少函數輸出114以擴展藉由邏輯鏈實施的邏輯閘的輸入的數量。在一或多個實施例中,進位信號108可作為輸入一起饋送到另一邏輯函數。將結合第3圖至第5圖論述額外實例。As shown in FIG. 1B, while each of the carry signals 108 may be fed as a carry input to another logic module 106, one or more embodiments may involve the carry signals being used as inputs to another logic function. For example, in one or more embodiments described herein, the carry signal 108 may be used as an AND/OR reduction function output 114 to expand the number of inputs to logic gates implemented by logic chains. In one or more embodiments, the carry signal 108 may be fed together as an input to another logic function. Additional examples will be discussed in conjunction with FIGS. 3-5.

如本文論述,邏輯鏈104可具有任何長度,其中進位信號108經由任何數量的邏輯模組106傳播。例如,第1B圖所示的邏輯鏈104具有第一邏輯模組106-1,接收第一進位信號108-1及第一輸入110-1。第一輸入110-1可包括來自位元向量的第一向量位元及來自輸入向量的第一輸入位元。因此,第一邏輯模組106-1可接收三個輸入。第一邏輯模組106-1可產生第一輸出112-1及第二進位信號108-2(例如,第一邏輯模組106-1的進位輸出信號)。在一些實施例中,第二進位信號108-2可係第一邏輯模組106-1的結果的LSB,並且第一輸出112-1可係結果的MSB。As discussed herein, logic chain 104 may be of any length with carry signal 108 propagated through any number of logic modules 106 . For example, the logic chain 104 shown in FIG. 1B has a first logic module 106-1 receiving a first carry signal 108-1 and a first input 110-1. The first input 110-1 may include first vector bits from the bit vector and first input bits from the input vector. Therefore, the first logic module 106-1 can receive three inputs. The first logic module 106-1 can generate the first output 112-1 and the second carry signal 108-2 (eg, the carry output signal of the first logic module 106-1). In some embodiments, the second carry signal 108-2 may be the LSB of the result of the first logic module 106-1, and the first output 112-1 may be the MSB of the result.

第二邏輯模組106-2可接收第二進位信號108-2(例如,進位輸入信號)及第二輸入110-2作為輸入。第二輸入110-2可包括來自位元向量的第二向量位元及來自輸入向量的第二輸入。因此,第二邏輯模組106-2可接收三個輸入。第二邏輯模組106-2可產生第二輸出112-2及進位信號108。邏輯鏈104可無限連續,從而產生n個進位信號108-n。N個邏輯模組106-n可接收n-進位信號108-n、n-輸入110-n、及n-輸出112-n。此可允許長邏輯鏈104在相同的邏輯位準上處理大量輸入而具有減少的處理延遲。The second logic module 106-2 may receive a second carry signal 108-2 (eg, a carry-in signal) and a second input 110-2 as inputs. The second input 110-2 may include the second vector bits from the bit vector and the second input from the input vector. Therefore, the second logic module 106-2 can receive three inputs. The second logic module 106 - 2 can generate the second output 112 - 2 and the carry signal 108 . The logic chain 104 can be continued indefinitely, thereby generating n carry signals 108-n. N logic modules 106-n can receive n-carry signal 108-n, n-input 110-n, and n-output 112-n. This may allow long logic chains 104 to process a large number of inputs at the same logic level with reduced processing delay.

第2圖示出了根據一或多個實施例的示例邏輯模組的更詳細實施方式。如第2圖所示,邏輯模組可包括可配置為針對預定數量的輸入提供邏輯閘的各種部件。在第2圖所示的實例中,邏輯模組可以經配置為處理多達八個輸入。Figure 2 illustrates a more detailed implementation of an example logic module in accordance with one or more embodiments. As shown in FIG. 2, a logic module may include various components that may be configured to provide logic gates for a predetermined number of inputs. In the example shown in Figure 2, the logic module can be configured to handle up to eight inputs.

如第2圖所示,邏輯模組206可包括可編程為針對N個輸入直到邏輯模組的極限提供各種布耳(Boolean)函數的任一者的查找表(LUT)。如第2圖所示,邏輯模組206可包括兩個邏輯部分(統稱為218),包括第一邏輯部分218-1及第二邏輯部分218-2。每個邏輯部分218可利用LUT 216來處理四輸入LUT表函數。以此方式,兩個四輸入LUT函數可處理輸入的相應分組並且產生兩個輸出。As shown in FIG. 2, logic module 206 may include a look-up table (LUT) programmable to provide any of various Boolean functions for N inputs up to the limit of the logic module. As shown in FIG. 2, the logic module 206 may include two logic parts (collectively referred to as 218), including a first logic part 218-1 and a second logic part 218-2. Each logic section 218 may utilize LUT 216 to process a four-input LUT table function. In this way, two four-input LUT functions can process respective groups of inputs and produce two outputs.

如第2圖所示,輸出可從LUT 216饋送到兩個分離的邏輯模組。在所示的實施例中,邏輯模組係加法器(統稱為220),該等加法器可用於使兩個位元相加。加法器220可係一位元加法器,並且經配置為處理三個一位元輸入,包括進位輸入位元222、輸入位元224、及向量位元226。加法器220可處理進位輸入位元222、輸入位元224、及向量位元226以產生輸出(統稱為212)及進位輸出位元(統稱為228)。輸出212可傳遞到暫存器230,其中該輸出可路由到不同的邏輯位準或可編程硬體裝置的其他部分。進位輸出位元228可用作邏輯鏈中的另一加法器的進位輸入位元222。As shown in Figure 2, the output from LUT 216 may be fed to two separate logic modules. In the illustrated embodiment, the logic modules are adders (collectively 220) that can be used to add two bits. Adder 220 may be a one-bit adder and is configured to process three one-bit inputs, including carry-in bit 222 , input bit 224 , and vector bit 226 . Adder 220 may process carry-in bits 222 , input bits 224 , and vector bits 226 to generate output (collectively 212 ) and carry-out bits (collectively 228 ). Output 212 may be passed to register 230, where the output may be routed to a different logic level or to other parts of the programmable hardware device. The carry-out bit 228 can be used as the carry-in bit 222 of another adder in the logic chain.

在所示的實施例中,邏輯模組206可接收LUT 216處的六個LUT輸入215。在第一邏輯部分218-1中,LUT 216可將第一輸入224-1輸出到第一加法器220-1並且為第一加法器220-1提供第一向量位元226-1。第一加法器220-1可接收第一進位輸入位元222-1。第一加法器220-1可處理第一進位輸入位元222-1、第一輸入224-1、及第一向量位元226-1,並且產生第一輸出212-1及第一進位輸出位元228-1。In the illustrated embodiment, logic module 206 may receive six LUT inputs 215 at LUT 216 . In the first logic portion 218-1, the LUT 216 may output a first input 224-1 to a first adder 220-1 and provide a first vector bit 226-1 to the first adder 220-1. The first adder 220-1 can receive the first carry-in bit 222-1. The first adder 220-1 can process the first carry-in bit 222-1, the first input 224-1, and the first vector bit 226-1, and generate the first output 212-1 and the first carry-out bit Yuan 228-1.

在第二邏輯部分218-2中,LUT 216可產生第二輸入224-2。LUT 216可為第二加法器220-2提供第二輸入224-2及第二向量位元226-2。第二加法器220-2可接收第一進位輸出位元228-1作為第二進位輸入位元222-2。第二加法器220-2可處理第二進位輸入位元222-2、第二輸入224-2、及第二向量位元226-2以產生第二輸出212-2及第二進位輸出位元228-2。第二進位輸出位元228-2可用在與進位輸入位元222相同的邏輯位準上的另一邏輯模組206處。In the second logic section 218-2, the LUT 216 may generate a second input 224-2. LUT 216 may provide second input 224-2 and second vector bits 226-2 to second adder 220-2. The second adder 220-2 can receive the first carry-out bit 228-1 as the second carry-in bit 222-2. The second adder 220-2 can process the second carry-in bit 222-2, the second input 224-2, and the second vector bit 226-2 to generate the second output 212-2 and the second carry-out bit 228-2. The second carry-out bit 228 - 2 can be used at another logic module 206 on the same logic level as the carry-in bit 222 .

在本文描述的一或多個實施例中,邏輯模組206可經配置為AND閘及OR閘並且經配置為使來自邏輯模組的相應邏輯部分的輸入相加。In one or more embodiments described herein, logic modules 206 may be configured as AND gates and OR gates and configured to sum inputs from respective logic portions of the logic modules.

如第2圖所示,邏輯模組206可包括複數個暫存器230。如上文提及,暫存器230可以從時鐘週期到下一時鐘週期獲取資料。在一或多個實施例中,邏輯模組206可經配置為使用暫存器230來實現與可能繞過加法器220及/或暫存器230相比更高的頻率。As shown in FIG. 2 , the logic module 206 may include a plurality of registers 230 . As mentioned above, the register 230 can fetch data from one clock cycle to the next clock cycle. In one or more embodiments, logic module 206 may be configured to use register 230 to achieve a higher frequency than would be possible by bypassing adder 220 and/or register 230 .

如將結合下文的多個實例論述,邏輯模組206可與一或多個額外邏輯模組206結合以建立具有與可用於單個邏輯模組206相比更大數量的輸入的邏輯閘。例如,邏輯模組206可與複數個額外邏輯模組206結合以提供具有可用於習知配置的顯著更高數量的輸入(例如,三十、五十、或數百輸入)的大AND閘。As will be discussed in connection with several examples below, a logic module 206 may be combined with one or more additional logic modules 206 to create a logic gate with a greater number of inputs than may be used with a single logic module 206 . For example, a logic module 206 may be combined with a plurality of additional logic modules 206 to provide a large AND gate with a significantly higher number of inputs (eg, thirty, fifty, or hundreds of inputs) available in conventional configurations.

在本文描述的一或多個實施例中,邏輯模組206可藉由實施具有複數個邏輯模組206的邏輯鏈來實施更大邏輯閘,該等邏輯模組經配置為接收位元向量、輸入向量、及進位信號。特定而言,除了饋送到邏輯鏈中的複數個邏輯模組206的第一邏輯模組206的引子位元之外,位元向量可包括向量1及/或0。位元向量可與邏輯閘將應用的輸入位元的輸入向量一起作為輸入饋送到邏輯鏈的邏輯模組206。In one or more embodiments described herein, logic modules 206 may implement larger logic gates by implementing logic chains with a plurality of logic modules 206 configured to receive bit vectors, Input vector, and carry signal. In particular, the bit vector may comprise a vector of 1s and/or 0s, in addition to the primer bits fed to the first logic module 206 of the plurality of logic modules 206 in the logic chain. The bit vector may be fed as an input to the logic module 206 of the logic chain along with the logic gate applying the input vector of input bits.

除了作為輸入提供的位元向量及輸入向量之外,邏輯模組206可使用饋送到加法器的進位輸入以建立波紋加法器並且基於相關位元向量值、輸入值、及提供到加法器的進位信號的組合來產生具有最高有效位元(most significant bit ; MSB)及最低有效位元(LSB)的兩個位元值。在本文描述的一或多個實施例中,MSB成為加法器或邏輯模組的進位輸出位元,而LSB成為相應加法器或邏輯模組的輸出。In addition to the bit vector and input vector provided as input, the logic module 206 can use the carry input fed to the adder to create a ripple adder and based on the associated bit vector value, the input value, and the carry provided to the adder Signals are combined to produce a two-bit value with a most significant bit (MSB) and a least significant bit (LSB). In one or more embodiments described herein, the MSB becomes the carry out bit of the adder or logic module, and the LSB becomes the output of the corresponding adder or logic module.

將瞭解,取決於邏輯模組經編程為提供的具體邏輯閘,可以不同方式使用MSB及/或LSB。此外,包括0及/或1的集合以及引子位元的位元向量的具體值可基於邏輯模組已經編程為提供的邏輯閘而決定為具體值。It will be appreciated that the MSB and/or LSB may be used in different ways depending on the specific logic gates that the logic module is programmed to provide. In addition, the specific value of the bit vector including the set of 0 and/or 1 and the primer bits can be determined as the specific value based on the logic gates that the logic module has been programmed to provide.

現將結合數個示例實施例方式論述額外細節。例如,第3圖示出了第一示例OR減少邏輯鏈及第一示例AND減少邏輯鏈。Additional details will now be discussed in connection with several example embodiment approaches. For example, Figure 3 shows a first example OR reduction logic chain and a first example AND reduction logic chain.

第3-1圖係包括複數個向量位元326的位元向量332及包括複數個輸入位元324的輸入向量334的表示。作為說明性實例,位元向量332包括四個向量位元326,包括b1、b2、b3、及b4,其中b1在本文中被稱為引子位元(Pr)。引子位元Pr可係在邏輯鏈的第一邏輯模組中使用的初始位元。輸入向量334包括四個輸入位元,包括a1、a2、a3、及a4。FIG. 3-1 is a representation of a bit vector 332 including a plurality of vector bits 326 and an input vector 334 including a plurality of input bits 324 . As an illustrative example, bit vector 332 includes four vector bits 326, including bl, b2, b3, and b4, where bl is referred to herein as a primer bit (Pr). The primer bit Pr may be the initial bit used in the first logic module of the logic chain. The input vector 334 includes four input bits, including a1, a2, a3, and a4.

第3-2圖係根據本揭示的至少一個實施例的包括四個加法器(統稱為320)的通用邏輯鏈304的表示。第一加法器320-1接收引子位元Pr及來自輸入向量334的第一輸入位元a1。第一加法器輸出第一進位信號C1。第二加法器320-2接收第一進位信號C1、來自位元向量332的第二向量位元b1、及來自輸入向量334的第二輸入位元a1作為輸入。第二加法器320-2可產生第二進位信號C2。在一或多個實施例中,第一加法器320-1及第二加法器320-2可係第一邏輯模組的部分。3-2 is a representation of a general logic chain 304 including four adders (collectively 320 ), in accordance with at least one embodiment of the present disclosure. The first adder 320 - 1 receives the preamble bit Pr and the first input bit a1 from the input vector 334 . The first adder outputs a first carry signal C1. The second adder 320 - 2 receives the first carry signal C1 , the second vector bit b1 from the bit vector 332 , and the second input bit a1 from the input vector 334 as inputs. The second adder 320-2 can generate the second carry signal C2. In one or more embodiments, the first adder 320-1 and the second adder 320-2 may be part of the first logic module.

第三加法器320-3可接收第二進位信號C2、位元向量332的第三向量位元b3、及來自輸入向量334的第三輸入位元a3作為輸入。第三加法器320-3可產生第三進位信號C3。第四加法器320-4可接收第三進位信號C3、位元向量332的第四向量位元b4、及輸入向量334的第四輸入位元作為輸入。第四加法器320-4可產生第四進位信號C4。第三加法器320-3及第四加法器320-4可係第二邏輯模組的部分。因此,第3-2圖所示的實施方式可表示兩個邏輯模組,各自包括兩個加法器。取決於邏輯模組的獨特硬體及規格,其他實施方式可包括加法器及模組的不同分組。The third adder 320 - 3 can receive the second carry signal C2 , the third vector bit b3 of the bit vector 332 , and the third input bit a3 from the input vector 334 as inputs. The third adder 320-3 can generate a third carry signal C3. The fourth adder 320 - 4 can receive the third carry signal C3 , the fourth vector bit b4 of the bit vector 332 , and the fourth input bit of the input vector 334 as inputs. The fourth adder 320-4 can generate a fourth carry signal C4. The third adder 320-3 and the fourth adder 320-4 may be part of the second logic module. Thus, the embodiment shown in Figure 3-2 may represent two logic modules, each including two adders. Depending on the unique hardware and specifications of the logic modules, other implementations may include different groupings of adders and modules.

如第3-2圖所示,減少輸出336可作為邏輯鏈304的輸出提供。減少輸出336可接收第四進位信號C4並且對第四進位信號C4及/或邏輯鏈304的任何其他輸出執行減少函數。如本文論述,因為加法器320的每一者係在相同的邏輯位準上,減少輸出336可在第四進位信號C4上以較少處理延遲產生。可看到,邏輯鏈304可包括多於或少於四個加法器320。As shown in FIG. 3-2 , reduce output 336 may be provided as an output of logic chain 304 . The reduce output 336 may receive the fourth carry signal C4 and perform a reduce function on the fourth carry signal C4 and/or any other output of the logic chain 304 . As discussed herein, because each of adders 320 is at the same logic level, decrease output 336 can be generated on fourth carry signal C4 with less processing delay. It can be seen that logic chain 304 may include more or less than four adders 320 .

如第3-3圖所示,OR減少邏輯鏈304包括四個加法器320,該等加法器可根據本文描述的一或多個實施例在邏輯模組內實施。例如,第一加法器320-1及/或第二加法器320-2可在如上文結合第2圖論述類似的邏輯模組內實施。第三加法器320-3及/或第四加法器320-4可在具有如上文結合第2圖論述類似的特徵及功能的另一邏輯模組內實施。As shown in FIGS. 3-3, the OR reduction logic chain 304 includes four adders 320, which may be implemented within logic modules according to one or more embodiments described herein. For example, the first adder 320-1 and/or the second adder 320-2 may be implemented within similar logic modules as discussed above in connection with FIG. 2 . The third adder 320-3 and/or the fourth adder 320-4 may be implemented within another logic module having similar features and functions as discussed above in connection with FIG. 2 .

如第3-3圖所示,具有值a1、a2、a3、及a4的輸入向量334可作為輸入提供到邏輯模組。例如,a1及a2可作為輸入提供到第一邏輯模組,而a3及a4可作為輸入提供到第二邏輯模組。與輸入向量334結合,預定義的位元向量332可作為輸入提供到邏輯模組。如第3-3圖所示,位元向量332的個別位元可結合輸入向量334的對應位元值提供到加法器的每一者。藉由實例,並且如第3圖所示,位元向量332的第一位元可結合a1作為輸入提供到第一加法器320-1(例如,在第一邏輯模組上),而位元向量332的第二位元可結合a2作為輸入提供到第二加法器320-2(例如,在第一邏輯模組上)。此外,位元向量332的第三位元可結合a3作為輸入提供到第三加法器320-3(例如,在第二邏輯模組上),而位元向量332的第四位元可結合a4作為輸入提供到第四加法器320-4(例如,在第二邏輯模組上)。其他配置可包括提供到額外邏輯模組上的額外加法器的任何數量的輸入位元及對應位元向量位元。As shown in FIGS. 3-3, an input vector 334 having values a1, a2, a3, and a4 may be provided as an input to a logic module. For example, a1 and a2 may be provided as inputs to a first logic module, and a3 and a4 may be provided as inputs to a second logic module. In conjunction with input vector 334, predefined bit vector 332 may be provided as an input to the logic module. As shown in FIGS. 3-3 , individual bits of bit vector 332 may be provided to each of the adders in conjunction with corresponding bit values of input vector 334 . By way of example, and as shown in FIG. 3, a first bit of bit vector 332 may be provided in conjunction with a1 as an input to first adder 320-1 (e.g., on a first logic module), and bit The second bit of vector 332 may be provided in conjunction with a2 as an input to a second adder 320-2 (eg, on the first logic module). Additionally, the third bit of bit vector 332 may be provided in combination with a3 as an input to third adder 320-3 (e.g., on the second logic module), while the fourth bit of bit vector 332 may be combined with a4 Provided as input to a fourth adder 320-4 (eg, on the second logic module). Other configurations may include any number of input bits and corresponding bit vector bits provided to additional adders on additional logic modules.

如上文提及,可基於將藉由邏輯鏈304的邏輯模組實施的具體邏輯閘來決定位元向量332的具體值。例如,在所示出的配置中,位元向量332可包括結合額外的「1」輸入提供到第一加法器的第一引子輸入(例如,「1」位元值),將該等輸入提供到OR減少邏輯鏈的額外加法器。As mentioned above, the specific value of the bit vector 332 may be determined based on the specific logic gates to be implemented by the logic modules of the logic chain 304 . For example, in the configuration shown, the bit vector 332 may include a first primer input (e.g., a "1" bit value) provided to the first adder in conjunction with an additional "1" input, providing those inputs with Extra adder to OR reduce logic chain.

在第3-3圖所示的實例中,加法器320將產生兩個位元輸出。若輸入位元的任一者(例如,輸入位元及對應向量位元)為真,則加法器320將傳播輸入進位。若均為真,則加法器320產生輸出進位。在一或多個實施例中,此配置包括波紋進位加法器。在一或多個實施例中,可以忽略LSB並且MSB可作為進位輸入信號提供到OR減少邏輯鏈中的下一加法器。在一或多個實施例中,LSB作為輸出提供到可編程硬體裝置上的一或多個額外邏輯函數。In the example shown in Figures 3-3, adder 320 will produce two bit outputs. Adder 320 will propagate the input carry if any of the input bits (eg, the input bit and the corresponding vector bit) is true. If both are true, adder 320 generates an output carry. In one or more embodiments, this configuration includes a ripple-carry adder. In one or more embodiments, the LSB can be ignored and the MSB can be provided as a carry-in signal to the next adder in the OR-reduce logic chain. In one or more embodiments, the LSB is provided as an output to one or more additional logic functions on the programmable hardware device.

如上文提及,第3-4圖進一步圖示了示例AND減少邏輯鏈304。如第3-4圖所示,AND減少邏輯鏈304包括四個加法器320,類似於上文結合OR減少邏輯鏈描述的四個加法器320。此等加法器320可在類似於OR減少邏輯鏈的邏輯模組內實施。如進一步圖示,邏輯模組可接收與位元向量332結合的輸入向量334,該位元向量具有基於實施AND閘功能(如與OR閘相反)的意圖的具體值。例如,在第3圖所示的實例中,位元向量332可包括全「0」值的向量,其中第一向量位元為「1」。使用與上文論述的加法器部件類似的邏輯,AND減少框架可使用在邏輯模組的相同邏輯鏈上(例如,在共用邏輯通道上)的複數個邏輯模組針對任何數量的輸入產生AND閘的輸出。As mentioned above, FIGS. 3-4 further illustrate an example AND reduction logic chain 304 . As shown in FIGS. 3-4, the AND reduce logic chain 304 includes four adders 320, similar to the four adders 320 described above in connection with the OR reduce logic chain. Such adders 320 may be implemented within logic modules similar to OR-reduce logic chains. As further illustrated, the logic module may receive an input vector 334 combined with a bit vector 332 having specific values based on the intent to implement an AND gate function (as opposed to an OR gate). For example, in the example shown in FIG. 3, the bit vector 332 may include a vector of all "0" values, wherein the first vector bit is "1". Using logic similar to the adder block discussed above, the AND reduction framework can generate AND gates for any number of inputs using a plurality of logic modules on the same logic chain of logic modules (e.g., on a common logic channel). Output.

第4-1圖示出了根據本揭示的至少一個實施例圖示上文結合第3圖論述的實例的變化的示例實施方式。第4-1圖中描述的示例函數示出了結合第3圖所示的類似部件論述的類似原理。例如,第4-1圖圖示了具有複數個加法器(統稱為420)的通用邏輯鏈404。每個加法器420可接收複數個輸入,包括進位輸入信號,輸入位元a1、a2、a3、a4,及來自位元向量的向量位元b1、b2、b3、b4。Figure 4-1 shows an example implementation illustrating a variation of the example discussed above in connection with Figure 3 in accordance with at least one embodiment of the present disclosure. The example functions described in Figure 4-1 illustrate similar principles discussed in conjunction with similar components shown in Figure 3 . For example, Figure 4-1 illustrates a general logic chain 404 having a plurality of adders (collectively 420). Each adder 420 can receive a plurality of inputs, including a carry input signal, input bits a1 , a2 , a3 , a4 , and vector bits b1 , b2 , b3 , b4 from a bit vector.

輸入位元可包括LUT減少函數438的輸出。LUT減少函數438可接收多個位元並且將多個輸入減少到單個輸入來用於加法器420。例如,LUT減少函數438可一次從位元向量減少六個位元以產生各個輸入a1、a2、a3、a4。以此方式,不同於來自位元向量的每個輸入係加法器420的獨立輸入,LUT減少函數428可執行輸入位元的初始減少,藉此減少當執行給定操作時可使用的邏輯模組的總數。加法器420的輸出可隨後用在減少輸出436中。The input bits may include the output of the LUT reduction function 438 . The LUT reduction function 438 may receive multiple bits and reduce the multiple inputs to a single input for the adder 420 . For example, the LUT reduction function 438 may reduce six bits at a time from the bit vector to produce respective inputs al, a2, a3, a4. In this way, rather than each input from a bit vector being an independent input to adder 420, LUT reduction function 428 may perform an initial reduction of input bits, thereby reducing the number of logic modules available when performing a given operation total. The output of adder 420 may then be used in reduce output 436 .

第4-2圖示出了在邏輯鏈404中的示例OR減少邏輯函數。OR減少邏輯鏈404可包括在邏輯模組內實施的加法器420的鏈,其可包括與上文結合第2圖論述的邏輯模組類似的特徵。在此實例中,加法器420的每一者可接收輸入位元a1、a2、a3、a4,及位元向量位元,以及來自另一加法器部件的進位輸入值c1、c2、c3、c4。FIG. 4-2 shows an example OR reduction logic function in logic chain 404 . The OR reduction logic chain 404 may include a chain of adders 420 implemented within logic modules, which may include similar features to the logic modules discussed above in connection with FIG. 2 . In this example, each of adders 420 may receive input bits a1, a2, a3, a4, and bit vector bits, and carry input values c1, c2, c3, c4 from another adder component .

在第4-2圖所示的實例中,輸入位元可指來自LUT的輸出。更具體地,在一或多個實施例中,LUT可用於一次減少位元向量的六個位元並且使用加法器鏈來減少結果。因此,不同於具有用於位元向量的每個位元的單個輸入位元,OR減少邏輯鏈可使用LUT來針對來自位元向量的每個位元減少多達六個位元(或其他預定數量的輸入,基於邏輯模組的能力)。此舉可以當將邏輯閘配置為具有大量輸入時顯著減少給定進位鏈中的邏輯模組的數量。In the example shown in Figure 4-2, the input bits may refer to the output from the LUT. More specifically, in one or more embodiments, a LUT may be used to reduce the bit vector six bits at a time and use a chain of adders to reduce the result. Thus, instead of having a single input bit for each bit of a bit vector, an OR reduction logic chain can use a LUT to reduce up to six bits (or other predetermined number of inputs, based on the capabilities of the logic module). This can significantly reduce the number of logic modules in a given carry chain when logic gates are configured with a large number of inputs.

參考第4-2圖所示的具體實例,第一LUT OR減少可減少輸入A1直至A6以產生第一輸入位元a1。第一加法器420-1可接收第一輸入位元a1及具有值1的向量位元(例如,引子位元)。第一加法器420-1的輸出可包括第一進位位元C1。Referring to the specific example shown in FIG. 4-2, the first LUT OR reduction can reduce the input A1 to A6 to generate the first input bit a1. The first adder 420 - 1 may receive a first input bit a1 and a vector bit with a value of 1 (eg, a primer bit). The output of the first adder 420-1 may include a first carry bit C1.

第二LUT OR減少可減少輸入A7直至A12以產生第二輸入位元a2。第二加法器420-2可接收第二輸入位元a2、具有值1的位元向量、及第一進位位元C1。第二加法器420-2的輸出可包括第二進位位元C2。A second LUT OR reduction may reduce input A7 to A12 to generate a second input bit a2. The second adder 420-2 may receive the second input bit a2, the bit vector with a value of 1, and the first carry bit C1. The output of the second adder 420-2 may include a second carry bit C2.

第三LUT OR減少可減少輸入A13直至A18以產生第三輸入位元a3。第三加法器420-3可接收第三輸入位元a3、具有值1的位元向量、及第二進位位元C2。第三加法器420-3的輸出可包括第三進位位元C3。A third LUT OR reduction reduces input A13 to A18 to generate a third input bit a3. The third adder 420-3 can receive the third input bit a3, the bit vector with a value of 1, and the second carry bit C2. The output of the third adder 420-3 may include a third carry bit C3.

第四LUT OR減少可減少輸入A19直至A24以產生第四輸入位元a4。第四加法器420-4可接收第三輸入位元a4、具有值1的位元向量、及第三進位位元C3。第四加法器420-4的輸出可包括第四進位位元C4。A fourth LUT OR reduction reduces input A19 to A24 to produce a fourth input bit a4. The fourth adder 420-4 can receive the third input bit a4, the bit vector with a value of 1, and the third carry bit C3. The output of the fourth adder 420-4 may include a fourth carry bit C4.

OR減少可接收第四進位位元C4並且執行OR減少。以此方式,進位鏈404可對單個邏輯位準中的24位元輸入向量執行OR減少,藉此減少用於執行OR減少的邏輯模組的總數並且增加OR減少的速度。OR-reduce may receive the fourth carry bit C4 and perform an OR-reduce. In this way, the carry chain 404 can perform an OR-reduce on a 24-bit input vector in a single logic level, thereby reducing the total number of logic modules used to perform the OR-reduce and increasing the speed of the OR-reduce.

第4-3圖係根據本文描述的一或多個實施例的示例AND減少進位鏈404邏輯函數的表示。與OR減少邏輯鏈404類似的特徵可應用於AND減少邏輯鏈404。例如,LUT可用於減少具有特定於AND閘邏輯的值的位元向量的位元。在此實例中,四個加法器420可用於處理用於四個LUT的AND減少閘,該等LUT各自接收多達六個輸入(或另一數量的輸入,基於邏輯模組的能力)。4-3 are representations of example AND reduce carry chain 404 logic functions in accordance with one or more embodiments described herein. Similar features to the OR reduction logic chain 404 are applicable to the AND reduction logic chain 404 . For example, a LUT can be used to reduce the bits of a bit vector with values specific to AND gate logic. In this example, four adders 420 may be used to process AND reduction gates for four LUTs each receiving up to six inputs (or another number of inputs, based on the capabilities of the logic module).

參考第4-3圖所示的具體實例,第一LUT AND減少可減少輸入A1直至A6以產生第一輸入位元a1。第一加法器420-1可接收第一輸入位元a1及具有值1的位元向量(例如,引子位元)。第一加法器420-1的輸出可包括第一進位位元C1。Referring to the specific example shown in FIGS. 4-3, the first LUT AND reduction can reduce the input A1 to A6 to generate the first input bit a1. The first adder 420 - 1 may receive a first input bit a1 and a bit vector with a value of 1 (eg, a primer bit). The output of the first adder 420-1 may include a first carry bit C1.

第二LUT AND減少可減少輸入A7直至A12以產生第二輸入位元a2。第二加法器420-2可接收第二輸入位元a2、具有值0的位元向量、及第一進位位元C1。第二加法器420-2的輸出可包括第二進位位元C2。A second LUT AND reduction may reduce input A7 to A12 to produce a second input bit a2. The second adder 420-2 may receive the second input bit a2, the bit vector having a value of 0, and the first carry bit C1. The output of the second adder 420-2 may include a second carry bit C2.

第三LUT AND減少可減少輸入A13直至A18以產生第三輸入位元a3。第三加法器420-3可接收第三輸入位元a3、具有值0的位元向量、及第二進位位元C2。第三加法器420-3的輸出可包括第三進位位元C3。A third LUT AND reduction may reduce input A13 to A18 to produce a third input bit a3. The third adder 420-3 can receive the third input bit a3, the bit vector with a value of 0, and the second carry bit C2. The output of the third adder 420-3 may include a third carry bit C3.

第四LUT AND減少可減少輸入A19直至A24以產生第四輸入位元a4。第四加法器420-4可接收第三輸入位元a4、具有值0的位元向量、及第三進位位元C3。第四加法器420-4的輸出可包括第四進位位元C4。A fourth LUT AND reduction reduces input A19 to A24 to produce a fourth input bit a4. The fourth adder 420-4 can receive the third input bit a4, the bit vector with a value of 0, and the third carry bit C3. The output of the fourth adder 420-4 may include a fourth carry bit C4.

OR減少可接收第四進位位元C4並且執行AND減少。以此方式,進位鏈404可對單個邏輯位準中的24位元輸入向量執行AND減少,藉此減少用於執行AND減少的邏輯模組的總數並且增加AND減少的速度。An OR decrease may receive the fourth carry bit C4 and perform an AND decrease. In this way, the carry chain 404 can perform an AND reduction on a 24-bit input vector in a single logic level, thereby reducing the total number of logic modules used to perform the AND reduction and increasing the speed of the AND reduction.

第5圖係根據本揭示的至少一個實施例的進位鏈504邏輯系統的表示。例如,第5圖示出了邏輯鏈504的示例配置,其中輸入向量包括源自任何LUT組合邏輯的輸入。例如,在需要額外組合邏輯的事件中,邏輯鏈504可利用LUT的組合邏輯並且將LUT的輸出饋送到邏輯鏈504。此舉使得邏輯鏈504能夠包括組合邏輯及AND/OR減少到單個邏輯位準,而非跨過多個邏輯位準。FIG. 5 is a representation of a carry chain 504 logic system in accordance with at least one embodiment of the present disclosure. For example, FIG. 5 shows an example configuration of logic chain 504 in which the input vectors include inputs from any LUT combinatorial logic. For example, logic chain 504 may utilize the combinational logic of a LUT and feed the output of the LUT to logic chain 504 in the event that additional combinatorial logic is required. This enables logic chain 504 to include combinational logic and AND/OR reduction to a single logic level, rather than across multiple logic levels.

此配置的示例實施方式可以包括經配置為基於某個偵測條件遞增或遞減的計數器。在此實例中,暫存函數可以經配置並且作為輸入驅動到加法器(例如,在加法器的鏈上)。在此舉將習知地藉由從另一邏輯層的邏輯模組提供輸出來執行的情況下,此框架實現將進位信號作為輸入饋送到相同邏輯層內的邏輯鏈並且顯著減少將在其他情況下藉由經由邏輯層之間的路由結構路由信號而導致的時延。如第5圖所示,具有減少邏輯函數的框架的任何組合邏輯的此組合可應用於AND減少及OR減少配置兩者。An example implementation of this configuration may include a counter configured to increment or decrement based on a certain detected condition. In this example, a scratch function may be configured and driven as an input to an adder (eg, on a chain of adders). Where this would conventionally be performed by providing an output from a logic module of another logic layer, this framework enables feeding the carry signal as an input to a logic chain within the same logic layer and significantly reduces the The latency introduced by routing signals through routing structures between logical layers. As shown in Figure 5, this combination of any combinatorial logic with a framework of reduce logic functions can be applied to both AND reduce and OR reduce configurations.

第6圖示出了根據本文描述的一或多個實施例的另一示例邏輯鏈604框架。例如,第6圖示出了實例,其中將AND/OR減少饋送到加法器520(例如,計數器)中。在此情況下,進位輸出信號可從加法器520饋送作為計數器的加法器520的進位輸入值。Figure 6 illustrates another example logic chain 604 framework according to one or more embodiments described herein. For example, Figure 6 shows an example where the AND/OR reduction is fed into an adder 520 (eg, a counter). In this case, the carry out signal may feed from the adder 520 the carry in value of the adder 520 as a counter.

例如,第一邏輯函數(例如,減法器函數)可供應指示比較操作的結果的進位輸出。此舉可結合用於將進位輸入值提供到另一計數器的組合信號的不同集合提供。因此,此框架可用於使用本文描述的進位鏈配置實施多個邏輯函數的整個邏輯。For example, a first logic function (eg, a subtractor function) may supply a carry out indicative of the result of the comparison operation. This can be provided in conjunction with a different set of combined signals for providing the carry-in value to another counter. Therefore, this framework can be used to implement the entire logic of multiple logic functions using the carry chain configuration described in this paper.

如第5圖所示,「T」及「R」信號可指代兩個輸入,該等輸入指代值集合的指針(例如,具有特定次序的輸入隊列,諸如藉由處理器發佈的輸入)。進位輸出信號可指代減法的最高有效級的進位輸出並且可與另一信號組合以產生邏輯函數。此邏輯函數可作為輸入饋送到另一函數。特定而言,如第5圖所示,進位輸出信號可用於驅動共用邏輯位準的額外邏輯(例如,在不經由邏輯通道之間的路由結構路由信號的情況下)。As shown in FIG. 5, the "T" and "R" signals may refer to two inputs that refer to pointers to sets of values (eg, a queue of inputs with a specific order, such as inputs issued by a processor) . The carry-out signal may refer to the carry-out of the most significant stage of the subtraction and may be combined with another signal to produce a logic function. This logistic function can be fed as input to another function. In particular, as shown in FIG. 5, the carry-out signal can be used to drive additional logic that shares a logic level (eg, without routing the signal via a routing structure between logic lanes).

作為一般實例,所示出的實例圖示了第一邏輯函數(例如,減法函數)。將邏輯函數的結果饋送到中止級(例如,第二邏輯函數),其饋送到加法器函數(例如,第三邏輯函數)。此實施方式使得邏輯鏈能夠實施為將通常涉及多複用器(MUX)的更複雜的函數或涉及多個邏輯級的其他函數,該等邏輯級具有導致不可接受的延遲量的可能性。實際上,藉由使用進位輸出信號來驅動額外邏輯函數,本文描述的配置使得邏輯鏈能夠考慮單個邏輯級內的多個條件。As a general example, the example shown illustrates a first logical function (eg, a subtraction function). The result of the logic function is fed to an abort stage (eg, second logic function), which is fed to an adder function (eg, third logic function). This implementation enables logic chains to be implemented as more complex functions that would typically involve multiplexers (MUXs) or other functions involving multiple logic stages that have the potential to result in unacceptable amounts of latency. In effect, the configuration described herein enables logic chains to consider multiple conditions within a single logic stage by using the carry-out signal to drive additional logic functions.

從更一般的觀點來看,可實施以上實例的類似原理以在減少邏輯函數(例如,AND減少、OR減少)之前或之後組合多個邏輯函數,諸如在如第5圖所示的減法函數之前或之後。在習知系統通常涉及邏輯模組跨多個邏輯級的複雜或穩健組合的情況下,本文描述的實施方式實現各種各種的邏輯函數(包括不相關的邏輯函數)的組合而不招致懲罰。From a more general point of view, similar principles to the above examples can be implemented to combine multiple logic functions before or after a reduction logic function (e.g., AND reduction, OR reduction), such as before a subtraction function as shown in FIG. 5 or after. Where conventional systems typically involve complex or robust combinations of logic modules across multiple levels of logic, embodiments described herein enable combinations of various logic functions, including unrelated logic functions, without incurring penalties.

示例實施方式可包括遞增隊列、進行比較並且將其用作邏輯函數而不路由懲罰、更新讀取指針、中止函數、或任何其他組合邏輯。Example implementations may include incrementing a queue, making comparisons and using them as logic functions without routing penalties, updating read pointers, aborting functions, or any other combinational logic.

第7圖係根據本揭示的至少一個實施例的在可編程硬體上實施的方法740的流程圖。於742,方法740包括接收輸入向量,包含邏輯模組處的複數個輸入位元。於744,邏輯模組接收位元向量。位元向量包括引子位元及複數個向量位元。於746,邏輯模組將引子位元、複數個向量位元中的第一向量位元、及複數個輸入位元中的第一輸入位元提供到進位鏈中的第一加法器。FIG. 7 is a flowchart of a method 740 implemented on programmable hardware in accordance with at least one embodiment of the present disclosure. At 742, method 740 includes receiving an input vector comprising a plurality of input bits at the logic module. At 744, the logic module receives the bit vector. A bit vector includes a primer bit and a plurality of vector bits. At 746, the logic module provides the primer bit, the first vector bit of the plurality of vector bits, and the first input bit of the plurality of input bits to a first adder in the carry chain.

於748,第一加法器基於引子位元、第一向量位元、及第一輸入位元產生第一進位輸出位元。於750,邏輯模組將來自複數個向量位元的每個向量位元及來自複數個輸入位元的相關聯的輸入位元作為輸入提供到進位鏈中的額外加法器。於752,可編程硬體將進位輸出位元從每個加法器提供到進位鏈中的下一加法器以基於來自進位鏈中的最後加法器的最後進位輸出位元產生輸出。At 748, the first adder generates a first carry-out bit based on the preamble bit, the first vector bit, and the first input bit. At 750, the logic module provides each vector bit from the plurality of vector bits and the associated input bit from the plurality of input bits as input to an additional adder in the carry chain. At 752, the programmable hardware provides a carry-out bit from each adder to the next adder in the carry chain to generate an output based on the last carry-out bit from the last adder in the carry chain.

在一些實施例中,輸入向量包括基於來自可編程硬體上實施的額外邏輯模組的組合邏輯的輸出的輸入值。在一些實施例中,額外邏輯模組在與邏輯模組的進位鏈相同的邏輯位準上實施。在一些實施例中,輸入值包括來自額外邏輯模組的加法器的進位輸出信號。In some embodiments, the input vectors include input values based on outputs from combinatorial logic of additional logic modules implemented on programmable hardware. In some embodiments, the additional logic module is implemented on the same logic level as the carry chain of the logic module. In some embodiments, the input value includes a carry-out signal from an adder of an additional logic module.

在一些實施例中,位元向量的值係基於將用作對應邏輯函數的邏輯模組的配置。在一些實施例中,位元向量的值包括引子位元及基於經配置為用作AND減少邏輯函數的邏輯模組的1位元值的集合。在一些實施例中,位元向量的值包括引子位元、第一向量位元的1位元值、及基於經配置為用作OR減少邏輯函數的邏輯模組的零位元值的集合。在一些實施例中,引子位元係1位元值。In some embodiments, the value of the bit vector is based on the configuration of the logic module to be used as the corresponding logic function. In some embodiments, the value of the bit vector includes a primer bit and a set of 1-bit values based on a logic module configured to function as an AND reduction logic function. In some embodiments, the values of the bit vector include a set of primer bits, 1 bit values of the first vector bits, and zero bit values based on a logic module configured to function as an OR reduction logic function. In some embodiments, the primer bit is a 1-bit value.

本文描述的技術可在硬體、軟體、韌體、或其任何組合中實施,除非具體描述為以具體方式實施。描述為模組、部件、或類似者的任何特徵亦可一起在整合的邏輯裝置中實施或獨立地作為離散但可相互操作的邏輯裝置實施。若在軟體中實施,則技術可至少部分藉由包含指令的非暫時性處理器可讀取儲存媒體來實現,當藉由至少一個處理器執行時,該等指令執行本文描述的一或多種方法。指令可組織為常式、程式、目標、部件、資料結構等,其等可執行特定任務及/或實施特定資料類型,並且在各個實施例中可如期望組合或分散。The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof unless specifically described as being implemented in a particular manner. Any features described as modules, components or the like may also be implemented together in an integrated logic device or independently as discrete but interoperable logic devices. If implemented in software, the techniques may be implemented at least in part by a non-transitory processor-readable storage medium containing instructions that, when executed by at least one processor, perform one or more of the methods described herein . Instructions may be organized as routines, programs, objects, components, data structures, etc., which may perform particular tasks and/or implement particular data types, and may be combined or dispersed as desired in various embodiments.

本文描述的方法的步驟及/或行動可彼此互換而不脫離申請專利範圍的範疇。換言之,除非需要具體順序的步驟或行動來用於所描述的方法的正確操作,否則可修改具體步驟及/或行動的順序及/或使用而不脫離申請專利範圍的範疇。The steps and/or actions of the methods described herein may be interchanged with one another without departing from the scope of the claimed claims. In other words, unless a specific order of steps or actions is required for proper operation of the method described, the order and/or use of specific steps and/or actions may be modified without departing from the scope of the claimed claim.

術語「決定」涵蓋各種行動,並且由此「決定」可以包括計算、運算、處理、導出、調查、查找(例如,查找表、資料庫或另一資料結構)、確認及類似者。此外,「決定」可以包括接收(例如,接收資訊)、存取(例如,存取記憶體中的資料)及類似者。此外,「決定」可以包括求解、選擇、選定、建立及類似者。The term "determining" covers a variety of actions, and thus "determining" may include computing, computing, processing, deriving, investigating, looking up (eg, a lookup table, database, or another data structure), validating, and the like. In addition, "determining" may include receiving (eg, receiving information), accessing (eg, accessing data in memory), and the like. Additionally, "determining" may include solving, selecting, choosing, establishing, and the like.

術語「包含(comprising)」、「包括(including)」、及「具有(having)」意欲為包括性的,並且表示可能存在與所列出的元件不同的額外元件。此外,應當理解,提及本揭示的「一個實施例」或「一實施例」不意欲解釋為排除存在亦整合記載的特徵的額外實施例。例如,在兼容的情況下,關於本文的實施例描述的任何元件或特徵可與本文描述的任何其他實施例的任何元件或特徵組合。The terms "comprising", "including", and "having" are intended to be inclusive and mean that there may be additional elements other than the listed elements. Furthermore, it should be understood that references to "one embodiment" or "an embodiment" of the present disclosure are not intended to be interpreted as excluding additional embodiments that also incorporate the recited features. For example, any element or feature described with respect to an embodiment herein may be combined with any element or feature of any other embodiment described herein, where compatible.

本揭示可以其他具體形式體現而不脫離其精神或特性。所描述的實施例被認為係說明性且不係限制性的。由此,本揭示的範疇藉由隨附申請專利範圍而非以上描述來指示。在申請專利範圍的等同物的意義及範圍內的改變將包含在其範疇內。The disclosure may be embodied in other specific forms without departing from its spirit or characteristics. The described embodiments are considered to be illustrative and not restrictive. Accordingly, the scope of the present disclosure is indicated by the appended claims rather than by the foregoing description. Changes within the meaning and range of equivalents of claims are intended to be embraced within their scope.

100:可編程硬體裝置 102:進位鏈邏輯系統 104:邏輯鏈 106:邏輯模組 106-1:第一邏輯模組 106-2:第二邏輯模組 106-n:n個邏輯模組 108-1:第一進位信號 108-2:第二進位信號 108-n:n-進位信號 110-1:第一輸入 110-2:第二輸入 110-n:n-輸入 112-1:第一輸出 112-2:第二輸出 112-n:n-輸出 114:AND/OR減少函數輸出 206:邏輯模組 215:LUT輸入 216:LUT 218-1:第一邏輯部分 218-2:第二邏輯部分 220-1:第一加法器 220-2:第二加法器 222-1:第一進位輸入位元 222-2:第二進位輸入位元 224-1:第一輸入 224-2:第二輸入 226-1:第一向量位元 226-2:第二向量位元 228-1:第一進位輸出位元 228-2:第二進位輸出位元 230:暫存器 304:通用邏輯鏈 320-1:第一加法器 320-2:第二加法器 320-3:第三加法器 320-4:第四加法器 324:輸入位元 326:向量位元 332:位元向量 334:輸入向量 336:減少輸出 404:通用邏輯鏈 420-1:第一加法器 420-2:第二加法器 420-3:第三加法器 420-4:第四加法器 436:減少輸出 438:LUT減少函數 504:邏輯鏈 604:邏輯鏈 740:方法 a1:第一輸入位元 a2:第二輸入位元 a3:第三輸入位元 a4:第四輸入位元 b1:第二向量位元 b2:第二向量位元 b3:第三向量位元 b4:第四向量位元 C1:第一進位位元 C2:第二進位位元 C3:第三進位位元 C4:第四進位位元 100: programmable hardware device 102: Carry chain logic system 104: logic chain 106:Logic module 106-1: The first logic module 106-2: Second logic module 106-n: n logic modules 108-1: First carry signal 108-2: Second carry signal 108-n:n-carry signal 110-1: first input 110-2: second input 110-n:n-input 112-1: first output 112-2: Second output 112-n:n-output 114: AND/OR reduction function output 206: Logic Module 215: LUT input 216:LUT 218-1: First logical part 218-2: Second logical part 220-1: First adder 220-2: second adder 222-1: First carry input bit 222-2: Second carry input bit 224-1: first input 224-2: second input 226-1: the first vector bit 226-2: second vector bit 228-1: First carry output bit 228-2: Second carry output bit 230: temporary register 304: General logic chain 320-1: first adder 320-2: second adder 320-3: The third adder 320-4: The fourth adder 324: input bit 326: vector bit 332:Bit vector 334: Input vector 336: Reduce output 404: General logic chain 420-1: first adder 420-2: second adder 420-3: The third adder 420-4: Fourth Adder 436: Reduce output 438:LUT reduction function 504: logic chain 604: logic chain 740: method a1: first input bit a2: second input bit a3: the third input bit a4: The fourth input bit b1: second vector bit b2: the second vector bit b3: the third vector bit b4: the fourth vector bit C1: first carry bit C2: second carry bit C3: The third carry bit C4: The fourth carry bit

第1A圖示出了根據一或多個實施例的包括示例進位鏈邏輯系統的示例環境。Figure 1A illustrates an example environment including an example carry chain logic system in accordance with one or more embodiments.

第1B圖示出了根據一或多個實施例的在進位鏈邏輯系統內實施的示例邏輯鏈。Figure 1B illustrates an example logic chain implemented within a carry chain logic system in accordance with one or more embodiments.

第2圖示出了根據一或多個實施例的可在邏輯鏈內實施的邏輯單元的示例實施方式。Figure 2 illustrates an example implementation of logic cells that may be implemented within logic chains in accordance with one or more embodiments.

第3-1圖示出了根據一或多個實施例的位元向量及輸入向量。Figure 3-1 shows bit vectors and input vectors according to one or more embodiments.

第3-2圖示出了根據一或多個實施例的通用進位鏈。Figure 3-2 illustrates a generalized carry chain in accordance with one or more embodiments.

第3-3圖示出了OR減少邏輯鏈的示例實施方式。Figures 3-3 illustrate example implementations of OR reduction logic chains.

第3-4圖示出了AND減少邏輯鏈的示例實施方式。Figures 3-4 illustrate example implementations of AND reduction logic chains.

第4-1圖直至第4-3圖示出了根據一或多個實施例的包括OR減少邏輯鏈及AND減少邏輯鏈的邏輯鏈的其他示例實施方式。Figures 4-1 through 4-3 illustrate other example implementations of logic chains including OR reduction logic chains and AND reduction logic chains in accordance with one or more embodiments.

第5圖示出了根據一或多個實施例的OR減少邏輯鏈及AND減少邏輯鏈的另一示例實施方式。Figure 5 illustrates another example implementation of an OR reduction logic chain and an AND reduction logic chain in accordance with one or more embodiments.

第6圖示出了根據一或多個實施例的結合AND/OR減少鏈實施的邏輯函數的示例配置。Figure 6 illustrates an example configuration of a logic function implemented in conjunction with an AND/OR reduction chain in accordance with one or more embodiments.

第7圖示出了根據一或多個實施例的在可編程硬體上實施的方法的流程圖。Figure 7 shows a flowchart of a method implemented on programmable hardware according to one or more embodiments.

國內寄存資訊(請依寄存機構、日期、號碼順序註記) 無 國外寄存資訊(請依寄存國家、機構、日期、號碼順序註記) 無 Domestic deposit information (please note in order of depositor, date, and number) none Overseas storage information (please note in order of storage country, institution, date, and number) none

104:邏輯鏈 104: logic chain

106-1:第一邏輯模組 106-1: The first logic module

106-2:第二邏輯模組 106-2: Second logic module

106-n:n個邏輯模組 106-n: n logic modules

108-1:第一進位信號 108-1: First carry signal

108-2:第二進位信號 108-2: Second carry signal

108-n:n-進位信號 108-n:n-carry signal

110-1:第一輸入 110-1: first input

110-2:第二輸入 110-2: second input

110-n:n-輸入 110-n:n-input

112-1:第一輸出 112-1: first output

112-2:第二輸出 112-2: Second output

112-n:n-輸出 112-n:n-output

114:AND/OR減少函數輸出 114: AND/OR reduction function output

Claims (20)

一種在邏輯模組的一進位鏈上實施的方法,該方法包含以下步驟: 接收一輸入向量,包含複數個輸入位元; 接收一位元向量,包含一引子位元及複數個向量位元; 將該引子位元及該複數個輸入位元中的一第一輸入位元提供到該進位鏈中的一第一加法器; 基於該引子位元及該第一輸入位元在該第一加法器處產生一第一進位輸出位元; 將來自該複數個向量位元的每個向量位元及來自該複數個輸入位元的一相關聯的輸入位元作為輸入提供到該進位鏈中的額外加法器;以及 將一進位輸出位元從每個加法器提供到該進位鏈中的一下一加法器以產生基於來自該進位鏈中的一最後加法器的一最後進位輸出位元的一輸出。 A method implemented on a unary chain of a logic module, the method comprising the steps of: receiving an input vector, including a plurality of input bits; Receive a one-bit vector, including a primer bit and a plurality of vector bits; providing the primer bit and a first input bit of the plurality of input bits to a first adder in the carry chain; generating a first carry-out bit at the first adder based on the primer bit and the first input bit; providing each vector bit from the plurality of vector bits and an associated input bit from the plurality of input bits as input to an additional adder in the carry chain; and A carry-out bit is provided from each adder to a next adder in the carry chain to produce an output based on a last carry-out bit from a last adder in the carry chain. 如請求項1所述的方法,其中該輸入向量包括基於來自該可編程模組上實施的額外邏輯模組的組合邏輯的輸出的輸入值。The method of claim 1, wherein the input vector includes an input value based on an output from combinatorial logic of an additional logic module implemented on the programmable module. 如請求項2所述的方法,其中該額外邏輯模組在與邏輯模組的該進位鏈相同的一邏輯位準上實施。The method of claim 2, wherein the additional logic module is implemented at the same logic level as the carry chain of logic modules. 如請求項2所述的方法,其中該等輸入值包括來自該等額外邏輯模組的加法器的進位輸出信號。The method of claim 2, wherein the input values include carry output signals from adders of the additional logic modules. 如請求項1所述的方法,其中基於該等邏輯模組的一配置決定該位元向量的值以用作一相關聯的邏輯函數。The method of claim 1, wherein the value of the bit vector is determined for use as an associated logic function based on a configuration of the logic modules. 如請求項5所述的方法,其中該位元向量的該等值包括該引子位元及基於經配置為用作一AND減少邏輯函數的該等邏輯模組的1位元值的一集合。The method of claim 5, wherein the values of the bit vector comprise the primer bits and a set of 1-bit values based on the logic modules configured for use as an AND-reduce logic function. 如請求項5所述的方法,其中該位元向量的該等值包括具有一1位元值的該引子位元及基於經配置為用作一OR減少邏輯函數的該等邏輯模組的零位元值的一集合。The method of claim 5, wherein the values of the bit vector include the primer bits having a 1-bit value and zeros based on the logic modules configured to function as an OR reduction logic function A collection of bit values. 如請求項1所述的方法,其中該引子位元係一1位元值。The method of claim 1, wherein the primer bit is a 1-bit value. 一種進位鏈邏輯函數,該進位鏈邏輯函數包含: 一第一邏輯模組,包括一第一加法器及一第二加法器,該第一邏輯模組經配置為: 在該第一加法器處接收一第一輸入及一位元向量的一引子位元以產生一第一進位輸出信號來饋送到該第二加法器; 在該第二加法器處接收該第一進位輸出信號作為一第一進位輸入信號、一第二輸入、及該位元向量的一相關聯的第二向量位元以產生一第二進位輸出信號; 一第二邏輯模組,包括一第三加法器及一第四加法器,該第二邏輯模組經配置為: 在該第三加法器處接收該第二進位輸出信號作為一第二進位輸入信號、一第三輸入、及該位元向量的一相關聯的第三向量位元以產生一第三進位輸出信號;以及 在該第四加法器處接收該第三進位輸出信號作為一第三進位輸入信號、一第四輸入、及該位元向量的一相關聯的第三向量位元以產生一第四進位輸出信號。 A carry chain logic function, the carry chain logic function includes: A first logic module includes a first adder and a second adder, the first logic module is configured as: receiving a first input and an introductory bit of a bit vector at the first adder to generate a first carry output signal to feed to the second adder; receiving the first carry-out signal at the second adder as a first carry-in signal, a second input, and an associated second vector bit of the bit vector to generate a second carry-out signal ; A second logic module includes a third adder and a fourth adder, the second logic module is configured to: receiving the second carry-out signal at the third adder as a second carry-in signal, a third input, and an associated third vector bit of the bit vector to generate a third carry-out signal ;as well as receiving the third carry out signal at the fourth adder as a third carry in signal, a fourth input, and an associated third vector bit of the bit vector to produce a fourth carry out signal . 如請求項9所述的進位鏈邏輯函數,其中基於藉由該進位鏈邏輯函數的相應加法器接收的該第一輸入、該第二輸入、該第三輸入、及該第四輸入的一組合,包括該第一邏輯模組及該第二邏輯模組的該進位鏈邏輯函數的一最後進位輸出信號係一邏輯閘的一輸出。The carry chain logic function as described in claim 9, wherein based on a combination of the first input, the second input, the third input, and the fourth input received by the corresponding adder of the carry chain logic function , a final carry output signal of the carry chain logic function including the first logic module and the second logic module is an output of a logic gate. 如請求項9所述的進位鏈邏輯函數,其中在該第一加法器、該第二加法器、該第三加法器、及該第四加法器處接收的該第一輸入、該第二輸入、該第三輸入、及該第四輸入的一或多個指代在與該進位鏈邏輯函數相同或不同的一邏輯位準上的組合邏輯函數的輸出。The carry chain logic function as described in claim 9, wherein the first input, the second input received at the first adder, the second adder, the third adder, and the fourth adder One or more of , the third input, and the fourth input refer to an output of a combinational logic function at the same or a different logic level as the carry chain logic function. 如請求項9所述的進位鏈邏輯函數,其中該位元向量的該等向量位元係基於該進位鏈邏輯函數經配置為模擬的一特定邏輯函數。The carry chain logic function of claim 9, wherein the vector bits of the bit vector are based on a specific logic function that the carry chain logic function is configured to simulate. 如請求項12所述的進位鏈邏輯函數,其中基於該進位鏈邏輯函數實施為一OR減少邏輯閘,該位元向量的該等向量位元為全1值。The carry chain logic function according to claim 12, wherein the carry chain logic function is implemented as an OR reduction logic gate, and the vector bits of the bit vector are all 1 values. 如請求項12所述的進位鏈邏輯函數,其中基於該進位鏈邏輯函數實施為一AND減少邏輯閘,該位元向量的該第一向量位元係1值並且該位元向量的剩餘向量位元係全零值。The carry chain logic function as described in claim item 12, wherein the carry chain logic function is implemented as an AND reduction logic gate based on the carry chain logic function, the first vector bit of the bit vector is a 1 value and the remaining vector bits of the bit vector Elements are all zero values. 一種可編程硬體裝置,包含: 在一邏輯鏈中編程的複數個邏輯模組,包含複數個加法器,該複數個邏輯模組經編程為: 在該複數個加法器的一第一加法器處接收一輸入位元,該加法器位於該複數個邏輯模組的一第一邏輯模組中; 在該複數個加法器的該第一加法器處接收一向量位元; 在該複數個加法器的該第一加法器處使用該輸入位元及該向量位元產生一進位輸出位元;以及 將該進位輸出位元提供到該複數個加法器的一下一加法器,該下一加法器接收該進位輸出位元作為一進位輸入位元,該下一加法器係在該複數個邏輯模組的一第二邏輯模組中。 A programmable hardware device comprising: A plurality of logic modules programmed in a logic chain, including a plurality of adders, the plurality of logic modules programmed to: receiving an input bit at a first adder of the plurality of adders, the adder being located in a first logic module of the plurality of logic modules; receiving a vector of bits at the first adder of the plurality of adders; generating a carry-out bit using the input bit and the vector bit at the first adder of the plurality of adders; and The carry output bit is provided to the next adder of the plurality of adders, the next adder receives the carry output bit as a carry input bit, and the next adder is in the plurality of logic modules In a second logic module of . 如請求項15所述的可編程硬體裝置,其中該輸入位元從該第一邏輯模組中的一LUT接收。The programmable hardware device as recited in claim 15, wherein the input bit is received from a LUT in the first logic module. 如請求項16所述的可編程硬體裝置,其中該LUT對來自一輸入向量的複數個輸入執行一減少。The programmable hardware device of claim 16, wherein the LUT performs a reduction on a plurality of inputs from an input vector. 如請求項16所述的可編程硬體裝置,其中該LUT對來自一輸入向量的複數個輸入執行組合邏輯。The programmable hardware device of claim 16, wherein the LUT performs combinatorial logic on a plurality of inputs from an input vector. 如請求項15所述的可編程硬體裝置,其中該下一加法器係在與該加法器相同的該邏輯位準中。The programmable hardware device of claim 15, wherein the next adder is at the same logic level as the adder. 如請求項15所述的可編程硬體裝置,其中該複數個硬體裝置經編程為將來自該複數個加法器的一最後加法器的一最後進位輸出位元提供到一減少函數。The programmable hardware device of claim 15, wherein the plurality of hardware devices are programmed to provide a last carry output bit from a last adder of the plurality of adders to a reduce function.
TW111144744A 2021-12-30 2022-11-23 Using and/or reduce carry chains on programmable hardware TW202331575A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163295323P 2021-12-30 2021-12-30
US63/295,323 2021-12-30
US17/740,831 US20230214180A1 (en) 2021-12-30 2022-05-10 Using and/or reduce carry chains on programmable hardware
US17/740,831 2022-05-10

Publications (1)

Publication Number Publication Date
TW202331575A true TW202331575A (en) 2023-08-01

Family

ID=84358214

Family Applications (1)

Application Number Title Priority Date Filing Date
TW111144744A TW202331575A (en) 2021-12-30 2022-11-23 Using and/or reduce carry chains on programmable hardware

Country Status (2)

Country Link
TW (1) TW202331575A (en)
WO (1) WO2023129261A1 (en)

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7167022B1 (en) * 2004-03-25 2007-01-23 Altera Corporation Omnibus logic element including look up table based logic elements
US10715144B2 (en) * 2019-06-06 2020-07-14 Intel Corporation Logic circuits with augmented arithmetic densities

Also Published As

Publication number Publication date
WO2023129261A1 (en) 2023-07-06

Similar Documents

Publication Publication Date Title
US11301213B2 (en) Reduced latency multiplier circuitry for very large numbers
US5844830A (en) Executing computer instrucrions by circuits having different latencies
US10558431B2 (en) Memristor-based multipliers using memristors-as-drivers (MAD) gates
WO2015121713A1 (en) Fpga architecture and design automation through constrained placement
US20160342422A1 (en) Pipelined cascaded digital signal processing structures and methods
US10873332B2 (en) Adder circuitry for very large integers
Hormigo et al. Multioperand redundant adders on FPGAs
US8589465B1 (en) Digital signal processing circuit blocks with support for systolic finite-impulse-response digital filtering
US9904512B1 (en) Methods and apparatus for performing floating point operations
US7269616B2 (en) Transitive processing unit for performing complex operations
US4796219A (en) Serial two's complement multiplier
JPH08212058A (en) Addition overflow detection circuit
US20080256165A1 (en) Full-Adder Modules and Multiplier Devices Using the Same
TW202331575A (en) Using and/or reduce carry chains on programmable hardware
US20190042197A1 (en) Programmable-Logic-Directed Multiplier Mapping
US20060155793A1 (en) Canonical signed digit (CSD) coefficient multiplier with optimization
JPH04233629A (en) Prefetching adder
US10175943B2 (en) Sorting numbers in hardware
US20230214180A1 (en) Using and/or reduce carry chains on programmable hardware
TW202230121A (en) Processor and computing system
US6484193B1 (en) Fully pipelined parallel multiplier with a fast clock cycle
US10908879B2 (en) Fast vector multiplication and accumulation circuit
CN110506255B (en) Energy-saving variable power adder and use method thereof
Hariri et al. A Simplified Modulo (2 n-1) Squaring Scheme for Residue Number System
US11113028B2 (en) Apparatus and method for performing an index operation