TW202307647A - Floating point multiply-add, accumulate unit with carry-save accumulator - Google Patents
Floating point multiply-add, accumulate unit with carry-save accumulator Download PDFInfo
- Publication number
- TW202307647A TW202307647A TW111110603A TW111110603A TW202307647A TW 202307647 A TW202307647 A TW 202307647A TW 111110603 A TW111110603 A TW 111110603A TW 111110603 A TW111110603 A TW 111110603A TW 202307647 A TW202307647 A TW 202307647A
- Authority
- TW
- Taiwan
- Prior art keywords
- exponent
- circuit
- multiplier
- sum
- significand
- Prior art date
Links
Images
Abstract
Description
本揭露的領域是算術邏輯電路的實現,包括浮點乘加累加電路,有時也稱為乘法及累加電路,用於高速處理器,包括其配置成有效執行訓練和推理的處理器。 優先權申請案的參照 The field of the present disclosure is the implementation of arithmetic logic circuits, including floating-point multiply-accumulate circuits, sometimes referred to as multiply-and-accumulate circuits, for high-speed processors, including processors configured to efficiently perform training and inference. References to Priority Applications
本申請案請求下列優先權:2021年11月23日提交的美國專利申請案第17/534,376號、2021年9月2日提交的美國專利申請案第17/465,558號、2021年8月9日提交的美國專利申請案第17/397,241號、2021年5月19日提交的美國臨時專利申請案第63/190,749號、2021年4月13日提交的美國臨時專利申請案第63/174,460號、2021年3月25日提交的美國臨時專利申請案第63/166,221號、2021年3月23日提交的美國臨時專利申請案第63/165,073號及2021年8月31日提交的美國臨時專利申請案第63/239,384號。上述所有八件申請案均以參照方式併入本文。This application claims priority to: U.S. Patent Application No. 17/534,376, filed November 23, 2021; U.S. Patent Application No. 17/465,558, filed September 2, 2021; U.S. Patent Application No. 17/397,241 filed May 19, 2021 U.S. Provisional Patent Application No. 63/190,749 filed May 19, 2021 U.S. Provisional Patent Application No. 63/174,460 filed April 13, 2021, U.S. Provisional Patent Application No. 63/166,221, filed March 25, 2021, U.S. Provisional Patent Application No. 63/165,073, filed March 23, 2021, and U.S. Provisional Patent Application, filed August 31, 2021 Case No. 63/239,384. All eight of the aforementioned applications are hereby incorporated by reference.
如在高性能處理器中實現的包括浮點、乘法及累加單元的算術邏輯電路是相對複雜的邏輯電路。乘法及累加電路應用於矩陣乘法和其它複雜的數學運算,其應用於機器學習和推理引擎。Arithmetic logic circuits including floating point, multiply and accumulate units as implemented in high performance processors are relatively complex logic circuits. Multiply and accumulate circuits are used for matrix multiplication and other complex mathematical operations, which are used in machine learning and inference engines.
實質上,乘法及累加電路生成項A(i)*B(i)的序列的總和S(i),通常表示如下: In essence, the multiply-and-accumulate circuit generates the sum S(i) of a sequence of terms A(i)*B(i), usually expressed as follows:
在此,週期(i)處的總和S(i)等於將項A(i)*B(i)加到總和S(i-1)上,即項A(0)*B(0)到A(i-1)*B(i-1)的累加。最終的總和S(N-1)對於N個週期(從0到N-1)的乘法及累加運算的總和輸出。Here, the sum S(i) at period (i) is equal to adding the term A(i)*B(i) to the sum S(i-1), i.e. the term A(0)*B(0) to A Accumulation of (i-1)*B(i-1). The final sum S(N-1) is the sum output of multiplication and accumulation operations for N cycles (from 0 to N-1).
在浮點實現中,每個週期將包括指數值和有效值的兩個輸入浮點運算元A(i)和B(i)相乘,以產生乘法器輸出項A(i)*B(i),接著透過將當前週期的乘法器輸出項A(i)*B(i)與前一個週期的累加器輸出總和S(i-1)相加來計算累加器輸出總和S(i)。In a floating-point implementation, two input floating-point operands A(i) and B(i) including exponent and rms values are multiplied each cycle to produce a multiplier output term A(i)*B(i ), then calculate the accumulator output sum S(i) by adding the multiplier output term A(i)*B(i) of the current cycle to the accumulator output sum S(i−1) of the previous cycle.
在計算中用於對浮點數進行編碼的浮點編碼格式中,可以對數字進行正規化,以使有效數在二進制點的左側包括一位整數(在二進制中始終為「1」),以及由二進制點右側的一些位元表示的小數,而所述數字僅使用小數進行編碼。編碼中省略了二進制1整數,因為它可以透過正規化形式隱含。以這種方式編碼的浮點編碼格式數字的運算考慮到二進制點左側的整數,稱為「隱含1」。In the floating-point encoding format used to encode floating-point numbers in computing, the number can be normalized so that the significand includes an integer bit to the left of the binary point (always "1" in binary), and A decimal represented by bits to the right of the binary point, and the number in question is encoded using only the decimal. The binary 1 integer is omitted from the encoding because it can be implied by the normalized form. Operations on floating point encoded format numbers encoded in this way take into account the integers to the left of the binary point, known as "implicit 1".
浮點數的乘法可以透過將指數相加,乘以有效數,接著正規化結果來實現,透過將輸出的結果有效數移位並調整輸出的指數以適應這種移位。Multiplication of floating-point numbers can be accomplished by adding the exponents, multiplying by the significand, and then normalizing the result by shifting the resulting significand on the output and adjusting the output exponent to accommodate this shift.
浮點數的加法可以透過首先識別較大的指數,以及運算元的指數之間的差異,接著將具有最小指數的運算元的有效數移位以與較大的指數對齊來實現。最後,將結果正規化,這可能涉及有效數的額外移位和指數的調整。Addition of floating-point numbers can be accomplished by first identifying the larger exponent, and the difference between the exponents of the operands, and then shifting the significand of the operand with the smallest exponent to align with the larger exponent. Finally, the result is normalized, which may involve additional shifting of the significand and adjustment of the exponent.
致使格式不支援的數字的計算,諸如浮點編碼格式,會致使異常訊號。在資料流架構和其它執行複雜演算法(如機器學習演算法)的架構中,這些異常可能致使演算法停止或失敗。即時系統中致使演算法停止或失敗的異常可能致使系統故障或其它性能問題。Calculations with numbers that result in formats not supported, such as floating-point encoding formats, result in exception signals. In dataflow architectures and other architectures that implement complex algorithms such as machine learning algorithms, these anomalies can cause the algorithm to stall or fail. Anomalies in real-time systems that cause algorithms to stall or fail can lead to system failures or other performance problems.
需要提供可應用於複雜資料處理設置的異常處理系統。There is a need to provide an exception handling system that can be applied to complex data processing settings.
與and
提供了實現具有異常處理的可配置和可重新配置資料流架構的算術單元的技術的詳細描述。Shah等人於2020年11月10日發布的美國專利第10,831,507號中描述了範例可重構資料流架構,所述專利透過參照併入,如同在本文中完整闡述一樣。算術單元可以使用輸入運算元執行複數個浮點算術運算並生成至少一個輸出運算元,其中輸入運算元的來源、輸出運算元的目的地和運算是可配置的,並且可透過可以在資料流運算期間保持靜態的配置資料重新配置。A detailed description of techniques for implementing an arithmetic unit of a configurable and reconfigurable dataflow architecture with exception handling is provided. An example reconfigurable dataflow architecture is described in US Patent No. 10,831,507 issued November 10, 2020 to Shah et al., which is incorporated by reference as if fully set forth herein. The arithmetic unit can perform a plurality of floating-point arithmetic operations using input operands and generate at least one output operand, where the source of the input operands, the destination of the output operands, and the operation are configurable and can be operated on through the data stream The configuration data remains static during reconfiguration.
在至少一個浮點算術運算的執行中,檢測到與非法運算相關的異常以及與所使用的浮點編碼格式不正常表示的結果的生成相關的異常,並將運算結果設置為在運算期間可用於進一步處理的值,不需要由例如運行時處理器進行特殊的中斷處理。結果,資料流運算能夠在不因至少一些異常而中斷的情況下完成。During the execution of at least one floating-point arithmetic operation, anomalies related to illegal operations and to the generation of results not normally represented by the floating-point encoding format used are detected, and the result of the operation is made available during the operation Values for further processing that do not require special interrupt handling by, for example, the runtime processor. As a result, dataflow operations can be completed without interruption due to at least some exceptions.
在一些實施例中,在控制流架構上使用的算術運算和算術單元可以實現本文中描述的異常處理技術。 浮點進位保留 MAC(FP-CS-MAC) In some embodiments, the arithmetic operations and arithmetic units used on the control flow architecture may implement the exception handling techniques described herein. Floating-point carry-save MAC (FP-CS-MAC)
描述了可以在三種運算模式下運算的FP-CS-MAC,諸如: 輸入 A(BF16)x 輸入 B(BF16)+ 累加迴路 輸入 A(BF16)x 輸入 B(BF16)+ 輸入 C(FP32)或單一32位元浮點加法,諸如: 輸入 A(FP32)+ 輸入 C(FP32)運算元A可以是任何格式,而在此實現中,其為以下兩種格式之一:BF16或FP32,其中BF16是一種包含8位元指數、1符號位元、7位元有效數的格式,其中隱含1整數位元,共有8有效數位元。FP32被稱為單精確度32位元、IEEE浮點754標準。 Describes the FP-CS-MAC that can operate in three operation modes, such as: input A (BF16) x input B (BF16) + accumulation loop input A (BF16) x input B (BF16) + input C (FP32) or A single 32-bit floating-point addition, such as: input A (FP32) + input C (FP32) operand A can be in any format, and in this implementation it is one of the following two formats: BF16 or FP32, where BF16 It is a format that contains 8-bit exponent, 1 sign bit, and 7-bit significand, in which 1 integer bit is implied, and there are 8 significand bits in total. FP32 is known as the single-precision 32-bit, IEEE floating-point 754 standard.
可以使用其它編碼格式,並且可以對所描述的實現進行適當的調整。Other encoding formats may be used and appropriate adaptations of the described implementations may be made.
描述了一種三模式浮點進位保留MAC(FP-CS-MAC)單元,包含實現為管線、響應於管線時脈運行的電路。在一些實現中,管線時脈可以是千兆赫(GHz)量級或更快。當管線時脈運行時,時脈的每個週期對應於一個管線週期。因此,在一些實施例中,管線週期可以小於一奈秒。在管線中,管線的階段包括在第一管線時脈脈衝(例如,時脈脈衝的前沿)保持階段輸入資料的輸入暫存器或資料儲存,以及在下一個管線時脈脈衝的階段(例如,下一個時脈脈衝的前沿,定義一個管線時脈週期)的暫存器階段輸出資料的輸出暫存器或資料儲存。在第一管線時脈脈衝開始一個管線週期(i)時,所述階段的輸出暫存器保存前一個管線週期(i-1)的階段輸出資料,而管線中一個階段的階段輸出資料至少是下一個階段的階段輸入資料的一部分。每個階段的電路必須在管線週期內可靠地穩定下來,因此快速的管線時脈對時序關鍵階段造成了很大的困難。A triple-mode floating-point carry-save MAC (FP-CS-MAC) unit is described that includes circuitry implemented as a pipeline that operates in response to a pipeline clock. In some implementations, the pipeline clock can be on the order of gigahertz (GHz) or faster. When the pipeline clock is running, each cycle of the clock corresponds to a pipeline cycle. Thus, in some embodiments, the pipeline cycle time may be less than one nanosecond. In a pipeline, stages of the pipeline include input registers or data stores that hold input data at the stage of the first pipeline clock pulse (e.g., the leading edge of the clock pulse), and hold input data at the stage of the next pipeline The leading edge of a clock pulse, defining a pipeline clock period) of the register stage outputs data to an output register or data store. When the first pipeline clock pulse starts a pipeline cycle (i), the output register of the stage holds the stage output data of the previous pipeline cycle (i-1), and the stage output data of a stage in the pipeline is at least Part of the stage input data for the next stage. The circuitry at each stage must settle reliably within the pipeline cycle, so fast pipeline clocks pose significant difficulties for timing-critical stages.
三模式浮點進位保留MAC(FP-CS-MAC)單元的一種實現包含6個管線階段。透過增加管線階段數可以進一步提高速度。透過減少管線階段數可以進一步降低功率。一般來說,管線階段的最佳數量取決於特定的技術和設計要求。第一主要單元是BF16乘法器,在本例中它在兩個管線階段中實現,並包括用於將乘法器結果轉換為16位元的2的補數有效數和指數的轉換單元。第三管線階段是進位保留累加階段。接下來的兩個階段將進位總和格式的結果轉換回傳統正規化符號數值格式,諸如輸出編碼格式所需的BF16或FP32。One implementation of a triple-mode floating-point carry-save MAC (FP-CS-MAC) unit contains 6 pipeline stages. Speed can be further increased by increasing the number of pipeline stages. Power can be further reduced by reducing the number of pipeline stages. In general, the optimal number of pipeline stages depends on the specific technology and design requirements. The first main unit is the BF16 multiplier, which in this example is implemented in two pipeline stages and includes a conversion unit for converting the multiplier result to a 16-
最後一個管線階段執行正規化和捨入以產生結果。在這種情況下,最終格式為BF16或FP32格式。輸入運算元有效數介於1 |a|<2之間,因為它們在十進制點左側包含隱含的1,並且僅包括有效數的小數部分。所述單元不支援非正規化數字並將它們截斷為零。因此,使用BF16或FP32,輸入運算元的範圍為±2−126到(2−2−7)×2127。如果小於±2−126,則超出此範圍的數字截斷為零,或者如果大於±(2−2−7)×2127,則轉換為±無窮大。 浮點編碼格式 The last pipeline stage performs normalization and rounding to produce the result. In this case the final format is BF16 or FP32 format. Input operand valid number is between 1 |a|<2 because they include an implied 1 to the left of the decimal point and only the fractional part of the significand. The unit in question does not support denormalized numbers and truncates them to zero. Therefore, with BF16 or FP32, the range of input operands is ±2−126 to (2−2−7)×2127. Numbers outside this range are truncated to zero if less than ±2−126, or converted to ±infinity if greater than ±(2−2−7)×2127. floating point encoding format
圖1說明了兩種編碼格式的位元模式。第一位元格式的第一範例圖說明了Bfloat16 110。Bfloat16浮點編碼格式(有時「BF16」)是16位元數字格式。BF16保留了IEEE單精確度數的近似動態範圍。說明的BF16格式包括7位元小數、用以完成有效數的「隱含位元」或「隱藏位元」、8位元指數和一個符號位元。Figure 1 illustrates the bit patterns for the two encoding formats. The first example diagram of the first bit format illustrates
第二張圖說明了IEEE 754單精確度32位元浮點(FP32)130編碼格式。說明的IEEE 754單精確度32位元浮點130包括23位元小數、用以完成有效數的「隱含位元」或「隱藏位元」、8位元指數和和一個符號位元。這兩種編碼格式的特徵是FP32格式中的數字可以透過丟棄23位元部分的16個較低有效數位元來轉換為BF16格式,在一些實施例中進行捨入以選擇較低階位元。
系統方塊圖 The second figure illustrates the IEEE 754 single-precision 32-bit floating-point (FP32) 130 encoding format. The illustrated IEEE 754 single-precision 32-bit
圖2是BF16和FP32格式的具有進位保留累加器的浮點乘加累加單元的高階方塊圖。運算元A 213被說明為BF16格式或FP32格式217。運算元B 214是BF16格式並且是乘法器電路202的第一輸入。第二輸入是BF16運算元A 213。當運算元A和運算元B都為BF16格式時,運算元A和運算元B可以佔用單一32位元暫存器,每個使用16位元,表示乘法器和乘法器的被乘數輸入。乘法器電路210的乘積(A*B)輸出在線218以進位及(Carry-Sum)形式產生,其為方塊220中最終加法器的輸入。方塊220還將結果轉換為2的補數形式,並且包括基底8轉換器電路以支援基底8運算。Figure 2 is a high-level block diagram of a floating-point multiply-accumulate unit with a carry-save accumulator in BF16 and FP32 formats. Operand A 213 is illustrated as BF16 format or
當管線以單一32位元加法運算時,(一個運算元)運算元A可以繞過乘法器電路202,而用於加法的第二運算元C來自線216。When the pipeline operates with a single 32-bit addition, (one operand) operand A can bypass the
在這個範例中,運算元C 216是32位元運算元,其被輸入到基底8轉換器215,其在線219上將結果輸出到多工器210和211之一的第一輸入。多工器210和211的第二輸入為從累加器240的輸出反饋的線224和226上用於進位及總和值C/S-ACC的兩條匯流排(以及未顯示的指數)。多工器211和212輸出指數和有效數作為匯流排223的兩個值。In this example,
進位保留加法器230在線221上接收方塊220的輸出,並在匯流排223上接收多工器211、212的輸出。進位保留加法器230在進入累加器240的雙匯流排222上輸出總和的指數和C/S值。累加器240在輸出匯流排224和225上以進位保留形式提供C/S-ACC指數和有效數,輸出匯流排224和225反饋給多工器211、多工器212,並在匯流排226上以進位保留形式提供C/S-ACC指數和有效數到進位保留到符號數值轉換方塊250,所述方塊執行匯流排226上的有效數的進位及總和值的最終相加,並在匯流排227上將得到的有效數轉換為符號數值格式。匯流排252和251將來自累加器240的資料傳送到進位保留到符號數值轉換方塊250。Carry-
基底8到基底2轉換和正規化方塊260在匯流排227上具有輸入,並在匯流排228上將正規化結果輸出以供後正規化、捨入和轉換到FP32或BF16方塊270,其在匯流排229上將輸出轉換為FP32或BF16格式。運算在匯流排229上以32位元FP32格式或16位元BF16格式輸出結果「Z」。The
因此,圖2說明了可以實現為多級管線的電路範例,所述多級管線配置成以三種模式執行,包括用於輸入浮點運算元序列的乘法及累加運算。在此範例中,電路可以配置成管線,包括含有具有和及進位輸出的浮點乘法器的第一級、包括用於乘法器的和及進位輸出的乘法器輸出加法器與用以將乘法器加法器輸出轉換為具有2的補數有效數的基底8格式的電路的第二級、包括有效數電路和累加器加法器的指數電路的第三級、將累加器符號位元、累加器指數和累加器有效數和及進位值轉換為符號數值有效數格式的第四級,將符號數值有效數格式從基底8對齊轉換為基底2對齊並產生正規化的指數和有效數的第五級,以及用以執行捨入和轉換為標準浮點表示的第六級。Accordingly, FIG. 2 illustrates an example of a circuit that may be implemented as a multi-stage pipeline configured to perform in three modes, including multiply and accumulate operations for input sequences of floating-point operands. In this example, the circuit can be configured as a pipeline comprising a first stage comprising a floating point multiplier with a sum and carry output, a multiplier output adder comprising a sum and carry output for the multiplier, and a The second stage of the circuit that converts the adder output to base-8 format with a 2's complement significand, the third stage of the exponent circuit including the significand circuit and the accumulator adder, the accumulator sign bit, the accumulator exponent A fourth stage that converts the sum-accumulator significand and carry value to signed numeric significand format, a fifth stage that converts signed numeric significand format from base-8 alignment to
本文描述的技術提供了一種乘法累加法來計算項A(i)*B(i)的總和S(i),其中(i)從0到N-1,N為總和中的項數。所述方法可以包含以浮點編碼格式接收運算元A(i)和運算元B(i)的序列,其中(i)從0到N-1;將運算元A(i)和運算元B(i)相乘以產生包含乘法器輸出指數和乘法器輸出有效數的格式的項A(i)*B(i),並將乘法器輸出有效數轉換為2的補數格式;使用進位保留加法器將項A(i)*B(i)的2的補數格式有效數相加到總和S(i-1)的有效數,並為總和S(i)產生和及進位值;從A(i)*B(i)的乘法器輸出指數與總和S(i-1)的指數中選擇總和S(i)的指數,以產生總和S(i)的指數;以及將和及進位值與總和S(i)的指數轉換為正規化浮點編碼格式。The technique described herein provides a multiply-accumulate method to compute a sum S(i) of terms A(i)*B(i), where (i) ranges from 0 to N-1, where N is the number of terms in the sum. The method may comprise receiving a sequence of operands A(i) and operands B(i) in floating-point encoded format, where (i) ranges from 0 to N-1; combining operands A(i) and operands B( i) multiply to produce terms A(i)*B(i) in the format containing the multiplier output exponent and the multiplier output significand, and convert the multiplier output significand to 2's complement format; use carry-save addition Adds the 2's complement significand of the term A(i)*B(i) to the significand of the sum S(i-1) and produces a sum and carry value for the sum S(i); from A( Select the index of the sum S(i) from the multiplier output index of *B(i) and the index of the sum S(i-1) to generate the index of the sum S(i); and combine the sum and the carry value with the sum The exponent of S(i) is converted to normalized floating-point encoding format.
此外,所述方法可以包括以基底8格式提供乘法器輸出指數和項A(i)*B(i)的乘法器輸出有效數,以及在轉換為可以是基底2的正規化浮點編碼格式之前以基底8格式產生和及進位值與總和S(i)的指數。Additionally, the method may include providing the multiplier output exponent and the multiplier output significand of the term A(i)*B(i) in
累加加法階段所需的對齊取決於許多狀況,包括總和S(i-1)有效數溢位、總和S(i-1)符號擴展以及加數:項A(i)*B(i)及總和S(i-1)的指數之間的差異。可以確定這些狀況並將其組合用於在同一管線週期(例如,六級範例中的第三級)中進行對齊,從而實現快速執行和更快的管線時脈。在本文提供的實施例中,所述單元執行計算項A(i)*B(i)的總和S(i)的方法,其中(i)從0到N-1,並且N為總和的項數,所述方法包含: 接收浮點編碼格式的運算元A(i)和運算元B(i)序列,其中(i)從0到N-1; 將運算元A(i)和運算元B(i)相乘以在第一個管線週期期間產生項A(i)*B(i),其格式包括項A(i)*B(i)的乘法器輸出指數和項A(i)*B(i)的乘法器輸出有效數,並在第一個管線週期期間將項A(i)*B(i)的乘法器輸出指數與總和S(i-1)的累加器輸出指數進行比較以針對總和S(i)產生比較訊號; 將項A(i)*B(i)相加到總和S(i-1)以在下一個管線週期期間產生總和S(i),其格式包括總和S(i)的累加器輸出指數及總和S(i)的累加器輸出有效數,其中所述加法包括 確定總和S(i)的累加器輸出指數,並將總和S(i-1)的累加器輸出有效數和項A(i)*B(i)的乘法器輸出有效數中的一者或兩者移位作為總和S(i)的比較訊號的結果。 The alignment required for the accumulate-add phase depends on a number of conditions, including sum S(i-1) sign-extended, sum S(i-1) sign-extended, and addends: term A(i)*B(i) and sum The difference between the indices of S(i-1). These conditions can be determined and combined for alignment within the same pipeline cycle (eg, third stage in a six-stage example), resulting in fast execution and faster pipeline clocks. In the example provided herein, the unit performs a method of computing a sum S(i) of terms A(i)*B(i), where (i) ranges from 0 to N-1, and N is the number of terms of the sum , the method contains: Receive a sequence of operands A(i) and operands B(i) in floating-point encoding format, where (i) is from 0 to N-1; Multiply operand A(i) and operand B(i) to produce term A(i)*B(i) during the first pipeline cycle in a format that includes the term A(i)*B(i) The multiplier output exponent and the multiplier output significand of the term A(i)*B(i) and the sum S( The accumulator output indices of i-1) are compared to generate a comparison signal for the sum S(i); Adds the term A(i)*B(i) to the sum S(i-1) to produce the sum S(i) during the next pipeline cycle in a format including the accumulator output index of the sum S(i) and the sum S The accumulator of (i) outputs a significand, wherein the addition includes Determine the accumulator output exponent of the sum S(i) and assign either or both of the accumulator output significand of the sum S(i-1) and the multiplier output significand of the term A(i)*B(i) or shifted as a result of the comparison signal of the sum S(i).
執行在第一個管線週期期間將項A(i)*B(i)的乘法器輸出指數與總和S(i-1)的累加器輸出指數進行比較的步驟,以產生用於總和S(i)的比較訊號,同時在下一個管線週期(早期指數比較)中執行對運算元的調整使得能夠使用具有較短關鍵時序路徑並且可在較高時脈速度下運算的累加器級的管線。 浮點乘數 The step of comparing the multiplier output exponent of the term A(i)*B(i) with the accumulator output exponent of the sum S(i-1) during the first pipeline cycle is performed to produce a sum for the sum S(i ) while performing adjustments to operands in the next pipeline cycle (early exponent compare) enables the use of pipelines with accumulator stages that have shorter critical timing paths and can operate at higher clock speeds. floating point multiplier
浮點乘法器包括指數電路和有效數電路。指數部分執行運算元指數的加法,而有效數部分執行運算元有效數的二進制乘法。進入乘法器的運算元是「正規化」浮點數,其中第一位元是1。因此,運算元有效數(m)介於1≤m<2之間,即大於或等於1且小於2。因此,兩個運算元有效數的乘積在1≤p<4的範圍內,並且永遠不會等於或大於4。The floating-point multiplier includes an exponent circuit and a significand circuit. The exponent section performs addition of the operands' exponents, and the significand section performs binary multiplication of the operands' significands. The operands that go into the multiplier are "normalized" floating point numbers, where the first bit is 1. Therefore, the effective number (m) of the operand is between 1≤m<2, that is, greater than or equal to 1 and less than 2. Therefore, the product of the significands of the two operands is in the
如果作為有效數乘法的結果的乘積p在2≤p<4的範圍內,則指數將遞增,而有效數向右移位一個二進制位置以進行正規化。If the product p as a result of significand multiplication is in the
第一管線階段使用8x8位元整數乘法器執行指數的加法和運算元有效數的乘法,包括用於部分乘積的進位保留加法器。在使用進位保留加法器對所有部分乘積加總之後,乘法器陣列的結果可以包括兩部分:乘法器陣列的最高有效部分中部分乘積的進位保留加法器的8位元總和與9位元進位,以及乘法器陣列的最低有效部分的8位元乘積。在此範例中,使用漣波進位加法器將最低有效部分中8位元的部分乘積相加,因為這些位元來自部分乘積簡化樹。這種總和可以使用漣波進位加法器來完成,因為來自乘法器的最低有效部分的時間到達分佈是從最低有效數位元(LSB)到最高有效數位元(MSB)及時到達的那部分,足夠做漣波進位加法器。施加漣波進位加法器(RCA)顯著降低了乘法器的複雜度(圖4a)。The first pipeline stage performs addition of exponents and multiplication of operand significands using 8x8-bit integer multipliers, including a carry-save adder for partial products. After summing all the partial products using the carry-save adders, the result of the multiplier array can consist of two parts: the 8-bit sum of the carry-save adders of the partial products in the most significant part of the multiplier array and the 9-bit carry, and the 8-bit product of the least significant part of the multiplier array. In this example, the 8-bit partial product in the least significant part is added using a ripple-carry adder, since these bits come from the partial product reduction tree. This summation can be done using a ripple-carry adder, because the time-arrival distribution from the least significant part of the multiplier is the part that arrives in time from the least significant bit (LSB) to the most significant bit (MSB), enough to do Ripple carry adder. Applying a ripple-carry adder (RCA) significantly reduces the complexity of the multiplier (Figure 4a).
所述級包括響應於在管線時脈上暫存的第一和第二輸入運算元,在管線時脈之前提供乘法器有效數和乘法器指數值的乘法器電路。乘法器電路包括有效數乘法器電路和指數加法器電路,所述有效數乘法器電路具有用以產生進位及總和值的部分乘積以產生乘法器輸出有效數的高階位元的進位保留加法器,和用以產生有效數進位及總和值輸出的低階位元的部分乘積的漣波進位加法器。此外,乘法器電路包括基底8轉換電路,用以將乘法器有效數和乘法器指數值轉換為乘法器輸出指數和有效數的基底8格式;以及2的補數轉換電路,用以將乘法器的有效數值轉換為乘法器輸出有效數的2的補數表示。The stage includes a multiplier circuit that provides multiplier significand and multiplier exponent values prior to the pipeline clock in response to the first and second input operands being buffered at the pipeline clock. The multiplier circuit includes a significand multiplier circuit and an exponent adder circuit, the significand multiplier circuit having a carry-save adder for generating a partial product of the carry and sum values to produce the high-order bits of the multiplier output significand, and a ripple-carry adder to generate the partial product of the significand carry and low-order bits of the sum output. In addition, the multiplier circuit includes a base-8 conversion circuit for converting the multiplier significand and multiplier exponent values into a base-8 format for the multiplier output exponent and significand; and a 2's complement conversion circuit for converting the multiplier The significand value of the multiplier is converted to the 2's complement representation of the multiplier output significand.
指數是單獨相加的。兩個指數都為大於零的正數。當加法結果是大於256的數字時,指示是來自指數加法器的進位輸出訊號。如果結果指數等於255,則判定正無窮大指示。如果指數等於0,則根據IEEE 754標準規則將有效數設置為零。在此實現中,如果乘積的指數為0,則結果的有效數強制為0,因此表示+/-零浮點數(圖4b)。在其它實施例中,可以不同地對待次正規數。Exponents are added individually. Both exponents are positive numbers greater than zero. When the addition result is a number greater than 256, the indication is the carry output signal from the exponent adder. If the resulting exponent is equal to 255, then a positive infinity indication is determined. If the exponent is equal to 0, the significand is set to zero according to the IEEE 754 standard rules. In this implementation, if the exponent of the product is 0, the significand of the result is forced to 0, thus representing a +/- zero float (Figure 4b). In other embodiments, subnormal numbers may be treated differently.
指數加法需要從結果中減去127,因為在BF16和FP32編碼格式中,兩個運算元都包含127偏置。將結果加129可以加快轉換程序,這是透過反相輸入之一的指數的MSB並將1引入加法器的進位輸入來實現的。這極大地簡化了電路並且可以減少管線階段所需的時間(圖4b)。Exponential addition requires subtracting 127 from the result, because in BF16 and FP32 encoding formats, both operands include a 127 offset. Adding 129 to the result speeds up the conversion process by inverting the MSB of the exponent on one of the inputs and introducing a 1 to the adder's carry input. This greatly simplifies the circuit and can reduce the time required for pipeline stages (Figure 4b).
我們透過以下方式證明這個程序的正確性:加法致使127的兩個偏置相加,使偏置為254。然而,由於加法器的進位輸出(即256)被忽略,結果偏置將是-2。我們可以透過在運算結果中加上129來得到127。這是透過反相運算元的MSB來實現的,在負運算元的情況下,這相當於加128,因為MSB位置包含零。在MSB等於1的正運算元的情況下,這也相當於加128。進位輸入處的額外1使結果偏置:-2+129,等於所需的127偏置。We prove the correctness of this procedure in the following way: The addition causes the two offsets of 127 to add, making the offset 254. However, since the carry out of the adder (ie 256) is ignored, the resulting bias will be -2. We can get 127 by adding 129 to the result of the operation. This is accomplished by inverting the MSB of the operand, which is equivalent to adding 128 in the case of a negative operand, since the MSB position contains a zero. In the case of positive operands with MSB equal to 1, this is also equivalent to adding 128. The extra 1 at the carry input biases the result: -2+129, which equals the desired bias of 127.
相同的管線階段將結果轉換為基底8的數字,其中包含5位元指數和適當地向右移位7個位置的有效數。對於剩餘3指數位元的值表示的數量,轉換成5位元指數需要從第7位置左移。這需要有效數穿過左移位器,所述左移位器將根據8位元指數的3-LSB位元的需求將有效數從0位元向左移位到7位元位置。(圖5b)The same pipeline stage converts the result to a base-8 number with a 5-bit exponent and a significand shifted right by 7 positions appropriately. For the quantity represented by the value of the remaining 3 exponent bits, conversion to a 5-bit exponent requires a left shift from the 7th position. This requires the significand to pass through a left shifter which will shift the significand left from the 0 bit to the 7 bit position as required by the 3-LSB bits of the 8 bit exponent. (Figure 5b)
乘法器透過識別源自部分乘積簡化樹(PPRT)的訊號到達分佈不均勻來節省計算時間。LSB位元首先到達,接著是下一個位元,對於PPRT的前8個最低有效數位元(LSB)依此類推。由於不相等的到達分佈,LSB部分的相加可以在乘法器陣列的延遲下被屏蔽(「隱藏」),從而為管線階段(例如,在上面概述的範例中的第二管線階段)提供節省(在時間上)。對LSB部分進行加總使用8位元漣波進位加法器(RCA),以將使用用於部分乘積的進位保留加法器的進位傳播加法器(CPA)的大小從17位元減少到9位元。用於下一個管線階段的MSB部分,包括只有9位元長的最終加法器。乘積的有效數在管線階段中透過將最終加法器的最高有效數9位元相加並利用先前使用前一個管線階段的漣波進位加法器形成的最低有效數8位元進行擴充來形成(圖4a)。The multiplier saves computation time by identifying uneven arrival distributions of signals originating from Partial Product Reduction Trees (PPRTs). The LSB bit arrives first, followed by the next bit, and so on for the first 8 least significant bit bits (LSB) of the PPRT. Due to the unequal arrival distribution, the addition of the LSB part can be masked ("hidden") at the delay of the multiplier array, thereby providing savings for pipeline stages (e.g., the second pipeline stage in the example outlined above) ( in time). Adding the LSB part uses an 8-bit ripple-carry adder (RCA) to reduce the size of the carry-propagate adder (CPA) from 17 bits to 9 bits using a carry-save adder for partial products . For the MSB part of the next pipeline stage, including the final adder which is only 9 bits long. The significand of the product is formed in the pipeline stage by summing the most significant 9 bits of the final adder and augmenting it with the least significant 8 bits previously formed using the ripple-carry adder of the previous pipeline stage (Fig. 4a).
圖3是具有兩個輸入(線213上的運算元A和線214上的運算元B)的乘法器電路202的簡化方塊圖300。乘法器電路202包含兩個方塊(乘法器及加法器方塊210a和指數方塊210b)。3 is a simplified block diagram 300 of a
圖4a說明了乘法器及加法器方塊210a的範例,其顯示8x8乘法器部分乘積簡化樹,具有用於更高有效數位元的部分乘積的進位保留加法器,沒有具有用於較低有效數位元的部分乘積加法的7-LSB漣波進位加法器方塊的最終16位元加法器(在下一階段提供)。運算元A 213儲存在包括三個欄位:Sa、Ea和Fa的暫存器420中。Sa是符號位元。Ea是八個指數位元,而Fa是有效數的小數部分。Fa欄位在線422上施加到8X8 BF16乘法器電路410的第一輸入。運算元B 214儲存在暫存器421中,暫存器421包含三個欄位:Sb、Eb和Fb。Sb是符號位元。Eb是八個指數位元,而Fb是有效數的小數部分。Fb欄位在線423上施加到8X8 BF16乘法器電路410的第二個輸入。在線440上,乘法器電路410的輸入是強制零位元,當為零時,強制8X8 BF16乘法器電路產生零輸出。Figure 4a illustrates an example of a multiplier and
8X8 BF16乘法器電路410輸出兩個7位元LSB匯流排428和429,其為7位元漣波進位加法器430的輸入。此外,8X8 BF16乘法器電路410輸出8個和位元S8 426和9個進位位元C9 427。在線424上,7位元漣波進位加法器430輸出7個位元,而在線425上,進位輸出位元COUT輸出到暫存器450。暫存器450具有以下映射:線424映射到PL[6:0],線425上的COUT映射到C7,線426上的S8映射到Sp[14:7],線427上的C9映射到Cp[14:6]。8×8
圖4b說明了具有特殊指數檢測方塊467的範例指數單元(例如圖3的210b)。如圖4a中的運算元A 213在暫存器420中,而如圖4a中的運算元B在暫存器421中。線465上的Ea是特殊指數檢測方塊和指數加法器電路464的一個輸入。線462上的Eb是特殊指數檢測方塊的第二輸入。線462上的Eb的7個最低有效數位元被輸入到指數加法器電路464,而第8位元在第8位元位置進入指數加法器電路464之前由反相器461反相。指數加法器電路464的進位值設置為「1」。FIG. 4b illustrates an example index unit (eg, 210b of FIG. 3 ) with a special
指數加法器電路464對Ea 465和Eb 462進行運算,將它們加在一起並減去127的偏置值。輸出是到暫存器470的10位元值466。兩個額外位元,超出了編碼指數所需的8位元,用於檢測指數溢位情況。這10位元在指數異常處理電路524中進一步檢查,如圖5C所示。
輸入指數訊號在特殊指數檢測方塊467中被檢查為零,如線468上的訊號所示,或無效,如線469上的訊號所示。來自暫存器420和421的符號位元Sa及Sb輸入到反互斥或閘471a,其輸出被施加到反互斥或閘471b。此外,線469上的無效訊號被輸入到反互斥或閘471c。如果無效訊號為零,則結果符號是Sa及Sb的互斥或函數。如果Invalid為真(等於「1」),則乘積符號Sp設置為「零」,如編碼標準中所指定。
基底 8 轉換 The input index signal is checked in special
圖5A是顯示基底8轉換器方塊592(例如圖2的方塊220)的簡化圖。基底8轉換器方塊592包含兩個子方塊,在此範例中,最終加法有效數選擇和基底8轉換子方塊592a和指數異常處理子方塊592b。
轉換為基底 8 , 2 的補數有效數 FIG. 5A is a simplified diagram showing a
外部輸入運算元A在第二管線階段轉換為基底8編碼。運算元A有效數被轉換為2的補數有效數。有效數擴展為34位元,包括兩個有效數符號位元。如圖5b所示,得到的管線暫存器520包含5位元指數、34位元有效數和兩個額外的狀態位元,總共41位元。The external input operand A is converted to base-8 encoding in the second pipeline stage. The operand A significand is converted to a 2's complement significand. The significand is extended to 34 bits, including two significand sign bits. As shown in FIG. 5b, the resulting pipeline register 520 contains a 5-bit exponent, a 34-bit significand, and two additional status bits, for a total of 41 bits.
使用指數的最後3位元實現到基底8的轉換,以將來自圖4a的暫存器450的24位元運算元有效數對齊為32位元基底8有效數,其中有效數的LSB與32位元有效數的LSB對齊,如果指數的3-LSB等於0(從二進制點向右移位8個位置)。由指數的3-LSB表示的任何值都為有效數向左移位的量(從第8位元位置開始),以補償從指數中截斷的那些位元。直到二進制點的其餘位元,以及超過的兩位元,都用符號擴展位元填充。在所有三個指數LSB都為b'1的情況下,也就是說,等於十進制7,32位元有效數的第一個有效數位元將是非零位元,也就是說,正規化有效數。由於有效數表示為2的補數,有效數點左側的兩個額外位元將用於儲存符號位元(包括擴展符號位元)。使用額外的第二符號位元,而不是一個,以便保留符號,因為可能的溢位情況會致使2位元整數覆蓋較低符號位元(圖5b)。The conversion to base-8 is implemented using the last 3 bits of the exponent to align the 24-bit operand significand from the
根據乘積的符號,有效數被穿過或被反相,以建立有效數的2的補數負表示。此實現不同於IEEE 754,其中有效數可以是正數或負數。此運算是透過在24位元有效數上相加一個符號位元並在符號等於1(負數)時反相這些位元來執行的。Depending on the sign of the product, the significand is crossed or inverted to create a 2's complement negative representation of the significand. This implementation differs from IEEE 754, where the significand can be positive or negative. This operation is performed by adding a sign bit to the 24-bit significand and inverting these bits when the sign is equal to 1 (negative number).
指數在-126到126之間的值檢查。如果大於126,則將其視為無窮大;如果小於-126,則將其視為非正規化數(小於-126)並轉換為零(圖5c)。Check for values with exponent between -126 and 126. If it is greater than 126, it is treated as infinity; if it is less than -126, it is treated as a denormalized number (less than -126) and converted to zero (Fig. 5c).
在某些實現中,這一階段管線的最終暫存器包含正規化浮點乘積,所述乘積具有5位元指數和34位元2的補數有效數,(含有乘積的重複符號,並且沒有隱含的1)和三個指數狀態位元。In some implementations, the final scratchpad at this stage of the pipeline contains a normalized floating-point product with a 5-bit exponent and a 34-
圖5B說明了最終部分乘積加法、有效數選擇和基底8轉換子方塊592a的範例性示意圖。暫存器450(圖4a)包含PL[6:0]、C7、Sp[14:7]和Cp[14:6]的欄位。暫存器470(圖4b)包含10位元乘積指數(Ep)值的欄位。暫存器504包括狀態位元。FIG. 5B illustrates an exemplary schematic diagram of the final partial product addition, significand selection, and base-8
在管線以FP32加法模式運行的情況下,運算元A為FP32格式並繞過乘法器。在這種情況下,運算元A源自暫存器460,佔用兩個組合的16位元暫存器420和421。線511上的add_op控制訊號指示何時將管線模式設置為加法(本例中為單精確度浮點)或累加。In cases where the pipeline runs in FP32 add mode, operand A is in FP32 format and the multiplier is bypassed. In this case, operand A originates from
有效數最終加法器電路502在線503上接收Sp[14:7]、線501上的Cp[14:6]和線507上的進位位元C7作為輸入,將溢位訊號519輸出到溢位選擇電路506。溢位選擇電路506具有輸入匯流排523,其是線509上的PL[6:0]和線521上的有效數最終加法器電路502輸出的組合。反或閘522具有線525上的輸入指數溢位位元及線468上的零強制位元並輸出線527上的訊號。溢位選擇電路506輸出的線527和匯流排529上的訊號路由到及閘544,其在指數溢位的情況下以及在有效數被強制為零的情況下將有效數設置為全零。此外,有效數選擇電路512使用線511上的add_op控制訊號在匯流排515上的旁路有效數Fa[22:0]或匯流排553上的及閘544輸出之間進行選擇。Significant
指數選擇電路510在8指數位元、線517上的Ep[7:0]或線513上的旁路指數位元Ea[30:23]位元之間進行選擇,並將線533上的選定指數輸出到暫存器520的E_mult欄位。符號位元選擇電路508接收在線473上的Sp符號位元(圖4b)和旁路符號位元Sa作為輸入,並將符號位元531輸出到暫存器520中的S_mult欄位。The
線511上的add_op控制訊號路由到有效數選擇電路512、指數選擇電路510和符號位元選擇電路508作為它們的控制輸入。The add_op control signal on
有效數選擇電路512的輸出進入8位元左移位器電路514。來自指數選擇電路510的線533的較低三位元[2:0]在線533上輸出以控制8位元左移位器電路514。8位元左移位器電路514的輸出匯流排537饋入多工器電路518,所述多工器電路518在線537上的輸入(如果有效數為正的情況)和線539上的輸入(有效數為負的情況)之間進行選擇。這由符號位元531選擇。2的補數反相加1電路516在線537上建立移位器輸出的2的補數,並在線539上輸出補數值。線541上的多工器電路518的輸出以34位元F_mult有效數進入管線暫存器520。所述程序將選定的有效數轉換為2的補數表示的有效數,所述有效數為32位元長,具有2個符號位元,儲存在管線暫存器520中。The output of significand
圖5C說明了指數異常處理子方塊592b的方塊圖。如參考圖5b所述,有效數最終加法器電路502從暫存器450接收輸入Sp[14:7]503、Cp[14:6]501和進位位元C7 507。有效數最終加法器電路502的溢位輸出連接到指數異常處理電路524。在檢測到溢位情況時,有效數最終加法器電路502判定溢位訊號519作為指數異常處理電路524的第一輸入。指數異常處理電路524的第二輸入是來自匯流排517上的暫存器470的指數位元Ep[9:0](輸入運算元指數的和)。第三輸入是線523上的指數溢位檢測電路522的輸出訊號。接著將指數異常處理電路524的輸出在線549上輸入到指數異常檢測電路526和指數選擇電路510(參照圖5b描述)。Figure 5C illustrates a block diagram of the exponential exception handling sub-block 592b. Significant
匯流排517上的指數位元Ep[9:0]被輸入到指數溢位檢測電路522,其檢測溢位狀況:
exp_ovf=Ec[8],:表示如果位元8為1,則檢測到指數溢位,
exp_povf=~Ec[9]&Ec[8]:如果位元9為0,而位元8為1;正溢位,
exp_novf=Ec[9] & Ec[8]:如果位元9和位元8都為1;負溢位。
電路522的線523上的第一輸出路由到指數異常處理電路524,線543上的第二輸出路由到輸出異常控制訊號產生電路528,第三輸出包括到圖5B中的閘522的線525上的指數溢位位元。
The exponent bits Ep[9:0] on the
指數異常檢測電路526例如透過匯流排547向暫存器532輸出包括以下三位元的異常:of(溢位);uf(欠位);以及nv(無效)。
檢測到以下情況時會發生這種情況:
of(溢位)-意思是如果Ec為11111111,並且未檢測到無窮大,則將其解釋為溢位。
uf(欠位)-意思是如果Ec是00000000並且零(有效數)未發出訊號,則為欠位情況。
nv(無效)「1」-意思是結果無效。
The index
輸出異常控制訊號產生電路528具有四個輸入。第一輸入是線511上的add_op控制訊號,其指示累加或旁路加法模式,第二輸入是線509上的狀態位元(無窮大、零或無效),線545上的第三輸入從在暫存器460的Ea[30:23]位元或指數異常處理電路524的輸出之間多工的指數選擇電路510路由,而第四輸入來自指數溢位檢測電路522的線543上的第二輸出。輸出異常控制訊號產生電路528在線551上輸出五個位元,表示儲存到暫存器530中的exp_mul_zero、exp_mul_inf、exp_zero_en、exp_inf_en和f_zero_en。
exp_mul_zero 含義:乘數乘積指數為零,
exp_mul_inf 含義:乘數乘積指數為無窮大
exp_zero_en 含義:當(乘數輸入指數之一為零,且乘數輸入指數均不為零)或乘數乘積指數有負溢位時致能,
exp_inf_en 含義:當乘數輸入指數之一為無限大或乘數乘積指數有正溢位時致能
f_zero_en 含義:當exp_zero_en訊號致能或乘數乘積指數溢位(正或負)或當乘數乘積指數為零時致能。
進位保留累加單元 The output abnormality control
圖6說明了用於有效數的進位保留累加器(例如圖2的240)的方塊圖600。基底8轉換器215接收運算元C作為輸入,並在線219上以基底8格式輸出運算元C到多工器210和多工器211。多工器210和211的兩個額外的輸入是從累加器總和暫存器242和累加器進位暫存器241反饋的匯流排224和225。多工器210和多工器211的輸出路由到移位器電路609和610,移位器電路609和610執行右移8/16/24位元或左移8位元。移位器電路609和610的輸出路由到進位保留加法器電路(CSA)614。進位保留加法器電路614具有來自8/16/24電路608的右移電路的第三輸入,其輸入是A*B(BF16)或A(FP32)運算元單獨的乘積602。線667和669上的進位保留加法器電路614的輸出路由到向S位元暫存器636提供輸出的LZA電路606,並且路由到向O位元暫存器634提供輸出的溢位檢測方塊605。FIG. 6 illustrates a block diagram 600 of a carry-save accumulator (eg, 240 of FIG. 2 ) for a significand. Base-8
進位保留累加單元包括在周期(i)的第一管線時脈接收項A(i)*B(i)的乘法器輸出有效數和表示總和值S(i-1)的先前累加器輸出的反饋總和及進位值的有效數電路。有效數電路包括2的補數、進位保留加法器,用以在第二管線時脈上為總和S(i)產生總和及進位累加器輸出有效數值。進位保留累加單元包括在第一管線時脈接收項A(i)*B(i)的乘法器輸出指數以及表示總和值S(i-1)的先前累加器輸出的反饋指數值的指數電路,以針對總和值S(i)在第二管線時脈上產生累加器輸出指數值。有效數電路包括有效數移位器,響應於儲存在第一管線時脈處的指數比較訊號,以對齊乘法器輸出有效數和用於加法的反饋總和及進位值。指數電路響應於儲存在第一管線時脈處的指數比較訊號,以產生累加器輸出指數值。管線包括指數比較電路,以在第一管線時脈之前將項A(i)*B(i)的乘法器輸出指數與總和S(i-1)的反饋指數值進行比較,以產生儲存在第一管線時脈處的指數比較訊號。The carry-save-accumulate unit includes the significand of the multiplier output receiving the term A(i)*B(i) at the first pipeline clock of cycle (i) and the feedback representing the previous accumulator output of the sum value S(i-1) Significant circuit for sum and carry values. The significand circuit includes a 2's complement, carry-save adder for generating a sum and a carry accumulator to output a significand value for the sum S(i) at the second pipeline clock. a carry save accumulator unit comprising an exponent circuit receiving at a first pipeline clock the multiplier output exponent of the term A(i)*B(i) and the feedback exponent value representing the previous accumulator output of the sum value S(i-1), The accumulator output index value is generated on the second pipeline clock for the sum value S(i). The significand circuit includes a significand shifter responsive to an exponent comparison signal stored at the first pipeline clock to align the multiplier output significand and a feedback sum and carry value for addition. The exponential circuit generates an accumulator output exponential value in response to the exponential comparison signal stored at the first pipeline clock. The pipeline includes exponent comparison circuitry to compare the multiplier output exponent of term A(i)*B(i) with the feedback exponent value of sum S(i-1) prior to the first pipeline clock to generate the value stored at An exponential comparison signal at a pipeline clock.
本實施例中的進位保留累加單元包括溢位檢測器電路,以在第一管線時脈處為反饋的總和及進位值中的至少一個產生指示溢位狀況的第一狀況訊號,以及前導符號位元檢測器電路,以產生指示反饋的總和及進位值中的至少一個在第一管線時脈處具有多於或等於數量8的擴展符號位元的第二狀況訊號。指數電路和有效數電路也響應於第一狀況訊號和第二狀況訊號。溢位和前導符號位元調整以及指數比較調整被組合以由移位器在同一管線週期中實現,如參考下面的表格1:CSA單元控制所述。The carry-save-accumulate unit in this embodiment includes an overflow detector circuit to generate a first condition signal indicating an overflow condition for at least one of the sum and the carry value fed back at the first pipeline clock, and a leading sign bit and a meta detector circuit to generate a second condition signal indicating that at least one of the fed back sum and carry values has more than or equal to a number of eight extended sign bits at the first pipeline clock. The exponent circuit and the significand circuit are also responsive to the first condition signal and the second condition signal. Overflow and leading sign bit adjustment and exponent compare adjustment are combined to be implemented by the shifter in the same pipeline cycle as described with reference to Table 1: CSA unit control below.
此外,管線的這一級具有累加器模式及加總模式,並包括選擇器,以在累加器模式下提供反饋的累加器輸出,並在加總模式下將第三浮點輸入運算元提供給有效數電路及指數電路。有效數電路可以包括有效數移位器,響應於儲存在第一管線時脈處的指數比較訊號,以在累加器模式中對齊乘法器輸出有效數及反饋的總和及進位值以進行加法,並在加總模式中對齊乘法器輸出有效數和第三輸入運算元的有效數以進行加法。指數電路響應於儲存在第一管線時脈的指數比較訊號,以產生累加器輸出指數值。管線包括指數比較電路,以在累加器模式下在第一管線時脈之前將乘法器輸出指數與反饋的指數值進行比較,並在加總模式下將乘法器輸出指數與第三輸入運算元的指數進行比較,以產生儲存於第一管線時脈的指數比較訊號。 有效數電路: Additionally, this stage of the pipeline features accumulator mode and sum mode and includes selectors to provide the fed back accumulator output in accumulator mode and a third floating-point input operand in sum mode to an active Number circuit and index circuit. The significand circuit may include a significand shifter responsive to the exponent compare signal stored at the first pipeline clock to align the sum and carry values of the multiplier output significand and feedback for addition in accumulator mode, and Aligns the multiplier output significand and the significand of the third input operand for addition in sum mode. The exponential circuit generates an accumulator output exponential value in response to the exponential comparison signal stored in the first pipeline clock. The pipeline includes exponent compare circuitry to compare the multiplier output exponent with the fed back exponent value in accumulator mode prior to the first pipeline clock, and to compare the multiplier output exponent with the value of the third input operand in sum mode The indices are compared to generate an index comparison signal stored in the first pipeline clock. Significant number circuit:
進位保留加法器(CSA)有效數階段有兩條路徑:累加器路徑,其中來自累加器的運算元可以向右移位8、16或24位元,並且可以向左移位8位元,以及乘法器路徑,其中來自乘法器的運算元可以向右移位8、16或24位元。當使用基底8指數時,右移8、16或24位元對應於運算元之間的指數差1、2或3。當進位保留加法器輸出符號擴展超過8位元的數字時,完成8位元左移。The carry-save adder (CSA) significand stage has two paths: the accumulator path, where operands from the accumulator can be shifted right by 8, 16, or 24 bits, and can be shifted left by 8 bits, and Multiplier path, where operands from the multiplier can be shifted right by 8, 16, or 24 bits. When using base-8 exponents, a right shift of 8, 16, or 24 bits corresponds to an exponent difference of 1, 2, or 3 between the operands. An 8-bit left shift is done when the carry-save adder outputs a sign-extended number beyond 8 bits.
如果運算元指數之間的差大於3,則意味著其中一個運算元向右移位超過了24位元,這使運算元向右對齊太遠,無法在較大運算元的範圍內。這種情況相當於將更大的運算元加零,或者使用旁路多工器將更大的運算元不變地簡單傳遞給累加器(圖7c)。If the difference between the operand exponents is greater than 3, it means that one of the operands has been shifted to the right by more than 24 bits, which makes the operand too far right aligned to be within the bounds of the larger operand. This case is equivalent to adding zero to the larger operand, or using a bypass multiplexer to simply pass the larger operand unchanged to the accumulator (Figure 7c).
此實現透過在指數差大於3時將CSA與零相加來消除旁路多工器,相當於繞過運算元。CSA的輸入來自乘法器和累加器,並由及閘來閘控。移位器和指數控制單元檢測到這種情況並將適當的運算元設置為零。這種實現方式在每條路徑中節省了一個多工器階段。This implementation eliminates the bypass multiplexer by adding CSA to zero when the exponent difference is greater than 3, equivalent to bypassing the operand. The input to the CSA comes from the multiplier and accumulator, and is gated by an AND gate. The shifter and exponent control unit detects this condition and sets the appropriate operands to zero. This implementation saves a multiplexer stage in each path.
符號擴展的檢測發生在3:2進位保留加法器階段之後。如果檢測到這種情況,則設置符號擴展位元S和溢位位元O,並在隨後的管線時脈中進行處理。為了不因溢位而丟失符號位元,計算時攜帶重複符號。引入了額外的複雜度以提高精確度。這涉及將累加器擴展到36或40位元。在另一種實現中,引入檢測邏輯改進了時序和準確性。檢測邏輯從三個輸入683、685、689獲取進位保留加法器電路(CSA)614的輸入,而不是進位保留加法器電路(CSA)614的兩個輸出,並且是另一個相關揭露的標的。
指數電路: The detection of sign extension occurs after the 3:2 carry-save adder stage. If this condition is detected, the sign extension bit S and the overflow bit O are set and processed on subsequent pipeline clocks. In order not to lose the sign bit due to overflow, the calculation carries repeated signs. Additional complexity is introduced to improve accuracy. This involves extending the accumulator to 36 or 40 bits. In another implementation, the introduction of detection logic improves timing and accuracy. The detection logic takes the inputs of the carry-save adder circuit (CSA) 614 from three
「指數控制單元」比較來自乘法器的第一指數運算元和來自累加器的第二指數運算元之間的指數差。指數控制單元檢查由比較乘法器和累加器指數生成的狀況,並根據表1選擇運算元路徑。同時,確定新的累加器指數並將其儲存到指數累加器(Eacc)暫存器654(圖7b)。An "exponent control unit" compares the exponent difference between the first exponent operand from the multiplier and the second exponent operand from the accumulator. The index control unit checks the conditions generated by comparing the multiplier and accumulator indices and selects the operand path according to Table 1. At the same time, a new accumulator index is determined and stored in the index accumulator (Eacc) register 654 (FIG. 7b).
指數部分有兩個分支:左分支和右分支。左分支由輸入671和673(進入或閘)組成,選擇兩個指數中較大的一個,接著成為結果指數。根據表1選擇此狀況。由輸入675和677(進入指數輸出或閘)組成的右分支將根據表1中描述的狀況選擇Ea+1或Ea-1。如果發出有效數溢位訊號,累加器有效數應向右移位8位元(SHR_8),並且指數加1。The exponential part has two branches: the left branch and the right branch. The left branch consists of
在CS加法期間執行溢位(O)檢測。如果檢測到溢位,則O位元被鎖存到輸出管線暫存器中。溢位情況將在下一個週期根據表1進行校正。 CS-累加的實現 Overflow (O) detection is performed during CS addition. If an overflow is detected, the O bit is latched into the output pipeline register. The overflow condition will be corrected according to Table 1 in the next cycle. Implementation of CS-cumulative
指數路徑和有效數路徑的功能是相互依賴的,其取決於有效數部分產生的指數和「符號擴展」(SE)和「溢位」(O)訊號的狀態。有兩個累加器,一個用於進位,另一個用於總和。它們使用3:2進位保留加法器(CSA)與乘積相加,並穿透過兩條單獨的路徑,一條用於進位,另一條用於總和。The functions of the exponent path and the significand path are interdependent, depending on the exponent generated by the significand part and the state of the "sign extend" (SE) and "overflow" (O) signals. There are two accumulators, one for carry and one for sum. They are added to the product using a 3:2 carry-save adder (CSA) and threaded through two separate paths, one for the carry and the other for the sum.
管線階段的目標暫存器是一個累加器,其包含進位及總及(兩個暫存器)。在接下來的管線階段(管線4和管線5)中執行到傳統格式的轉換。進位保留階段可以是時序關鍵階段。因此,特別注意本節中描述的指導設計決策的時序和區域。所述管線階段的關鍵路徑包含:指數控制、三個2:1多工器、一個5位元減法器、一個5位元減法器和比較單元,在指數部分,而在有效數部分,指數控制、5位元遞增器、3:2進位保留加法器(CSA)和一個及閘。關鍵路徑可以穿過指數路徑和有效數路徑,就像本設計中的情況一樣。 累加器設計 The target register of the pipeline stage is an accumulator containing the carry and sum (two registers). The conversion to legacy format is performed in the next pipeline stages (Pipeline 4 and Pipeline 5). The carry save phase can be a timing critical phase. Therefore, pay special attention to the timing and regions described in this section to guide design decisions. The critical path of the pipeline stage contains: exponent control, three 2:1 multiplexers, a 5-bit subtractor, a 5-bit subtractor and compare unit, in the exponent part, and in the significand part, the exponent control , 5-bit incrementer, 3:2 carry-save adder (CSA) and an AND gate. The critical path can traverse both the exponential path and the significand path, as is the case in this design. accumulator design
圖7A說明了累加器240的簡化方塊圖610,其包括三個電路方塊:指數控制單元240A、指數比較器單元240B和有效數部分240C。7A illustrates a simplified block diagram 610 of
圖7B說明了指數控制單元240A和指數比較器單元240B的範例性階層式方塊圖和示意圖。移位器指數控制訊號產生/旁路控制電路630接收來自:accum_ld、exp_zero_en、f_zero_en、e_cin_zero、551、csa_ovf位元634 O及Signext的輸入,其為S位元636以及16位元乘法器指數比較電路652的輸出,其存入16位元狀況暫存器650,十六個指數比較位元*:
*其中:
emult: 是乘積指數
eaccu: 是累加器指數
emmp 含義:emult大3以上
eamp 含義:eaccu大4以上
還有其它控制訊號:
accum_ld -含義:累加器接收輸入C值。
exp_zero_en-含義:將乘積指數設置為零。
f_zero_en
-含義:如果Exponent=0,則將乘積有效數設置為0(因為不允許非正規)
e_cin_zero
-含義:輸入C指數等於零
FIG. 7B illustrates an exemplary hierarchical block diagram and schematic diagram of the
移位器指數控制訊號產生/旁路控制電路630的輸出是控制訊號:線638上的累加移位器控制、線636上的累加旁路控制、線634上的乘法器旁路控制、以及線632上的乘法器移位器控制、線646上的Ea_sel、線642上的Ea1m_sel、線648上的Em_sel、線645上的Ea1p_sel。The outputs of the shifter index control signal generation/
比較電路652比較來自以下的兩個運算元的指數:(1)線521上的乘法器指數E_mult及線679上的累加器指數;(2)或輸入A(在旁路模式下來自線521上的指數E_mult)及線679上的累加器指數;(3)或輸入A(在旁路模式下來自線521上的指數E_mult)及輸入C(來自線460上的指數Ec)。比較電路652產生儲存在16位元狀況暫存器650中的以下狀況位元:emult:乘法器指數;eaccu:累加器指數;z_diff - emult與eaccu相同;mgrt - emult大於eaccu;agrt - eaccu大於emult;em1p - emult大1;em2p - emult大2;em3p - emult大3;emmp - emult大3;ea1p - eaccu大1;ea2p - eaccu大2;ea3p - eaccu大3;ea4p - eaccu大4;eamp - eaccu大4以上;emz - emult為零;eaz - eaccu為零;eminf - emult是無窮大;並且eainf - eaccu是無窮大。16位元狀況暫存器650透過匯流排621與移位元指數控制訊號產生/旁路控制電路630介面。16位元狀況暫存器650儲存來自總和S(i-1)的Eacc的比較結果,並且在為總和S(i)產生Eacc期間,E_mult暫存器520在累加模式中儲存項A(i)*B(i)。
比較電路652在線647上的輸入來自減法器電路646。減法器電路646在線521上接收來自管線暫存器520的E_mult和多工器642的輸出,多工器642在暫存器460的Ec和或閘670的線679上的新指數輸出之間進行選擇,其中多工器642線665上指示模式的accum_en訊號控制。(圖7b)The input to compare
線618上的Exp_Zero_En被施加到反相器619,其輸出被用作及閘617的輸入。來自管線暫存器520的E_mult指數位元也被輸入到及閘617,其在線681上的輸出饋送到及閘668,其中線648上的Em_sal位元來自移位器指數控制訊號產生/旁路控制電路630以穿過或阻擋E_mult。及閘668的線671上的輸出連接到四輸入或閘670。或閘670具有三個其它輸入,包括及閘615的輸出,其由訊號Ea_sel選擇以穿過或阻擋Eaccum,而線616上遞增器660的輸出和線663上的遞減器661的輸出各自由輸出Ea1p_sel和Ea1m_sel在及閘664和665處分別控制。根據選擇訊號(其中只有一個可以是1)、648、646、645、642,選擇適當的指數作為或閘670的輸出。這個輸出是Eacc訊號,也稱為新指數,其為指數累加器(Eacc)暫存器654的輸入,也是多工器642的輸入。Exp_Zero_En on
多工器665的輸出(「總和S(i)的新指數,或運算元C的指數,取決於模式)在此實施例中也記錄在暫存器460中,所述暫存器460在線644上連接為遞增器660和遞減器661的輸入。The output of the multiplexer 665 ("the new index of the sum S(i), or the index of the operand C, depending on the mode) is also recorded in this embodiment in the
因此,當線679上的新指數使用總和S(i-1)產生的比較位元表示總和S(i)時,新指數將在暫存器520中的項A(i-1)*B(i-1)的E-mult值進行比較,以產生將與總和S(i)一起鎖存的比較訊號,並用於在產生總和S(i+1)期間的移位器控制。Thus, when the new index on
圖7C是有效數部分240C的示意圖。說明了移位器指數控制訊號產生/旁路控制電路630,其顯示四個輸出控制訊號。第一控制訊號是在線638上的累加移位器控制,其為移位器電路SHR8/16/24/SHL8 609和610的選擇訊號。兩個移位器電路:SHR8/16/24/SHL8 609和610在線682上和線683上接收來自一組多工器210和211的輸入。線665上的accum_en訊號控制多工器210和多工器211,以在線224上的總和、線225上的進位或線219上的值之間進行選擇,源自暫存器Fcin 560或邏輯「0」作為多工器211的另一個輸入。多工器210和多工器211將匯流排682和匯流排683上的選定值輸出到移位器電路SHR8/16/24/SHL8 609和610。移位器電路609和610在匯流排692和693上輸出移位值。匯流排692可以直接介接到匯流排613或可以遍歷可選的進位捨入方塊604,所述進位捨入方塊604將捨入位元額添加到匯流排613。匯流排693可以直接介接到匯流排611或可以遍歷可選的總和捨入方塊612,並將捨入位元添加到693。匯流排611和613是及閘687和688的輸入,其輸出在線689和685上用作進位保留加法器電路614的輸入(AND符號表示匯流排上每條訊號線的及閘的多樣性:613、611和607。及閘688和及閘686的輸入由線633上及線634上的控制訊號分別選擇。FIG. 7C is a schematic diagram of the
管線暫存器520中的F_mult值被輸入到簡單乘積捨入方塊684,其在線603上直接輸入到移位暫存器SHR 8/16/24電路608。SHR 8/16/24電路608的選擇訊號是線632上的多移位器控制,其在線601上的F_mult輸入和線603上的捨入乘積之間進行選擇。輸出是匯流排607上包含42位元(34+8)的乘積,其施加於及閘686的輸入,其輸出是進位保留加法器電路614的輸入。The F_mult value in
42位元3:2進位保留加法器電路614具有三個輸入,包括及閘686的輸出683、及閘687的輸出689和及閘688的輸出685。42位元3:2進位保留加法器電路614的輸出是兩條匯流排:總和匯流排669和進位匯流排667。兩個輸出669和667分別透過匯流排669進入42位元小數總和暫存器242,而透過匯流排667進入42位元小數進位暫存器241。匯流排669和匯流排667也是溢位檢測方塊605和符號擴展檢測單元662的輸入。溢位檢測方塊605和符號擴展檢測單元662這兩個方塊向O位元634提供輸出,即csa_ovf訊號及是符號擴展訊號的S-bit 636。符號擴展檢測單元662具有在線665上的致能位元accum_en訊號,當運算為累加時,所述訊號被設置為邏輯「1」。符號擴展檢測模組僅在致能「accum_en」訊號時才可運算。
共有三種運算模式可供選擇,其為:
輸入A(BF16)x輸入B(BF16)+輸入C(FP32),
輸入A(BF16)x輸入B(BF16)+累加迴圈(總和),
輸入A(FP32)+輸入C(FP32)。
「accum_en」訊號僅在第二模式狀況(累加)期間致能。在加法模式中,不需要符號擴展檢測。僅在累加模式下才需要,因為符號擴展位元的逐漸增長只能在累加運算期間發生。
符號擴展檢測單元 662 : The 42-bit 3:2 carry-save
根據一些態樣,符號擴展檢測單元662附接到總和及進位兩者的累加器輸出。當檢測到10位元符號(包括兩個符號位元,以及總和或進位的第一個位元組中的額外8位元)時,輸出將在下一個週期(SHL_8)中左移以保持運算元的準確性。如果不進行符號擴展檢測,則在正常運算期間,運算元的有效數位元會逐漸向右移位,直到被擴展符號位元替換,從而致使精確度損失。在所述實現中,每次運算元之一檢測到至少10個前導符號位元時,調整執行將運算元左移8位元位置。透過將指數值減一來相應地調整指數,這在同一周期中執行。當在累加器的進位或總和部分檢測到S時,S位元被鎖存在輸出管線暫存器636中,用於在下一個週期進行校正。校正運算將累加器向左移位8位元位置(SHL_8)。有時這種情況可能會在下一個動作(需要SHR_8)時自行取消,通常保持不變,如表1所示。
正規化和轉換為符號數值格式 According to some aspects, sign
圖8A說明了正規化轉換到符號數值格式方塊270,其包含兩個子方塊,第一子方塊是從進位保留轉換到符號數值格式方塊270a,第二子方塊是從基底8轉換到基底2浮點數方塊270b。FIG. 8A illustrates a normalized conversion to signed
圖8B說明了從進位保留轉換到符號數值格式方塊270a的範例性示意圖。兩個暫存器,42位元小數總和暫存器242和42位元小數進位暫存器241輸出移位進位[42:0]匯流排704及符號擴展總和[42:0]匯流排702作為43位元加法器電路708的輸入。第二電路LZA/LOA 710接收輸入匯流排702和匯流排704。第二電路LZA/LOA 710在線711上輸出兩條匯流排POS_P[5:0],而在線712上輸出POS_N[5:0]到第三LZA POS選擇電路714。LZA POS選擇電路714的輸出,例如,透過匯流排715,是POS[5:0],其映射到暫存器730作為6位元位置,指定正規化有效數所需的向左移位數量。FIG. 8B illustrates an exemplary schematic diagram of the convert from carry-save to signed
43位元加法器電路708在線719上輸出訊號SIGN以控制LZA POS選擇電路714,將匯流排716路由到「0」分支輸入上的有效數選擇多工器電路720,並將匯流排716路由到負的輸入:反相加1電路718。2的有效數選擇多工器電路720的「1」分支接收匯流排717,其表示負有效數,轉換為正一。符號719控制有效數選擇多工器,使得輸出738總是包含正有效數。2的補數選擇多工器電路720的輸出是匯流排738,其作為41位元正有效數映射到暫存器730。線706上的5位元指數直接映射到暫存器730以及符號位元726。此步驟完成了以進位保留格式表示的累加器有效數轉換到符號數值基底8格式。43-
在這個階段,總和匯流排702和進位匯流排704上的兩個值(表示進位保留格式的有效數)在43位元加法器電路708中相加,以產生有效數的符號數值格式。前導零/前導一預測器(第二LZA/LOA電路)710將計算兩個數字:前導零的數字711(在有效數716為正的情況下)和前導一的數字712(在有效數716為負的情況下)。根據有效數符號位元719,多工器LZA POS選擇電路714將選擇正確的位置並將其儲存到暫存器730中。LZ和LO位置POS_P和POS_N都為6位元長數字,預計包含32個前導零或一的情況。At this stage, the two values on
如果43位元加法器電路708的輸出的有效數為負,則所述負數被轉換為正數(因為IEEE 754使用符號數值表示,也就是說,正有效數)。為此目的,使用了2的補數轉換器718。符號位元719控制多工器電路720,因此如果數字為正,則將其直接儲存到41位元有效數暫存器730。如果輸出為負,則線717上的輸出,即716上的值轉換為正值,將在線738上傳遞到暫存器730。If the significand of the output of the 43-
有效數的預測6位元位置將被相加到5位元指數以產生符合標準浮點數表示的新8位元指數,而有效數將相對於浮點有效數對齊,使用相同的6位元預測位置(圖8c)。The predicted 6-bit position of the significand will be added to the 5-bit exponent to produce a new 8-bit exponent conforming to the standard floating-point representation, and the significand will be aligned relative to the floating-point significand, using the same 6-bit Predicted location (Fig. 8c).
圖8C說明了從基底8到基底2浮點數轉換的方塊270b的範例性示意圖。暫存器730透過41位元匯流排731連接到SHL左移位器電路735。暫存器730透過41位元匯流排734介接到有效數零檢測電路728。暫存器730的6位元位置欄位在線723上提供Pos[5:0]以控制SHL左移位器電路735。線723上的Pos[5:0]也是指數加法器電路740的輸入。暫存器730的指數5位元欄位是線721上到指數加法器電路740的第二輸入。指數加法器電路740調整(增加)由POS[5:0]指示的有效數向左移位的位置數量的指數,並在線736上提供輸出到暫存器748。然而,鑑於預測器可能在一個位置上出錯,移位器的輸出被傳遞到溢/欠檢測電路752,所述電路將透過發出訊號739來發出錯誤訊號,訊號739施加於指數加法器電路740的進位輸入且於欠檢測多工器760的控制輸入。欠檢測多工器760具有線746上的輸入,其中來自移位器電路735的線742上的有效數處於相同位置(未檢測到錯誤),以及線747上的輸入,其中在線742上的有效數被向左移位一位元(檢測到錯誤)。如果訊號739指示欠檢測,則正確的輸出將透過匯流排745鎖存到暫存器770。透過進位輸入將1輸入至加法器來校正指數調整中的錯誤。符號值從暫存器730複製到暫存器747。FIG. 8C illustrates an exemplary schematic diagram of
暫存器730中的符號位元透過線744傳遞到暫存器747。The sign bit in
異常控制(8位元)暫存器750在線724上將其值傳遞給異常控制暫存器751。異常控制暫存器位元的含義如下:
圖9A說明了執行700最終轉換為BF16或IEEE 754 32位元單精確度格式的方塊270,包括執行捨入及轉換為BF16或IEEE 754 32位元單精確度格式的符號數值格式方塊270a和執行指數和異常處理的子方塊270b。
捨入並轉換為 FP32 格式 FIG. 9A illustrates performing 700 the final conversion to BF16 or IEEE 754 32-bit single-
根據某些態樣,最後階段是管線6,其執行將結果捨入為標準浮點符號/數值數,其具有下列:符號位元、8位元指數、23位元、正規化有效數(+1隱含整數位元)。在將31位元有效數轉換為具有一隱含位元的24位元正規化有效數的期間,將結果從31位元捨入到24位元。在此實現中,實現了兩種捨入模式:向零捨入(RTZ)截斷、捨入到最近偶數(RNE)。然而,任何其它捨入模式,例如,捨入到最近奇數(RNO)都很容易被併入。According to certain aspects, the final stage is
根據一些態樣,捨入邏輯檢查來自暫存器770的39個有效數位元中的最後15個LSB位元(不計算構成總42位元(39+3個GRS位元)的GRS位元),並確定剩餘的24位元是否需要捨入(根據施加的規則:RNE或RTZ)。RNE所需的遞增器包含在捨入框中。從累加器(CSA)運算中攜帶的三個位元GRS在此實現中被忽略。它們可以併入到其它可能實現的最終捨入中。
捨入是透過以下幾種方式之一完成的:
(a) 在CS累加運算期間,施加捨入到最接近奇數(RNO),
- 針對總和訊號,僅在將捨入位元插入進位LSB打開位置的情況下,
- 分別針對每個總和訊號及進位訊號,
(b) 在CS累加期間,以及在管線6階段(最終捨入),以及
(c) 僅在管線6階段,而CSA被禁用。
根據特定應用程式施加的精確度和特定要求施加每種捨入模式。
According to some aspects, the rounding logic checks the last 15 LSB bits of the 39 significand bits from scratchpad 770 (not counting the GRS bits that make up the total 42 bits (39+3 GRS bits)) , and determine whether the remaining 24 bits need to be rounded (according to the rules imposed: RNE or RTZ). The incrementers required by the RNE are included in the rounding box. The three bits GRS carried from the accumulator (CSA) operation are ignored in this implementation. They can be incorporated into the final rounding of other possible implementations.
Rounding is done in one of several ways:
(a) apply round-to-nearest-odd (RNO) during CS accumulation operations,
- For sum signals, only if the round bit is inserted into the carry LSB on position,
- For each sum signal and carry signal respectively,
(b) during CS accumulation, and at the
根據需要,管線6的輸出是FP32或BF16。因此,有效數長度是24(23+隱含位元)或8位元(7+隱含位元)。這由施加到第一多工器的「Out_FP32」訊號控制。如果捨入致使25位元有效數,有效數將右移一個位置,指數將遞增1。The output of
正確正規化和捨入的結果儲存在管線6的輸出暫存器中,作為由1位元符號、8位元指數和7位元小數的有效數組成的BF16數,或由1位元符號、8位元指數和23位元小數的有效數組成的FP32數。The properly normalized and rounded result is stored in the output register of
圖9B說明了顯示捨入和轉換為BF16或IEEE 754 32位元SP格式的範例性示意圖。39位元有效數暫存器770匯流排fpst_l[38:0]837將fpst_l[32:0]819或fpst_l[16:0]819提供給包含保護位元、捨入位元和黏性位元的捨入電路830。控制Out_FP32選擇線819上將由捨入電路830捨入的有效數部分。在選擇32位元SP格式的情況下,較高24位元[38:16]增加3個捨入位元。多工器840在線835上的「0」輸入或線825上的捨入電路830輸出之間進行選擇,其中多工器840由捨入到零選擇線823控制。這種情況在指數超過-126並且有效數變為非正規時發生,在此實現中致使捨入為零。線827上的輸出將23位元(一個隱含的)和3捨入位元路由到第一捨入遞增電路860,從而在線819上以IEEE 754 SP 32位元格式適當地捨入有效數。FIG. 9B illustrates an exemplary schematic diagram showing rounding and conversion to BF16 or IEEE 754 32-bit SP format. The 39-
當選擇BF16輸出格式時,第二捨入遞增電路850用於捨入成BF-16格式。線817上的39位元有效數從fpst_l[38:0]817轉換到BF-16輸出在捨入遞增電路850中完成,得到7位元(隱含一個),增加了1個捨入決策位元,以及額外的16個零。這表示線821上的輸出處的有效數作為多工器802的輸入之一。When the BF16 output format is selected, the second
第一多工器802使用控制線訊號Out_FP32 801選擇FP-32或BF16輸出。當Out_FP32 801有效時,其在線805上輸出FP-32格式有效數。當Out_FP32 801控制訊號無效時,線805上的輸出是BF-16格式的有效數。第一多工器802的輸出是匯流排805,其分為匯流排807和匯流排809,進入第二多工器810。多工器810由線805上輸出匯流排的第24位元控制,線803上的frnd[23]訊號。如[0093]中所述,在捨入產生25位元有效數的情況下,第24位元將為1。在這種情況下,線803上的訊號frnd[23]位元選擇輸入匯流排807,其為匯流排805向右移位一位元位置。(frnd [23]訊號還將結果的指數遞增1,以調整右移。如果frnd [23]等於0,則不需要右移,並且匯流排805將透過選定的輸入匯流排809直接穿過線831。The
第三多工器(零)820在線829輸入上的全「0」或線831上的fnorm[22:0]之間進行選擇。如果輸出指數為無窮大或非正規,則輸出有效數被強制為零,這是透過線788上的控制訊號Zero_Sel完成,其選擇所有「0」輸入829。如果沒有例外,則正規化有效數匯流排831被路由為匯流排833,並映射到23位元有效數(IEEE 754)暫存器930。A third multiplexer (zero) 820 selects between all "0"s on
圖9C說明了顯示指數和異常處理方塊270b的範例性示意圖。8位元指數748提供EPST_L[9:0]964匯流排,並且如果frnd[23]=1則遞增1。這是透過將frnd[23]路由到遞增器982的進位位置來完成的。指數遞增器982的輸出是9位元Enorm[8:0]匯流排968,其為第一多工器(零)974的第一輸入。第一多工器(零)974的第二輸入是全「0」匯流排829。第一多工器(零)974的目的是將指數設置為全「0」,以防異常需要,由異常控制(8位元)暫存器750透過零控制邏輯970指示。FIG. 9C illustrates an exemplary schematic diagram showing index and
異常控制(8位元)暫存器750在以下八狀況953上運算:exp_inf_en、pos_zero、s_mult、s_cin、e_mul_zero、e_cin_zero、e_mul_zero和e_cin_inf。The exception control (8-bit)
零控制邏輯970具有三個輸入:線961上來自互斥或閘的sign_diff和線959上的e_cin_zero、線957上來自異常控制(8位元)暫存器750的e_mul_zero,以及線972上的輸出控制第一多工器(零)974。第一多工器(零)974的線975上的輸出穿過多工器976,提供儲存在指數暫存器980中的指數訊號匯流排979。如果無窮大控制962在線963上發出無窮大訊號,則線907上的全「1」輸入穿過多工器976,將所有指數位元設置為「1」,如IEEE 754標準所推薦的。
訊號的含義是:(在第089段中解釋)
異常控制(8位元)暫存器750在匯流排965上向符號生成和異常處理電路988以及欠位/溢位檢測和指數異常檢測電路986提供以下訊號:s_cin、s_mult、exp_inf_en、e_cin_inf、e_mul_inf、pos_zero、sign_diff。電路986的控制訊號是線969上的norm_en 758和線971上的小數零暫存器756。電路986的輸出是三個訊號ov(溢位)、uf(欠位)及(無效)nv。線991上的第四輸出從無窮大檢測電路992發送到符號生成和異常處理電路988指示溢位。Exception control (8-bit)
訊號具有以下縮寫,如下所示:s_cin(輸入C符號)、s_mult(乘積符號)、e_mul_zero(乘積指數零)、e_cin_zero(輸入C指數零)、e_mul_inf(乘積指數無窮大)和e_cin_inf(輸入C指數無窮大)。The signals have the following abbreviations, as follows: s_cin (enter C symbol), s_mult (multiply sign), e_mul_zero (multiply exponent zero), e_cin_zero (enter C exponent zero), e_mul_inf (multiply exponent infinite), and e_cin_inf (enter C exponent infinity).
兩個相關事件致使欠位。一個是在±2-126[其中-126是最小指數值]之間建立一個微小的非零結果,因為它非常小,稍後可能會致使一些其它異常,諸如除法時溢位。另一個事件是在對如此小的數字進行近似的期間出現了準確性的異常損失。當傳遞的結果與在指數範圍和精確度無界的情況下計算的結果不同時,可能會檢測到精確度損失。除了要求單精確度和雙精確度外,IEEE標準754不追蹤精確度。在此揭露的實現中,不使用「非正規」數,並且指數值為-126且有效數小於1的任何值將被轉換為零。透過將所有有效數設置為零並將指數值設置為零來表示零,這在我們揭露的實現中由異常處理電路處理。Two related events cause an underbit. One is to create a tiny non-zero result between ±2-126 [where -126 is the minimum exponent value], since it is so small that it might later cause some other anomaly, such as overflow in division. Another incident is the unusual loss of accuracy during approximations to such small numbers. Loss of precision may be detected when the passed result differs from the result computed with the exponent range and precision unbounded. IEEE Std 754 does not track precision other than requiring single and double precision. In this disclosed implementation, no "denormal" numbers are used, and any value with an exponent of -126 and a significand less than 1 will be converted to zero. Zero is represented by setting all significands to zero and the exponent value to zero, which is handled by the exception handling circuitry in our disclosed implementation.
符號生成和異常處理電路988透過匯流排965和無窮大檢測電路992接收來自異常控制(8位元)暫存器750的輸入。符號生成和異常處理電路988的輸出是一符號位元,其透過訊號線983被儲存到暫存器990。Symbol generation and
無窮大檢測電路992在指數訊號匯流排979輸入上運行,並且如果它檢測到所有指數位元為1,其將向或閘987提供「1」,或閘987又將其輸出788設置為「1」。這設置了ZERO-SEL訊號788,所述ZERO-SEL訊號788將有效數設置為全零(Mux 820,圖9B)。The
當指數訊號匯流排979上的指數值超出範圍時,非正規電路994檢測到這種情況,並在訊號線967上發出欠位狀況的訊號。這種情況也被發訊號給或閘987,其在線788上產生訊號ZERO-SEL。線788(圖9B)上的ZERO-SEL訊號將指示多工器820將所有「0」插入有效數,從而建立正確的IEEE 754「零」表示(指數和有效數都包含所有「0」)。
When the index value on
描述了,使用具有基底為8的指數的進位保留加法和累加的浮點乘加累加單元。這平衡了指數單位的關鍵時序與有效數單位的關鍵時序。此外,與浮點IEEE-754標準中提出的使用符號數值表示不同,使用2的補數系統來表示也具有運算元符號的正和負有效數。當指數相等以確定IEEE-754標準規定的兩者中的較大者時,這避免了不必要的有效數減法。引入2的補數表示需要適用於正數和負數兩者的新穎前導零(前導一)檢測器(預測器)。這同樣適用於溢位(OV)檢測。此外,有必要確定何時將進位及總和相加會致使長符號擴展(SE),這需要引入新穎的設計特徵。A floating-point multiply-accumulate unit using carry-save add and accumulate with base-8 exponents is described. This balances critical timing for exponential units with critical timing for significand units. Also, unlike the signed numerical representation proposed in the floating-point IEEE-754 standard, the 2's complement number system is used to represent positive and negative significands that also have operand signs. This avoids unnecessary significand subtraction when exponents are equal to determine the greater of the two as specified by the IEEE-754 standard. The introduction of 2's complement representation requires a novel leading zero (leading one) detector (predictor) suitable for both positive and negative numbers. The same applies to overflow (OV) detection. Furthermore, it is necessary to determine when adding the carry and sum results in a long sign extension (SE), which requires the introduction of novel design features.
描述了浮點乘加累加單元,支援乘法累加運算的BF16格式,以及符合IEEE 754標準的FP32單精確度加法。乘法累加單元使用更高的內部精確度和更長的累加器,透過將運算元轉換為更高基底和更長的內部2的補數有效數表示,以促進精確度以及與負數的比較和運算。使用進位保留格式執行加法,以避免長進位傳播並加快運算。2的補數和進位保留格式採用溢位檢測、零檢測和符號擴展等運算。溢位和符號擴展的處理允許相對獨立於累加器大小的快速運算。在不影響時間的累加運算中引入了適合機器學習的捨入,大大提高了計算的準確性。 異常處理 Describes the floating-point multiply-accumulate unit, supporting BF16 format for multiply-accumulate operations, and FP32 single-precision add in accordance with the IEEE 754 standard. The multiply-accumulate unit uses a higher internal precision and longer accumulator to facilitate precision and comparisons and operations with negative numbers by converting the operands to a higher base and longer internal 2's complement significand representation . Addition is performed using a carry-save format to avoid long carry propagation and speed up operations. The 2's complement and carry-save formats employ operations such as overflow detection, zero detection, and sign extension. Handling of overflow and sign extension allows fast operations relatively independent of the size of the accumulator. The rounding suitable for machine learning is introduced in the accumulation operation that does not affect the time, which greatly improves the accuracy of the calculation. exception handling
圖10至21與上述電路和方法中的異常處理有關。根據用於表示有效數和指數的特定編碼格式,浮點數可以採用特殊情況下的值,諸如:正無窮大或負無窮大、零和非正規數字。10 to 21 relate to exception handling in the circuits and methods described above. Depending on the particular encoding format used to represent the significand and exponent, floating-point numbers can take on special-case values such as: positive or negative infinity, zero, and denormal numbers.
圖10將方塊1010中所示的浮點數範圍1000說明為由許多項劃分為感興趣區域的水平數線。用語的定義在表1中定義。
浮點特殊數
FIG. 10 illustrates the range of floating
表1中顯示的以下列表包含三行。第一行列出了特殊浮點數的定義。第二行列出了BF16浮點編碼格式的值。第三行顯示浮點編碼FP32格式的值。The following list shown in Table 1 contains three rows. The first line lists the definitions of special floating-point numbers. The second line lists the values in BF16 floating point encoding format. The third line shows the value in floating point encoded FP32 format.
在表1的註釋欄中,用語(+)Nan包括BF16中的值7F81和FP32中的值7F800001,作為表示NaN的兩種約定。用語(+)範數(Norm)被列為(+)Pi,(3.14…)並且用語(-)範數被列為(-)Pi以用於測試目的。項(+)非範數及(-)非範數是最小的可表示值。(+)Zero及(-)Zero有兩種表示形式,區別在於最高有效符號位元。對於與符號位元有關的所有其它用語也是如此。 In the Note column of Table 1, the term (+)Nan includes the value 7F81 in BF16 and the value 7F800001 in FP32 as two conventions for expressing NaN. The term (+) norm (Norm) is listed as (+)Pi, (3.14...) and the term (-) norm is listed as (-)Pi for testing purposes. The terms (+) non-norm and (-) non-norm are the smallest representable values. (+)Zero and (-)Zero have two representations, the difference lies in the most significant sign bit. The same is true for all other terms related to sign bits.
BFloat16浮點格式,也稱為大腦浮點格式,(有時也稱為「BF16」)是一種16位元數字編碼格式。BF16保留了IEEE單精確度數的近似動態範圍。BF16格式包括一7位元小數(也稱為尾數或有效數)、一「隱含位元」或「隱藏位元」、一8位元指數和一符號位元。單精確度浮點值可以轉換為BF16以加速機器學習。動態範圍與使用8位元精確度而非24位元小數的單精確度FP32(8位元指數)相同。BFloat16可以降低記憶體需求、可以降低儲存需求、可以提高機器學習演算法的計算速度。BF16是32位元單精確度IEEE 754格式的截斷16位元版本,其意於加速機器學習。The BFloat16 floating-point format, also known as the brain floating-point format, (sometimes referred to as "BF16") is a 16-bit numeric encoding format. BF16 preserves the approximate dynamic range of IEEE single-degree precision. The BF16 format includes a 7-bit fraction (also known as a mantissa or significand), an "implicit bit" or "hidden bit", an 8-bit exponent, and a sign bit. Single precision floating point values can be converted to BF16 to speed up machine learning. Dynamic range is the same as single-precision FP32 (8-bit exponent) using 8-bit precision instead of 24-bit fractional. BFloat16 can reduce memory requirements, reduce storage requirements, and increase the calculation speed of machine learning algorithms. BF16 is a truncated 16-bit version of the 32-bit single-precision IEEE 754 format, which is intended to speed up machine learning.
第二種數字格式是IEEE 754單精確度32位元浮點(FP32)。IEEE 754單精確度32位元浮點包括一23位元小數、一「隱含」位元或「隱藏位元」、一8位元指數和一符號位元。The second number format is IEEE 754 single precision 32-bit floating point (FP32). IEEE 754 single-precision 32-bit floating point includes a 23-bit fraction, an "implicit" or "hidden bit," an 8-bit exponent, and a sign bit.
表2的以下內容列出了BFloat16用語及其數值定義。 The following content of Table 2 lists BFloat16 terms and their numerical definitions.
表3的以下內容列出了額外的BFloat16用語及其二進制格式的數字定義。正無窮大和負無窮大定義為當所有指數位元都等於一時且當所有小數位都等於零時。正負NaN(非數字)定義為當所有指數位元都等於一時且當並非所有小數位元都等於零時。正負非範數定義為當所有指數位元都等於零且當並非所有小數位元都等於零時。正無窮大或負無窮大、NaN或非範數取決於符號位元。 The following in Table 3 lists additional BFloat16 terms and their numerical definitions in binary format. Positive infinity and negative infinity are defined when all exponent bits are equal to one and when all scale bits are equal to zero. A positive or negative NaN (not a number) is defined as when all exponent bits are equal to one and when not all scale bits are equal to zero. Positive or negative non-norm is defined when all exponent bits are equal to zero and when not all scale bits are equal to zero. Positive or negative infinity, NaN or non-norm depending on the sign bit.
在一些實施例中,用於機器學習運算的異常處理單元不支援非正規或NaN運算。非正規數被視為零,而NaN數被視為無窮大。 異常 In some embodiments, the exception handling unit for machine learning operations does not support denormal or NaN operations. Denormal numbers are treated as zero, and NaN numbers are treated as infinity. abnormal
IEEE標準754-2019規範的第7章描述了下面列出的五類浮點異常。根據一個實施例,實現以下類別中的三個:(1)無效運算、(3)溢位及(4)欠位。本實施例不支援除以零及不精確。
1) 無效運算
2) 除以零
3) 溢位
4) 欠位
5) 不精確
根據一些其它實施例,實現了四個類別:(1)無效運算、(2)除以零、(3)溢位及(4)欠位。根據其它實施例,實現了所有五個類別。 無效運算 According to some other embodiments, four categories are implemented: (1) invalid operation, (2) division by zero, (3) overflow, and (4) underbit. According to other embodiments, all five categories are implemented. invalid operation
IEEE標準754-2019規範將以下內容描述為無效運算: a)對信令NaN的任何通用計算運算; b)乘法:乘法(0, ∞)或乘法(∞, 0); c)積和熔加(fusedMultiplyAdd):積和熔加(0, ∞, c)或積和熔加(∞, 0, c); d)加法或減法或積和熔加:無窮大的幅度減法,如加法(+∞, -∞); e)除法:除法(0, 0)或除法(∞, ∞); f)餘數:餘數(x, y),當y為零或x為無窮大,且兩者都不是NaN時; g)如果運算元為負數,則平方根;以及 h)當結果不適合目標格式或當一個運算元是有限的而另一個是無限的時,量化。 The IEEE Std 754-2019 specification describes the following as invalid operations: a) Any general calculation operation on signaling NaN; b) Multiplication: multiplication (0, ∞) or multiplication (∞, 0); c) fusedMultiplyAdd (fusedMultiplyAdd): fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c); d) Addition or subtraction or product and fusion addition: infinite amplitude subtraction, such as addition (+∞, -∞); e) Division: division (0, 0) or division (∞, ∞); f) Remainder: Remainder (x, y), when y is zero or x is infinite, and neither of them is NaN; g) if the operand is negative, the square root; and h) Quantization when the result does not fit in the target format or when one operand is finite and the other is infinite.
根據一個實施例,進位保留累加單元中的異常處理實現上面列出的無效運算a/b/c/d。根據另一實施例,類別(a)至(h)的任何組合都為可能的。 無效運算異常 According to one embodiment, exception handling in the carry-save-accumulate unit implements the invalid operations a/b/c/d listed above. According to another embodiment, any combination of categories (a) to (h) is possible. invalid operation exception
表4的以下內容列出了產生異常的無效運算。
表4
表5的以下內容列出了兩個運算元最大範數(最大範數)乘以最大範數的一個範例的溢位異常。這些溢位異常會產生結果「帶正負號無窮大(Signed Infinity)」。當結果大於帶正負號最大範數(Signed最大範數)並且僅當運算元的輸入上不存在無限大值時,才會發生此溢位異常。
表5
在一個實施例中,存在幾種情況,其中結果小於帶正負號最小範數。當運算元輸入上沒有精確的零值時,會發生此異常。當加法器範數/捨入:(+)範數+(-)範數發生時,結果是非正規(非正規)值(但不是精確零),但實際結果將是「帶正負號零」。參見表6,其中顯示兩個運算元最小範數(最小範數)乘以最小範數的範例。
表6
在一些實施例中,異常處理可以分為「異常旗標生成」及「異常結果生成」。例如在圖12A的浮點乘法器方塊1110和浮點進位保留加法器方塊1130上處理異常。
異常旗標生成
In some embodiments, exception handling can be divided into "exception flag generation" and "exception result generation". Exceptions are handled, for example, at floating
在一些實施例中,浮點乘法器異常旗標被提供用於:(1)溢位、(2)欠位,以及(3)無效。In some embodiments, floating point multiplier exception flags are provided for: (1) overflow, (2) underbit, and (3) invalid.
在一些實施例中,提供了以下浮點加法器異常旗標:(1)溢位,(2)欠位,以及(3)無效。 異常結果生成 In some embodiments, the following floating point adder exception flags are provided: (1) overflow, (2) underrun, and (3) invalid. abnormal result generation
異常處理的運算在IEEE標準754-2019第6章中進行了解釋。以下兩個表格總結了乘法和加法運算的一個實施例和實現。
乘法器運算
The operation of exception handling is explained in
根據一個實施例,表7的內容列出了具有無效、欠位和溢位運算的註釋的乘法運算。
表7
根據一個實施例,表8的加法器運算列表的內容具有無效及溢位運算的註釋。
表8
圖11說明了一個範例高階架構方塊圖1100,其描繪了用於機器學習的進位保留累加單元中的異常處理的元件。FIG. 11 illustrates an example high-level architectural block diagram 1100 depicting elements of exception handling in a carry-save-accumulate unit for machine learning.
在一個實施例中,進位保留累加單元設計中的異常處理包含三個不同的輸入訊號。這些是運算元A 1113、運算元B 1114和運算元C 1116。運算元A 1113可以是BF16和FP32格式。運算元B 1114是BF16格式,運算元C 1116是FP32格式。運算元也稱為輸入。In one embodiment, the exception handling in the carry-save-accumulation unit design involves three different input signals. These are operand A 1113 ,
乘法器異常處理方塊1102匯入運算元A 1113和運算元B 1114。乘法器異常處理方塊1102的輸出連接到以下:(1)經由匯流排1106到乘法器異常旗標1104,(2)經由匯流排乘法器異常狀況訊號1108到異常輸出控制訊號生成1126,及(3)乘法器異常結果1115。The multiplier
FP32格式的運算元C 1116進入運算元C基本轉換方塊1118並輸出到:(1)透過運算元C異常狀況訊號匯流排1120到異常輸出控制訊號生成1126,及(2)透過匯流排1122到進位保存加法器方塊1130。The
進位保留加法器(CSA)方塊1130透過匯流排1122處理兩個輸入:(1)乘法器異常結果1115,及(2)來自運算元C基本轉換方塊1118的輸出。在一個實施例中,CSA方塊1130具有累加器迴圈1124,其將僅在迴圈結束時輸出資料。CSA模組1130透過匯流排1132輸出。Carry save adder (CSA) block 1130 processes two inputs via bus 1122 : (1) the
加法器正規化異常處理方塊1134匯入兩個輸入。第一輸入是透過匯流排1132輸出的CSA方塊1130。第二輸入是來自異常輸出控制訊號生成1126方塊的異常控制1128。Adder normalization
加法器正規化異常處理方塊1134的輸出是加法器異常結果1139和路由到加法器異常旗標方塊1140的匯流排1138。The output of adder normalized
運算元A是32位元匯流排,並且為乘法運算提供BF16輸入,而為加法運算提供FP32輸入。運算元B僅用於乘法運算,始終具有BF16 16位元輸入格式。運算元C用於加法運算或累加器初始化,始終具有FP32 32位元輸入格式。乘法器部分具有分離的輸出旗標(溢位、欠位和無效),而乘法器異常結果成為加法器的直接輸入。加法器的異常狀況訊號由乘法器產生並連接到「異常控制訊號產生」方塊,並具有來自運算元C和累加器的異常狀況訊號,為加法器正規化方塊產生「異常控制」訊號以用於加法器異常處理。 運算模式 Operand A is a 32-bit bus and provides BF16 inputs for multiplication and FP32 inputs for addition. Operand B is used for multiplication only and always has the BF16 16-bit input format. Operand C is used for addition or accumulator initialization and always has FP32 32-bit input format. The multiplier section has separate output flags (overflow, underbit, and invalid), while multiplier exception results become direct inputs to the adder. The abnormal condition signal of the adder is generated by the multiplier and connected to the "exception control signal generation" block, and has the abnormal condition signal from the operand C and the accumulator, and generates the "exception control" signal for the normalization block of the adder for use in Adder exception handling. operation mode
根據一個實施例,進位保留累加單元設計中的異常處理支援三種不同的運算模式。According to one embodiment, the exception handling in the carry-save-accumulation unit design supports three different operation modes.
圖12A說明了根據圖11的高階方塊圖架構的第一運算模式1200。乘法累加運算顯示BF16格式的運算元A 1113,其首先與BF16格式的運算元B 1114相乘,接著乘積以FP32格式相加到運算元C 1116。乘加運算在單一運算中完成。此運算產生乘法和加法異常旗標和結果。FIG. 12A illustrates a first mode of
乘法器方塊1110匯入運算元A 1113和運算元B 1114。乘法器方塊1110的輸出連接到以下:(1)透過匯流排1106到乘法器異常旗標1104方塊,以及(2)連接乘法器異常結果1115到進位保留加法器(CSA)方塊1130。在一些實施例中,第一運算模式使用乘法器異常旗標1104以用於統計目的。The
進位保留加法器(CSA)方塊1130處理以下兩個輸入:(1)乘法器異常結果1115及(2)運算元C 1116。CSA方塊1130具有兩個輸出。第一輸出是加法器異常結果1139,其以BF16或FP32格式路由到輸出方塊1129。第二輸出是路由到加法器異常旗標方塊1140的匯流排1138。Carry save adder (CSA) block 1130 processes two inputs: (1)
圖12B說明了根據圖11的高階方塊圖架構的第二運算模式1202。乘法累加運算顯示為BF16格式的運算元A 1113與BF16中的運算元B 1114在單一運算中相乘。它在累加迴圈結束時產生輸出結果。在累加期間,加法器輸出和加法器異常被禁用。此運算產生乘法和加法異常旗標和結果。FIG. 12B illustrates a second mode of
乘法器方塊1110匯入運算元A 1113和運算元B 1114。乘法器方塊1110的輸出連接到以下方塊:(1)透過匯流排1106到乘法器異常旗標1104,及(2)透過乘法器異常結果1115到進位保留加法器(CSA)1130。在一些實施例中,第二運算模式使用乘數異常旗標1104以用於統計目的。The
進位保留加法器(CSA)方塊1130處理以下兩個輸入:(1)乘法器異常結果1115和累加器迴圈1124。CSA方塊1130具有兩個輸出。第一輸出是加法器異常結果1139,其僅在累加器迴圈1124結束時以BF16或FP32格式路由到輸出方塊1131。第二輸出是匯流排1138,其路由到加法器異常旗標方塊1140。Carry save adder (CSA) block 1130 processes the following two inputs: (1)
圖12C說明了根據圖11的高階方塊圖架構的第三運算模式1203。加法運算顯示為FP32格式的運算元A 1113與FP32格式的運算元C 1116相加。此運算僅產生相加異常旗標和結果。乘法器異常處理被禁用。FIG. 12C illustrates a third mode of
進位保留加法器(CSA)方塊1130將兩個輸入相加。FP32格式的運算元A 1113與FP32格式的運算元C 1116相加。CSA方塊1130具有兩個輸出。第一輸出是加法器異常結果1139,其以BF16或FP32格式路由到輸出方塊1129。第二輸出是路由到加法器異常旗標方塊1140的匯流排1138。
異常處理結構
A carry save adder (CSA)
根據一些實施例,異常處理分為「異常旗標生成」和「異常結果生成」,兩者也可以分為乘法器異常和加法器異常。According to some embodiments, exception handling is divided into "exception flag generation" and "exception result generation", which can also be divided into multiplier exception and adder exception.
乘法器及加法器旗標生成產生溢位、欠位和無效旗標。這些旗標在下面的應用中顯示為一組電路實現。Multiplier and adder flag generation generates overflow, underbit, and invalid flags. These flags are shown as a set of circuit implementations in the application below.
乘法器及加法器異常結果生成包括三狀況:(1)符號生成,(2)指數生成,以及(3)小數生成。所述符號具有正輸出或負輸出。當異常情況發生時,指數可以具有全「0」和全「1」狀況,而當不發生異常情況時,指數可以具有正規輸出。小數值可以有兩狀況;對於所有異常情況,其都為「0」,而對於非異常情況,其為正常的。The multiplier and adder abnormal result generation includes three states: (1) sign generation, (2) exponent generation, and (3) decimal generation. The symbols have positive or negative outputs. When abnormal conditions occur, the index can have all "0" and all "1" conditions, and when abnormal conditions do not occur, the index can have normal output. Fractional values can have two states; for all abnormal cases, it is "0", and for non-exceptional cases, it is normal.
圖13說明了在高階方塊圖中描繪的異常處理結構1300。浮點乘法累加器異常方塊1308將旗標輸出到異常旗標生成方塊1304並將結果輸出到異常結果生成方塊1312。旗標包含狀態資訊或用於由專用異常處理方塊處理的資料,如下文將進一步描述的。FIG. 13 illustrates an
異常旗標生成方塊1304可以將旗標輸出到乘法器異常旗標生成方塊1302或加法器異常旗標生成方塊1306。乘法器異常旗標生成方塊1302驅動乘法器溢位旗標狀況方塊1381、乘法器欠位旗標狀況方塊1382,以及乘法器無效旗標狀況方塊1383。Exception
加法器異常旗標生成方塊1306驅動加法器溢位旗標狀況方塊1387、加法器欠位旗標狀況方塊1388和加法器無效旗標狀況方塊1389。Adder exception
異常結果生成方塊1312將結果提供給乘法器異常結果生成方塊1310和加法器異常結果生成方塊1314。Exception
乘法器異常結果生成方塊1310將結果輸出到:(1)乘法器符號生成狀況方塊1320A,(2)乘法器指數生成狀況方塊1322A,以及(3)乘法小數生成狀況方塊1324A。Multiplier exception
加法器異常結果生成方塊1314將結果輸出到:(1)加法器符號生成狀況方塊1326A,(2)加法器指數生成狀況方塊1328A,以及(3)加法器小數生成狀況方塊1330A。Adder exception result
乘法器符號生成狀況方塊1320A將狀況輸出到方塊1320B和1320C。乘法器指數生成狀況方塊1322A將狀況輸出到方塊1322B、1322C和1322D。乘法小數生成狀況方塊1324A將狀況輸出到方塊1324B和1324C。The multiplier symbol
加法器符號生成狀況方塊1326A將狀況輸出到方塊1326B和1326C。加法器指數生成狀況方塊1328A將狀況輸出到方塊1328B、1328C和1328D。加法器小數生成狀況方塊1330A將狀況輸出到方塊1330B和1330C。Adder symbol
以下從圖14A到圖21B的圖說明了圖13中所示的高階方塊的示意性實現。例如,圖13中的方塊1381在圖14A中實現,方塊1382在圖14B中實現。下圖標題對應於圖13的方塊名稱。
狀況電路
The following diagrams from FIG. 14A to FIG. 21B illustrate a schematic implementation of the high-level blocks shown in FIG. 13 . For example,
圖14A描繪了乘法器溢位旗標狀況電路1381的一種實現1400。示意圖顯示了乘法器溢位旗標狀況1446在來自乘法器溢位及閘1444的高階輸出上有效。及閘1444具有以下三個輸入:(1)乘法運算致能1414,(2)反或閘1445的輸出,以及(3)1442,其為乘積指數及閘1440的輸出,也稱為Ep(指數積)和乘法器乘積指數。FIG. 14A depicts one
反或閘1445具有兩個輸入eainf和ebinf。如果輸入A為無窮大,則出現訊號指數A無窮大(eainf),並且在指數等於0xFF時被檢測到(意味著全「1」)。如果輸入B為無窮大,則出現訊號指數B無窮大(ebinf),並且在指數等於0xFF(表示全「1」)時被檢測到。上面顯示的表2定義了BFloat16用語及其數值定義。Eainf是及閘1420的輸出。及閘1420的輸入是運算元A的指數,並且被顯示具有最低有效數位元(LSB)1402和最高有效數位元(MSB)1404,包含八個指數位元。The NOR
Ebinf是及閘1430的輸出。及閘1430的輸入是運算元B的指數,並且被顯示具有最低有效數位元(LSB)1406和最高有效數位元(MSB)1408,包含八個指數位元。Ebinf is the output of AND
乘積指數及閘1440的輸入是最低有效數位元(LSB)1410和最高有效數位元(MSB)1412,包含八個指數位元。The inputs to the product
圖14B描繪了乘法器欠位旗標狀況電路1382的一種實現1400。乘法器欠位旗標狀況1482在來自乘法器欠位及閘1480的高階輸出上有效。及閘1480具有以下三個輸入:(1)乘法運算致能1418,(2)反或閘1475的輸出,及(3)1472,其為乘積指數反或閘1470的輸出。FIG. 14B depicts one
反或閘1475有兩個輸入,eaz(輸入A指數為零)和ebz(輸入B指數為零)。Eaz是反或閘1450的輸出。反或閘1450的輸入是運算元A的指數,並且被顯示為具有最低有效數位元(LSB)1422和最高有效數位元(MSB)1424,包含八個指數位元。Inverse OR
Ebz是反或閘1460的輸出。反或閘1460的輸入是運算元B的指數,並且被顯示為具有最低有效數位元(LSB)1426和最高有效數位元(MSB)1428,包含八個指數位元。Ebz is the output of the NOR
乘積指數反或閘1470的輸入是最低有效數位元(LSB)1432和最高有效數位元(MSB)1434,包含八個指數位元。The inputs to product exponent OR
當乘法運算被致能、Ep(乘積指數)為0x00,並且任何乘法器指數輸入不為零時,乘法器欠位旗標的判定發生。不為零的乘法器指數輸入意味著運算元A或運算元B的任何指數都不是0x00。The assertion of the multiplier underbit flag occurs when the multiplication operation is enabled, Ep (product exponent) is 0x00, and any multiplier exponent input is non-zero. A non-zero multiplier exponent input means that neither operand A nor operand B has any exponent other than 0x00.
乘法器無效旗標的判定根據以下兩狀況發生:(1)乘法運算被致能及(2)無效為「1」。無效的「1」定義為當運算元A的指數為無窮大(0xFF)且運算元B的指數為零(0x00)時或當運算元B的指數為無窮大(0xFF)且運算元A的指數為零(0x00)時的狀況。The determination of the multiplier invalid flag occurs according to the following two conditions: (1) the multiplication operation is enabled and (2) invalid is "1". Invalid "1" is defined as when the exponent of operand A is infinity (0xFF) and the exponent of operand B is zero (0x00) or when the exponent of operand B is infinity (0xFF) and the exponent of operand A is zero (0x00) status.
圖15描繪了方塊1383中乘法器無效旗標狀況電路的一種實現1500。乘法器無效旗標狀況1582在來自乘法器無效及閘1580的高階輸出上有效。及閘1580具有以下兩個輸入:(1)乘法運算致能1501,及(2)1572,或閘1570的輸出。FIG. 15 depicts one
或閘1570有兩個輸入,第一輸入是從及閘1550導出的1552:第二輸入是從及閘1560導出的1562。OR
及閘1550有兩個輸入:eainf和ebz。訊號ebz出現在輸入B為零時,即(+)零或(-)零,狀況是指數為0x00(表示全為「0」)。Eainf是及閘1510的輸出。及閘1510的輸入是運算元A的指數,並且顯示為具有最低有效數位元(LSB)1502和最高有效數位元(MSB)八位元1504。Ebz是反或閘1520的輸出。反或閘1520的輸入是運算元B的八位元指數,並且顯示為具有最低有效數位元(LSB)1506和最高有效數位元(MSB)1508。And
及閘1560有兩個輸入:ebinf和eaz。Ebinf是及閘1530的輸出。及閘1530的輸入是運算元B的八位元指數,並顯示為具有最低有效數位元(LSB)1506和最高有效數位元(MSB)1508。Eaz是反或閘1540的輸出。訊號eaz出現在輸入A為零時,即(+)零或(-)零),而指數為0x00(表示全「0」)。反或閘1540的輸入是運算元A的指數,並且被說明具有最低有效數位元(LSB)1502和最高有效數位元(MSB)八位元1504,包含八個指數位元。And
圖16描繪了乘法器符號生成狀況電路1320的一種實現1600。乘法器旗標狀況1632由及閘1630產生。圖15的乘法器無效旗標狀況682是及閘1630的第一輸入。及閘1630的第二輸入是從EX-或閘1620的輸出導出的1632。符號A 1618和符號B 1622包含EX-或閘1620的輸入。FIG. 16 depicts one
圖17A顯示了乘法器指數生成狀況電路1322的一種實現1700。乘法器指數1752根據三狀況生成。它們是:(1)全「0」(0x00=零),(2)全「1」(0xFF=無窮大),及(3)正規指數。FIG. 17A shows an
第一狀況是指數全「0」。如果運算元A的任何指數或運算元B的任何指數為零,則可能出現第一狀況。如果乘法器指數計算結果為負溢位,也可能出現第一狀況,這意味著乘數指數計算結果小於-126。The first situation is that the index is all "0". The first condition may arise if any exponent of operand A or any exponent of operand B is zero. The first condition can also occur if the multiplier exponent evaluates to a negative overflow, which means that the multiplier exponent evaluates to less than -126.
第二狀況是全「1」。如果運算元A的任何指數或運算元B的任何指數為無窮大,則可能出現第二狀況。如果乘法器指數計算結果為正溢位,也可能出現第二狀況,即乘法器指數計算結果大於Emax(+127)。The second condition is all "1". The second condition may arise if any exponent of operand A or any exponent of operand B is infinity. If the calculation result of the multiplier exponent is a positive overflow, the second situation may also occur, that is, the calculation result of the multiplier exponent is greater than Emax (+127).
第三狀況是正規輸出,定義為沒有第一狀況或第二狀況正在發生。這被定義為除全「0」或全「1」以外的狀況。The third condition is a normal output, defined as neither the first condition nor the second condition is occurring. This is defined as anything other than all '0's or all '1's.
八位元寬的多工器1750具有三個8位元匯流排作為輸入。這些是全「0」、全「1」和其它匯流排。使用或閘1710、及閘1720、或閘1730和反或閘1740的控制閘控制了多工器1750的輸出。Eight-bit wide multiplexer 1750 has three 8-bit buses as inputs. These are all '0's, all '1's, and other buses. The output of multiplexer 1750 is controlled using the control gates of
圖17B顯示了乘法器小數生成狀況電路1324的一種實現1700。乘法器小數匯流排1772是16位元寬並且根據兩狀況產生:(1)異常,(2)正規小數。兩個多工器輸入是:(1)全「0」狀況(0x00=零),及(2)正規小數狀況。One
多工器1770有兩個16位元輸入匯流排。這些是全「0」和小數匯流排。或閘1760控制1770多工16位元寬多工輸出的閘控。Multiplexer 1770 has two 16-bit input buses. These are all "0" and decimal buses. The
訊號exp_overflow和ezero是或閘1760的兩個輸入。訊號exp_overflow是正溢位或負溢位的邏輯或,其中溢位在上面的圖17A中進行了描述。在一些實施例中,在指數計算之後提供一個狀態位元以檢測指數溢位。用語ezero定義為eaz(輸入A指數為零)或ebz(輸入B指數為零)。或閘1760的輸出是1762,其為在兩個16位元寬多工器1770輸入匯流排之間進行選擇的控制。全「0」發生在以下情況:(1)乘積指數為正溢位,或(2)存在負溢位,或(3)運算元A的指數為零,或(4)運算元B的指數為零。Signals exp_overflow and zero are two inputs of
在一些實施例中,進位保留累加單元中的異常處理不支援非正規或NaN運算。在這種情況下,非正規數被視為零,而NaN數被視為無窮大。當任何這些異常發生時,小數輸出將為全「0」。否則,多工器1770的輸出是正規化小數。In some embodiments, the exception handling in the carry save accumulation unit does not support denormal or NaN operations. In this case, denormal numbers are treated as zero, and NaN numbers are treated as infinity. When any of these exceptions occur, the fractional output will be all "0". Otherwise, the output of multiplexer 1770 is a normalized decimal.
圖18A說明了方塊1387中的加法器溢位旗標狀況電路的一個示意性實現1800。名為加法器溢位1832的訊號演示了溢位旗標狀況並且是及閘1830的輸出。當及閘1830具有以下三個輸入時,發生加法器溢位旗標狀況:(1)正規化_致能,(2)及閘1810輸出命名為溢位,及(3)反或閘1820輸出命名為非輸入指數無窮大1822。及閘1810的輸入是八個正規化指數位元。反或閘1820的輸入是乘積指數無窮大和輸入C指數無窮大。總之,在正規化期間,如果正規化指數等於0xFF並且沒有具有指數無窮大狀況的輸入,則可能發生加法器溢位1832旗標狀況。FIG. 18A illustrates an
圖18B顯示了方塊1388中加法器欠位旗標狀況電路的一種實現1800。加法器欠位1892在具有以下三個輸入的及閘1890高階輸出上有效:(1)正規化_致能,(2)及閘1850輸出命名為欠位,及(3)反或閘1880輸出命名為非輸入精確零1882。及閘1850的輸入是八個正規化指數位元及小數輸出零。總之,當正規化指數為0x00(等於0)且輸入不精確為零時,通常會在正規化期間發生加法器欠位旗標情況。以下段落將描述可以檢查非輸入精確零1882狀況的電路的一種實現。FIG. 18B shows an
進一步描述圖18B的電路,非輸入精確零訊號1882的生成是透過反或閘1880的三個輸入。這三個輸入是:(1)乘法器乘積指數零,(2)輸入C指數零,以及(3)輸出精確零1872,其源自及閘1870的輸出。及閘1870具有兩個輸入。第一輸入是名為sign_diff的EX-或1860閘輸出。EX-或1860閘在兩個輸入(乘法器乘積符號和輸入C符號)上運行。及閘1870的第二輸入是(+)零,其中(+)零是輸入C指數大於或等於乘法器乘積指數與小數出零訊號邏輯上AND。Further describing the circuit of FIG. 18B , the non-input exact-zero
加法器欠位是以下狀況的組合,如電路所示。第一狀況;(1)是最終正規化時致能。第二狀況;(2)是最終加法指數結果時,即正規化指數等於0x00並且最終加法小數結果(小數輸出零訊號)不完全為零時。第三狀況;(3)是乘法器乘積指數零、輸入C指數零和輸出非精確零均未致能時,這意味著其中一個為零。如果三狀況中的任何一個為零,則不會發生加法器欠位。An adder underbit is a combination of the following conditions, as shown in the circuit. The first condition; (1) is enabled at the time of final normalization. The second condition; (2) is when the final addition exponent result, that is, when the normalized exponent is equal to 0x00 and the final addition decimal result (decimal output zero signal) is not completely zero. The third condition; (3) is when the multiplier product index zero, input C index zero and output non-exact zero are all disabled, which means one of them is zero. If any of the three conditions is zero, no adder underbit occurs.
圖19A顯示了方塊1389中的加法器無效旗標狀況電路的一種實現1900。參考表8,當將(+)無窮大及(-)無窮大相加時會出現一種無效狀況。將(+)無限大及(+)無限大相加或將(-)無限大及(-)無限大相加不會致使無效旗標狀況1389。所述電路檢查具有相反符號的無限大的兩個運算元。加法器無效1932旗標是及閘1930的輸出。及閘1930具有以下三個輸入:(1)正規化致能,(2)名為1912的或閘1910輸出,及(3)來自EX-或閘1920的sign_diff輸出。One
加法器無效或閘1910的兩個輸入是乘法器乘積指數無窮大和輸入C指數無窮大。EX-或閘1920對兩個輸入、乘法器乘積符號和輸入C符號進行運算。The two inputs to the adder inactive OR
總結電路功能,當(1)致能最終正規化,(2)如果來自輸入A(或乘法器乘積)和輸入C的輸入符號不同(正及負,或負及正)時,以及(3)兩個輸入的指數是無窮大(0xFF),則會出現加法器無效旗標狀況。 加法器符號生成電路 To summarize the circuit function, when (1) final normalization is enabled, (2) if the inputs from input A (or multiplier product) and input C are of different signs (positive and negative, or negative and positive), and (3) If the exponents of both inputs are infinity (0xFF), an adder invalid flag condition occurs. Adder symbol generation circuit
圖19B顯示了用於生成加法器符號正輸出的加法器符號正狀況電路810A的一種實現1900。如前述,異常結果生成具有三個部分:符號生成;指數生成;以及小數生成。產生的符號輸出可以是正的或負的。符號輸出的一種實現具有兩個電路,其結合起來為異常結果生成提供正確的符號輸出。第一電路是加法器符號正狀況電路810A。當加法器符號正1992位元等於「1」時,加法器符號輸出具有正狀況(+)。Figure 19B shows one
符號輸出函數根據以下相加將正狀況強制為零:(1)(+)Zero+(-)Zero,(2)(+)非範數+(-)非範數(反之亦然),(3)符號不同且指數輸入之一為無窮大((+)Inf+(-)Inf反之亦然),(4)符號不同且乘法器乘積為正且指數為無窮大((+)零x(+)Inf=(+)Inf),(5)符號不同且運算元C指數大於運算元A(或乘法器乘積)指數,(6)等式和小數輸出為零,以及(7)當乘法器乘積符號和輸入C符號均為正時。The sign output function forces the positive case to zero according to the addition of: (1) (+) Zero + (-) Zero, (2) (+) non-norm + (-) non-norm (and vice versa), (3 ) are different signs and one of the exponent inputs is infinity ((+)Inf+(-)Inf and vice versa), (4) are different signs and the multiplier product is positive and the exponent is infinity ((+)zero x(+)Inf = (+)Inf), (5) the sign is different and the operand C exponent is greater than the operand A (or multiplier product) exponent, (6) the equation and decimal output is zero, and (7) when the multiplier product sign and input The C symbols are all positive.
圖19B中顯示用於生成強制為「0」的符號位元以用於正狀況的第一電路的一種實現,其中正號是訊號加法器符號正1992,其為及閘1990的輸出。及閘1990有兩個輸入。第一輸入是名為1982的或閘1980輸出,第二輸入是sign_diff,其來自EX-或閘1960。EX-或閘1960閘對兩個輸入進行運算,乘法器乘積符號及輸入C符號。或閘1980有四個輸入。這些是欠位1967、1972以及加零及小數輸出零。One implementation of the first circuit for generating a sign bit forced to "0" for a positive condition is shown in FIG. And
及閘1950的輸出稱為欠位,其為反或閘1940輸出1942與AND小數輸出零的組合。反或閘1940的輸入是8位元正規化指數向量。或閘1965的兩個輸入是乘法器乘積指數無窮大和輸入C指數無窮大。及閘1970具有第一輸入乘法器指數無窮大致能,而第二輸入是反相閘1968的輸出,其輸入是乘法器乘積符號。The output of the AND
圖20A描繪了用於產生加法符號負(-)狀況輸出的加法符號負狀況電路方塊1326C的一種實現2000。這是用於生成如上所述的符號輸出的第二電路。符號輸出包括兩個電路,其結合起來產生正(+)狀況或負(-)狀況。當加法器符號負狀況2042位元等於「1」時,加法符號負狀況電路810B生成負(-)狀況。首先將第二電路描述為示意圖實現,並在下面的幾段中總結了所述功能。FIG. 20A depicts one
圖20A中顯示針對負(-)狀況產生強制為1'的符號輸出位元的電路的一種實現。或閘2040輸出加法器符號負狀況2042。或閘2040的兩個輸入是及閘2020輸出2022及及閘2030輸出2032。當及閘2020被判定時,可能出現第一加法符號負狀況。及閘2020對輸入乘法器乘積符號和輸入C指數進行運算,如果兩個符號都為負,則判定輸出。One implementation of a circuit that produces a sign output bit that is forced to 1' for a negative (-) condition is shown in FIG. 20A. OR
及閘2030的判定可能會發生第二加法器符號負狀況。這將在以下及閘2030的輸入被判定時發生:乘法器指數無窮大致能、乘法器乘積符號和2012。反及閘2010的輸出是2012由乘法器乘積指數無窮大及以運算元C符號作為輸入的反相閘2005輸出致能。The decision of the AND
總結加法器符號負狀況,根據以下狀況,符號輸出對於負狀況(-)被強制為邏輯「1」:(1)當運算元A(或乘法器乘積)和運算元C符號均為「1」時(均為負數);(2)或當運算元A(或乘法器乘積)為負無窮大且運算元C不是正無窮大時。 加法指數電路 Summarizing the adder signed negative condition, the signed output is forced to a logic "1" for the negative condition (-) according to the following conditions: (1) When operand A (or multiplier product) and operand C are both signed "1" (both negative numbers); (2) or when the operand A (or multiplier product) is negative infinity and the operand C is not positive infinity. Adding Exponent Circuit
加法器指數輸出是使用三狀況生成的。第一狀況是全「0」(0x00=零)。第二狀況是全「1」,其中0xFF將等於無窮大狀況。第三狀況是正規指數。The adder exponential output is generated using three conditions. The first condition is all "0"s (0x00=zero). The second condition is all "1", where 0xFF will equal the infinity condition. The third state is the regular index.
圖20B說明了在全「0」狀況電路方塊1328B中的加法器指數生成的一種實現2000。加法器指數全「0」選擇2082是具有三個輸入的或閘2080的輸出。每個輸入代表單獨的狀況。FIG. 20B illustrates an
當乘法器乘積指數零或輸入C指數零進入或閘2050時,產生第一加法器指數全「0」狀況。所述狀況判定2052以觸發加法器指數全「0」選擇2082。When the multiplier product index zero or the input C index zero enters the
當及閘2060具有(+)零和sign_diff輸入時,產生第二加法器指數全「0」狀況,其中(+)零是輸入C指數大於或等於乘法器乘積指數與小數輸出零訊號邏輯上AND。EX-或閘2070對乘法器乘積符號及輸入C符號進行運算以生成sign_diff。當及閘2060被判定時,2062可運算以觸發加法器指數全「0」選擇2082。When AND
第三加法器指數全「0」狀況由輸入到或閘2080的小數輸出零生成以觸發加法器指數全「0」選擇2082。The third adder exponent all "0" condition is generated by the fractional output zero input to OR
圖21A說明了用於所有「1」狀況電路方塊1328B的加法器指數生成的一種示意性實現2100。具有三個輸入狀況的或(OR)輸出函數確定全「1」輸出狀況,並在以下段落中進行描述。狀況電路811B將加法器指數全「1」選擇2122顯示為包含三個輸入的或閘2120的輸出。每個輸入代表單獨的狀況。FIG. 21A illustrates an
當乘法器乘積指數無窮大或輸入C指數無窮大是或閘2110的有效輸入時,生成第一個全「1」狀況。所述狀況判定2112。The first all "1" condition is generated when the multiplier product exponent infinity or input C exponent infinity is a valid input to OR
當加法器正規化指數[8]的第8位元出現溢位結果時,生成第二個全「1」狀況。A second all "1" condition is generated when the 8th bit of the normalized exponent [8] of the adder has an overflow result.
第三個全「1」狀況由乘法器指數無窮大致能生成,其中乘法器輸出指數為無窮大或具有正溢位。The third all "1" condition can be generated approximately by the multiplier exponent being infinite, where the multiplier output exponent is infinite or has a positive overflow.
總而言之,當輸入A(或乘法器乘積)或輸入C的任何指數為無窮大,或最終加法器正規化指數溢位,或乘法器輸出指數為無窮大或正溢位((+)零x(+)Inf=(+)Inf),生成全「1」狀況。In summary, when either the input A (or the multiplier product) or any exponent of the input C is infinity, or the final adder normalized exponent overflows, or the multiplier output exponent is infinity or positive overflow ((+) zero x(+) Inf=(+)Inf), generate all "1" situations.
圖21B顯示了加法器小數生成狀況電路1330A的一種示意性實現2100。所述電路根據控制多工器的三個選擇器狀況路由實際正規化小數值或23位元零。強制前23位元匯流排為零的三個狀況是:(1)任何正規化指數溢位;(2)正規化指數欠位;或(3)或乘法器輸出指數為無窮大或正溢位否則,這種情況是正常的,並且包含23位元正規化小數匯流排的第二匯流排路由到加法器小數2132輸出。FIG. 21B shows an
電路812A的示意圖顯示加法器小數2132輸出為由多工器2150提供的23位元匯流排。或閘2130輸出選擇器控制2131以在多工器2150的兩個23位元輸入匯流排之間進行選擇。多工器2150具有兩個23位元匯流排作為輸入。第一輸入匯流排是23個全零位元匯流排。第二輸入匯流排是23位元正規化小數匯流排。或閘2130根據上面段落中描述的三個輸入狀況將控制2131輸出到多工器2150。The schematic diagram of circuit 812A shows the adder fractional 2132 output as a 23-bit bus provided by
加法器小數2132的輸出有兩狀況,即全「0」(0x00=零)和正規小數。在一個實施例中,進位保留累加單元中的異常處理不支援非正規或NaN運算,將非正規數視為零,而將NaN數視為無窮大。當任何這些異常發生時,小數輸出將為全「0」。The output of the adder decimal 2132 has two states, that is, all "0" (0x00=zero) and normal decimal. In one embodiment, the exception handling in the carry save accumulation unit does not support denormal or NaN operations, treating denormal numbers as zero and NaN numbers as infinity. When any of these exceptions occur, the fractional output will be all "0".
當發生溢位或欠位異常,或乘法器輸出指數為無窮大或正溢位((+)零乘以(+)Inf=(+)Inf)時,出現全「0」。 乘加器運算案例考量 When an overflow or underbit exception occurs, or the output exponent of the multiplier is infinity or a positive overflow ((+) zero multiplied by (+)Inf=(+)Inf), all "0"s appear. Multiplier Adder Operation Case Considerations
在下一段中,x是乘法運算元,而+是總和運算元,=是等於運算元。
雖然本發明是透過參考上面詳述的各種實施例和範例來揭露的,但是應當理解,這些範例意於說明性的而不是限制性的。預期本領域技術人員將容易想到修改和組合,這些修改和組合將在本發明的精神和所附請求的範圍內。While the invention has been disclosed by reference to various embodiments and examples detailed above, it should be understood that these examples are intended to be illustrative and not restrictive. It is expected that those skilled in the art will readily devise modifications and combinations which will be within the spirit of the invention and scope of the appended claims.
110:Bfloat16 130:IEEE754單精確度32位元浮點(FP32) 202:乘法器電路 210a:乘法器及加法器方塊 210b:指數方塊 210:多工器 211:多工器 212:多工器 213:運算元A 214:運算元B 215:基底8轉換器 216:運算元C;線 217:BF16格式或FP32格式 218:線 219:線 220:方塊 221:線 222:雙匯流排 223:匯流排 224:輸出匯流排 225:輸出匯流排 226:匯流排 227:匯流排 228:匯流排 229:匯流排 230:進位保留加法器 240:累加器 240A:指數控制單元 240B:指數比較器單元 240C:有效數部分 241:42位元小數進位暫存器 242:42位元小數總和暫存器 250:進位保留到符號數值轉換方塊 251:匯流排 252:匯流排 260:基底8到基底2轉換和正規化方塊 270:FP32或BF16方塊 300:簡化方塊圖 410:8X8 BF16乘法器電路 420:暫存器 421:暫存器 422:線 423:線 424:線 425:線 426:線 427:線 428:7位元LSB匯流排 429:7位元LSB匯流排 430:7位元漣波進位加法器 440:線 450:暫存器 460:暫存器 461:反相器 462:線 464:指數加法器電路 465:線 466:10位元值 467:特殊指數檢測方塊 468:線 469:線 470:暫存器 471a:反互斥或閘 471b:反互斥或閘 471c:反互斥或閘 473:線 501:線 502:有效數最終加法器電路 503:線 504:暫存器 506:溢位選擇電路 507:線 508:符號位元選擇電路 509:線 510:指數選擇電路 511:線 512:有效數選擇電路 513:線 514:8位元左移位器電路 515:匯流排 516:2的補數反相加1電路 517:匯流排 518:多工器電路 519:溢位訊號 520:暫存器 521:線 522:指數溢位檢測電路 523:輸入匯流排;線 524:指數異常處理電路 525:線 526:指數異常檢測電路 527:線 528:輸出異常控制訊號產生電路 529:匯流排 530:暫存器 531:符號位元 532:暫存器 533:線 537:輸出匯流排;線 539:線 541:線 543:線 544:及閘 545:線 547:匯流排 549:線 551:線 553:匯流排 560:暫存器Fcin 592:基底8轉換器方塊 592a:最終加法有效數選擇和基底8轉換子方塊 592b:指數異常處理子方塊 602:乘積 604:進位捨入方塊 605:溢位檢測方塊 606:LZA電路 607:匯流排 608:移位暫存器SHR8/16/24電路 609:移位器電路 610:移位器電路 611:匯流排 612:總和捨入方塊 613:匯流排 614:進位保留加法器電路 617:及閘 619:反相器 630:移位器指數控制訊號產生/旁路控制電路 633:線 634:線 636:輸出管線暫存器 644:線 646:線 647:線 648:線 650:16位元狀況暫存器 652:16位元乘法器指數比較電路 654:指數累加器(Eacc)暫存器 660:遞增器 661:遞減器 662:符號擴展檢測單元 664:及閘 665:及閘 667:進位匯流排 668:及閘 669:總和匯流排 670:或閘 671:線;輸入 673:輸入 675:輸入 677:輸入 679:線 681:線 682:線 683:線 684:簡單乘積捨入方塊 685:輸出 686:及閘 687:及閘 688:及閘 689:輸出 692:匯流排 693:匯流排 700:最終轉換 270:正規化轉換到符號數值格式方塊 270a:從進位保留轉換到符號數值格式方塊 270b:從基底8轉換到基底2浮點數方塊 702:匯流排 704:匯流排 706:線 708:43位元加法器電路 710:第二電路LZA/LOA 711:線 712:線 714:LZA POS選擇電路 715:匯流排 716:匯流排 717:匯流排 718:反相加1電路 719:線 720:有效數選擇多工器電路 726:符號位元 728:有效數零檢測電路 730:暫存器 735:SHL左移位器電路 736:線 738:匯流排 739:訊號 740:指數加法器電路 742:線 744:線 745:匯流排 746:線 747:線 748:暫存器 750:異常控制(8位元)暫存器 751:異常控制暫存器 752:溢/欠檢測電路 753:線 755:線 756:小數零暫存器 758:norm_en 760:欠檢測多工器 770:39位元有效數 788:線 801:控制線訊號Out_FP32 802:第一多工器 803:線 805:匯流排 807:匯流排 809:匯流排 810A:加法器符號正狀況電路 811B:狀況電路 812A:電路 817:線 819:線 820:多工器 821:線 823:捨入到零選擇線 825:線 827:線 829:線 830:捨入電路 831:線 833:匯流排 835:線 837:39位元有效數暫存器770匯流排fpst_l[38:0] 840:多工器 850:第二捨入遞增電路 860:第一捨入遞增電路 907:線 930:23位元有效數(IEEE 754)暫存器 953:八狀況 957:線 959:線 961:線 962:無窮大控制 963:線 964:EPST_L[9:0]匯流排 965:匯流排 967:訊號線 968:9位元Enorm[8:0]匯流排 969:線 970:零控制邏輯 971:線 972:線 974:第一多工器(零) 975:線 976:多工器 979:指數訊號匯流排 980:指數暫存器 982:遞增器 983:訊號線 986:欠位/溢位檢測和指數異常檢測電路 987:或閘 988:符號生成和異常處理電路 990:暫存器 991:線 992:無窮大檢測電路 994:非正規電路 1000:浮點數範圍 1010:方塊 1100:範例高階架構方塊圖 1102:乘法器異常處理方塊 1104:乘法器異常旗標 1106:匯流排 1108:匯流排乘法器異常狀況訊號 1110:乘法器方塊 1113:運算元A 1114:運算元B 1115:乘法器異常結果 1116:運算元C 1118:運算元C基本轉換方塊 1120:運算元C異常狀況訊號匯流排 1122:匯流排 1124:累加器迴圈 1126:異常輸出控制訊號生成方塊 1128:異常控制 1129:輸出方塊 1130:進位保留加法器(CSA)方塊 1131:輸出方塊 1132:匯流排 1134:加法器正規化異常處理方塊 1138:匯流排 1139:加法器異常結果 1140:加法器異常旗標方塊 1300:異常處理結構 1302:乘法器異常旗標生成方塊 1304:異常旗標生成方塊 1306:加法器異常旗標生成方塊 1308:浮點乘法累加器異常方塊 1310:乘法器異常結果生成方塊 1312:異常結果生成方塊 1314:加法器異常結果生成方塊 1320A:乘法器符號生成狀況方塊 1320B:方塊 1320C:方塊 1322A:乘法器指數生成狀況方塊 1322B:方塊 1322C:方塊 1322D:方塊 1324A:乘法小數生成狀況方塊 1324B:方塊 1324C:方塊 1326A:加法器符號生成狀況方塊 1326B:方塊 1326C:方塊 1328A:加法器指數生成狀況方塊 1328B:方塊 1328C:方塊 1328D:方塊 1330A:加法器小數生成狀況方塊 1330B:方塊 1330C:方塊 1381:乘法器溢位旗標狀況方塊 1382:乘法器欠位旗標狀況方塊 1383:乘法器無效旗標狀況方塊 1387:加法器溢位旗標狀況方塊 1388:加法器欠位旗標狀況方塊 1389:加法器無效旗標狀況方塊 1400:實現 1402:最低有效數位元(LSB) 1404:最高有效數位元(MSB) 1406:最低有效數位元(LSB) 1408:最高有效數位元(MSB) 1410:最低有效數位元(LSB) 1412:最高有效數位元(MSB) 1414:乘法運算致能 1418:乘法運算致能 1420:及閘 1430:及閘 1440:乘積指數及閘 1442:輸出 1444:及閘 1445:反或閘 1446:乘法器溢位旗標狀況 1450:反或閘 1460:反或閘 1470:乘積指數反或閘 1472:輸出 1475:反或閘 1480:乘法器欠位及閘 1482:乘法器欠位旗標狀況 1500:實現 1501:乘法運算致能 1502:最低有效數位元(LSB) 1504:最高有效數位元(MSB)八位元 1506:最低有效數位元(LSB) 1508:最高有效數位元(MSB) 1510:及閘 1520:反或閘 1530:及閘 1540:反或閘 1550:及閘 1560:及閘 1570:或閘 1580:乘法器無效及閘 1552:第一輸入 1562:第二輸入 1572:輸出 1582:乘法器無效旗標狀況 1600:實現 1618:符號A 1620:EX-或閘 1622:符號B 1630:及閘 1632:第二輸入 1700:實現 1710:或閘 1720:及閘 1730:或閘 1740:反或閘 1750:多工器 1760:或閘 1770:多工器 1752:乘法器指數 1762:輸出 1772:乘法器小數匯流排 1800:示意性實現 1810:及閘 1820:反或閘 1830:及閘 1822:非輸入指數無窮大 1832:加法器溢位 1850:及閘 1860:EX-或閘 1870:及閘 1880:反或閘 1890:及閘 1872:輸出精確零 1882:非輸入精確零 1892:加法器欠位 1900:實現 1910:或閘 1920:EX-或閘 1930:及閘 1912:輸出 1932:加法器無效旗標 1940:反或閘 1942:輸出 1950:及閘 1960:EX-或閘 1965:或閘 1967:欠位 1968:反相閘 1970:及閘 1972:欠位 1980:或閘 1990:及閘 1982:輸出 1992:訊號加法器符號正位元 2000:實現 2005:反相閘 2010:反及閘 2020:及閘 2030:及閘 2040:或閘 2050:或閘 2060:及閘 2070:EX-或閘 2080:或閘 2012:輸出 2022:輸出 2032:輸出 2042:加法器符號負狀況位元 2052:判定 2082:加法器指數全「0」選擇 2100:示意性實現 2110:或閘 2120:或閘 2112:判定 2122:加法器指數全「1」選擇 2130:或閘 2131:選擇器控制 2132:加法器小數輸出 2150:多工器 110:Bfloat16 130: IEEE754 single-precision 32-bit floating point (FP32) 202: Multiplier circuit 210a: Multiplier and adder blocks 210b: Index block 210: multiplexer 211: multiplexer 212: multiplexer 213: Operand A 214:Operator B 215: base 8 converter 216: Operator C; line 217: BF16 format or FP32 format 218: line 219: line 220: block 221: line 222: double busbar 223: busbar 224: output bus 225: output bus 226: busbar 227: busbar 228: busbar 229: busbar 230: Carry save adder 240: accumulator 240A: Index control unit 240B: Exponent comparator unit 240C: Significant part 241: 42-bit decimal carry register 242: 42-bit fractional sum register 250: Carry reserved to sign value conversion box 251: busbar 252: busbar 260:Base-8 to base-2 conversion and normalizing squares 270: FP32 or BF16 block 300:Simplified Block Diagram 410:8X8 BF16 multiplier circuit 420: scratchpad 421: scratchpad 422: line 423: line 424: line 425: line 426: line 427: line 428: 7-bit LSB bus 429: 7-bit LSB bus 430: 7-bit ripple-carry adder 440: line 450: scratchpad 460: scratchpad 461: Inverter 462: line 464:Exponent adder circuit 465: line 466: 10-bit value 467:Special index detection block 468: line 469: line 470: scratchpad 471a: Anti-Mutual Exclusion OR Gate 471b: Anti-mutex OR gate 471c: Anti-Mutual Exclusion OR Gate 473: line 501: line 502: Effective number final adder circuit 503: line 504: scratchpad 506: overflow selection circuit 507: line 508: Sign bit selection circuit 509: line 510: Index selection circuit 511: line 512: Significant number selection circuit 513: line 514: 8-bit left shifter circuit 515: busbar 516:2's complement inverting plus 1 circuit 517: busbar 518:Multiplexer circuit 519: overflow signal 520: scratchpad 521: line 522: Exponent overflow detection circuit 523: input bus; line 524: Index exception processing circuit 525: line 526: Index anomaly detection circuit 527: line 528: Output abnormal control signal generation circuit 529: busbar 530: scratchpad 531: sign bit 532: scratchpad 533: line 537: output bus bar; line 539: line 541: line 543: line 544: and gate 545: line 547: busbar 549: line 551: line 553: busbar 560: Register Fcin 592: Base 8 Converter Block 592a: Final addition significand selection and base-8 conversion sub-block 592b: Exponential exception handling sub-block 602: product 604: Carry rounding box 605: Overflow detection block 606: LZA circuit 607: busbar 608: Shift register SHR8/16/24 circuit 609: Shifter circuit 610: Shifter circuit 611: busbar 612:Sum rounding box 613: busbar 614: Carry save adder circuit 617: and gate 619: Inverter 630: Shifter index control signal generation/bypass control circuit 633: line 634: line 636:Output pipeline register 644: line 646: line 647: line 648: line 650: 16-bit status register 652: 16-bit multiplier exponent comparison circuit 654: Index accumulator (Eacc) scratchpad 660: incrementer 661:Decrementer 662: sign extension detection unit 664: and gate 665: and gate 667: carry bus 668: and gate 669: sum bus 670: OR gate 671: line; input 673:Input 675:Enter 677:Input 679: line 681: line 682: line 683: line 684: Simple Product Rounding Blocks 685:Output 686: and gate 687: and gate 688: and gate 689:Output 692: busbar 693: busbar 700: final conversion 270:Normalize Convert to Symbolic Numeric Format Box 270a: Convert from Carry-Saving to Signed Numeric Format Block 270b: Converting from base-8 to base-2 floating point squares 702: Bus 704: busbar 706: line 708: 43-bit adder circuit 710: second circuit LZA/LOA 711: line 712: line 714: LZA POS selection circuit 715: busbar 716: busbar 717: busbar 718: Inverting plus 1 circuit 719: line 720: Valid number selection multiplexer circuit 726: sign bit 728: Significant zero detection circuit 730: scratchpad 735:SHL left shifter circuit 736: line 738: busbar 739:Signal 740:Exponent adder circuit 742: line 744: line 745: Bus 746: line 747: line 748: scratchpad 750: exception control (8-bit) scratchpad 751: Exception control register 752: Overflow/under detection circuit 753: line 755: line 756: Decimal zero register 758: norm_en 760: under detection multiplexer 770: 39-bit significant number 788: line 801: Control line signal Out_FP32 802: The first multiplexer 803: line 805: Bus 807: Bus 809: Bus 810A: Adder Sign Positive Condition Circuit 811B: Status circuit 812A: circuit 817: line 819: line 820: multiplexer 821: line 823: Round to zero selection line 825: line 827: line 829: line 830: Rounding circuit 831: line 833: busbar 835: line 837: 39-bit significand register 770 bus fpst_l[38:0] 840: multiplexer 850: second rounding increment circuit 860: first round increment circuit 907: line 930: 23-bit significand (IEEE 754) register 953: eight conditions 957: line 959: line 961: line 962: infinite control 963: line 964: EPST_L[9:0] bus 965: busbar 967: signal line 968: 9-bit Enorm[8:0] bus 969: line 970: Zero control logic 971: line 972: line 974: first multiplexer (zero) 975: line 976: multiplexer 979: index signal bus 980: index register 982: incrementer 983: signal line 986: Under/overflow detection and exponent anomaly detection circuit 987: OR gate 988:Symbol generation and exception handling circuits 990: scratchpad 991: line 992: Infinity detection circuit 994: informal circuit 1000: range of floating point numbers 1010: block 1100:Example High-Level Architecture Block Diagram 1102: Multiplier exception handling block 1104: Multiplier exception flag 1106: busbar 1108: Bus multiplier abnormal condition signal 1110: Multiplier block 1113: Operand A 1114:Operator B 1115: Abnormal multiplier result 1116:Operator C 1118:Operator C basic conversion block 1120: Operation element C abnormal condition signal bus 1122: busbar 1124: accumulator loop 1126: Abnormal output control signal generation block 1128: Abnormal control 1129: output block 1130: Carry save adder (CSA) block 1131: output block 1132: busbar 1134: Adder normalization exception handling block 1138: busbar 1139: Abnormal adder result 1140: Adder exception flag block 1300: exception handling structure 1302: Multiplier exception flag generation block 1304: Abnormal flag generation block 1306: Adder exception flag generation block 1308: Floating-point multiplication accumulator exception block 1310: The abnormal result of the multiplier generates a block 1312: Abnormal result generation block 1314: The abnormal result of the adder generates a block 1320A: Multiplier Symbol Generation Status Block 1320B: block 1320C: block 1322A: Multiplier Exponent Generation Status Block 1322B: block 1322C: block 1322D: block 1324A: Multiplication decimal generation status block 1324B: block 1324C: block 1326A: Adder Symbol Generation Status Block 1326B: block 1326C: block 1328A: Adder Exponent Generation Status Block 1328B: block 1328C: block 1328D: block 1330A: Adder decimal generation status block 1330B: block 1330C: block 1381: Multiplier overflow flag status block 1382:Multiplier underbit flag status box 1383: Multiplier invalid flag status block 1387: Adder overflow flag status block 1388: Adder underbit flag status block 1389: Adder invalid flag status block 1400: Implementation 1402: Least Significant Bit (LSB) 1404: most significant bit (MSB) 1406: least significant bit (LSB) 1408: Most significant bit (MSB) 1410: least significant bit (LSB) 1412: Most significant bit (MSB) 1414: Enable multiplication operation 1418: Enable multiplication operation 1420: and gate 1430: and gate 1440: Product index and gate 1442: output 1444: and gate 1445: Inverse OR gate 1446: Multiplier overflow flag status 1450: Inverse OR gate 1460: Inverse OR gate 1470: product index inverse OR gate 1472: output 1475: Inverse OR gate 1480: Multiplier underbit and gate 1482: Multiplier underbit flag status 1500: Achieved 1501: Enable multiplication operation 1502: Least Significant Bit (LSB) 1504: Most significant bit (MSB) octet 1506: Least Significant Bit (LSB) 1508: Most significant bit (MSB) 1510: and gate 1520: reverse OR gate 1530: and gate 1540: Inverse OR gate 1550: and gate 1560: and gate 1570: OR gate 1580: Multiplier invalid and gate 1552: first input 1562: second input 1572: output 1582: Multiplier invalid flag condition 1600: Achieved 1618: Symbol A 1620: EX-OR gate 1622: Symbol B 1630: and gate 1632: second input 1700: Achieved 1710: OR gate 1720: and gate 1730: OR gate 1740: Inverse OR gate 1750: multiplexer 1760: OR gate 1770: multiplexer 1752: Multiplier exponent 1762: output 1772: Multiplier Fractional Bus 1800: Schematic implementation 1810: and gate 1820: Inverse OR gate 1830: and gate 1822: Non-input exponent is infinity 1832: Adder overflow 1850: and gate 1860:EX-OR gate 1870: and gate 1880: Inverse OR gate 1890: and gate 1872: Output exact zero 1882: Non-input exact zero 1892: Adder underbit 1900: Implementation 1910: OR gate 1920: EX-OR gate 1930: and gate 1912: output 1932: Adder invalid flag 1940: Inverse OR gate 1942: Export 1950: and gate 1960: EX-OR gate 1965: OR gate 1967: Underbit 1968: Inverting gate 1970: And gate 1972: Underbit 1980: OR gate 1990: And gate 1982: Export 1992: Sign positive bit of signal adder 2000: realized 2005: Reversing gate 2010: Reverse and gate 2020: and gate 2030: and gate 2040: OR gate 2050: OR gate 2060: and gate 2070: EX-OR gate 2080: OR gate 2012: Output 2022: Export 2032: output 2042: adder sign negative status bit 2052: Judgment 2082: Adder index all "0" selection 2100: Schematic implementation 2110: OR gate 2120: OR gate 2112: Judgment 2122: Adder exponent all "1" selection 2130: OR gate 2131: selector control 2132: adder decimal output 2150: multiplexer
[圖1]說明了BFloat16和浮點IEEE-754標準的編碼格式。[Figure 1] illustrates the encoding format of BFloat16 and floating-point IEEE-754 standards.
[圖2]說明了BF16和FP32格式的具有進位保留累加器的浮點乘加累加單元的高階方塊圖。[Fig. 2] A high-level block diagram illustrating a floating-point multiply-accumulate unit with a carry-save accumulator in BF16 and FP32 formats.
[圖3]說明了具有兩個輸入(運算元A和運算元B)的乘法器電路的階層式方塊圖。[ Fig. 3 ] A hierarchical block diagram illustrating a multiplier circuit having two inputs (operator A and operand B).
[圖4a]說明了包含8x8乘法器部分乘積簡化樹的範例乘法器及加法器方塊。[FIG. 4a] illustrates an example multiplier and adder block comprising a partial product reduction tree of 8x8 multipliers.
[圖4b]說明了具有特殊指數檢測方塊的範例指數單元。[FIG. 4b] illustrates an example index unit with a special index detection block.
[圖5A]說明了顯示包含範例最終加法、有效數選擇和基底8轉換方塊和範例指數異常處理方塊的基底8轉換器的階層式方塊圖。[FIG. 5A] illustrates a hierarchical block diagram showing a base-8 converter including example final addition, significand selection, and base-8 conversion blocks and example exponent exception handling blocks.
[圖5B]說明了最終部分乘積加法、有效數選擇和基底8轉換方塊的範例性示意圖表示。[FIG. 5B] An exemplary schematic representation illustrating final partial product addition, significand selection, and base-8 conversion blocks.
[圖5C]說明了異常處理方塊的範例性示意圖表示。[FIG. 5C] illustrates an exemplary schematic representation of an exception handling block.
[圖6]說明了進位保留累加單元的高階階層式方塊圖。[FIG. 6] A high-level hierarchical block diagram illustrating a carry-save-accumulate unit.
[圖7A]說明了包含兩個階層式塊:指數控制單元和有效數單元的累加器的高階階層式方塊圖。[FIG. 7A] A high-level hierarchical block diagram illustrating an accumulator comprising two hierarchical blocks: an exponent control unit and a significand unit.
[圖7B]說明了指數控制單元的範例性階層式方塊圖和示意圖。[FIG. 7B] An exemplary hierarchical block diagram and schematic diagram illustrating an index control unit.
[圖7C]說明了有效數單位的範例性階層式方塊圖和示意圖。[FIG. 7C] Exemplary hierarchical block and schematic diagrams illustrating significand units.
[圖8A]說明了範例性階層式方塊,其顯示了包含兩個子方塊的正規化轉換到符號數值格式方塊,從進位保留到符號數值子方塊的第一轉換及從基底8到基底2的浮點數子方塊的第二轉換。[FIG. 8A] illustrates an exemplary hierarchical block showing a normalized conversion to signed value format block containing two sub-blocks, a first conversion from carry-save to signed value sub-block and
[圖8B]說明了從進位保留到符號數值方塊的轉換的範例性示意圖。[FIG. 8B] An exemplary diagram illustrating conversion from carry-save to signed value blocks.
[圖8C]說明了從基底8到基底2的浮點數方塊的轉換的範例性示意圖。[ FIG. 8C ] An exemplary diagram illustrating conversion of a floating-point number block from
[圖9A]說明了顯示捨入及轉換為BF16或IEEE 754 32位元單精確度格式子方塊以及指數和異常處理子方塊的範例性階層式方塊。[FIG. 9A] illustrates an exemplary hierarchical block showing rounding and conversion to BF16 or IEEE 754 32-bit single precision format sub-blocks and exponent and exception handling sub-blocks.
[圖9B]說明了顯示捨入和轉換為BF16或IEEE 754 32位元SP格式方塊的範例性示意圖。[FIG. 9B] illustrates an exemplary diagram showing rounding and conversion to BF16 or IEEE 754 32-bit SP format blocks.
[圖9C]說明了顯示指數和異常處理方塊的範例性示意圖。[FIG. 9C] illustrates an exemplary schematic diagram showing indices and exception handling blocks.
[圖10]說明了用於機器學習的進位保留累加單元中的異常處理所處理的浮點數範圍。[Fig. 10] illustrates the range of floating-point numbers handled by exception handling in the carry-save accumulation unit for machine learning.
[圖11]顯示了描繪用於機器學習的進位保留累加單元中的異常處理元素的高階架構方塊圖。[FIG. 11] shows a high-level architectural block diagram depicting exception handling elements in a carry-save-accumulate unit for machine learning.
[圖12A]說明了包含BF16格式的輸入A、BF16格式的輸入B和FP32格式的輸入C的第一運算模式高階方塊圖架構,其中BF16指定16位元機器學習浮點編碼格式,稱為「B-float」,或Google開發的(腦浮點(Brain Floating Point)),而FP32指定32位元單精確度IEEE 754標準表示。[FIG. 12A] illustrates a first mode of operation high-level block diagram architecture comprising input A in BF16 format, input B in BF16 format, and input C in FP32 format, where BF16 specifies a 16-bit machine learning floating-point encoding format called " B-float", or developed by Google (Brain Floating Point (Brain Floating Point)), and FP32 specifies a 32-bit single-precision IEEE 754 standard representation.
[圖12B]說明了包含BF16格式的輸入A、BF16格式的輸入B以及執行累加的第二運算模式高階方塊圖架構。[ FIG. 12B ] illustrates a high-level block diagram architecture of the second operation mode including input A in BF16 format, input B in BF16 format, and performing accumulation.
[圖12C]說明了包含FP32格式的輸入A和FP32格式的輸入C的第三運算模式高階方塊圖架構。[ FIG. 12C ] illustrates the high-level block diagram architecture of the third operation mode including input A in FP32 format and input C in FP32 format.
[圖13]說明了異常處理結構的高階方塊圖。[Fig. 13] A high-level block diagram illustrating the exception handling structure.
[圖14A]描繪了乘法器溢位旗標狀況電路。[FIG. 14A] depicts the multiplier overflow flag condition circuit.
[圖14B]顯示了乘法器欠位旗標狀況電路。[FIG. 14B] shows the multiplier under-bit flag condition circuit.
[圖15]說明了乘法器無效旗標狀況電路。[Fig. 15] illustrates the multiplier invalid flag condition circuit.
[圖16]描繪了乘法符號生成狀況電路。[ Fig. 16 ] depicts a multiplication sign generation condition circuit.
[圖17A]顯示了乘法指數生成狀況電路。[FIG. 17A] shows a multiplication index generating condition circuit.
[圖17B]描繪了乘法小數生成狀況電路。[FIG. 17B] The multiplication decimal number generation condition circuit is depicted.
[圖18A]說明了加法器溢位旗標狀況電路。[FIG. 18A] illustrates the adder overflow flag condition circuit.
[圖18B]顯示了加法器欠位旗標狀況電路。[FIG. 18B] shows the adder underbit flag condition circuit.
[圖19A]顯示了加法器無效旗標狀況電路。[FIG. 19A] shows the adder invalid flag condition circuit.
[圖19B]描繪了加法器符號正狀況電路。[ Fig. 19B ] The adder sign positive condition circuit is depicted.
[圖20A]描繪了加法器符號負電路。[ Fig. 20A ] An adder sign negative circuit is depicted.
[圖20B]說明了加法器指數生成全「0」狀況電路。[FIG. 20B] illustrates the adder index generating all "0" condition circuit.
[圖21A]說明了加法器指數生成全「1」狀況電路。[FIG. 21A] illustrates the adder index generating all "1" state circuit.
[圖21B]顯示了加法小數生成狀況電路。[FIG. 21B] shows the additive decimal generation condition circuit.
210:多工器 210: multiplexer
211:多工器 211: multiplexer
212:多工器 212: multiplexer
213:運算元A 213: Operand A
214:運算元B 214:Operator B
215:基底8轉換器
215:
216:運算元C;線 216: Operator C; line
217:BF16格式或FP32格式 217: BF16 format or FP32 format
218:線 218: line
219:線 219: line
220:方塊 220: block
221:線 221: line
222:雙匯流排 222: double busbar
223:匯流排 223: busbar
224:輸出匯流排 224: output bus
225:輸出匯流排 225: output bus
226:匯流排 226: busbar
227:匯流排 227: busbar
228:匯流排 228: busbar
229:匯流排 229: busbar
230:進位保留加法器 230: Carry save adder
240:累加器 240: accumulator
250:進位保留到符號數值轉換方塊 250: Carry reserved to sign value conversion box
251:匯流排 251: busbar
252:匯流排 252: busbar
260:基底8到基底2轉換和正規化方塊 260:Base-8 to base-2 conversion and normalizing squares
270:FP32或BF16方塊 270: FP32 or BF16 block
Claims (22)
Applications Claiming Priority (16)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163165073P | 2021-03-23 | 2021-03-23 | |
US63/165,073 | 2021-03-23 | ||
US202163166221P | 2021-03-25 | 2021-03-25 | |
US63/166,221 | 2021-03-25 | ||
US202163174460P | 2021-04-13 | 2021-04-13 | |
US63/174,460 | 2021-04-13 | ||
US202163190749P | 2021-05-19 | 2021-05-19 | |
US63/190,749 | 2021-05-19 | ||
US17/397,241 US11429349B1 (en) | 2021-03-23 | 2021-08-09 | Floating point multiply-add, accumulate unit with carry-save accumulator |
US17/397,241 | 2021-08-09 | ||
US202163239384P | 2021-08-31 | 2021-08-31 | |
US63/239,384 | 2021-08-31 | ||
US17/465,558 US11366638B1 (en) | 2021-03-23 | 2021-09-02 | Floating point multiply-add, accumulate unit with combined alignment circuits |
US17/465,558 | 2021-09-02 | ||
US17/534,376 | 2021-11-23 | ||
US17/534,376 US11442696B1 (en) | 2021-03-23 | 2021-11-23 | Floating point multiply-add, accumulate unit with exception processing |
Publications (1)
Publication Number | Publication Date |
---|---|
TW202307647A true TW202307647A (en) | 2023-02-16 |
Family
ID=86669900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW111110603A TW202307647A (en) | 2021-03-23 | 2022-03-22 | Floating point multiply-add, accumulate unit with carry-save accumulator |
Country Status (1)
Country | Link |
---|---|
TW (1) | TW202307647A (en) |
-
2022
- 2022-03-22 TW TW111110603A patent/TW202307647A/en unknown
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Huang et al. | A new architecture for multiple-precision floating-point multiply-add fused unit design | |
US5027308A (en) | Circuit for adding/subtracting two floating point operands | |
US7395304B2 (en) | Method and apparatus for performing single-cycle addition or subtraction and comparison in redundant form arithmetic | |
JPH02196328A (en) | Floating point computing apparatus | |
JPH04227531A (en) | Pipeline floating-point processor | |
US5148386A (en) | Adder-subtracter for signed absolute values | |
Huang et al. | Low-cost binary128 floating-point FMA unit design with SIMD support | |
US11119729B2 (en) | Alignment shifting and incrementing to determine a rounded result of adding first and second floating-point operands | |
US8214416B2 (en) | Floating-point addition acceleration | |
Lichtenau et al. | Quad precision floating point on the IBM z13 | |
US20140074903A1 (en) | Dual-Path Fused Floating-Point Add-Subtract | |
Samy et al. | A decimal floating-point fused-multiply-add unit | |
Boersma et al. | The POWER7 binary floating-point unit | |
US5260889A (en) | Computation of sticky-bit in parallel with partial products in a floating point multiplier unit | |
US7437400B2 (en) | Data processing apparatus and method for performing floating point addition | |
US6205462B1 (en) | Digital multiply-accumulate circuit that can operate on both integer and floating point numbers simultaneously | |
Erle et al. | Decimal floating-point multiplication | |
US7716264B2 (en) | Method and apparatus for performing alignment shifting in a floating-point unit | |
US6061707A (en) | Method and apparatus for generating an end-around carry in a floating-point pipeline within a computer system | |
US11366638B1 (en) | Floating point multiply-add, accumulate unit with combined alignment circuits | |
US5170371A (en) | Method and apparatus for rounding in high-speed multipliers | |
TW202307647A (en) | Floating point multiply-add, accumulate unit with carry-save accumulator | |
Mathis et al. | A novel single/double precision normalized IEEE 754 floating-point adder/subtracter | |
Schwarz | Binary Floating-Point Unit Design: the fused multiply-add dataflow | |
KR100627993B1 (en) | Three input split-adder |