TWI776580B

TWI776580B - Adders, arithmetic circuits, chips and computing devices

Info

Publication number: TWI776580B
Application number: TW110124879A
Authority: TW
Inventors: 劉建波; 范志軍; 李楠; 郭海豐
Original assignee: 大陸商深圳比特微電子科技有限公司
Priority date: 2020-07-22
Filing date: 2021-07-07
Publication date: 2022-09-01
Also published as: WO2022017179A1; CN111708512A; TW202143024A

Abstract

本公開涉及加法器、運算電路、晶片和計算裝置。公開了一種加法器，用於計算輸入的兩個數字之和，加法器具有分別表示兩個數字的兩個輸入，每個輸入被彼此對應地劃分為多個子部分，多個子部分由低位到高位依次表示輸入的部分位，加法器包括：多個第一級加法模塊，每一個用於對兩個輸入的對應子部分進行求和；多個中間寄存器，每一個耦接到對應的第一級加法模塊，用於存儲兩個輸入的對應子部分的和數；一個或多個進位寄存器，每一個耦接到對應的第一級加法模塊，用於存儲兩個輸入的對應子部分的進位；以及第二級加法模塊，耦接到多個中間寄存器和一個或多個進位寄存器，用於對來自每個中間寄存器的和數與來自對應的前一個進位寄存器的進位進行求和。The present disclosure relates to adders, arithmetic circuits, chips, and computing devices. An adder is disclosed for calculating the sum of two numbers of inputs, the adder has two inputs representing the two numbers respectively, each input is divided into a plurality of sub-parts corresponding to each other, and the plurality of sub-parts are from low-order to high-order In order to represent the partial bits of the input, the adder includes: a plurality of first-stage addition modules, each of which is used to sum corresponding subsections of the two inputs; a plurality of intermediate registers, each of which is coupled to the corresponding first-stage an addition module for storing the sum of the corresponding subsections of the two inputs; one or more carry registers, each coupled to the corresponding first-level addition module, for storing the carry of the corresponding subsections of the two inputs; and a second stage addition module coupled to the plurality of intermediate registers and one or more carry registers for summing the sum from each intermediate register with the carry from the corresponding preceding carry register.

Description

Adders, arithmetic circuits, chips and computing devices

本公開總體而言涉及數字電路。具體來說，涉及一種加法器，一種包括加法器的運算電路，以及晶片和計算裝置。The present disclosure generally relates to digital circuits. Specifically, it relates to an adder, an arithmetic circuit including the adder, and a chip and a computing device.

用於進行加法運算的加法器是許多運算電路的重要組成部分。在相關技術中，如果需要提高加法器的運算速度，通常會採用高速器件來實現加法器。Adders, which are used to perform addition operations, are an important part of many arithmetic circuits. In the related art, if the operation speed of the adder needs to be improved, a high-speed device is usually used to realize the adder.

根據本公開的一個方面，提供了一種加法器，其用於計算輸入的兩個數字之和，所述加法器具有分別表示所述兩個數字的兩個輸入，其中每個輸入被彼此對應地劃分為多個子部分，所述多個子部分由低位到高位依次表示所述輸入的部分位，並且所述加法器包括：多個第一級加法模塊，每個第一級加法模塊用於對所述兩個輸入的對應子部分進行求和；多個中間寄存器，每個中間寄存器耦接到對應的第一級加法模塊，用於存儲所述兩個輸入的對應子部分的和數；一個或多個進位寄存器，每個進位寄存器耦接到對應的第一級加法模塊，用於存儲所述兩個輸入的對應子部分的進位；以及第二級加法模塊，耦接到所述多個中間寄存器和所述一個或多個進位寄存器，用於對來自每個中間寄存器的和數與來自對應的前一個進位寄存器的進位進行求和。According to one aspect of the present disclosure, there is provided an adder for calculating the sum of two numbers of an input, the adder having two inputs respectively representing the two numbers, wherein each input is corresponding to each other Divided into a plurality of sub-parts, the plurality of sub-parts represent partial bits of the input sequentially from low bits to high bits, and the adder includes: a plurality of first-stage addition modules, each first-stage addition module is used to The corresponding subsections of the two inputs are summed; a plurality of intermediate registers, each of which is coupled to a corresponding first-stage addition module, is used to store the sum of the corresponding subsections of the two inputs; one or a plurality of carry registers, each carry register coupled to a corresponding first stage addition module for storing carry bits of corresponding sub-portions of the two inputs; and a second stage addition module coupled to the plurality of intermediate A register and the one or more carry registers for summing the sum from each intermediate register with the carry from the corresponding preceding carry register.

根據本公開的另一個方面，提供了一種加法器，其用於計算輸入的一個數字與預定的常數之和，所述加法器具有表示所述數字的一個輸入，所述輸入被劃分為多個子部分，所述多個子部分由低位到高位依次表示所述輸入的部分位，並且所述加法器包括：一個或多個第一級加法模塊，每個第一級加法模塊用於對所述輸入的對應子部分與所述常數的對應位進行求和；多個中間寄存器，每個中間寄存器耦接到對應的第一級加法模塊，用於存儲所述輸入的對應子部分與所述常數的對應位的和數；一個或多個進位寄存器，每個進位寄存器耦接到對應的第一級加法模塊，用於存儲所述輸入的對應子部分與所述常數的對應位的進位；以及第二級加法模塊，耦接到所述多個中間寄存器和所述一個或多個進位寄存器，用於對來自每個中間寄存器的和數與來自對應的前一個進位寄存器的進位進行求和。According to another aspect of the present disclosure, there is provided an adder for calculating the sum of a number of an input and a predetermined constant, the adder having an input representing the number, the input divided into a plurality of sub-numbers part, the plurality of sub-parts sequentially represent partial bits of the input from low bits to high bits, and the adder includes: one or more first-stage addition modules, each first-stage addition module is used for adding the input to the input. The corresponding subsections of the input are summed with the corresponding bits of the constant; a plurality of intermediate registers, each of which is coupled to the corresponding first-level addition module, is used to store the corresponding subsections of the input and the constants. a sum of corresponding bits; one or more carry registers, each of which is coupled to a corresponding first-stage addition module, for storing the carry of the corresponding subsection of the input and the corresponding bit of the constant; and A second-level addition module, coupled to the plurality of intermediate registers and the one or more carry registers, for summing the sum from each intermediate register with the carry from the corresponding preceding carry register.

根據本公開的另一個方面，提供了一種運算電路，其包括如上所述的加法器；以及耦接到所述加法器的輸入的前置組合邏輯模塊和耦接到所述加法器的輸出的後置組合邏輯模塊中的至少一者。According to another aspect of the present disclosure, there is provided an arithmetic circuit comprising an adder as described above; and a pre-combination logic module coupled to an input of the adder and a pre-combination logic module coupled to an output of the adder at least one of the post-combination logic modules.

根據本公開的另一個方面，提供了一種晶片，其包括如上所述的運算電路。According to another aspect of the present disclosure, there is provided a wafer including the arithmetic circuit as described above.

根據本公開的又一個方面，提供了一種計算裝置，其包括如上所述的晶片。According to yet another aspect of the present disclosure, there is provided a computing device comprising a wafer as described above.

通過以下參照圖式對本公開的示例性實施例的詳細描述，本公開的其它特徵及其優點將會變得更為清楚。Other features of the present disclosure and advantages thereof will become more apparent from the following detailed description of exemplary embodiments of the present disclosure with reference to the accompanying drawings.

下面將參照圖式來詳細描述本公開的各種示例性實施例。應注意到：除非另外具體說明，否則在這些實施例中闡述的部件和步驟的相對佈置、數字表達式和數值不限制本公開的範圍。Various exemplary embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. It should be noted that the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

以下對至少一個示例性實施例的描述實際上僅僅是說明性的，決不作為對本公開及其應用或使用的任何限制。也就是說，本文中的結構及方法是以示例性的方式示出，來說明本公開中的結構和方法的不同實施例。然而，本領域技術人員將會理解，它們僅僅說明可以用來實施的本公開的示例性方式，而不是窮盡的方式。此外，圖式不必按比例繪製，一些特徵可能被放大以示出具體組件的細節。The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application or uses in any way. That is, the structures and methods herein are shown by way of example to illustrate various embodiments of the structures and methods in the present disclosure. Those skilled in the art will appreciate, however, that they are merely illustrative, and not exhaustive, of the ways in which the disclosure may be practiced. Furthermore, the drawings are not necessarily to scale and some features may be exaggerated to show details of particular components.

對於相關領域普通技術人員已知的技術、方法和設備可能不作詳細討論，但在適當情況下，所述技術、方法和設備應當被視為授權說明書的一部分。Techniques, methods, and devices known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods, and devices should be considered part of the authorized description.

在這裡示出和討論的所有示例中，任何具體值應被解釋為僅僅是示例性的，而不是作為限制。因此，示例性實施例的其它示例可以具有不同的值。In all examples shown and discussed herein, any specific value should be construed as illustrative only and not as limiting. Accordingly, other examples of exemplary embodiments may have different values.

在相關技術中，如果需要提高加法器的運算速度，通常會採用高速器件來實現加法器。但是，高速器件的面積較大、功耗較高，這導致加法器以及包括加法器的運算電路的面積和功耗相應增大，使得晶片的製造成本和功耗顯著增大。因此，期望以較低的製造成本和功耗來提高加法器的運算速度，因而需要一種改進的加法器。In the related art, if the operation speed of the adder needs to be improved, a high-speed device is usually used to realize the adder. However, high-speed devices have larger areas and higher power consumption, which lead to corresponding increases in the area and power consumption of the adder and the operation circuit including the adder, which significantly increases the manufacturing cost and power consumption of the chip. Therefore, it is desired to increase the operation speed of the adder with lower manufacturing cost and power consumption, and thus an improved adder is required.

圖1示出了根據本公開一個或多個示例性實施例的加法器100的示意圖。加法器100用於計算輸入的兩個數字之和。FIG. 1 shows a schematic diagram of an adder 100 in accordance with one or more exemplary embodiments of the present disclosure. The adder 100 is used to calculate the sum of the two numbers input.

如圖1所示，加法器100具有兩個輸入111、112和兩個輸出161、162。其中，兩個輸入111、112分別表示輸入的兩個數字，並且輸出161、162分別表示這兩個數字的求和結果的和數及進位。As shown in FIG. 1 , the adder 100 has two inputs 111 , 112 and two outputs 161 , 162 . Wherein, the two inputs 111 and 112 respectively represent the two input numbers, and the outputs 161 and 162 respectively represent the sum and the carry of the summation result of the two numbers.

本領域技術人員應當理解，加法器100的輸入和輸出的配置不限於圖1所示的實施例。可以根據加法器的功能和運算電路的需要而適當地調整加法器100的輸入和輸出的配置，並相應地調整加法器100的各個模塊的配置。例如，在一些實施例中，加法器也可以僅具有一個輸出，即僅輸出求和結果的和數，而不輸出進位。Those skilled in the art should understand that the configuration of the input and output of the adder 100 is not limited to the embodiment shown in FIG. 1 . The configuration of the input and output of the adder 100 can be appropriately adjusted according to the function of the adder and the needs of the operation circuit, and the configuration of each module of the adder 100 can be adjusted accordingly. For example, in some embodiments, the adder may also have only one output, ie, only the sum of the summation results, without the carry.

如圖1所示，每個輸入111、112被彼此對應地劃分為N個子部分，這N個子部分由低位到高位依次表示該輸入的部分位。例如，輸入111由低位到高位被劃分為子部分111-1, 111-2, …, 111-N，輸入112由低位到高位被劃分為子部分112-1, 112-2, …, 112-N。As shown in FIG. 1 , each input 111 , 112 is divided into N sub-parts corresponding to each other, and the N sub-parts sequentially represent the partial bits of the input from low-order to high-order. For example, input 111 is divided into subsections 111-1, 111-2, …, 111-N from low to high, and input 112 is divided into subsections 112-1, 112-2, …, 112- from low to high N.

具體而言，第一個子部分111-1和112-1分別表示輸入111和112的最低一位或多位，並且111-1和112-1所表示的位相同。相應地，第二個子部分111-2和112-2分別表示輸入111和112的比111-1和112-1高的一位或多位，並且111-2和112-2所表示的位相同。依此類推，第N個子部分111-N和112-N分別表示輸入111和112的最高一位或多位，並且111-N和112-N所表示的位相同。Specifically, the first subsections 111-1 and 112-1 represent the least significant one or more bits of the inputs 111 and 112, respectively, and the bits represented by 111-1 and 112-1 are the same. Correspondingly, the second subsections 111-2 and 112-2 respectively represent one or more bits of the inputs 111 and 112 higher than 111-1 and 112-1, and the bits represented by 111-2 and 112-2 are the same . And so on, the Nth subsections 111-N and 112-N represent the highest one or more bits of the inputs 111 and 112, respectively, and the bits represented by 111-N and 112-N are the same.

其中，N應為大於或等於2的整數。即，每個輸入111、112至少具有兩個子部分。where N should be an integer greater than or equal to 2. That is, each input 111, 112 has at least two subsections.

在一些實施例中，如圖1所示，表示輸入111和112的和數的輸出161被劃分為兩個子部分161-1、161-2。其中，161-1與輸入111和112的第一個子部分111-1和112-1對應，表示輸出161的最低一位或多位；161-2則與輸入111和112的其他子部分對應，表示輸出161的其他一位或多位。In some embodiments, as shown in Figure 1, the output 161 representing the sum of the inputs 111 and 112 is divided into two subsections 161-1, 161-2. Among them, 161-1 corresponds to the first subsections 111-1 and 112-1 of the inputs 111 and 112, representing the lowest one or more bits of the output 161; 161-2 corresponds to the other subsections of the inputs 111 and 112 , representing the other one or more bits of output 161.

本領域技術人員應當理解，本文中將輸入和輸出劃分為多個子部分只是為了便於描述各個子部分的不同的耦接關係，並不意指或暗示各個子部分之間必然被物理地分隔或隔斷。特別地，本領域技術人員應當理解，將輸入和輸出劃分為耦接關係不同的多個子部分並不需要在數字電路中引入額外的部件或產生額外的成本。It should be understood by those skilled in the art that the input and output are divided into multiple sub-sections herein only for the convenience of describing different coupling relationships of the sub-sections, and does not mean or imply that the sub-sections must be physically separated or partitioned. In particular, those skilled in the art will understand that dividing the input and output into multiple sub-sections with different coupling relationships does not require introducing additional components or incurring additional costs into the digital circuit.

如圖1所示，加法器100包括第一級加法模塊組120、中間寄存器組130、進位寄存器組140以及第二級加法模塊150。As shown in FIG. 1 , the adder 100 includes a first-stage addition module group 120 , an intermediate register group 130 , a carry register group 140 and a second-stage addition module 150 .

第一級加法模塊組120耦接到輸入111、112，包括多個第一級加法模塊120-1, 120-2, …, 120-N。每個第一級加法模塊120-1, 120-2, …, 120-N用於對兩個輸入111、112的對應子部分進行求和。The first-stage addition module group 120 is coupled to the inputs 111, 112, and includes a plurality of first-stage addition modules 120-1, 120-2, . . . , 120-N. Each of the first stage addition modules 120-1, 120-2, . . . , 120-N is used to sum corresponding sub-portions of the two inputs 111, 112.

例如，第一個第一級加法模塊120-1耦接到兩個輸入111、112的第一個子部分111-1和112-1，用於對111-1和112-1進行求和。For example, the first first stage summing module 120-1 is coupled to the first subsections 111-1 and 112-1 of the two inputs 111, 112 for summing 111-1 and 112-1.

第一級加法模塊的數量與輸入的子部分的數量相等，並且每個第一級加法模塊的配置可以根據輸入的對應子部分的位數來確定。The number of the first-stage addition modules is equal to the number of input subsections, and the configuration of each first-stage addition module can be determined according to the number of bits of the input corresponding subsections.

第一級加法模塊組120的輸出耦接到中間寄存器組130和進位寄存器組140，將求和結果的和數與進位分別輸出到中間寄存器組130和進位寄存器組140。The outputs of the first-stage addition module group 120 are coupled to the intermediate register group 130 and the carry register group 140, and output the sum and carry of the summation result to the intermediate register group 130 and the carry register group 140, respectively.

中間寄存器組130包括多個中間寄存器130-1, 130-2, …, 130-N。每個中間寄存器130-1, 130-2, …, 130-N耦接到對應的第一級加法模塊120-1, 120-2, …, 120-N，用於存儲兩個輸入111、112的對應子部分的求和結果的和數。The intermediate register group 130 includes a plurality of intermediate registers 130-1, 130-2, . . . , 130-N. Each intermediate register 130-1, 130-2, ..., 130-N is coupled to a corresponding first stage addition module 120-1, 120-2, ..., 120-N for storing the two inputs 111, 112 The sum of the results of the sum of the corresponding subsections of .

例如，第一個中間寄存器130-1耦接到第一個第一級加法模塊120-1，用於存儲111-1和112-1的求和結果的和數。即，第一個中間寄存器130-1與兩個輸入111、112的第一個子部分111-1和112-1對應，用於存儲兩個輸入111、112的由111-1和112-1所表示的最低一位或多位的求和結果的和數。For example, the first intermediate register 130-1 is coupled to the first first stage addition module 120-1 for storing the sum of the summation results of 111-1 and 112-1. That is, the first intermediate register 130-1 corresponds to the first subsections 111-1 and 112-1 of the two inputs 111, 112, and is used to store the two inputs 111, 112 by 111-1 and 112-1 The sum of the summation results of the least significant one or more digits represented.

中間寄存器的數量與輸入的子部分的數量相等，並且每個中間寄存器的配置可以根據輸入的對應子部分的位數來確定。The number of intermediate registers is equal to the number of input subsections, and the configuration of each intermediate register can be determined according to the number of bits of the input corresponding subsection.

進位寄存器組140包括多個進位寄存器140-1, 140-2, …, 140-N。每個進位寄存器140-1, 140-2, …, 140-N耦接到對應的第一級加法模塊120-1, 120-2, …, 120-N，用於存儲兩個輸入111、112的對應子部分的求和結果的進位。The carry register group 140 includes a plurality of carry registers 140-1, 140-2, . . . , 140-N. Each carry register 140-1, 140-2, ..., 140-N is coupled to a corresponding first stage addition module 120-1, 120-2, ..., 120-N for storing the two inputs 111, 112 The carry of the sum of the corresponding subparts of .

例如，第一個進位寄存器140-1耦接到第一個第一級加法模塊120-1，用於存儲111-1和112-1的求和結果的進位。即，第一個進位寄存器140-1用於存儲兩個輸入111、112的由111-1和112-1所表示的最低一位或多位的求和結果的進位。For example, the first carry register 140-1 is coupled to the first first stage addition module 120-1 for storing the carry of the summation results of 111-1 and 112-1. That is, the first carry register 140-1 is used to store the carry of the summation result of the least significant one or more bits of the two inputs 111, 112 represented by 111-1 and 112-1.

進位寄存器的數量可以根據輸入的子部分的數量來確定。在圖1所示的實施例中，進位寄存器的數量與輸入的子部分的數量相等。在其他實施例中，進位寄存器的數量可以比輸入的子部分的數量少1。即，進位寄存器組140中可以不存在第N個進位寄存器140-N，不存儲兩個輸入111、112的由111-N和112-N所表示的最高一位或多位的求和結果的進位。The number of carry registers can be determined based on the number of subsections of the input. In the embodiment shown in Figure 1, the number of carry registers is equal to the number of input subsections. In other embodiments, the number of carry registers may be one less than the number of subsections of the input. That is, the Nth carry register 140-N may not exist in the carry register group 140, and the summation result of the highest one or more bits of the two inputs 111 and 112 represented by 111-N and 112-N may not be stored. carry.

進位寄存器的數量可以根據需要來確定。在加法器100需要輸出兩個輸入111、112的求和結果的進位（即，加法器100包括輸出162）的實施例中，可以將進位寄存器的數量確定為與輸入的子部分的數量相等。在加法器100不需要輸出兩個輸入111、112的求和結果的進位（即，加法器100不包括輸出162）的實施例中，可以將進位寄存器的數量確定為比輸入的子部分的數量少1，從而使加法器100的額外成本減少。例如，在兩個輸入111、112分別具有兩個子部分的實施例中，加法器100可以僅包括一個進位寄存器。The number of carry registers can be determined as needed. In embodiments where the adder 100 needs to output a carry of the summed result of the two inputs 111, 112 (ie, the adder 100 includes an output 162), the number of carry registers may be determined to be equal to the number of subsections of the inputs. In embodiments where the adder 100 does not need to output a carry of the summed result of the two inputs 111, 112 (ie, the adder 100 does not include an output 162), the number of carry registers may be determined to be greater than the number of subsections of the inputs 1 less, thereby reducing the additional cost of the adder 100. For example, in an embodiment where the two inputs 111, 112 each have two subsections, the adder 100 may include only one carry register.

進位寄存器僅用於存儲進位，因此每個進位寄存器可以由1比特寄存器來實現。The carry register is only used to store the carry bit, so each carry register can be implemented by a 1-bit register.

中間寄存器組130和進位寄存器組140的輸出耦接到第二級加法模塊150，將兩個輸入111、112的各個對應的子部分分別求和的結果（包括和數與進位）輸出到第二級加法模塊150。The outputs of the intermediate register group 130 and the carry register group 140 are coupled to the second-stage addition module 150, and the result of summing the corresponding sub-parts of the two inputs 111 and 112 respectively (including the sum and carry) is output to the second stage. Stage addition module 150 .

第二級加法模塊150用於對來自每個中間寄存器的和數與來自對應的前一個進位寄存器的進位進行求和，從而得到兩個輸入111、112之和。The second stage addition module 150 is used to sum the sum from each intermediate register and the carry from the corresponding previous carry register to obtain the sum of the two inputs 111, 112.

具體而言，第二級加法模塊150可以將從第一個中間寄存器130-1輸出的兩個輸入111、112的由111-1和112-1所表示的最低一位或多位的求和結果的和數輸出為兩個輸入111、112的求和結果的和數的對應最低一位或多位（即輸出161的子部分161-1）。Specifically, the second-stage addition module 150 may sum the least significant one or more bits represented by 111-1 and 112-1 of the two inputs 111, 112 output from the first intermediate register 130-1 The summed output of the results is the corresponding least significant one or more bits of the sum of the summed results of the two inputs 111, 112 (ie, subsection 161-1 of output 161).

進一步地，第二級加法模塊150可以將從第二個中間寄存器130-2輸出的兩個輸入111、112的由111-2和112-2所表示的一位或多位的求和結果的和數與從第一個進位寄存器140-1輸出的兩個輸入111、112的由111-1和112-1所表示的最低一位或多位的求和結果的進位進行求和，而後將求和結果的和數輸出為兩個輸入111、112的和數的對應一位或多位，並且將求和結果的進位用於第二級加法模塊150中的進一步的運算。Further, the second-stage addition module 150 may obtain a summation result of one or more bits represented by 111-2 and 112-2 of the two inputs 111, 112 output from the second intermediate register 130-2 The sum is summed with the carry of the summation result of the least significant one or more bits represented by 111-1 and 112-1 of the two inputs 111, 112 output from the first carry register 140-1, and then the The sum output of the summation result is the corresponding one or more bits of the sum of the two inputs 111 , 112 , and the carry of the summation result is used for further operations in the second-stage addition module 150 .

依此類推，第二級加法模塊150可以將從第N個中間寄存器130-N輸出的兩個輸入111、112的由111-N和112-N所表示的最高一位或多位的求和結果的和數與從第N-1個進位寄存器140-N-1輸出的兩個輸入111、112的由111-N-1和112-N-1所表示的一位或多位的求和結果的進位進行求和，而後將求和結果的和數輸出為兩個輸入111、112的和數的對應最高一位或多位。And so on, the second-stage addition module 150 may sum the most significant one or more bits represented by 111-N and 112-N of the two inputs 111, 112 output from the Nth intermediate register 130-N The sum of the result and the sum of one or more bits represented by 111-N-1 and 112-N-1 of the two inputs 111, 112 output from the N-1th carry register 140-N-1 The carry of the result is summed, and then the sum of the summation result is output as the corresponding highest one or more digits of the sum of the two inputs 111 and 112 .

進一步地，第二級加法模塊150可以將上述求和結果的進位與從第N個進位寄存器140-N輸出的兩個輸入111、112的由111-N和112-N所表示的最高一位或多位的求和結果的進位進行求和，而後將求和結果輸出為兩個輸入111、112的求和結果的進位，即輸出162。Further, the second-stage addition module 150 may combine the carry of the above summation result with the most significant bit represented by 111-N and 112-N of the two inputs 111, 112 output from the Nth carry register 140-N or the carry of the multi-bit summation result is summed, and then the summation result is output as the carry of the summation result of the two inputs 111 and 112 , that is, the output 162 .

本領域技術人員應當理解，第二級加法模塊150所執行的處理不限於以上所述的處理。可以根據加法器100的功能來確定第二級加法模塊150的配置。例如，在不需要輸出兩個輸入111、112的求和結果的進位（即加法器不包括輸出162）的實施例中，第二級加法模塊150可以不執行用於計算並輸出兩個輸入111、112的求和結果的進位的處理。Those skilled in the art should understand that the processing performed by the second-stage addition module 150 is not limited to the processing described above. The configuration of the second-stage adding module 150 may be determined according to the function of the adder 100 . For example, in embodiments that do not need to output a carry of the summed result of the two inputs 111 , 112 (ie, the adder does not include an output 162 ), the second stage addition module 150 may not perform the computation and output the two inputs 111 , 112 The processing of the carry of the summation result.

第二級加法模塊150的輸出耦接到輸出161和162，輸出161和162分別表示兩個輸入111、112的求和結果的和數及進位。The output of the second stage addition module 150 is coupled to outputs 161 and 162, which represent the sum and carry of the summation results of the two inputs 111, 112, respectively.

在一些實施例中，輸出161可以被劃分為兩個子部分：第一個子部分161-1，表示兩個輸入111、112的求和結果的和數的與111-1和112-1對應的最低一位或多位；以及第二個子部分161-2，表示兩個輸入111、112的求和結果的和數的其他一位或多位。In some embodiments, the output 161 may be divided into two subsections: the first subsection 161-1, representing the sum of the summed results of the two inputs 111, 112, corresponds to 111-1 and 112-1 and the second subsection 161-2, representing the other one or more bits of the sum of the summation results of the two inputs 111, 112.

如圖1所示，在一些實施例中，第二級加法模塊150可以將第一個中間寄存器130-1的輸出直接耦接到輸出161的第一個子部分161-1。As shown in FIG. 1 , in some embodiments, the second stage addition module 150 may directly couple the output of the first intermediate register 130 - 1 to the first subsection 161 - 1 of the output 161 .

本領域技術人員應理解，本文中提到的寄存器可以是邊緣觸發寄存器（例如D類型觸發器）或準位觸發寄存器（例如鎖存器）。Those skilled in the art should understand that the registers mentioned herein may be edge-triggered registers (eg, D-type flip-flops) or level-triggered registers (eg, latches).

加法器100的計算速度主要依賴於第一級加法模塊組120和第二級加法模塊150的計算速度，而第一級加法模塊組120和第二級加法模塊150的計算速度與兩個輸入111、112的子部分的數量和位數有關。因此，可以適當地確定兩個輸入111、112的子部分的數量和位數，從而使加法器100的計算速度提高。The calculation speed of the adder 100 mainly depends on the calculation speed of the first-stage addition module group 120 and the second-stage addition module 150 , while the calculation speed of the first-stage addition module group 120 and the second-stage addition module 150 is related to the two inputs 111 . , The number of subparts of 112 is related to the number of bits. Therefore, the number and the number of bits of the subsections of the two inputs 111, 112 can be appropriately determined, thereby increasing the calculation speed of the adder 100.

第一級加法模塊組120的整體計算延時由多個第一級加法模塊120-1, 120-2, …, 120-N中的計算延時最長的一個第一級加法模塊的計算延時來決定。每個第一級加法模塊120-1, 120-2, …, 120-N用於對兩個輸入111、112的對應子部分進行求和，該子部分的位數越多，則對應的第一級加法模塊的計算延時越長。The overall calculation delay of the first-stage addition module group 120 is determined by the calculation delay of the first-stage addition module with the longest calculation delay among the plurality of first-stage addition modules 120-1, 120-2, . . . , 120-N. Each of the first-level addition modules 120-1, 120-2, ..., 120-N is used to sum the corresponding sub-parts of the two inputs 111 and 112. The calculation delay of the first-level addition module is longer.

因此，第一級加法模塊組120的整體計算延時依賴於輸入111、112的多個子部分的位數中的最大位數。該最大位數越大，則第一級加法模塊組120的整體計算延時越長。Therefore, the overall computation delay of the first-stage addition block group 120 depends on the largest number of bits among the number of bits of the multiple subsections of the inputs 111 , 112 . The larger the maximum number of bits is, the longer the overall calculation delay of the first-stage addition module group 120 is.

第二級加法模塊150用於對來自多個中間寄存器的和數與來自對應的前一個進位寄存器的進位進行求和。其中，第一個中間寄存器130-1的輸出即表示求和結果的和數的對應最低一位或多位。因此，第二級加法模塊150可以不對第一個中間寄存器130-1的輸出進行額外的處理。特別地，在一些實施例中，第二級加法模塊150也可以根據需要對第一個中間寄存器130-1的輸出進行一定的處理，但這種處理的耗時將遠少於第二級加法模塊150需要對其他中間寄存器和進位寄存器的輸出進行的如上所述的求和處理。也就是說，第二級加法模塊150的計算延時由如上所述的求和處理的計算延時來決定。The second stage addition module 150 is used to sum the sums from the plurality of intermediate registers and the carry from the corresponding previous carry register. Wherein, the output of the first intermediate register 130-1 represents the corresponding lowest one or more bits of the sum of the summation result. Therefore, the second stage addition module 150 may not perform additional processing on the output of the first intermediate register 130-1. In particular, in some embodiments, the second-stage addition module 150 may also perform certain processing on the output of the first intermediate register 130-1 as required, but the processing time will be much less than the second-stage addition Module 150 requires the summation process as described above for the outputs of the other intermediate registers and the carry register. That is, the calculation delay of the second-stage addition module 150 is determined by the calculation delay of the summation processing as described above.

因此，第二級加法模塊150的計算延時可依賴於輸入111、112的多個子部分中的除了第一個子部分111-1、112-1之外的其他子部分的數量以及其位數之和。該其他子部分的數量越多，或者其位數之和越大，則第二級加法模塊150的計算延時越長。換言之，輸入111、112的子部分的數量越少，或者第一個子部分111-1、112-1的位數越大，則第二級加法模塊150的計算延時越短。Therefore, the calculation delay of the second stage addition module 150 may depend on the number of subsections other than the first subsection 111-1, 112-1 among the subsections of the input 111, 112 and the sum of their bits and. The greater the number of the other subsections, or the greater the sum of the number of bits, the longer the calculation delay of the second-stage addition module 150 is. In other words, the smaller the number of subsections of the input 111 and 112, or the larger the number of bits of the first subsections 111-1 and 112-1, the shorter the calculation delay of the second-stage addition module 150 is.

因此，期望減少兩個輸入111、112的多個子部分的位數中的最大位數。同時，期望減少輸入111、112的子部分的數量，並且期望增加第一個子部分的位數。Therefore, it is desirable to reduce the maximum number of bits in the number of bits of the subsections of the two inputs 111, 112. At the same time, it is desired to reduce the number of subsections of the inputs 111, 112, and to increase the number of bits of the first subsection.

在一些實施例中，輸入111、112的第一個子部分111-1、112-1的位數大於或等於其他子部分的位數。在另一些實施例中，輸入111、112的多個子部分的位數基本相等。這有利於降低第一級加法模塊組120和第二級加法模塊150的計算延時，從而提高加法器100的計算速度，進而降低晶片的功耗算力比。In some embodiments, the number of bits of the first subsection 111-1, 112-1 of the input 111, 112 is greater than or equal to the number of bits of the other subsections. In other embodiments, the number of bits of the subsections of the inputs 111, 112 are substantially equal. This is beneficial to reduce the calculation delay of the first-stage addition module group 120 and the second-stage addition module 150, thereby increasing the calculation speed of the adder 100, and further reducing the power consumption and computing power ratio of the chip.

在一些實施例中，輸入111、112具有兩個子部分，並且第一個子部分111-1、112-1的位數大於或等於輸入111、112的位數的一半。這有利於在提高加法器100的計算速度的同時使得額外成本減少。In some embodiments, the input 111 , 112 has two subsections, and the first subsection 111 - 1 , 112 - 1 has a number of bits greater than or equal to half the number of bits of the input 111 , 112 . This is beneficial to reduce the extra cost while increasing the calculation speed of the adder 100 .

需要說明的是，本文中的“基本相等”的表述意指二者大致相等，但不必然嚴格地、精確地相等。本領域技術人員應當理解，這符合技術原理和工程實踐。例如，二者可以相差約5%或10%。在一些語境中，二者可以相差約15%或20%。It should be noted that the expression "substantially equal" herein means that the two are approximately equal, but not necessarily strictly and precisely equal. Those skilled in the art should understand that this is consistent with technical principles and engineering practice. For example, the two may differ by about 5% or 10%. In some contexts, the two may differ by about 15% or 20%.

圖2示出了根據本公開一個或多個示例性實施例的加法器200的示意圖。加法器200用於計算輸入的一個數字與預定的常數之和。FIG. 2 shows a schematic diagram of an adder 200 in accordance with one or more exemplary embodiments of the present disclosure. The adder 200 is used to calculate the sum of an inputted number and a predetermined constant.

如圖2所示，加法器200具有一個輸入210和兩個輸出261、262。其中，輸入210表示輸入的數字，並且輸出261、262分別表示這個數字與預定的常數的求和結果的和數及進位。As shown in FIG. 2 , the adder 200 has one input 210 and two outputs 261 , 262 . The input 210 represents the input number, and the outputs 261 and 262 represent the sum and the carry of the summation result of the number and a predetermined constant, respectively.

類似地，在一些實施例中，加法器200也可以僅具有一個輸出261，即僅輸出求和結果的和數，而不輸出進位。Similarly, in some embodiments, the adder 200 may also have only one output 261, that is, only the sum of the summation results, but not the carry.

加法器200的配置與加法器100類似，並且可以根據該預定的常數而進行適當的調整。The configuration of the adder 200 is similar to that of the adder 100, and can be appropriately adjusted according to the predetermined constant.

如圖2所示，輸入210被劃分為N個子部分，這N個子部分由低位到高位依次表示該輸入的部分位。即，輸入210由低位到高位被劃分為子部分210-1, 210-2, …, 210-N。其中，N應為大於或等於2的整數。即，輸入210至少具有兩個子部分。As shown in FIG. 2 , the input 210 is divided into N sub-parts, and the N sub-parts represent the partial bits of the input sequentially from low-order to high-order. That is, the input 210 is divided into subsections 210-1, 210-2, . . . , 210-N from low to high. where N should be an integer greater than or equal to 2. That is, input 210 has at least two subsections.

在一些實施例中，如圖2所示，表示輸入210與該常數的和數的輸出261被劃分為兩個子部分261-1、261-2。其中，261-1與輸入210的第一個子部分210-1對應，表示輸出261的最低一位或多位；261-2則與輸入210的其他子部分210-2, …, 210-N對應，表示輸出261的其他一位或多位。In some embodiments, as shown in Figure 2, the output 261 representing the sum of the input 210 and the constant is divided into two subsections 261-1, 261-2. Among them, 261-1 corresponds to the first subsection 210-1 of the input 210, representing the lowest one or more bits of the output 261; 261-2 corresponds to the other subsections 210-2, ..., 210-N of the input 210 Correspondingly, represent the other one or more bits of output 261.

如圖2所示，加法器200包括第一級加法模塊組220、中間寄存器組230、進位寄存器組240以及第二級加法模塊250。As shown in FIG. 2 , the adder 200 includes a first-stage addition module group 220 , an intermediate register group 230 , a carry register group 240 and a second-stage addition module 250 .

第一級加法模塊組220耦接到輸入210，包括多個第一級加法模塊220-1, 220-2, …, 220-N。每個第一級加法模塊220-1, 220-2, …, 220-N用於對輸入210的對應子部分與該常數的對應位進行求和。The first-stage addition module group 220 is coupled to the input 210 and includes a plurality of first-stage addition modules 220-1, 220-2, . . . , 220-N. Each of the first stage addition modules 220-1, 220-2, . . . , 220-N is used to sum the corresponding sub-portion of the input 210 with the corresponding bits of the constant.

每個第一級加法模塊220-1, 220-2, …, 220-N的配置可以與該常數的對應位相關。在一些實施例中，第一級加法模塊220-1, 220-2, …, 220-N的數量和配置可以至少部分地根據該常數來確定或調整。例如，對於輸入210的任一子部分，如果已知該預定常數的對應位全部為零，則可以不對該子部分與該常數的對應位進行求和，因此第一級加法模塊組220中可以不包括對應的第一級加法模塊。這有利於降低加法器200的製造成本。The configuration of each of the first stage addition modules 220-1, 220-2, . . . , 220-N may be related to the corresponding bit of the constant. In some embodiments, the number and configuration of the first stage addition modules 220-1, 220-2, . . . , 220-N may be determined or adjusted based at least in part on the constant. For example, for any subsection of the input 210, if it is known that the corresponding bits of the predetermined constant are all zero, the subsection and the corresponding bits of the constant may not be summed, so the first-level addition module group 220 can The corresponding first-level addition module is not included. This is beneficial to reduce the manufacturing cost of the adder 200 .

例如，在一些實施例中，如果該常數較小（即較高位全部為零），則第一級加法模塊組220中可以僅包括一個第一級加法模塊，即僅包括與輸入210的第一個子部分210-1對應的第一個第一級加法模塊220-1。特別地，當該常數為1時，加法器200為自加1加法器，並且可以僅包括一個第一級加法模塊。For example, in some embodiments, if the constant is small (ie, the higher bits are all zeros), the first-stage addition module group 220 may include only one first-stage addition module, that is, only the first-stage addition module with the input 210 is included. The first first-stage addition module 220-1 corresponding to the sub-sections 210-1. In particular, when the constant is 1, the adder 200 is a self-adding 1 adder, and may include only one first-stage adding module.

第一級加法模塊組220的輸出耦接到中間寄存器組230和進位寄存器組240，將求和結果（包括和數與進位）分別輸出到中間寄存器組230和進位寄存器組240。The outputs of the first-stage addition module group 220 are coupled to the intermediate register group 230 and the carry register group 240, and output the summation result (including the sum and carry) to the intermediate register group 230 and the carry register group 240, respectively.

中間寄存器組230包括多個中間寄存器230-1, 230-2, …, 230-N。如圖2所示，每個中間寄存器230-1, 230-2, …, 230-N耦接到對應的第一級加法模塊220-1, 220-2, …, 220-N，用於存儲輸入210的對應子部分與該常數的對應位的求和結果的和數。The intermediate register group 230 includes a plurality of intermediate registers 230-1, 230-2, . . . , 230-N. As shown in FIG. 2, each of the intermediate registers 230-1, 230-2, ..., 230-N is coupled to the corresponding first-stage addition module 220-1, 220-2, ..., 220-N for storing The sum of the corresponding subsection of input 210 and the result of summing the corresponding bit of the constant.

進位寄存器組240包括多個進位寄存器240-1, 240-2, …, 240-N。每個進位寄存器240-1, 240-2, …, 240-N耦接到對應的第一級加法模塊220-1, 220-2, …, 220-N，用於存儲輸入210的對應子部分與該常數的對應位的求和結果的進位。The carry register group 240 includes a plurality of carry registers 240-1, 240-2, . . . , 240-N. Each carry register 240-1, 240-2, . . . , 240-N is coupled to a corresponding first-stage addition module 220-1, 220-2, . The carry of the result of the summation with the corresponding bits of this constant.

中間寄存器組230和進位寄存器組240的配置可以根據第一級加法模塊組220的配置進行適當的調整。例如，當第一級加法模塊組220中不包括與輸入210的任一子部分對應的第一級加法模塊時，中間寄存器組230中的對應的中間寄存器可以直接耦接到輸入210的該子部分，並且進位寄存器組240中可以不包括對應的進位寄存器。The configuration of the intermediate register group 230 and the carry register group 240 can be appropriately adjusted according to the configuration of the first-stage addition module group 220 . For example, when the first-stage addition module corresponding to any sub-portion of input 210 is not included in the first-stage addition module group 220 , the corresponding intermediate register in the intermediate register group 230 may be directly coupled to the sub-section of the input 210 part, and the corresponding carry register may not be included in the carry register group 240 .

在圖2所示的實施例中，進位寄存器的數量與輸入的子部分的數量相等。在其他實施例中，進位寄存器的數量可以比輸入的子部分的數量少1，即不存在第N個進位寄存器240-N。In the embodiment shown in Figure 2, the number of carry registers is equal to the number of input subsections. In other embodiments, the number of carry registers may be one less than the number of input subsections, ie there is no Nth carry register 240-N.

中間寄存器組230和進位寄存器組240的輸出耦接到第二級加法模塊250，將輸入210的各個子部分與所述常數的對應位分別求和的結果（包括和數與進位）輸出到第二級加法模塊250。The outputs of the intermediate register group 230 and the carry register group 240 are coupled to the second-stage addition module 250, and the result (including the sum and carry) of the summation of the respective sub-parts of the input 210 and the corresponding bits of the constant (including the sum and carry) is output to the first stage. Secondary addition module 250 .

第二級加法模塊250用於對來自每個中間寄存器的和數與來自對應的前一個進位寄存器的進位進行求和，即對輸入210的各個子部分與該常數的對應位分別求和的結果（包括和數與進位）進行求和，從而得到輸入210與該常數之和。The second-stage addition module 250 is used to sum the sum from each intermediate register and the carry from the corresponding previous carry register, that is, the result of summing the respective sub-parts of the input 210 and the corresponding bits of the constant. (including the sum and carry) to get the sum of the input 210 and the constant.

具體而言，第二級加法模塊250可以將從第一個中間寄存器230-1輸出的輸入210的由210-1所表示的最低一位或多位與該常數的對應最低一位或多位的和數輸出為輸入210與該常數的和數的對應最低一位或多位（即輸出261的子部分261-1）。Specifically, the second-stage addition module 250 may output the lowest one or more bits of the input 210 represented by 210-1 output from the first intermediate register 230-1 and the corresponding lowest one or more bits of the constant The sum output is the corresponding least significant one or more bits of the sum of input 210 and the constant (ie, subsection 261-1 of output 261).

進一步地，第二級加法模塊250可以將從第二個中間寄存器230-2輸出的輸入210的由210-2所表示的一位或多位與該常數的對應一位或多位的和數與從第一個進位寄存器240-1輸出的進位進行求和，而後將求和結果的和數輸出為輸入210與該常數的和數的對應一位或多位，並且將求和結果的進位用於第二級加法模塊250中的進一步的運算。Further, the second-stage addition module 250 may add a sum of one or more bits of the input 210 represented by 210-2 output from the second intermediate register 230-2 and the corresponding one or more bits of the constant summing with the carry output from the first carry register 240-1, then outputting the sum of the summation result as the corresponding one or more bits of the sum of the input 210 and the constant, and adding the carry of the summation result For further operations in the second stage addition module 250 .

依此類推，第二級加法模塊250可以將從第N個中間寄存器230-N輸出的輸入210的由210-N所表示的最高一位或多位與該常數的對應最高一位或多位的和數與從第N-1個進位寄存器240-N-1輸出的進位進行求和，而後將求和結果的和數輸出為輸入210與該常數的和數的對應最高一位或多位。And so on, the second-stage addition module 250 can output the highest one or more bits of the input 210 represented by 210-N output from the Nth intermediate register 230-N and the corresponding highest one or more bits of the constant sum the sum of the N-1 th carry register 240-N-1, and then output the sum of the summation result as the highest one or more bits corresponding to the sum of the input 210 and the constant .

進一步地，第二級加法模塊250可以將上述求和結果的進位與從第N個進位寄存器240-N輸出的進位進行求和，而後將求和結果輸出為輸入210與該常數的求和結果的進位（即輸出262）。Further, the second-level addition module 250 may sum the carry of the above summation result and the carry output from the Nth carry register 240-N, and then output the summation result as the summation result of the input 210 and the constant. carry (i.e. output 262).

本領域技術人員應當理解，第二級加法模塊250所執行的處理不限於以上所述。可以根據加法器的功能來確定第二級加法模塊250的配置。例如，在加法器200不包括輸出262的實施例中，可以對第二級加法模塊250的配置及其所執行的處理進行相應的調整。Those skilled in the art should understand that the processing performed by the second-level addition module 250 is not limited to the above. The configuration of the second-stage adding module 250 may be determined according to the function of the adder. For example, in embodiments where adder 200 does not include output 262, the configuration of second stage adder module 250 and the processing performed by it may be adjusted accordingly.

第二級加法模塊250的輸出耦接到輸出261和262，輸出261和262分別表示輸入210與該常數的求和結果的和數及進位。The output of the second stage addition module 250 is coupled to outputs 261 and 262, which respectively represent the sum and carry of the summation result of the input 210 and the constant.

在一些實施例中，輸出261可以被劃分為兩個子部分：第一個子部分261-1，表示輸入210與該常數的求和結果的和數的與210-1對應的最低一位或多位；以及第二個子部分261-2，表示輸入210與該常數的求和結果的和數的其他一位或多位。如圖2所示，在一些實施例中，第二級加法模塊250可以將第一個中間寄存器230-1的輸出直接耦接到輸出261的第一個子部分261-1。In some embodiments, output 261 may be divided into two subsections: a first subsection 261-1, representing the least significant bit corresponding to 210-1 of the sum of input 210 and the result of summing the constant, or multiple bits; and a second subsection 261-2 representing the other one or more bits of the sum of the input 210 and the result of summing the constant. As shown in FIG. 2 , in some embodiments, the second stage addition module 250 may directly couple the output of the first intermediate register 230 - 1 to the first subsection 261 - 1 of the output 261 .

加法器200的計算速度主要依賴於第一級加法模塊組220和第二級加法模塊250的計算速度。The calculation speed of the adder 200 mainly depends on the calculation speed of the first-stage addition module group 220 and the second-stage addition module 250 .

關於第二級加法模塊250，類似地，輸入210的子部分的數量越少，或者第一個子部分210-1的位數越大，則第二級加法模塊250的計算延時越短。因此，期望減少輸入210的子部分的數量，並且增加第一個子部分210-1的位數。Regarding the second stage addition module 250, similarly, the smaller the number of subsections of the input 210, or the larger the number of bits of the first subsection 210-1, the shorter the computation delay of the second stage addition module 250. Therefore, it is desirable to reduce the number of subsections of the input 210 and increase the number of bits of the first subsection 210-1.

另一方面，與加法器100不同的是，在加法器200中，第一級加法模塊220-1, 220-2, …, 220-N的配置可以與該常數相關。如上所述，如果對於輸入210的某一子部分，該常數的對應位全部為零，則第一級加法模塊組220中可以不包括對應的第一級加法模塊。因此，第一級加法模塊組220的計算延時與這樣的子部分的位數無關，而僅依賴於輸入210的其他子部分（即，對於這些子部分，該常數的對應位不全部為零）的位數。具體而言，期望適當地增加對應於常數的全部為零的位的子部分的位數，並且期望減少其他子部分的位數中的最大位數。On the other hand, unlike the adder 100, in the adder 200, the configurations of the first-stage adding modules 220-1, 220-2, . . . , 220-N can be related to the constant. As described above, if for a certain sub-portion of the input 210, the corresponding bits of the constant are all zero, the first-stage addition module group 220 may not include the corresponding first-stage addition module. Therefore, the computation delay of the first-stage addition block group 220 is independent of the number of bits of such subsections, but only depends on other subsections of the input 210 (ie, for these subsections, the corresponding bits of the constant are not all zeros) number of digits. In particular, it is desirable to appropriately increase the number of bits of the subsections corresponding to the bits of all zeros of the constant, and it is desirable to decrease the maximum number of bits among the number of bits of the other subsections.

在一些實施例中，輸入210的子部分的數量和每個子部分的位數至少部分地根據該常數來確定。例如，如果該常數較小（即該常數的較高位全部為零，例如該常數為1），則輸入210可以具有兩個子部分，使得該常數的與第二個子部分對應的位全部為零。例如，如果該常數中包括全部為零的連續多個位，則可以與這連續多個位的至少一部分對應地劃分出輸入210的一個子部分。In some embodiments, the number of subsections of input 210 and the number of bits per subsection are determined based at least in part on the constant. For example, if the constant is small (ie, the higher bits of the constant are all zeros, eg, the constant is 1), then the input 210 may have two subsections such that the bits of the constant corresponding to the second subsection are all zeros . For example, if the constant includes a plurality of consecutive bits that are all zeros, a sub-portion of the input 210 may be divided corresponding to at least a portion of the consecutive plurality of bits.

圖3示出了包括根據本公開一個或多個示例性實施例的加法器300的運算電路3000的一部分。FIG. 3 illustrates a portion of an arithmetic circuit 3000 including an adder 300 according to one or more exemplary embodiments of the present disclosure.

僅作為示例，在圖3中，加法器300示出為如圖1所示的用於計算輸入的兩個數字之和的加法器。但是，本領域技術人員應當理解，可以將加法器300替換為如圖2所示的用於計算輸入的一個數字與預定的常數之和的加法器，只需對運算電路3000進行適當的調整。For example only, in FIG. 3, adder 300 is shown as an adder as shown in FIG. 1 for calculating the sum of two numbers input. However, those skilled in the art should understand that the adder 300 can be replaced with an adder for calculating the sum of an input number and a predetermined constant as shown in FIG.

運算電路3000包括前一級寄存器3101、3102，加法器300，以及後一級寄存器3200。此外，在一些實施例中，運算電路3000還可以包括前置組合邏輯模塊3110和後置組合邏輯模塊3120。The arithmetic circuit 3000 includes registers 3101 and 3102 of the previous stage, an adder 300 , and a register 3200 of the subsequent stage. In addition, in some embodiments, the operation circuit 3000 may further include a pre-combination logic module 3110 and a post-combination logic module 3120 .

在一些實施例中，前一級寄存器3101、3102可以直接耦接到加法器300。在一些實施例中，前一級寄存器3101、3102可以經由前置組合邏輯模塊3110耦接到加法器300。在一些實施例中，加法器300可以直接耦接到後一級寄存器3200。在一些實施例中，加法器300可以經由後置組合邏輯模塊3120耦接到後一級寄存器3200。In some embodiments, the previous stage registers 3101 , 3102 may be directly coupled to the adder 300 . In some embodiments, the previous stage registers 3101 , 3102 may be coupled to the adder 300 via a pre-combination logic module 3110 . In some embodiments, the adder 300 may be directly coupled to the subsequent stage register 3200 . In some embodiments, the adder 300 may be coupled to the post-stage register 3200 via the post-combination logic module 3120 .

本領域技術人員應當理解，前一級寄存器3101、3102和後一級寄存器3200的數量和配置不限於圖3中的實施例。例如，在一些實施例中，運算電路3000可以僅包括一個前一級寄存器3101，該前一級寄存器3101經由前置組合邏輯模塊3110來向加法器300提供兩個輸入311、312。Those skilled in the art should understand that the number and configuration of the previous-level registers 3101 and 3102 and the subsequent-level registers 3200 are not limited to the embodiment shown in FIG. 3 . For example, in some embodiments, the arithmetic circuit 3000 may include only one previous stage register 3101 , which provides two inputs 311 , 312 to the adder 300 via the pre-combination logic module 3110 .

僅作為示例，圖3示出了運算電路3000包括前置組合邏輯模塊3110和後置組合邏輯模塊3120的實施例。本領域技術人員應當理解，以下描述同樣可以適用於運算電路3000不包括前置組合邏輯模塊3110或後置組合邏輯模塊3120的實施例，只需進行適當的調整。For example only, FIG. 3 shows an embodiment in which the arithmetic circuit 3000 includes a pre-combination logic module 3110 and a post-combination logic module 3120 . Those skilled in the art should understand that the following description is also applicable to the embodiment in which the operation circuit 3000 does not include the pre-combination logic module 3110 or the post-combination logic module 3120, and only needs to make appropriate adjustments.

加法器300與圖1所示的加法器100的配置類似。The adder 300 is similar in configuration to the adder 100 shown in FIG. 1 .

加法器300具有分別表示輸入的兩個數字的兩個輸入311、312，以及分別表示這兩個數字的求和結果的和數及進位兩個輸出361、362。其中，輸出361具有兩個子部分361-1、361-2。The adder 300 has two inputs 311, 312 respectively representing the two numbers input, and two outputs 361, 362 representing the sum and carry of the summation result of the two numbers, respectively. Therein, the output 361 has two subsections 361-1, 361-2.

加法器300包括：第一級加法模塊組320，包括多個中間寄存器330-1, 330-2, …, 330-N的中間寄存器組330，進位寄存器組340，以及第二級加法模塊350。其中，在一些實施例中，第二級加法模塊350將第一個中間寄存器330-1的輸出直接耦接到輸出361的第一個子部分361-1。The adder 300 includes: a first-stage addition module group 320 , an intermediate register group 330 including a plurality of intermediate registers 330 - 1 , 330 - 2 , . . . , 330 -N, a carry register group 340 , and a second-stage addition module 350 . Wherein, in some embodiments, the second stage addition module 350 directly couples the output of the first intermediate register 330-1 to the first subsection 361-1 of the output 361.

在運算電路3000中，用於前一級寄存器3101和3102、中間寄存器組330以及後一級寄存器3200的時鐘的頻率相同。因此，期望前置組合邏輯模塊3110和第一級加法模塊組320的運算能夠在一個時鐘週期內完成，並且第二級加法模塊350和後置組合邏輯模塊3120的運算能夠在一個時鐘週期內完成。In the arithmetic circuit 3000, the frequencies of the clocks used for the preceding-stage registers 3101 and 3102, the intermediate register group 330, and the succeeding-stage register 3200 are the same. Therefore, it is expected that the operations of the pre-combination logic module 3110 and the first-stage addition module group 320 can be completed in one clock cycle, and the operations of the second-stage addition module 350 and the post-combination logic module 3120 can be completed in one clock cycle. .

因此，期望第一級加法模塊組320的計算延時小於時鐘週期與前置組合邏輯模塊3110的計算延時之差，並且第二級加法模塊350的計算延時小於時鐘週期與後置組合邏輯模塊3120的計算延時之差。Therefore, it is expected that the calculation delay of the first-stage addition module group 320 is less than the difference between the clock cycle and the calculation delay of the pre-combination logic module 3110, and the calculation delay of the second-stage addition module 350 is smaller than the clock cycle and the post-combination logic module 3120. Calculate the difference in delays.

關於第一級加法模塊組320，如上所述，在加法器300中，輸入311、312的多個子部分的位數中的最大位數越大，則第一級加法模塊組320的計算延時越長。因此，輸入311、312的多個子部分的位數中的最大位數的上限可以至少部分地根據時鐘週期與前置組合邏輯模塊3110的計算延時之差來確定。Regarding the first-stage addition module group 320, as described above, in the adder 300, the larger the maximum number of bits among the sub-sections of the inputs 311 and 312 is, the longer the calculation delay of the first-stage addition module group 320 is. long. Accordingly, the upper limit of the maximum number of bits in the number of sub-sections of the inputs 311 , 312 may be determined at least in part based on the difference between the clock period and the computation delay of the pre-combination logic module 3110 .

在一些實施例中，可以至少部分地根據時鐘週期與前置組合邏輯模塊3110的計算延時之差來確定加法器300的輸入311、312的多個子部分的位數中的最大位數。具體而言，可以將該最大位數確定為：使得第一級加法模塊組320的計算延時小於時鐘週期與前置組合邏輯模塊3110的計算延時之差。在一些實施例中，可以將該最大位數確定為：使得第一級加法模塊組320的計算延時基本等於時鐘週期與前置組合邏輯模塊3110的計算延時之差。In some embodiments, the maximum number of bits of the plurality of subsections of the inputs 311 , 312 of the adder 300 may be determined based at least in part on the difference between the clock period and the computation delay of the pre-combination logic module 3110 . Specifically, the maximum number of bits can be determined so that the calculation delay of the first-stage addition module group 320 is smaller than the difference between the clock cycle and the calculation delay of the pre-combination logic module 3110 . In some embodiments, the maximum number of bits may be determined such that the calculation delay of the first-stage addition module group 320 is substantially equal to the difference between the clock period and the calculation delay of the pre-combination logic module 3110 .

此外，在運算電路3000不包括前置組合邏輯模塊3110的實施例中，在一些示例中，至少部分地根據時鐘週期來確定加法器300的輸入311、312的多個子部分的位數中的最大位數。Furthermore, in embodiments in which operational circuit 3000 does not include pre-combination logic module 3110, in some examples, the maximum number of bits of the plurality of subsections of inputs 311, 312 of adder 300 is determined based at least in part on clock cycles digits.

另一方面，在加法器300為圖2所示的用於計算輸入的一個數字與預定的常數之和的加法器的情況下，第一級加法模塊組320的計算延時還與該常數有關。如上所述，可以根據該常數來調整多個子部分的位數。On the other hand, when the adder 300 is the adder shown in FIG. 2 for calculating the sum of an input number and a predetermined constant, the calculation delay of the first-stage adding module group 320 is also related to the constant. As mentioned above, the number of bits of the subsections can be adjusted according to this constant.

因此，在一些實施例中，可以首先根據時鐘週期與前置組合邏輯模塊3110的計算延時之差來確定加法器300的輸入的多個子部分的位數中的最大位數，進而根據該常數來調整多個子部分的位數。例如，如果該常數包括全部為零的連續多個位，則可以與這連續多個位對應地劃分出輸入的一個子部分，不管該子部分的位數是否大於該確定的最大位數。在一些實施例中，可以調整輸入的多個子部分的位數，使得不對應於該常數的全部為零的位的子部分中的最大位數基本等於該確定的最大位數。Therefore, in some embodiments, the maximum number of bits in the number of bits of the multiple sub-sections of the input of the adder 300 may be determined first according to the difference between the clock period and the calculation delay of the pre-combination logic module 3110, and then according to the constant Adjust the number of bits for multiple subsections. For example, if the constant includes consecutive bits that are all zeros, a subsection of the input may be divided corresponding to the consecutive bits, regardless of whether the number of bits in the subsection is greater than the determined maximum number of bits. In some embodiments, the number of bits of the plurality of sub-portions of the input may be adjusted such that the maximum number of bits in the sub-portion that does not correspond to the all-zero bits of the constant is substantially equal to the determined maximum number of bits.

關於第二級加法模塊350，如上所述，在加法器300中，輸入311、312的子部分的數量越多，或者第一個子部分的位數越小，則第二級加法模塊350的計算延時越長。因此，加法器300的輸入311、312的第一個子部分的位數的下限可以至少部分地根據時鐘週期與後置組合邏輯模塊3120的計算延時之差來確定。With regard to the second-stage addition module 350, as described above, in the adder 300, the greater the number of sub-sections input 311 and 312, or the smaller the number of bits of the first sub-section, the smaller the number of bits in the second-stage addition module 350. The longer the calculation delay is. Thus, the lower bound on the number of bits of the first subsection of the inputs 311 , 312 of the adder 300 may be determined at least in part from the difference between the clock period and the computation delay of the post-combination logic module 3120 .

在一些實施例中，至少部分地根據時鐘週期與後置組合邏輯模塊3120的計算延時之差來確定加法器300的輸入311、312的第一個子部分的位數。具體而言，可以將該第一個子部分的位數確定為：使得第二級加法模塊350的計算延時小於時鐘週期與後置組合邏輯模塊3120的計算延時之差。在一些實施例中，可以將該第一個子部分的位數確定為：使得第二級加法模塊350的計算延時基本等於時鐘週期與後置組合邏輯模塊3120的計算延時之差。In some embodiments, the number of bits of the first subsection of the inputs 311 , 312 of the adder 300 is determined based at least in part on the difference between the clock period and the computation delay of the post-combination logic module 3120 . Specifically, the number of bits of the first subsection can be determined such that the calculation delay of the second-stage addition module 350 is smaller than the difference between the clock cycle and the calculation delay of the post-combination logic module 3120 . In some embodiments, the number of bits of the first subsection may be determined such that the calculation delay of the second stage addition module 350 is substantially equal to the difference between the clock period and the calculation delay of the post-combination logic module 3120 .

此外，在運算電路3000不包括後置組合邏輯模塊3120的實施例中，在一些示例中，至少部分地根據時鐘週期來確定加法器300的輸入311、312的第一個子部分的位數。Furthermore, in embodiments where the operational circuit 3000 does not include the post-combination logic module 3120, in some examples, the number of bits of the first subsection of the inputs 311, 312 of the adder 300 is determined based at least in part on the clock cycle.

另一方面，如上面所提到的，在一些實施例中，將輸入311、312的第一個子部分的位數確定為大於或等於其他子部分的位數。在一些實施例中，將輸入311、312的多個子部分的位數確定為基本相等。On the other hand, as mentioned above, in some embodiments, the number of bits of the first subsection of the input 311, 312 is determined to be greater than or equal to the number of bits of the other subsections. In some embodiments, the number of bits of the multiple sub-portions of the inputs 311, 312 are determined to be substantially equal.

在一些實施例中，可以將以上所述的策略結合起來以確定加法器300的輸入311、312的子部分的數量和位數。In some embodiments, the strategies described above may be combined to determine the number and number of bits of subsections of the inputs 311 , 312 of the adder 300 .

例如，在一些實施例中，首先可以根據時鐘週期與前置組合邏輯模塊3110的計算延時之差以及時鐘週期與後置組合邏輯模塊3120的計算延時之差來確定加法器300的輸入311、312的第一個子部分的位數。例如，可以根據時鐘週期與前置組合邏輯模塊3110的計算延時之差來確定該第一個子部分的位數的上限，並且根據時鐘週期與後置組合邏輯模塊3120的計算延時之差來確定該第一個子部分的位數的下限。For example, in some embodiments, the inputs 311 , 312 of the adder 300 may be determined first according to the difference between the clock cycle and the calculation delay of the pre-combination logic module 3110 and the difference between the clock cycle and the calculation delay of the post-combination logic module 3120 The number of digits in the first subsection of . For example, the upper limit of the number of bits of the first subsection can be determined according to the difference between the clock period and the calculation delay of the pre-combination logic module 3110, and determined according to the difference between the clock period and the calculation delay of the post-combination logic module 3120 The lower bound on the number of bits in this first subsection.

而後，如果確定的第一個子部分的位數大於或等於輸入311、312的位數的一半，則可以將輸入311、312的其他位劃分為第二個子部分。這樣，輸入311、312被劃分為兩個子部分，其中第一個子部分的位數大於或等於第二個子部分的位數。Then, if the determined number of bits of the first subsection is greater than or equal to half of the number of bits of the inputs 311, 312, the other bits of the inputs 311, 312 can be divided into the second subsection. In this way, the input 311, 312 is divided into two subsections, wherein the number of bits in the first subsection is greater than or equal to the number of bits in the second subsection.

如果確定的第一個子部分的位數小於輸入311、312的位數的一半，則可以將輸入311、312的其他位劃分為若干個子部分，使得這些子部分的數量盡可能少，並且每個子部分的位數均小於或等於確定的第一個子部分的位數。例如，可以將這些子部分的位數確定為與第一個子部分的位數基本相等。If the determined number of bits of the first subsection is less than half of the bits of the input 311, 312, the other bits of the input 311, 312 can be divided into several subsections, so that the number of these subsections is as small as possible, and each The number of bits of each subsection is less than or equal to the number of bits of the first subsection determined. For example, the number of bits of these subsections may be determined to be substantially equal to the number of bits of the first subsection.

在加法器300為圖2所示的用於計算輸入的一個數字與預定的常數之和的加法器200的情況下，可以進而根據該常數來調整子部分的數量和位數。例如，如果該常數包括全部為零的連續多個位，則可以與這連續多個位對應地劃分出輸入的一個子部分，不管該子部分的位數是否大於確定的第一個子部分的位數。When the adder 300 is the adder 200 shown in FIG. 2 for calculating the sum of an input number and a predetermined constant, the number and the number of bits of the subsections can be further adjusted according to the constant. For example, if the constant includes a plurality of consecutive bits that are all zeros, a subsection of the input may be divided corresponding to the consecutive bits, regardless of whether the number of bits in the subsection is greater than the determined first subsection. digits.

本領域技術人員應當理解，加法器的輸入的子部分的數量和位數的確定方式不限於以上所描述的具體實施例。可以獨立地或結合地採用本文所描述的各種策略，綜合考慮加法器和運算電路的功能、配置、面積、成本、速度、功耗等各種因素，來確定加法器的輸入的子部分的數量和位數。Those skilled in the art should understand that the manner of determining the number and the number of bits of the input subsections of the adder is not limited to the specific embodiments described above. The various strategies described in this paper can be used independently or in combination, taking into account various factors such as function, configuration, area, cost, speed, power consumption, etc. of the adder and arithmetic circuit to determine the number and digits.

與需要使用高速器件的相關技術相比，本公開以較低的成本和較低的功耗實現了加法器的運算速度的提升。Compared with the related art that needs to use high-speed devices, the present disclosure achieves an increase in the operation speed of the adder with lower cost and lower power consumption.

作為對比，圖4示出了包括根據相關技術的加法器4120的運算電路4000的一部分。In contrast, FIG. 4 shows a part of an arithmetic circuit 4000 including an adder 4120 according to the related art.

運算電路4000包括第一級寄存器4101、4102，前置組合邏輯模塊4110，加法器4120，第二級寄存器4200，後置組合邏輯模塊4210，以及第三級寄存器4300。The arithmetic circuit 4000 includes first-level registers 4101 and 4102 , a pre-combination logic module 4110 , an adder 4120 , a second-level register 4200 , a post-combination logic module 4210 , and a third-level register 4300 .

其中，第一級寄存器4101、4102經由前置組合邏輯模塊4110和加法器4120耦接到第二級寄存器4200。第二級寄存器4200經由後置組合邏輯模塊4210耦接到第三級寄存器4300。The first-level registers 4101 and 4102 are coupled to the second-level register 4200 via the pre-combination logic module 4110 and the adder 4120 . The second level register 4200 is coupled to the third level register 4300 via the post-combination logic module 4210 .

可以看出，圖4所示的運算電路4000中的第一級寄存器4101和4102、第二級寄存器4200及第三級寄存器4300分別對應於圖3所示的運算電路3000中的前一級寄存器3101和3102、中間寄存器組330及後一級寄存器3200。相應地，運算電路4000中的前置組合邏輯模塊4110和後置組合邏輯模塊4210分別對應於運算電路3000中的前置組合邏輯模塊3110和後置組合邏輯模塊3120。It can be seen that the first level registers 4101 and 4102, the second level register 4200 and the third level register 4300 in the arithmetic circuit 4000 shown in FIG. 4 correspond to the previous level register 3101 in the arithmetic circuit 3000 shown in FIG. 3 respectively And 3102, the intermediate register group 330, and the next-level register 3200. Correspondingly, the pre-combination logic module 4110 and the post-combination logic module 4210 in the operation circuit 4000 correspond to the pre-combination logic module 3110 and the post-combination logic module 3120 in the operation circuit 3000, respectively.

本公開和相關技術的重要區別在於，圖4所示的相關技術的運算電路4000中的加法器4120耦接在第一級寄存器4101、4102和第二級寄存器4200之間，而圖3所示的本公開的運算電路3000中的加法器300跨中間寄存器組330耦接在前一級寄存器3101、3102和後一級寄存器3200之間。An important difference between the present disclosure and the related art is that the adder 4120 in the operation circuit 4000 of the related art shown in FIG. The adder 300 in the arithmetic circuit 3000 of the present disclosure is coupled between the previous-stage registers 3101 , 3102 and the latter-stage register 3200 across the intermediate register group 330 .

在圖4所示的相關技術中，加法器4120僅能夠與前置組合邏輯模塊4110一起利用第一級寄存器4101、4102和第二級寄存器4200之間的時鐘週期進行運算。而在圖3所示的本公開的技術方案中，加法器300能夠與前置組合邏輯模塊3110和後置組合邏輯模塊3120一起利用前一級寄存器3101、3102和中間寄存器組330之間以及中間寄存器組330和後一級寄存器3200之間的兩個時鐘週期進行運算。在本公開的技術方案中，可以根據前置組合邏輯模塊3110和後置組合邏輯模塊3120的配置來對加法器300的配置進行適當的調整，從而更充分、更靈活地利用兩個時鐘週期的時間來完成加法運算。In the related art shown in FIG. 4 , the adder 4120 can only use the clock cycle between the first-level registers 4101 and 4102 and the second-level register 4200 to perform operations together with the pre-combination logic module 4110 . In the technical solution of the present disclosure shown in FIG. 3 , the adder 300 can use the preceding level registers 3101 , 3102 and the intermediate register group 330 and the intermediate registers together with the pre-combination logic module 3110 and the post-combination logic module 3120 The operation takes two clock cycles between the bank 330 and the next stage register 3200 . In the technical solution of the present disclosure, the configuration of the adder 300 can be appropriately adjusted according to the configuration of the pre-combination logic module 3110 and the post-combination logic module 3120, so that the two clock cycles can be used more fully and flexibly. time to complete the addition operation.

此外，本領域技術人員應當理解，圖3所示的加法器300中的第一級加法模塊組320和第二級加法模塊350所執行的運算與圖4所示的加法器4120執行的運算實質上是等同的。換言之，加法器4120中也具有與加法器300中的第一級加法模塊組320和第二級加法模塊350等同或對應的模塊或單元。因此，與相關技術相比，加法器300中的第一級加法模塊組320和第二級加法模塊350的配置並未引入額外的成本。In addition, those skilled in the art should understand that the operations performed by the first-stage addition module group 320 and the second-stage addition module 350 in the adder 300 shown in FIG. 3 are essentially the same as those performed by the adder 4120 shown in FIG. 4 . above are equivalent. In other words, the adder 4120 also has modules or units equivalent to or corresponding to the first-stage addition module group 320 and the second-stage addition module 350 in the adder 300 . Therefore, compared with the related art, the configuration of the first-stage addition module group 320 and the second-stage addition module 350 in the adder 300 does not introduce additional cost.

也就是說，與圖4所示的相關技術相比，實現圖3所示的運算電路3000中的加法器300所需要的額外模塊或單元僅是進位寄存器組340。如上所述，進位寄存器組340中的每個進位寄存器均由1比特寄存器來實現，其製造成本較低。換言之，與圖4所示的相關技術相比，實現本公開的加法器的額外成本基本上僅僅是若干個1比特寄存器的製造成本。That is, compared with the related art shown in FIG. 4 , the additional module or unit required to implement the adder 300 in the arithmetic circuit 3000 shown in FIG. 3 is only the carry register group 340 . As described above, each carry register in the carry register group 340 is implemented by a 1-bit register, which has a low manufacturing cost. In other words, compared to the related art shown in FIG. 4 , the additional cost of implementing the adder of the present disclosure is substantially only the manufacturing cost of several 1-bit registers.

因此，與相關技術相比，本公開所提出的加法器創造性地利用相鄰一級的時鐘週期來完成部分運算，從而以較低的成本有效地提高了加法器及包括加法器的運算電路的運算速度。Therefore, compared with the related art, the adder proposed by the present disclosure creatively utilizes the clock cycles of adjacent stages to complete some operations, thereby effectively improving the operations of the adder and the operation circuit including the adder at a lower cost. speed.

根據本公開的加法器及運算電路可以以軟件、硬件、軟件與硬件的結合等各種適當的方式實現。在一種實現方式中，一種晶片可以包括如上所述的運算電路，該晶片還可以包括在一種計算裝置中。The adder and the arithmetic circuit according to the present disclosure can be implemented in various appropriate manners, such as software, hardware, and a combination of software and hardware. In one implementation, a wafer may include the arithmetic circuitry described above, and the wafer may also be included in a computing device.

在說明書及權利要求中的詞語“前”、“後”、“頂”、“底”、“之上”、“之下”等，如果存在的話，用於描述性的目的而並不一定用於描述不變的相對位置。應當理解，這樣使用的詞語在適當的情況下是可互換的，使得在此所描述的本公開的實施例，例如，能夠在與在此所示出的或另外描述的那些取向不同的其他取向上操作。The words "front," "rear," "top," "bottom," "over," "under," etc. in the specification and claims, if present, are used for descriptive purposes and not necessarily to describe an invariant relative position. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are, for example, capable of other orientations than those illustrated or otherwise described herein. Operate up.

如在此所使用的，詞語“示例性的”意指“用作示例、實例或說明”，而不是作為將被精確複製的“模型”。在此示例性描述的任意實現方式並不一定要被解釋為比其它實現方式優選的或有利的。而且，本公開不受在上述技術領域、背景技術、發明內容或具體實施方式中所給出的任何所表述的或所暗示的理論所限定。As used herein, the word "exemplary" means "serving as an example, instance, or illustration" rather than as a "model" to be exactly reproduced. Any implementation illustratively described herein is not necessarily to be construed as preferred or advantageous over other implementations. Furthermore, the present disclosure is not to be bound by any expressed or implied theory presented in the preceding technical field, background, brief summary or detailed description.

如在此所使用的，詞語“基本上”意指包含由設計或製造的缺陷、器件或元件的容差、環境影響和/或其它因素所致的任意微小的變化。詞語“基本上”還允許由寄生效應、噪聲以及可能存在於實際的實現方式中的其它實際考慮因素所致的與完美的或理想的情形之間的差異。As used herein, the word "substantially" is meant to encompass any minor variation due to design or manufacturing imperfections, tolerances of devices or elements, environmental influences, and/or other factors. The word "substantially" also allows for differences from a perfect or ideal situation due to parasitics, noise, and other practical considerations that may exist in an actual implementation.

另外，前面的描述可能提及了被“連接”或“耦接”在一起的元件或節點或特徵。如在此所使用的，除非另外明確說明，“連接”意指一個元件/節點/特徵與另一種元件/節點/特徵在電學上、機械上、邏輯上或以其它方式直接地連接（或者直接通信）。類似地，除非另外明確說明，“耦接”意指一個元件/節點/特徵可以與另一元件/節點/特徵以直接的或間接的方式在機械上、電學上、邏輯上或以其它方式連結以允許相互作用，即使這兩個特徵可能並沒有直接連接也是如此。也就是說，“耦接”意圖包含元件或其它特徵的直接連結和間接連結，包括利用一個或多個中間元件的連接。Additionally, the preceding description may refer to elements or nodes or features being "connected" or "coupled" together. As used herein, unless expressly stated otherwise, "connected" means that one element/node/feature is electrically, mechanically, logically or otherwise directly connected (or directly connected) to another element/node/feature. communication). Similarly, unless expressly stated otherwise, "coupled" means that one element/node/feature can be mechanically, electrically, logically or otherwise linked, directly or indirectly, with another element/node/feature to allow interaction, even though the two features may not be directly connected. That is, "coupled" is intended to encompass both direct and indirect connections of elements or other features, including connections utilizing one or more intervening elements.

另外，僅僅為了參考的目的，還可以在本文中使用“第一”、“第二”等類似術語，並且因而並非意圖限定。例如，除非上下文明確指出，否則涉及結構或元件的詞語“第一”、“第二”和其它此類數字詞語並沒有暗示順序或次序。Also, terms like "first," "second," and the like may also be used herein for reference purposes only, and are thus not intended to be limiting. For example, the terms "first," "second," and other such numerical terms referring to structures or elements do not imply a sequence or order unless the context clearly dictates otherwise.

還應理解，“包括/包含”一詞在本文中使用時，說明存在所指出的特徵、整體、步驟、操作、單元和/或組件，但是並不排除存在或增加一個或多個其它特徵、整體、步驟、操作、單元和/或組件以及/或者它們的組合。It should also be understood that the term "comprising/comprising" when used herein indicates the presence of the indicated feature, integer, step, operation, unit and/or component, but does not preclude the presence or addition of one or more other features, Entities, steps, operations, units and/or components and/or combinations thereof.

在本公開中，術語“提供”從廣義上用於涵蓋獲得對象的所有方式，因此“提供某對象”包括但不限於“購買”、“製備/製造”、“佈置/設置”、“安裝/裝配”、和/或“訂購”對象等。In this disclosure, the term "providing" is used broadly to encompass all ways of obtaining an object, thus "providing something" includes, but is not limited to, "purchasing," "preparing/manufacturing," "arranging/arranging," "installing/ Assembly", and/or "Order" objects, etc.

本領域技術人員應當意識到，在上述操作之間的邊界僅僅是說明性的。多個操作可以結合成單個操作，單個操作可以分佈於附加的操作中，並且操作可以在時間上至少部分重疊地執行。而且，另選的實施例可以包括特定操作的多個實例，並且在其他各種實施例中可以改變操作順序。但是，其它的修改、變化和替換同樣是可能的。因此，本說明書和圖式應當被看作是說明性的，而非限制性的。Those skilled in the art will appreciate that the boundaries between the operations described above are merely illustrative. Multiple operations may be combined into a single operation, a single operation may be distributed among additional operations, and operations may be performed at least partially overlapping in time. Furthermore, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be changed in other various embodiments. However, other modifications, changes and substitutions are equally possible. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.

雖然已經通過示例對本公開的一些特定實施例進行了詳細說明，但是本領域的技術人員應該理解，以上示例僅是為了進行說明，而不是為了限制本公開的範圍。在此公開的各實施例可以任意組合，而不脫離本公開的精神和範圍。本領域的技術人員還應理解，可以對實施例進行多種修改而不脫離本公開的範圍和精神。本公開的範圍由所附權利要求來限定。While some specific embodiments of the present disclosure have been described in detail by way of examples, those skilled in the art will appreciate that the above examples are provided for illustration only, and are not intended to limit the scope of the present disclosure. The various embodiments disclosed herein may be combined arbitrarily without departing from the spirit and scope of the present disclosure. It will also be understood by those skilled in the art that various modifications may be made to the embodiments without departing from the scope and spirit of the present disclosure. The scope of the present disclosure is defined by the appended claims.

100:加法器 111,112:輸入 111-1~111-N:子部分 112-1~112-N:子部分 120:第一級加法模塊組 120-1~120-N:第一級加法模塊 130:中間寄存器組 130-1~130-N:中間寄存器 140:進位寄存器組 140-1~140-N:進位寄存器 150:第二級加法模塊 161,162:輸出 161-1,161-2:子部分 200:加法器 210:輸入 210-1~210-N:子部分 220:第一級加法模塊組 220-1~220-N:第一級加法模塊 230:中間寄存器組 230-1~230-N:中間寄存器 240:進位寄存器組 240-1~240-N:進位寄存器 250:第二級加法模塊 261,262:輸出 261-1,261-2:子部分 300:加法器 311,312:輸入 320:第一級加法模塊組 330:中間寄存器組 330-1~330-N:中間寄存器 340:進位寄存器組 350:第二級加法模塊 361,362:輸出 361-1,361-2:子部分 3000:運算電路 3101,3102:前一級寄存器 3110:前置組合邏輯模塊 3120:後置組合邏輯模塊 3200:後一級寄存器 4000:運算電路 4101,4102:第一級寄存器 4110:前置組合邏輯模塊 4120:加法器 4200:第二級寄存器 4210:後置組合邏輯模塊 4300:第三級寄存器100: Adder 111, 112: input 111-1~111-N: Subsections 112-1~112-N: Subsections 120: First-level addition module group 120-1~120-N: first-level addition module 130: Intermediate register bank 130-1~130-N: Intermediate register 140: Carry register group 140-1~140-N: carry register 150: Second-level addition module 161, 162: output 161-1, 161-2: Subsections 200: Adder 210: Enter 210-1~210-N: Subsections 220: First-level addition module group 220-1~220-N: first-level addition module 230: Intermediate register bank 230-1~230-N: Intermediate register 240: carry register group 240-1~240-N: carry register 250: Second-level addition module 261, 262: output 261-1, 261-2: Subsections 300: Adder 311, 312: Input 320: First-level addition module group 330: Intermediate register bank 330-1~330-N: Intermediate register 340: carry register group 350: Second-level addition module 361, 362: output 361-1, 361-2: Subsections 3000: Operational Circuits 3101, 3102: previous level register 3110: Front Combination Logic Module 3120: Post Combination Logic Module 3200: next level register 4000: Operational Circuit 4101, 4102: first level register 4110: Front Combination Logic Module 4120: Adder 4200: second level register 4210: Post Combination Logic Module 4300: The third level register

構成說明書的一部分的圖式描述了本公開的實施例，並且連同說明書一起用於解釋本公開的原理。The drawings, which form a part of the specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure.

參照圖式，根據下面的詳細描述，可以更加清楚地理解本公開，其中：The present disclosure may be more clearly understood from the following detailed description with reference to the drawings, wherein:

圖1示出了根據本公開一個或多個示例性實施例的用於計算輸入的兩個數字之和的加法器的示意圖。FIG. 1 shows a schematic diagram of an adder for calculating the sum of two numbers of an input, according to one or more exemplary embodiments of the present disclosure.

圖2示出了根據本公開一個或多個示例性實施例的用於計算輸入的一個數字與預定的常數之和的加法器的示意圖。FIG. 2 shows a schematic diagram of an adder for calculating the sum of an input number and a predetermined constant according to one or more exemplary embodiments of the present disclosure.

圖3示出了包括根據本公開一個或多個示例性實施例的加法器的運算電路的一部分。3 illustrates a portion of an operational circuit including an adder according to one or more exemplary embodiments of the present disclosure.

圖4示出了包括根據相關技術的加法器的運算電路的一部分。FIG. 4 shows a part of an operation circuit including an adder according to the related art.

注意，在以下說明的實施方式中，有時在不同的圖式之間共同使用同一圖式標記來表示相同部分或具有相同功能的部分，而省略其重複說明。在一些情況中，使用相似的標號和字母表示類似項，因此，一旦某一項在一個圖式中被定義，則在隨後的圖式中不需要對其進行進一步討論。Note that, in the embodiments described below, the same drawing symbols are used in common between different drawings to denote the same parts or parts having the same function, and repeated descriptions thereof are omitted. In some cases, similar numbers and letters are used to denote similar items, so once an item is defined in one figure, it does not require further discussion in subsequent figures.

為了便於理解，在圖式等中所示的各結構的位置、尺寸及範圍等有時不表示實際的位置、尺寸及範圍等。因此，本公開並不限於圖式等所公開的位置、尺寸及範圍等。For ease of understanding, the position, size, range, and the like of each structure shown in the drawings and the like may not indicate the actual position, size, range, or the like. Therefore, the present disclosure is not limited to the positions, dimensions, ranges, and the like disclosed in the drawings and the like.

100:加法器 100: Adder

111,112:輸入 111, 112: input

111-1~111-N:子部分 111-1~111-N: Subsections

112-1~112-N:子部分 112-1~112-N: Subsections

120:第一級加法模塊組 120: First-level addition module group

120-1~120-N:第一級加法模塊 120-1~120-N: first-level addition module

130:中間寄存器組 130: Intermediate register bank

130-1~130-N:中間寄存器 130-1~130-N: Intermediate register

140:進位寄存器組 140: Carry register group

140-1~140-N:進位寄存器 140-1~140-N: carry register

150:第二級加法模塊 150: Second-level addition module

161,162:輸出 161, 162: output

161-1,161-2:子部分 161-1, 161-2: Subsections

Claims

An adder for calculating the sum of two numbers of inputs, the adder having two inputs representing the two numbers respectively, wherein each of the two inputs is divided into multiples corresponding to each other sub-parts, the plurality of sub-parts represent partial bits of the input sequentially from low bits to high bits, and the adder includes: a plurality of first-stage addition modules operating in parallel with each other, each first-stage addition module is used for adding The corresponding subsections of the two inputs are summed; a plurality of intermediate registers, each of which is coupled to a corresponding first-stage addition module, is used to store the two output from the corresponding first-stage addition module. The sum of corresponding sub-portions of the inputs; one or more carry registers, each carry register coupled to a corresponding first-stage addition module for storing the two output by the corresponding first-stage addition module a carry bit of the corresponding sub-portion of the input; and a second stage addition module, coupled to the plurality of intermediate registers and the one or more carry registers, for summing the sum from each intermediate register with the sum from the corresponding preamble The carry bits of a carry register are summed; wherein the number of bits in each subsection of each of the two inputs is determined at least in part based on the overall computational delay of the plurality of first stage addition modules.

The adder of claim 1, wherein a second-stage addition module directly couples an output of a first intermediate register of the plurality of intermediate registers that corresponds to a first subsection of the two inputs to the output of the adder, wherein the first sub-portion of each of the two inputs represents the least significant one or more bits of the input.

The adder of claim 1 or 2, wherein the number of bits of the first subsection of each of the two inputs is greater than or equal to the number of bits of the other subsections of the input.

The adder of claim 1 or 2, wherein each of the two inputs has two subsections.

An adder for calculating the sum of a number of inputs and a predetermined constant, the adder having an input representing the number, the input being divided into a plurality of subsections, the plurality of subsections from low to high The partial bits of the input are represented in sequence, and the adder includes: one or more first-stage addition modules operating in parallel with each other, each first-stage addition module for comparing a corresponding subsection of the input with the The corresponding bits of the constant are summed; a plurality of intermediate registers, each of which is coupled to a corresponding first-stage addition module, is used to store the corresponding subsection of the input output by the corresponding first-stage addition module and The sum of the corresponding bits of the constant; one or more carry registers, each of which is coupled to a corresponding first-stage addition module for storing the input value output by the corresponding first-stage addition module; a carry of the corresponding subsection and the corresponding bit of the constant; and a second stage addition module, coupled to the plurality of intermediate registers and the one or more carry registers, for adding a sum from each intermediate register summing with the carry from the corresponding previous carry register; wherein the number of bits in each subsection of the input is determined at least in part based on the overall computational delay of the plurality of first stage addition modules.

The adder of claim 5, wherein a second stage adder module directly couples an output of a first one of the plurality of intermediate registers corresponding to the first sub-portion of the input to an adder the output of the device, wherein the first subsection represents the least significant one or more bits of the input.

The adder of claim 5 or 6, wherein the number of subsections of the input and the number of bits per subsection are determined at least in part from the constant.

The adder of claim 5 or 6, wherein the number and configuration of first stage adding modules is determined at least in part from the constant.

The adder of claim 8, wherein the constant is one.

The adder of claim 5 or 6, wherein the input has two subsections.

An arithmetic circuit comprising: the adder of any one of claim 1 to 10; and a pre-combination logic module coupled to an input of the adder and coupled to the adder The output of at least one of the post-combination logic modules.

The arithmetic circuit of claim 11, wherein the number of subsections of the input and the number of bits per subsection of the adder depend at least in part on at least one of a pre-combinational logic module and a post-combinational logic module The calculation delay of one and the period of the clock used for the arithmetic circuit are determined.

The arithmetic circuit of claim 12, wherein, if the arithmetic circuit includes a pre-combination logic module, the maximum number of bits of the number of bits of the plurality of subsections of the input of the adder is at least partially It is determined according to the difference between the period of the clock used for the operation circuit and the calculation delay of the pre-combination logic module. If the operation circuit does not include the pre-combination logic module, the input value of the adder is The maximum number of bits of the plurality of subsections is determined based at least in part on a period of a clock used for the arithmetic circuit.

The arithmetic circuit of claim 12, wherein, If the arithmetic circuit includes a post-combination logic module, the number of bits of the first sub-portion of the input of the adder is combined with the post-combination at least in part according to the period of the clock used for the arithmetic circuit The difference in the computation delays of the logic modules is determined, if the arithmetic circuit does not include a post-combination logic module, the number of bits of the first subsection of the input of the adder is based at least in part on the number of bits used for the operation The cycle of the circuit's clock is determined.

A wafer comprising the arithmetic circuit of any one of claims 11 to 14.

A computing device comprising the wafer of claim 15.