WO2022170811A1 - Fixed-point multiply-add operation unit and method suitable for mixed-precision neural network - Google Patents
Fixed-point multiply-add operation unit and method suitable for mixed-precision neural network Download PDFInfo
- Publication number
- WO2022170811A1 WO2022170811A1 PCT/CN2021/131800 CN2021131800W WO2022170811A1 WO 2022170811 A1 WO2022170811 A1 WO 2022170811A1 CN 2021131800 W CN2021131800 W CN 2021131800W WO 2022170811 A1 WO2022170811 A1 WO 2022170811A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- multiplier
- input data
- partial product
- precision
- data
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 26
- 230000000873 masking effect Effects 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 16
- 230000000295 complement effect Effects 0.000 description 5
- 238000009825 accumulation Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 4
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 102100030614 Carboxypeptidase A2 Human genes 0.000 description 1
- 108091006675 Monovalent cation:proton antiporter-2 Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004148 unit process Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/57—Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the invention relates to the field of digital circuits, in particular to a fixed-point multiply-add operation unit and method suitable for a mixed-precision neural network.
- artificial intelligence algorithms are widely used in many commercial fields.
- the quantization of different layers of the network is one of the important methods to improve the efficiency of network computing.
- artificial intelligence chips have an increasing demand for mixed-precision computing in the process of data processing in order to meet the characteristics of network design.
- Conventional processors use a variety of processing units with different precisions to process mixed-precision operations. This method makes the hardware overhead too large, redundant idle resources, and excessive delays when switching between different precision hardware, reducing throughput, and cannot target applications.
- the technical problem to be solved by the present invention is to provide a fixed-point multiply-add operation unit and method suitable for mixed-precision neural network in view of the above-mentioned defects of the prior art, aiming at solving the need to use a variety of different precision processing in the prior art
- the unit processes mixed-precision operations, resulting in problems such as excessive hardware overhead and redundant idle resources.
- an embodiment of the present invention provides a fixed-point multiply-add operation method suitable for mixed-precision neural networks, wherein the method includes:
- the target sum is truncated, and the data obtained after the truncated selection is used as the result of the dot product of the input data.
- the acquiring the mode signal and the input data, determining the data input position according to the mode signal, and inputting the input data from the data input position into the multiplier includes:
- the number of called multipliers is greater than 1;
- the number of called multipliers is 1;
- a data input location is determined based on the mode signal, and the input data is input into a multiplier from the data input location.
- the acquiring mode signal, processing the partial product generated by the multiplier according to the mode signal, and performing a summation operation, using the data obtained after the summation operation as the target sum including :
- a summation operation is performed on the first partial product generation part and the second partial product generation part, and the data obtained after the summation operation is used as a target sum.
- the mode signal is determined by the precision of the input data; the processing includes at least one of the following operations:
- the first partial product generating part and the second The second partial product generation part performs a summation operation, and obtains the target sum based on the summation operation, including:
- the output result of the second stage compressor c is input into an adder, and the output result of the adder is used as a target sum.
- performing a summation operation on the first partial product generation part and the second partial product generation part performs a target based on the summation operation and includes:
- stage compressor b When the highest bit number of the input data is equal to the highest bit number of the multiplier, the first partial product generating part and the second partial product generating part are input to the first stage compressor a and the first stage compressor a respectively.
- stage compressor b When the highest bit number of the input data is equal to the highest bit number of the multiplier, the first partial product generating part and the second partial product generating part are input to the first stage compressor a and the first stage compressor a respectively.
- stage compressor b When the highest bit number of the input data is equal to the highest bit number of the multiplier, the first partial product generating part and the second partial product generating part are input to the first stage compressor a and the first stage compressor a respectively.
- stage compressor b When the highest bit number of the input data is equal to the highest bit number of the multiplier, the first partial product generating part and the second partial product generating part are input to the first stage compressor a and the first stage compressor a respectively.
- stage compressor b When the highest bit number of the input data is
- the output results of the first-stage compressor a and the first-stage compressor b are respectively input into the first adder and the second adder, and the first adder and the second adder are The sum of the output results is used as the target sum.
- performing a summation operation on the first partial product generation part and the second partial product generation part performs a target based on the summation operation and includes:
- the multiplier When the highest bit number of the input data is greater than the highest bit number of the multiplier, the multiplier includes a first multiplier and a second multiplier, and the second multiplier is a low-order operation multiplier; the first multiplier outputs the first partial product generating part, and the second multiplier outputs the second partial product generating part;
- the sum of the output results of the first adder and the second adder is used as the target sum.
- the said target sum is truncated, and the data obtained after the truncated selection is used as the result of the dot product of the input data, comprising:
- a truncation operation is performed on the target sum starting from the 0th bit, and the data obtained after the truncation and selection operation is used as the result of the dot product of the input data.
- the method further includes:
- an embodiment of the present invention further provides a fixed-point multiply-add operation unit suitable for a mixed-precision neural network, wherein the operation unit includes:
- a position determination module for acquiring a mode signal and input data, determining a data input position according to the mode signal, and inputting the input data into a multiplier from the data input position;
- a partial product processing module for processing the partial product generated by the multiplier according to the mode signal, and performing a summation operation, using the data obtained after the summation operation as a target sum;
- the result generation module is used for intercepting the target sum, and using the data obtained after the interception as the result of the dot product of the input data.
- the present invention outputs the partial product generating part after the partial product of the designated area is masked according to the mode signal by inputting the precision of the input data with different precisions into the multiplier, and controls the output part to the output part.
- the product generation part performs the summation operation according to the methods corresponding to different precisions, so as to realize the dot multiplication operation of mixed precision.
- a multiplier can be used to realize the point multiplication operation of the mixed-precision neural network, which solves the need to use a variety of processing units with different precisions in the prior art to process the mixed-precision operation, resulting in excessive hardware overhead and idle time. Resource redundancy and other issues.
- FIG. 1 is a schematic flowchart of a fixed-point multiply-add operation method suitable for a mixed-precision neural network provided by an embodiment of the present invention.
- FIG. 2 is a schematic diagram of a partial product generated in a conventional 8bit ⁇ 8bit multiplier provided by an embodiment of the present invention.
- FIG. 3 is an addition tree structure used by a conventional 8bit ⁇ 8bit multiplier provided by an embodiment of the present invention.
- FIG. 4 is a reference diagram for implementing multiplication operations of four groups of input data with a precision of 2bit ⁇ 2bit based on a group of 8bit ⁇ 8bit multipliers provided by an embodiment of the present invention.
- FIG. 5 is a reference diagram for implementing multiplication operations of two groups of input data with a precision of 4bit ⁇ 4bit based on a group of 8bit ⁇ 8bit multipliers provided by an embodiment of the present invention.
- FIG. 6 is a reference diagram for implementing a multiplication operation of input data with a precision of 1 bit ⁇ 1 bit based on an 8bit ⁇ 8bit multiplier according to an embodiment of the present invention.
- FIG. 7 is a reference diagram for implementing a multiplication operation of input data with a precision of 3 bits ⁇ 3 bits based on an 8bit ⁇ 8bit multiplier according to an embodiment of the present invention.
- FIG. 8 is a reference diagram for implementing a multiplication operation of input data with a precision of 5bit ⁇ 5bit based on an 8bit ⁇ 8bit multiplier according to an embodiment of the present invention.
- FIG. 9 is a reference diagram for implementing a multiplication operation of input data with a precision of 6bit ⁇ 6bit based on an 8bit ⁇ 8bit multiplier according to an embodiment of the present invention.
- FIG. 10 is a reference diagram for implementing a multiplication operation of input data with a precision of 7bit ⁇ 7bit based on an 8bit ⁇ 8bit multiplier according to an embodiment of the present invention.
- 11 is a reference diagram for realizing multiplication of two 4bit ⁇ 8bit mixed-precision input data by dividing and summing the partial product generation part based on an 8bit ⁇ 8bit multiplier according to an embodiment of the present invention.
- FIG. 12 is a reference diagram for implementing a multiplication operation of input data with a mixed precision of 8bit ⁇ 16bit based on two sets of 8bit ⁇ 8bit multipliers according to an embodiment of the present invention.
- FIG. 13 is a schematic diagram of accumulating the output data of the first multiplier and the second multiplier under mixed precision provided by an embodiment of the present invention.
- FIG. 15 is a reference diagram for implementing a multiplication operation of input data with a mixed precision of 8bit ⁇ 15bit based on two sets of 8bit ⁇ 8bit multipliers provided by an embodiment of the present invention.
- FIG. 16 is a schematic diagram of a partial product including a sign bit in an 8bit ⁇ 8bit multiplier according to an embodiment of the present invention.
- FIG. 17 is a reference diagram of an internal module of an arithmetic unit provided by an embodiment of the present invention.
- artificial intelligence algorithms are widely used in many commercial fields.
- the quantization of different layers of the network is one of the important methods to improve the efficiency of network computing.
- artificial intelligence chips have an increasing demand for mixed-precision computing in the process of data processing in order to meet the characteristics of network design.
- Conventional processors use a variety of processing units with different precisions to process mixed-precision operations. This method makes the hardware overhead too large, redundant idle resources, and excessive delays when switching between different-precision hardware, reducing throughput, and cannot target applications.
- the present invention provides a fixed-point multiply-add operation method suitable for mixed-precision neural networks.
- the multiplication is controlled according to the mode signal.
- the device After masking the partial product of the specified area, the device outputs the partial product generation part, and performs the sum operation on the output partial product generation part according to the method corresponding to different precisions, so as to realize the mixed precision dot product operation.
- a multiplier can be used to realize the point multiplication operation of the mixed-precision neural network, which solves the need to use a variety of processing units with different precisions in the prior art to process the mixed-precision operation, resulting in excessive hardware overhead and idle time. Resource redundancy and other issues.
- the method includes the following:
- Step S100 Acquire a mode signal and input data, determine a data input position according to the mode signal, and input the input data into the multiplier from the data input position.
- this embodiment uses a unified multiplier to perform the dot multiplication operation of the mixed-precision neural network, and the number of bits of the input position of the multiplier is fixed, it is possible that the precision of the input data is different from the highest number of bits of the multiplier. matching situation.
- this embodiment needs to acquire a mode signal and input data, determine the data input position according to the mode signal, and then input the input data from the data input position into the multiplier middle.
- input data of different precisions are input into the multipliers from different data input positions, thereby implementing the point multiplication operation of the mixed precision neural network by using a unified multiplier.
- step S100 specifically includes the following steps:
- Step S110 obtain mode signal and input data, determine the quantity of the multiplier called according to the precision of described input data
- Step S120 when the highest precision of the input data is higher than the highest bit of the multiplier, the number of called multipliers is greater than 1;
- Step S130 determining the data input position according to the mode signal, splitting the data with the highest precision in the input data, and inputting the input data obtained after the splitting into the multiplier from the data input position;
- Step S140 when the highest precision of the input data is lower than or equal to the highest bit of the multiplier, the number of called multipliers is 1;
- Step S150 Determine the data input position according to the mode signal, and input the input data into the multiplier from the data input position.
- this embodiment adopts a unified multiplier, and the highest number of bits of the multiplier is fixed, it may happen that the precision of the multiplier does not match the precision of the input data.
- the multiplier is an 8bit ⁇ 8bit multiplier, and the input data
- the precision is 3bit ⁇ 3bit, or the multiplier is 8bit ⁇ 8bit multiplier, and the precision of the input data is 8bit ⁇ 16bit. Therefore, the number of called multipliers needs to be determined according to the precision of the input data. It can be understood that if the precision of the input data exceeds the precision of the multiplier, the input data cannot be multiplied by one multiplier, and in this case, multiple multipliers need to be called.
- the number of called multipliers is greater than 1, and then the data input position is determined according to the mode signal, and the highest bit in the input data is determined.
- the precision data is split, and the input data obtained after the split is input into the multiplier from the data input position.
- the input data is mixed-precision 8bit ⁇ 16bit
- the multiplier is an 8bit ⁇ 8bit multiplier
- two 8bit ⁇ 8bit multipliers need to be called to realize the multiplication of mixed-precision 8bit ⁇ 16bit input data.
- the 8-bit part of the data can be directly input into the multiplier from the specified data input position, and the 16-bit part of the input data needs to be split before being input into the two multipliers respectively (as shown in Figure 12).
- the data input position is determined according to the mode signal, and the input data is converted from the data The input position is entered into the multiplier.
- the precision of the input data is 3bit ⁇ 3bit
- the multiplier uses an 8bit ⁇ 8bit multiplier
- only one 8bit ⁇ 8bit multiplier needs to be called to realize the multiplication of the input data.
- the highest precision of the data does not exceed the highest bit of the multiplier, so the input data can be directly input into the multiplier from the specified data input position for operation (as shown in Figure 7).
- the method further includes the following steps:
- Step S200 Obtain a mode signal, process the partial product generated by the multiplier according to the mode signal, perform a summation operation, and use the data obtained after the summation operation as a target sum.
- the highest bit number of the multiplier may not be equal to the precision of the input data.
- the concept of a mode signal is introduced, and the partial product generated by the multiplier is processed by the mode signal, so that only the partial product generation part corresponding to the input data is left.
- the mode signal is equivalent to a control command, and the control system performs different processing on the partial products of different regions generated by the multiplier.
- the mode signal is determined by the precision of the input data, and the processing includes at least one of the following two operations: 1. Masking the partial product of the preset area generated by the multiplier .
- the multiplier is an 8bit ⁇ 8bit multiplier
- the partial product generated by the 8bit ⁇ 8bit multiplier is gated and selected by the mode signal.
- the unneeded partial product under the mode signal will be masked.
- the masking process can be implemented by setting the output result of the unneeded partial product generation part to 0 or 1 (the complement of the high-order bit is complemented). bits).
- Figure 4 shows the multiplication and accumulation operations of 4 groups of 2bit ⁇ 2bit input data.
- the blocks of the same depth represent the same group of multiplier input data, or the multiplicand input data or the partial product generation part corresponding to the input data. For these 4 For groups of input data, a specific mode signal will be generated, and other partial products other than the partial products corresponding to the four groups of input data will be masked.
- the partial product generating part obtained after the processing needs to be split into a first partial product generating part and a second partial product generating part. Then, a summation operation is performed on the first partial product generation part and the second partial product generation part, and the data obtained after the summation operation is used as a target sum.
- the sum operation performed on the first partial product generation part and the second partial product generation part is mainly divided into the following three cases:
- the first partial product generating part and the second partial product generating part can be generated by Input the first compressor and the second compressor respectively, then input the output result of the first compressor and the second compressor into the same adder, and use the output result of the adder as the target sum .
- the speed of floating-point multiplication is largely determined by the speed of mantissa processing.
- a large number of partial products are generated in the process of mantissa processing.
- the multiplier used in this embodiment is an 8bit ⁇ 8bit multiplier, as shown in Figure 2 and Figure 3, the implementation of the conventional 8bit ⁇ 8bit multiplier will generate a total of 8 groups of partial products of gradual shifting, 8
- the component product PP 0 -PP 7 is divided into two parts, and they will go through two 42 compressors (CSA42) in the first stage respectively.
- the output results of the two 42 compressors will be jointly input to a 42 in the second stage.
- Compressor (CSA42) and then input the output result of the second-stage 42-compressor (CSA42) into the first-stage carry-pass adder (CPA) to obtain the final sum, that is, the target sum.
- CPA carry-pass adder
- the first 4 partial product generation parts are one part, that is, the first partial product generation part; the last 4 partial product generation parts are one part, that is, the second partial product generation part and then input the first partial product generation part and the second partial product generation part into the first stage compressor a and the first stage compressor b respectively, and the first stage compressor a and the first stage compressor a and the first stage compressor
- the output results of the first stage compressor b are jointly input to the second stage compressor c, and then the output results of the second stage compressor c are input into the adder, and finally the output result of the adder is used as the target sum.
- Figure 6 shows the distribution of the partial product generation part in the multiplier when the input data is 1bit ⁇ 1bit
- Figure 7 shows the distribution of the partial product generation part in the multiplier when the input data is 3bit ⁇ 3bit
- Figure 8 shows the distribution of the partial product generation part in the multiplier when the input data is 5bit ⁇ 5bit
- Figure 9 shows the distribution of the partial product generation part in the multiplier when the input data is 6bit ⁇ 6bit
- Figure 10 shows the distribution of the partial product generation part in the multiplier when the input data is 7bit ⁇ 7bit.
- the embodiments corresponding to these figures all meet the condition that the precision of the input data is the same, so the steps of splitting, compressing and summing the partial product generation part are similar to those of the embodiment shown in FIG. 5 .
- this embodiment adopts another method to obtain the target sum corresponding to the input data.
- the highest number of bits is equal to the highest number of bits of the multiplier
- the highest precision representing the input data does not exceed the highest bit of the multiplier, and only one multiplier needs to be called for multiplication at this time.
- this embodiment adopts the operation of separately summing the compressed partial product generating parts, that is, the two partial product generating parts obtained after compression are input into different adders respectively. beg for peace.
- this embodiment adopts a conventional 8bit ⁇ 8bit multiplier to realize two 4bit ⁇ 8bit mixed-precision input data, and the 8 partial product generating parts generated at this time can be generated from top to bottom.
- the first four partial product generating parts are the first partial product generating part, and these four partial product generating parts are summed up separately, that is, the first partial product generating part is input into a compressor for compression, and then separately Input into an adder for summation;
- the last four partial product generating parts are the second partial product generating part, and these four partial product generating parts are summed separately, that is, the second partial product generating part is input into another compressor After compressing in another adder, it is separately input into another adder for summation, and then the output results of the two adders are summed.
- the mixed-precision input data may also have a situation where the highest bit number of the input data is greater than the highest bit number of the multiplier. It is understandable that when this happens, it is impossible to rely on only one multiplier
- the input data is multiplied, and two multipliers must be called for the operation.
- the highest bit number of the input data is obtained, and the highest bit number of the input data is compared with the highest bit number of the multiplier.
- this embodiment divides the two multipliers called into a first multiplier and a second multiplier, where the second multiplier is a multiplier that performs low-order operations.
- the partial product generation part generated by the first multiplier is used as the first partial product generation part
- the partial product generation part generated by the second multiplier is used as the first partial product generation part.
- the second part is the product generation part.
- the first partial product generating part can be directly input into the first adder (CPA1), while the second partial product generating part must be split and input to the first adder and the second adder respectively (CPA2).
- the sum of the output results of the first adder and the second adder is used as the target sum.
- the data can be directly input into the addition without being compressed by a compressor. calculation in the device. And since 2 multipliers need to be called in this case, at the system accumulation implementation level, it is necessary to right-shift the partial product generation part generated by the multiplier that performs the low-order operation to realize the subsequent correct summation operation, so in the On the basis of an adder that is conventionally used, another adder needs to be called to perform a summation operation on the excess part after the right shift.
- Figure 12 shows the multiplication operation of 8bit ⁇ 16bit input data based on two sets of 8bit ⁇ 8bit multiplier architecture
- Figure 14 shows the implementation of 8bit ⁇ xbit input data based on two sets of 8bit ⁇ 8bit multiplier architecture.
- Multiplication operation, x 9 ⁇ 15bit
- Figure 15 shows the multiplication operation of 8bit ⁇ 15bit input data based on two sets of 8bit ⁇ 8bit multiplier architecture.
- the above cases belong to the highest bit of the input data is greater than the highest bit of the multiplier. In the case of bits, it is necessary to adopt the above method to realize the sum operation of the partial product generation part.
- the method further includes the following steps:
- step S300 the target sum is cut and selected, and the data obtained after the cut and selection is used as the result of the dot product of the input data.
- the dot product result consistent with the mode signal and the input data can be finally obtained.
- step S300 specifically includes the following steps:
- Step S310 determining the selected bit width according to the precision of the input data
- Step S320 perform a truncation operation on the target sum starting from the 0th bit according to the truncation bit width, and use the data obtained after the truncation and selection operation as the result of the dot product of the input data.
- the truncation bit width is related to the precision of the input data. Specifically, for input data of the same precision, the truncated bit width is from the 0th bit to the 8th-nth bit, where n is the precision of the input data.
- the clipped bit width is the 0th bit To the 5th bit;
- the selected bit width is from the 0th bit to the 16th-x, where x is the highest bit of the input data, and the value is 9-15, such as 8bit ⁇ 12bit input data , and the selected bit width is from the 0th to the 4th bit.
- the cut-selection operation is performed on the target sum starting from the 0th position, and finally the data obtained after the cut-selection operation is used as the dot product of the input data Operation result.
- this embodiment can not only support dot multiplication operations of different precisions, but also satisfy signed bit operations and unsigned bit operations. Therefore, the method further comprises the following steps:
- Step S1 determining the partial product generating part corresponding to the highest bit of the input data, and using the partial product generating part as the partial product generating part to be adjusted;
- Step S2 when the highest bit of the input data is a negative number, invert and add one processing to the partial product generation part to be adjusted.
- this embodiment first determines the partial product generation part related to the signed bit operation.
- the operation of the signed-bit fixed-point multiplier is implemented based on the complement input, where the complement of a positive number is itself, and the complement of a negative number is a signed binary (including the sign bit) which is directly fetched bit by bit Add one more.
- the partial product generating part corresponding to the most significant bit of the input data is used as the partial product generating part to be adjusted.
- the partial product generating part to be adjusted is inverted. Add one processing, and then realize the operation with the sign bit.
- Figure 16 shows a schematic diagram of the generation of the partial product generation part of an 8bit ⁇ 8bit multiplier, wherein the generation of the first 7 partial product generation parts PP 0 -PP 6
- the generation of the partial product generation part (PP 7 ) requires special processing: when the sign bit B7 is 0, it means a positive number, then PP 7 is 0; when the sign bit B7 is 1, it means a negative number, then PP 7 is A7A6A5A4A3A2A1A0 negated Add one operation.
- PP 1 , PP 3 , PP 5 , and PP 7 need to be processed.
- PP 3 and PP 7 need to be processed.
- the operation of the 8bit ⁇ 16bit multiplier needs to reduce such operations, and the generation of PP 7 in the second multiplier that performs low-bit operations does not require a similar method, only the first multiplier The generation of PP 7 in 2 needs to be carried out in a similar way.
- the bit width needs to be extended on the left side of the data in the addition operation, the added data needs to be the same as the highest bit of the original data to ensure the same value in size.
- the present invention also provides a fixed-point multiply-add operation unit suitable for mixed-precision neural network.
- the operation unit includes:
- a position determination module 01 for acquiring a pattern signal and input data, determining a data input position according to the pattern signal, and inputting the input data into the multiplier from the data input position;
- Partial product processing module 02 for processing the partial product generated by the multiplier according to the mode signal, and performing a summation operation, using the data obtained after the summation operation as a target sum;
- the result generation module 03 is configured to cut and select the target sum, and use the data obtained after the cut as the result of the dot product of the input data.
- a unified multiplier is used for operation, but the number of multipliers is not fixed, and the number of multipliers called by the operation unit can be adaptively changed according to the precision of the input data . It can be understood that when the most significant bit of the input data is less than or equal to the most significant bit of the multiplier, the operation unit may only call one multiplier to implement the operation on the input data. When the most significant bit of the input data is greater than the most significant bit of the multiplier, the operation unit needs to call more than one multiplier.
- the operation unit can only call one multiplier, and then according to the mode
- the signal controls the multiplier to output the partial product generating part after masking the partial product of the specified area, and perform a summation operation on the output partial product generating part according to methods corresponding to different precisions.
- the operation unit needs to call two multipliers, control the two multipliers to mask the partial product of the specified area according to the mode signal, and then output the partial product generating part, and generate the partial product for the output part.
- the sum operation is performed according to the methods corresponding to different precisions.
- the present invention discloses a fixed-point multiply-add operation unit and method suitable for mixed-precision neural network.
- the multiplier is controlled according to the mode signal. After masking the partial product of the specified area, output the partial product generation part, and perform the sum operation on the output partial product generation part according to the methods corresponding to different precisions, so as to realize the mixed precision dot product operation.
- a multiplier can be used to realize the point multiplication operation of the mixed-precision neural network, which solves the need to use a variety of processing units with different precisions in the prior art to process the mixed-precision operation, resulting in excessive hardware overhead and idle time. Resource redundancy and other issues.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Complex Calculations (AREA)
Abstract
Description
Claims (10)
- 一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,所述方法包括:A fixed-point multiply-add operation method suitable for mixed-precision neural networks, characterized in that the method comprises:获取模式信号和输入数据,根据所述模式信号确定数据输入位置,并将所述输入数据从所述数据输入位置输入乘法器中;acquiring a mode signal and input data, determining a data input position according to the mode signal, and inputting the input data into a multiplier from the data input position;根据所述模式信号对所述乘法器生成的部分积进行处理,并执行求和操作,将所述求和操作后得到的数据作为目标和;Process the partial products generated by the multiplier according to the mode signal, perform a summation operation, and use the data obtained after the summation operation as a target sum;对所述目标和进行截选,将截选后得到的数据作为所述输入数据的点乘运算结果。The target sum is truncated, and the data obtained after the truncated selection is used as the result of the dot product of the input data.
- 根据权利要求1所述的一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,所述获取模式信号和输入数据,根据所述模式信号确定数据输入位置,并将所述输入数据从所述数据输入位置输入乘法器中包括:A fixed-point multiply-add operation method suitable for mixed-precision neural network according to claim 1, characterized in that, said acquiring a mode signal and input data, determining a data input position according to said mode signal, and applying said input Data from the data input location into the multiplier includes:获取模式信号和输入数据,根据所述输入数据的精度确定调用的乘法器的数量;Acquire the mode signal and input data, and determine the number of multipliers called according to the precision of the input data;当所述输入数据的最高精度高于所述乘法器的最高比特位时,调用的乘法器的数量大于1;When the highest precision of the input data is higher than the highest bit of the multiplier, the number of called multipliers is greater than 1;根据所述模式信号确定数据输入位置,将所述输入数据中最高精度的数据进行拆分,将拆分完毕后得到的输入数据从所述数据输入位置输入乘法器中;Determine the data input position according to the mode signal, split the data with the highest precision in the input data, and input the input data obtained after the splitting into the multiplier from the data input position;当所述输入数据的最高精度低于或者等于所述乘法器的最高比特位时,调用的乘法器的数量为1;When the highest precision of the input data is lower than or equal to the highest bit of the multiplier, the number of called multipliers is 1;根据所述模式信号确定数据输入位置,将所述输入数据从所述数据输入位置输入乘法器中。A data input location is determined based on the mode signal, and the input data is input into a multiplier from the data input location.
- 根据权利要求2所述的一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,所述获取模式信号,根据所述模式信号对所述乘法器生成的部分积进行处理,并执行求和操作,将所述求和操作后得到的数据作为目标和包括:A fixed-point multiply-add operation method suitable for mixed-precision neural networks according to claim 2, wherein the acquiring mode signal, processing the partial product generated by the multiplier according to the mode signal, and A summation operation is performed, and the data obtained after the summation operation is used as the target sum including:获取模式信号,根据所述模式信号对所述乘法器生成的部分积进行处理;obtaining a mode signal, and processing the partial product generated by the multiplier according to the mode signal;将处理后得到的部分积生成部分拆分为第一部分积生成部分和第二部分积生成部分;Splitting the partial product generation part obtained after processing into a first partial product generation part and a second partial product generation part;对所述第一部分积生成部分和所述第二部分积生成部分执行求和操作,将所述求和操作后得到的数据作为目标和。A summation operation is performed on the first partial product generation part and the second partial product generation part, and the data obtained after the summation operation is used as a target sum.
- 根据权利要求3所述的一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,所述模式信号由输入数据的精度确定;所述处理至少包括以下操作中的一种:A fixed-point multiply-add operation method suitable for mixed-precision neural networks according to claim 3, wherein the mode signal is determined by the precision of the input data; the processing includes at least one of the following operations:对所述乘法器生成的预设区域的部分积进行屏蔽处理;performing masking processing on the partial product of the preset area generated by the multiplier;当调用的乘法器的数量大于1时,对进行低位运算的乘法器输出的部分积生成部分进行移位处理。When the number of called multipliers is greater than 1, a shift process is performed on the partial product generation part output by the multipliers that perform lower order operations.
- 根据权利要求3所述的一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,当所述输入数据为相同精度,且所述输入数据的最高比特位小于或者等于所述乘法器的最高比特位时,所述对所述第一部分积生成部分和所述第二部分积生成部分执行求和操作,并基于所述求和操作得到目标和包括:A fixed-point multiply-add operation method suitable for mixed-precision neural networks according to claim 3, characterized in that, when the input data is of the same precision, and the highest bit of the input data is less than or equal to the multiplication When the highest bit of the device is selected, performing a summation operation on the first partial product generation part and the second partial product generation part, and obtaining a target sum based on the summation operation includes:将所述第一部分积生成部分和所述第二部分积生成部分分别输入第一级压缩器a和第一级压缩器b中;inputting the first partial product generating part and the second partial product generating part into the first-stage compressor a and the first-stage compressor b respectively;将所述第一级压缩器a和所述第一级压缩器b的输出结果共同输入至第二级压缩器c中;inputting the output results of the first-stage compressor a and the first-stage compressor b together into the second-stage compressor c;将所述第二级压缩器c的输出结果输入加法器中,并将所述加法器的输出结果作为目标和。The output result of the second stage compressor c is input into an adder, and the output result of the adder is used as a target sum.
- 根据权利要求3所述的一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,当所述输入数据为混合精度时,所述对所述第一部分积生成部分和所述第二部分积生成部分执行求和操作,并基于所述求和操作得到目标和包括:A fixed-point multiply-add operation method suitable for mixed-precision neural networks according to claim 3, wherein when the input data is mixed-precision, the first partial product generating part and the first partial product are generated. The bipartite product generation part performs a summation operation, and obtains the target sum based on the summation operation, including:获取所述输入数据的最高比特位数,并将所述输入数据的最高比特位数与所述乘法器的最高比特位数进行比较;Obtain the highest bit number of the input data, and compare the highest bit number of the input data with the highest bit number of the multiplier;当所述输入数据的最高比特位数等于所述乘法器的最高比特位数时,将所述第一部分积生成部分和所述第二部分积生成部分分别输入第一级压缩器a和第一级压缩器b中;When the highest bit number of the input data is equal to the highest bit number of the multiplier, the first partial product generating part and the second partial product generating part are input to the first stage compressor a and the first stage compressor a respectively. stage compressor b;将所述第一级压缩器a和所述第一级压缩器b的输出结果分别输入第一加法器和第二加法器中,并将所述第一加法器和所述第二加法器的输出结果之和作为目标和。The output results of the first-stage compressor a and the first-stage compressor b are respectively input into the first adder and the second adder, and the first adder and the second adder are The sum of the output results is used as the target sum.
- 根据权利要求3所述的一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,当所述输入数据为混合精度时,所述对所述第一部分积生成部分和所述第二部分积生成部分执行求和操作,并基于所述求和操作得到目标和包括:A fixed-point multiply-add operation method suitable for mixed-precision neural networks according to claim 3, wherein when the input data is mixed-precision, the first partial product generating part and the first partial product are generated. The bipartite product generation part performs a summation operation, and obtains the target sum based on the summation operation, including:获取所述输入数据的最高比特位数,并将所述输入数据的最高比特位数与所述乘法器的最高比特位数进行比较;Obtain the highest bit number of the input data, and compare the highest bit number of the input data with the highest bit number of the multiplier;当所述输入数据的最高比特位数大于所述乘法器的最高比特位数时,所述乘法器包括第一乘法器和第二乘法器,所述第二乘法器为低位运算乘法器;所述第一乘法器输出所述第一部分积生成部分,所述第二乘法器输出所述第二部分积生成部分;When the highest bit number of the input data is greater than the highest bit number of the multiplier, the multiplier includes a first multiplier and a second multiplier, and the second multiplier is a low-order operation multiplier; the first multiplier outputs the first partial product generating part, and the second multiplier outputs the second partial product generating part;将所述第一部分积生成部分直接输入第一加法器;inputting the first partial product generation part directly into the first adder;将所述第二部分积生成部分拆分后分别输入所述第一加法器和第二加法器中;splitting the second partial product generation part and inputting them into the first adder and the second adder respectively;将所述第一加法器和所述第二加法器的输出结果之和作为目标和。The sum of the output results of the first adder and the second adder is used as the target sum.
- 根据权利要求1所述的一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,所述对所述目标和进行截选,将截选后得到的数据作为所述输入数据的点乘运算结果包括:A fixed-point multiply-add operation method suitable for mixed-precision neural network according to claim 1, characterized in that, when the target sum is cut and selected, the data obtained after the cut and selection is used as the input data. The dot product results include:根据所述输入数据的精度确定截选位宽;Determine the truncation bit width according to the precision of the input data;根据所述截选位宽,将所述目标和从第0位开始进行截选操作,将所述截选操作后得到的数据作为所述输入数据的点乘运算结果。According to the truncation bit width, an truncation operation is performed on the target sum starting from the 0th bit, and the data obtained after the truncation and selection operation is used as the result of the dot product of the input data.
- 根据权利要求1所述的一种适用于混合精度神经网络的定点乘加运算方法,其特征在于,所述方法还包括:A fixed-point multiply-add operation method suitable for mixed-precision neural network according to claim 1, characterized in that, the method further comprises:确定所述输入数据的最高位对应的部分积生成部分,将所述部分积生成部分作为待调整部分积生成部分;determining the partial product generating part corresponding to the highest bit of the input data, and using the partial product generating part as the partial product generating part to be adjusted;当所述输入数据的最高位为负数时,对所述待调整部分积生成部分进行取反加一处 理。When the most significant bit of the input data is a negative number, inversion and addition processing is performed on the to-be-adjusted partial product generation part.
- 一种适用于混合精度神经网络的定点乘加运算单元,其特征在于,所述运算单元包括:A fixed-point multiply-add operation unit suitable for mixed-precision neural network, characterized in that, the operation unit includes:位置确定模块,用于获取模式信号和输入数据,根据所述模式信号确定数据输入位置,并将所述输入数据从所述数据输入位置输入乘法器中;a position determination module for acquiring a mode signal and input data, determining a data input position according to the mode signal, and inputting the input data into a multiplier from the data input position;部分积处理模块,用于根据所述模式信号对所述乘法器生成的部分积进行处理,并执行求和操作,将所述求和操作后得到的数据作为目标和;a partial product processing module, configured to process the partial product generated by the multiplier according to the mode signal, and perform a summation operation, using the data obtained after the summation operation as a target sum;结果生成模块,用于对所述目标和进行截选,将截选后得到的数据作为所述输入数据的点乘运算结果。The result generation module is used for intercepting the target sum, and using the data obtained after the interception as the result of the dot product of the input data.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110178992.7 | 2021-02-09 | ||
CN202110178992.7A CN113010148B (en) | 2021-02-09 | 2021-02-09 | Fixed-point multiply-add operation unit and method suitable for mixed precision neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022170811A1 true WO2022170811A1 (en) | 2022-08-18 |
Family
ID=76383947
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/131800 WO2022170811A1 (en) | 2021-02-09 | 2021-11-19 | Fixed-point multiply-add operation unit and method suitable for mixed-precision neural network |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113010148B (en) |
WO (1) | WO2022170811A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113010148B (en) * | 2021-02-09 | 2022-11-11 | 南方科技大学 | Fixed-point multiply-add operation unit and method suitable for mixed precision neural network |
US20240004952A1 (en) * | 2022-06-29 | 2024-01-04 | Mediatek Singapore Pte. Ltd. | Hardware-Aware Mixed-Precision Quantization |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101916177A (en) * | 2010-07-26 | 2010-12-15 | 清华大学 | Configurable multi-precision fixed point multiplying and adding device |
US8706790B1 (en) * | 2009-03-03 | 2014-04-22 | Altera Corporation | Implementing mixed-precision floating-point operations in a programmable integrated circuit device |
CN108287681A (en) * | 2018-02-14 | 2018-07-17 | 中国科学院电子学研究所 | A kind of single-precision floating point fusion point multiplication operation unit |
CN108459840A (en) * | 2018-02-14 | 2018-08-28 | 中国科学院电子学研究所 | A kind of SIMD architecture floating-point fusion point multiplication operation unit |
CN108694038A (en) * | 2017-04-12 | 2018-10-23 | 英特尔公司 | Dedicated processes mixed-precision floating-point operation circuit in the block |
CN113010148A (en) * | 2021-02-09 | 2021-06-22 | 南方科技大学 | Fixed-point multiply-add operation unit and method suitable for mixed precision neural network |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100405289C (en) * | 2005-03-08 | 2008-07-23 | 中国科学院计算技术研究所 | Floating-point multiplicator and method of compatible double-prepcision and double-single precision computing |
CN102591615A (en) * | 2012-01-16 | 2012-07-18 | 中国人民解放军国防科学技术大学 | Structured mixed bit-width multiplying method and structured mixed bit-width multiplying device |
US11068238B2 (en) * | 2019-05-21 | 2021-07-20 | Arm Limited | Multiplier circuit |
CN110531954B (en) * | 2019-08-30 | 2024-07-19 | 上海寒武纪信息科技有限公司 | Multiplier, data processing method, chip and electronic equipment |
CN210109863U (en) * | 2019-08-30 | 2020-02-21 | 上海寒武纪信息科技有限公司 | Multiplier, device, neural network chip and electronic equipment |
CN110780845B (en) * | 2019-10-17 | 2021-11-30 | 浙江大学 | Configurable approximate multiplier for quantization convolutional neural network and implementation method thereof |
CN111522528B (en) * | 2020-04-22 | 2023-03-28 | 星宸科技股份有限公司 | Multiplier, multiplication method, operation chip, electronic device, and storage medium |
-
2021
- 2021-02-09 CN CN202110178992.7A patent/CN113010148B/en active Active
- 2021-11-19 WO PCT/CN2021/131800 patent/WO2022170811A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8706790B1 (en) * | 2009-03-03 | 2014-04-22 | Altera Corporation | Implementing mixed-precision floating-point operations in a programmable integrated circuit device |
CN101916177A (en) * | 2010-07-26 | 2010-12-15 | 清华大学 | Configurable multi-precision fixed point multiplying and adding device |
CN108694038A (en) * | 2017-04-12 | 2018-10-23 | 英特尔公司 | Dedicated processes mixed-precision floating-point operation circuit in the block |
CN108287681A (en) * | 2018-02-14 | 2018-07-17 | 中国科学院电子学研究所 | A kind of single-precision floating point fusion point multiplication operation unit |
CN108459840A (en) * | 2018-02-14 | 2018-08-28 | 中国科学院电子学研究所 | A kind of SIMD architecture floating-point fusion point multiplication operation unit |
CN113010148A (en) * | 2021-02-09 | 2021-06-22 | 南方科技大学 | Fixed-point multiply-add operation unit and method suitable for mixed precision neural network |
Also Published As
Publication number | Publication date |
---|---|
CN113010148B (en) | 2022-11-11 |
CN113010148A (en) | 2021-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115934030B (en) | Arithmetic logic unit, method and equipment for floating point number multiplication | |
CN110780845B (en) | Configurable approximate multiplier for quantization convolutional neural network and implementation method thereof | |
CN107451658B (en) | Fixed-point method and system for floating-point operation | |
US20210349692A1 (en) | Multiplier and multiplication method | |
CN110852416B (en) | CNN hardware acceleration computing method and system based on low-precision floating point data representation form | |
WO2022170811A1 (en) | Fixed-point multiply-add operation unit and method suitable for mixed-precision neural network | |
CN110852434B (en) | CNN quantization method, forward calculation method and hardware device based on low-precision floating point number | |
US20210182026A1 (en) | Compressing like-magnitude partial products in multiply accumulation | |
CN116400883A (en) | Floating point multiply-add device capable of switching precision | |
Venkatachalam et al. | Approximate sum-of-products designs based on distributed arithmetic | |
CN112434801A (en) | Convolution operation acceleration method for carrying out weight splitting according to bit precision | |
CN117908835B (en) | Method for accelerating SM2 cryptographic algorithm based on floating point number computing capability | |
KR20220031098A (en) | Signed Multi-Word Multiplier | |
CN114860193A (en) | Hardware operation circuit for calculating Power function and data processing method | |
JPH04332036A (en) | Floating decimal point multiplier and its multiplying system | |
CN113608718A (en) | Method for realizing acceleration of prime number domain large integer modular multiplication calculation | |
JPH04205026A (en) | Divider circuit | |
WO2023078364A1 (en) | Operation method and apparatus for matrix multiplication | |
CN112558920A (en) | Signed/unsigned multiply-accumulate device and method | |
CN115827555A (en) | Data processing method, computer device, storage medium and multiplier structure | |
US20220075598A1 (en) | Systems and Methods for Numerical Precision in Digital Multiplier Circuitry | |
CN115062768A (en) | Softmax hardware implementation method and system of logic resource limited platform | |
JPH05204602A (en) | Method and device of control signal | |
Zhang et al. | A multiple-precision multiply and accumulation design with multiply-add merged strategy for AI accelerating | |
Kumar et al. | Complex multiplier: implementation using efficient algorithms for signal processing application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21925458 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21925458 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21925458 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20.02.2024) |