CN110688090B

CN110688090B - Floating point multiplication method, circuit and equipment for AI (artificial intelligence) calculation

Info

Publication number: CN110688090B
Application number: CN201910860148.5A
Authority: CN
Inventors: 周韧研
Original assignee: Beijing Intengine Technology Co Ltd
Current assignee: Beijing Intengine Technology Co Ltd
Priority date: 2019-09-11
Filing date: 2019-09-11
Publication date: 2021-10-12
Anticipated expiration: 2039-09-11
Also published as: CN110688090A

Abstract

The embodiment of the invention provides a floating-point multiplication calculation method, a floating-point multiplication calculation circuit and floating-point multiplication calculation equipment for AI calculation, wherein the method comprises the following steps: preprocessing the input first data and the input second data respectively to obtain first formatted data and second formatted data in a preset format; inputting the first formatted data and the second formatted data into a multiplier to carry out original code signed integer multiplication to obtain an original code signed product f; inputting the first formatted data and the second formatted data into an adder for unsigned addition to obtain a sum e; compressing the product F and the sum e to obtain floating point data with a floating point format F (p, q) so as to output; the preset format is an M format which is a mapping from a binary ordinal number pair with parameters to a real number, and a function is expressed as M (S, (f, e)): is (f < < e) > > S; wherein, f is signed number, e is unsigned number, S is a fixed parameter, and S ═ 1< (p-1)) -2+ q) has strong universality.

Description

Floating point multiplication method, circuit and equipment for AI (artificial intelligence) calculation

Technical Field

The embodiment of the invention relates to the technical field of computer data processing, in particular to a floating point multiplication method, a floating point multiplication circuit and floating point multiplication equipment for AI (analog to digital) calculation.

Background

Floating-point multiplication conforming to the IEEE 754 standard has a widespread scope of application. The IEEE 754 standard sets forth several special formats for SubNormal, Inf, NaN, in addition to the Normal (Normal) floating point format. However, current floating-point multiply IP is mostly designed for FPUs in processors and therefore needs to cover all cases of the IEEE 754 standard. There are also floating point multipliers for the data path that retain the support of Normal format and do not support other special formats.

AI calculations have special requirements for floating-point multiplication and can be summarized as follows: support Normal floating point format; supporting SubNormal floating point format; for Inf and NaN, the most value of the same symbol is required to be considered, and no IP on the market can meet the requirement at present.

Therefore, how to provide a multiplication circuit scheme capable of supporting the Normal floating point format and the SubNormal floating point format can be regarded as the most significant values of the same sign for Inf and NaN, and is suitable for AI calculation, and the versatility is strong.

Disclosure of Invention

Therefore, embodiments of the present invention provide a floating-point multiplication method, a floating-point multiplication circuit, and a floating-point multiplication apparatus for AI computation, which can support a Normal floating-point format and a SubNormal floating-point format, and for Inf and NaN, the Inf and NaN can be regarded as the most significant values with the same sign, so that the method, the circuit, and the apparatus are suitable for AI computation, and have strong versatility.

In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:

in a first aspect, an embodiment of the present invention provides a floating-point multiplication method for AI calculation, including:

preprocessing the input first data and the input second data respectively to obtain first formatted data and second formatted data in a preset format;

inputting the first formatted data and the second formatted data into a multiplier to carry out original code signed integer multiplication to obtain an original code signed product f;

inputting the first formatted data and the second formatted data into an adder for unsigned addition to obtain a sum e;

compressing the product F and the sum e to obtain floating point data with a floating point format F (P, Q) so as to output;

the preset format is an M format, the M format is a preset format aiming at the multiplication results f and e, and is also a preset format formed by formatting the first data and the second data, and all data are converted into the M format when the floating point multiplication is calculated; the M format is a mapping from a binary ordinal number pair with parameters to a real number, f and e parts are converted to an integer domain according to the definition of S, and the function is expressed as M (S, (f, e)): ═ f < < e > > S; wherein F is a signed number, e is an unsigned number, S is a fixed parameter, S is (1< (P-1)) -2+ Q, and P and Q are parameters of floating point data F (P, Q); s is a mapping relation, based on bit width defined by p and q, the position M corresponds to original floating point data, and the obtained M is not the true value of the original floating point data.

Preferably, the preprocessing the input first data and second data to obtain first formatted data and second formatted data in a preset format includes:

the preprocessing the input first data and second data to obtain first formatted data and second formatted data in a preset format includes:

receiving 2 input data in F (P, Q) floating point format, and respectively processing the following data:

judging whether P is all binary 1, if so, keeping the sign bit unchanged, and setting all binary Q to be 1;

judging whether P is all binary 0, if not, supplementing one 1 to the highest binary bit of Q, recording as Qs, and assigning P-1 to the current P; if P is all binary 0, supplementing one 0 to the highest binary bit of Q, and recording as Qs; qs is a binary number of q +1 bits;

generating f according to the sign bit and the value of Qs;

assigning P to e; to get f, e outputs for M format: the first formatted data M1 ═ M (S, (f1, e1)) and the second formatted data M2 ═ M (S, (f2, e2)), where f has a number of bits q +2 and e has a number of bits p.

Preferably, said generating f according to the sign bit and the value of Qs comprises:

if the sign bit is 0, f is the result of 0 supplementation of the highest bit of Qs; if the sign bit is 1, f is obtained by cutting out the low q +2 bits of the result obtained by inverting Qs according to bits and then adding 1.

Preferably, the inputting the first formatted data and the second formatted data into a multiplier for original code signed integer multiplication to obtain an original code signed product f includes:

carrying out original code signed integer multiplication on f1 and f2 to obtain an original code signed product f; wherein the width of the product f is 2q +3 bits.

Preferably, the inputting the first formatted data and the second formatted data to an adder for unsigned addition to obtain a sum e includes:

carrying out unsigned addition on e1 and e2 to obtain a sum e; wherein the width of e is p + 1.

Preferably, the compressing the product F and the sum e to obtain floating point data in a floating point format F (P, Q) for outputting includes:

taking the non-symbol bit of f, scanning from high bit to low bit, and recording the position of the first non-0 bit, which starts with zero, as t;

calculating the value of the intermediate value y ═ e +1-t- ((1< (p-1)) -2);

if y is greater than (1< < P) -2, outputting a positive maximum value of F (P, Q) when the sign bit of F is positive, otherwise outputting a negative maximum value of F (P, Q);

if y is less than (1< < P) -1 and y is greater than or equal to 0, then the output P +1 at this time; taking Q number of f starting from t +2 bits, and if the number of f is less than Q, carrying out zero padding to obtain output Q; outputting a sign bit with a sign bit of f;

if y is less than 0, the f logic is shifted right by Q +1+ y bits, the lower Q bits are taken as the output Q, the output P is equal to 0, and the sign bit is the sign bit of f.

In a second aspect, an embodiment of the present invention provides a floating-point multiplication circuit for AI calculation, including:

the preprocessing module is used for respectively preprocessing the input first data and the input second data to obtain first formatted data and second formatted data in preset formats;

the multiplier is used for inputting the first formatted data and the second formatted data into the multiplier to carry out original code signed integer multiplication to obtain an original code signed product f;

the adder is used for inputting the first formatted data and the second formatted data into the adder to carry out unsigned addition to obtain a sum e;

the output processing module is used for compressing the product F and the sum e to obtain floating point data with a floating point format F (P, Q) so as to output the floating point data;

Preferably, the output processing module includes:

a symbol acquisition unit for acquiring non-symbol bits of f; scanning from high to low, and recording the position of the first non-0 bit, starting with zero, as t;

a y value calculation unit for calculating a value of the intermediate value y ═ e +1-t- ((1< (p-1)) -2);

a first output unit for outputting a positive maximum value of F (P, Q) when the sign bit of F is positive if y is greater than (1< < P) -2, and otherwise outputting a negative maximum value of F (P, Q);

a second output unit for outputting P +1 at this time if y is less than (1< < P) -1 and y is greater than or equal to 0; taking Q numbers (zero padding if less than Q) of f starting from t +2 bits as output Q; outputting a sign bit with a sign bit of f;

and the third output unit is used for shifting the f logic right by Q +1+ y bits if y is less than 0, taking the lower Q bits as an output Q, setting the output P to be 0 and outputting a sign bit with a sign bit of f.

In a third aspect, an embodiment of the present invention provides a floating-point multiplication apparatus for AI calculation, including:

a memory for storing a computer program;

a processor for implementing the steps of the floating-point multiplication method for AI computation according to any one of the first aspect described above when executing the computer program.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements the steps of the floating-point multiplication method for AI calculation according to any one of the above first aspects.

The embodiment of the invention provides a floating point multiplication method for AI calculation, which comprises the following steps: preprocessing the input first data and the input second data respectively to obtain first formatted data and second formatted data in a preset format; inputting the first formatted data and the second formatted data into a multiplier to carry out original code signed integer multiplication to obtain an original code signed product f; inputting the first formatted data and the second formatted data into an adder for unsigned addition to obtain a sum e; compressing the product F and the sum e to obtain floating point data with a floating point format F (P, Q) so as to output; the preset format is an M format, the M format is a preset format aiming at the multiplication results f and e, and is also a preset format formed by formatting the first data and the second data, and all data are converted into the M format when the floating point multiplication is calculated; the M format is a mapping from a binary ordinal number pair with parameters to a real number, f and e parts are converted to an integer domain according to the definition of S, and the function is expressed as M (S, (f, e)): ═ f < < e > > S; wherein F is a signed number, e is an unsigned number, S is a fixed parameter, S is (1< (P-1)) -2+ Q, and P and Q are parameters of floating point data F (P, Q); s is a mapping relation, based on bit widths defined by p and q, the position M corresponds to original floating point data, the obtained M is not a true value of the original floating point data, so that a Normal floating point format and a SubNormal floating point format can be supported, Inf and NaN can be regarded as the most common values of the same sign, and the mapping relation is suitable for AI calculation.

The floating-point multiplication method, the circuit and the equipment for AI calculation provided by the embodiment of the invention can support a Normal floating-point format and a SubNormal floating-point format, can be regarded as the most significant values of the same sign for Inf and NaN, are suitable for AI calculation, and have strong universality.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.

The structures, ratios, sizes, and the like shown in the present specification are only used for matching with the contents disclosed in the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions that the present invention can be implemented, so that the present invention has no technical significance, and any structural modifications, changes in the ratio relationship, or adjustments of the sizes, without affecting the effects and the achievable by the present invention, should still fall within the range that the technical contents disclosed in the present invention can cover.

Fig. 1 is a flowchart of a floating-point multiplication method for AI calculation according to an embodiment of the present invention;

fig. 2 is a flow chart illustrating a preset format conversion of a floating-point multiplication method for AI calculation according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating data output of a floating-point multiplication method for AI computation according to an embodiment of the present invention;

FIG. 4 is a block diagram of a floating-point multiply circuit for AI calculation according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an output processing module of a floating-point multiplication circuit for AI calculation according to an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of a floating-point multiplication device for AI calculation according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

The present invention is described in terms of particular embodiments, other advantages and features of the invention will become apparent to those skilled in the art from the following disclosure, and it is to be understood that the described embodiments are merely exemplary of the invention and that it is not intended to limit the invention to the particular embodiments disclosed. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 2, and fig. 3, fig. 1 is a flowchart of a floating-point multiplication method for AI calculation according to an embodiment of the present invention; fig. 2 is a flow chart illustrating a preset format conversion of a floating-point multiplication method for AI calculation according to an embodiment of the present invention; fig. 3 is a data output flow chart of a floating-point multiplication method for AI calculation according to an embodiment of the present invention.

The embodiment of the invention provides a floating point multiplication method for AI calculation, which comprises the following steps:

step S11: preprocessing the input first data and the input second data respectively to obtain first formatted data and second formatted data in a preset format;

step S12: inputting the first formatted data and the second formatted data into a multiplier to carry out original code signed integer multiplication to obtain an original code signed product f;

step S13: inputting the first formatted data and the second formatted data into an adder for unsigned addition to obtain a sum e;

step S14: compressing the product F and the sum e to obtain floating point data with a floating point format F (P, Q) so as to output;

Specifically, the following steps may be implemented when preprocessing the input first data and second data to obtain first formatted data and second formatted data in a preset format:

step S21: receiving 2 input data in F (P, Q) floating point format, and respectively processing the following data:

step S22: judging whether P is all binary 1, if so, keeping the sign bit unchanged, and setting all binary Q to be 1;

step S23: judging whether P is all binary 0, if not, supplementing one 1 to the highest binary bit of Q, recording as Qs, and assigning P-1 to the current P; if P is all binary 0, supplementing one 0 to the highest binary bit of Q, and recording as Qs; qs is a binary number of q +1 bits;

step S24: generating f according to the sign bit and the value of Qs;

step S25: assigning P to e; to get f, e outputs for M format: the first formatted data M1 ═ M (S, (f1, e1)) and the second formatted data M2 ═ M (S, (f2, e2)), where f has a number of bits q +2 and e has a number of bits p.

Further, in order to generate f according to the sign bit and the value of Qs, it can be obtained that: if the sign bit is 0, f is the result of 0 supplementation of the highest bit of Qs; if the sign bit is 1, f is obtained by cutting out the low q +2 bits of the result obtained by inverting Qs according to bits and then adding 1. In order to input the first formatted data and the second formatted data to a multiplier for original code signed integer multiplication to obtain an original code signed product f, original code signed integer multiplication can be carried out on f1 and f2 to obtain the original code signed product f; specifically, that is, sign bit exclusive or, and complement bit unsigned multiplication (i.e., the multiplier has q +1 bits as input and 2q +2 bits as output), where the product f has a width of 2q +3 bits.

Further, in order to input the first formatted data and the second formatted data to an adder for unsigned addition to obtain a sum e, specifically, e1 and e2 are subjected to unsigned addition to obtain a sum e; wherein the width of e is p + 1.

On the basis of any of the above embodiments, in this embodiment, in order to perform compression processing on the product F and the sum e to obtain floating point data in a floating point format F (P, Q) for outputting, the following steps may be performed:

step S31: taking the non-symbol bit of f, scanning from high bit to low bit, and recording the position of the first non-0 bit, which starts with zero, as t;

step S32: calculating the value of the intermediate value y ═ e +1-t- ((1< (p-1)) -2);

step S33: if y is greater than (1< < P) -2, outputting a positive maximum value of F (P, Q) when the sign bit of F is positive, otherwise outputting a negative maximum value of F (P, Q);

step S34: if y is less than (1< < P) -1 and y is greater than or equal to 0, then the output P +1 at this time; taking Q numbers (zero padding if less than Q) of f starting from t +2 bits as output Q; outputting a sign bit with a sign bit of f;

step S35: if y is less than 0, the f logic is shifted right by Q +1+ y bits, the lower Q bits are taken as the output Q, the output P is equal to 0, and the sign bit is the sign bit of f.

In the above steps of the embodiment of the present invention, specifically to how to implement, each step may be implemented by using a program, an analog circuit, a digital circuit, a programmable logic circuit, etc., so that the above algorithm may be implemented by using a digital logic circuit or a functionally equivalent analog circuit, and the above algorithm is already clear and can use a basic unit of a digital circuit: logical AND or, integer multiplier, integer adder, shift register, etc. Note that for a given p, q, using a parametric design approach, the present algorithm can in principle be applied to all IEEE 754 floating point formats.

Referring to fig. 4 and 5, fig. 4 is a schematic diagram illustrating a floating-point multiplication circuit for AI calculation according to an embodiment of the present invention; fig. 5 is a schematic diagram of output processing modules of a floating-point multiplication circuit for AI calculation according to an embodiment of the present invention.

An embodiment of the present invention provides a floating-point multiplication circuit for AI calculation, including:

the preprocessing module 410 is configured to respectively preprocess the input first data and second data to obtain first formatted data and second formatted data in a preset format;

a multiplier 420, configured to input the first formatted data and the second formatted data to a multiplier to perform original code signed integer multiplication, so as to obtain an original code signed product f;

the adder 430 is configured to input the first formatted data and the second formatted data to the adder for unsigned addition, so as to obtain a sum e;

the output processing module 440 is configured to compress the product F and the sum e to obtain floating point data in a floating point format F (P, Q) for output;

Preferably, the output processing module 440 includes:

a sign obtaining unit 441, configured to obtain non-sign bits of f; scanning from high to low, and recording the position of the first non-0 bit, starting with zero, as t;

a y-value calculating unit 442 for calculating a value of the intermediate value y ═ e +1-t- ((1< (p-1)) -2);

a first output unit 443 for outputting a positive maximum value of F (P, Q) when the sign bit of F is positive if y is greater than (1< < P) -2, and otherwise outputting a negative maximum value of F (P, Q);

a second output unit 444 for outputting P +1 if y is less than (1< < P) -1 and y is greater than or equal to 0; taking Q numbers (zero padding if less than Q) of f starting from t +2 bits as output Q; outputting a sign bit with a sign bit of f;

and a third output unit 445, configured to shift the f logic right by Q +1+ y bits if y is smaller than 0, take the lower Q bits thereof as an output Q, output P ═ 0, and output a sign bit with a sign bit of f.

Referring to fig. 6 and 7, fig. 6 is a schematic structural diagram of a floating-point multiplication device for AI calculation according to an embodiment of the present invention; fig. 7 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.

An embodiment of the present invention provides a floating-point multiplication apparatus 600 for AI calculation, including:

a memory 610 for storing a computer program;

a processor 620 configured to implement the steps of any one of the floating-point multiplication methods for AI calculation as described in the first aspect above when the computer program is executed. Stored in a space in the memory 610 for storage of program code which, when executed by the processor 620, implements any of the methods in embodiments of the present invention.

An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of any one of the floating-point multiplication methods for AI calculation according to any one of the above embodiments.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another apparatus, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a function calling device, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. Although the invention has been described in detail above with reference to a general description and specific examples, it will be apparent to one skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims

1. A floating-point multiplication method for AI computation, comprising:

2. The floating-point multiplication computation method for AI computation of claim 1,

generating f according to the sign bit and the value of Qs;

3. The floating-point multiplication computation method for AI computation of claim 2,

generating f according to the sign bit and the value of Qs comprises:

4. The floating-point multiplication computation method for AI computation of claim 1,

the inputting the first formatted data and the second formatted data into a multiplier to perform original code signed integer multiplication to obtain an original code signed product f, including:

5. The floating-point multiplication computation method for AI computation of claim 1,

inputting the first formatted data and the second formatted data to an adder for unsigned addition to obtain a sum e, including:

6. The floating-point multiplication computation method for AI computation according to any one of claims 1 to 5,

the compressing the product F and the sum e to obtain floating point data in a floating point format F (P, Q) for outputting, includes:

calculating an intermediate value y ═ e +1-t- ((1< (p-1)) -2);

7. A floating-point multiplication circuit for AI computation, comprising:

8. The floating-point multiplication circuit for AI computations of claim 7,

the output processing module comprises:

a y value calculation unit for calculating an intermediate value y ═ e +1-t- ((1< (p-1)) -2);

a second output unit for outputting P +1 at this time if y is less than (1< < P) -1 and y is greater than or equal to 0; taking Q number of f starting from t +2 bits, and if the number of f is less than Q, carrying out zero padding to obtain output Q; outputting a sign bit with a sign bit of f;

9. A floating-point multiplication computation apparatus for AI computation, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the floating-point multiplication method for AI calculations according to any one of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which, when being executed by a processor, carries out the steps of the floating-point multiplication method for AI calculations according to any one of claims 1 to 6.