CN112416295B - Arithmetic unit for floating point data and tensor data operation - Google Patents

Arithmetic unit for floating point data and tensor data operation Download PDF

Info

Publication number
CN112416295B
CN112416295B CN202011427161.0A CN202011427161A CN112416295B CN 112416295 B CN112416295 B CN 112416295B CN 202011427161 A CN202011427161 A CN 202011427161A CN 112416295 B CN112416295 B CN 112416295B
Authority
CN
China
Prior art keywords
value
tensor data
shared
data
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011427161.0A
Other languages
Chinese (zh)
Other versions
CN112416295A (en
Inventor
罗闳訚
何日辉
周志新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Yipu Intelligent Technology Co ltd
Original Assignee
Xiamen Yipu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Yipu Intelligent Technology Co ltd filed Critical Xiamen Yipu Intelligent Technology Co ltd
Priority to CN202011427161.0A priority Critical patent/CN112416295B/en
Publication of CN112416295A publication Critical patent/CN112416295A/en
Application granted granted Critical
Publication of CN112416295B publication Critical patent/CN112416295B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/487Multiplying; Dividing
    • G06F7/4876Multiplying

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Nonlinear Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses an arithmetic unit for floating point data and tensor data operation, wherein the arithmetic unit for tensor data operation comprisesTwo input tensor data and a shared E value thereof, and one output tensor data and a shared E value thereof, wherein each number of the tensor data is represented by S+F in EF16 data format, wherein S is a symbol value of EF16 data, and F is a decimal value of EF16 data; the shared E value of the tensor data is an index value of EF16 data; the numerical expression formula of the EF16 data is as follows: (-1) signbit ×2 (‑exp onent) X fraction, wherein sign bit is a symbol value; exp onent is an index value; fraction is a fractional value. When the arithmetic unit executes operations such as multiplication and addition, the S+F part and the shared E value part of the two input tensor data can execute multiplication or addition respectively without data conversion, so that the operation of the tensor data can be effectively simplified, and the calculation efficiency of the tensor data can be improved.

Description

Arithmetic unit for floating point data and tensor data operation
Technical Field
The invention relates to the field of computer mathematical computation, in particular to an arithmetic unit for floating point data and tensor data operation.
Background
The neural network algorithm uses tensor data in floating point format to perform operations such as addition, multiplication, multiply-accumulate and the like. To balance precision and speed of operation, a half-precision floating point format is typically used.
The common half-precision floating point format is the FP16 format, and the FP16 half-precision floating point is represented by the following method:
(1) If the exact bits are all 0:
if the fraction bits are all 0, then the number 0 is indicated;
if the fraction bit is not 0, a very small number (subnormal numbers) is represented, which is calculated as:
(2) If the exact bits are all 1:
if the fraction bits are all 0, the number + -inf is indicated;
if the fraction bit is not 0, then NAN is indicated;
(3) Other cases of the exact bit:
the calculation formula is that
From the above FP16 expression rule, the data expression mode can be divided into three cases according to the content of the exposure: the exponents are all 0, the exponents are all 1 or others. This makes the FP16 calculation necessary to first determine the content of the exact value and then determine the data expression, and contains a large number of control rule logics.
In addition, the data expression mode used in most cases of FP16 is
The two FP16 data are as follows:
the multiplication between them can be expressed as:
finally, the multiplication calculation of the two FP16 data consists of the following calculation:
(1)s3=s1+s2
(2)e3=(e1+e2-15)
(3)
therefore, the multiplication operation of FP16 data consists of a plurality of addition, multiplication, and division operations, and the hardware implementation is very complex. Similarly, the addition operation of the FP16 data is very complex, and the processing efficiency of the neural network algorithm is seriously affected.
Disclosure of Invention
The invention aims to overcome at least one defect (deficiency) of the prior art, provides an arithmetic unit for floating point data and tensor data operation, and is based on a novel half-precision floating point expression mode and an operation realization method so as to reconstruct floating point operation, simplify tensor operation steps and improve the processing capacity of large data volume operations such as a neural network algorithm and the like.
To achieve the above object, the present invention proposes an operator for floating point data operations, said operator for performing a multiplication operation or an addition operation, comprising two input floating point data and its exponent value and one output floating point data and its exponent value, said input floating point data and output floating point data being represented in EF16 data format,
the numerical expression formula of the EF16 data is as follows:
(-1) signbit ×2 (-exponent) ×fraction
wherein sign is a symbol value; the exact is an index value; fraction is a fractional value;
in the case of the arithmetic unit,
the first input floating point data is expressed as two parts of S1+F1 and E1;
the other input floating point data is expressed as two parts of S2+F2 and E2;
the output floating point data is expressed as S3+F3 and E3;
wherein S1, S2 and S3 are symbol values; e1, E2 and E3 are index values; f1, F2, F3 are fractional values.
Further, the data bit width of the EF16 data is 21 bits, including a sign bit with a bit width of 1 bit, a finger bit with a bit width of 5 bits, and a decimal bit with a bit width of 15 bits.
Further, the operator is a multiplier for performing a multiplication operation, the multiplication operation of the multiplier being expressed as:
when E3 is not specified, s3=s1+s2; f3 =f1×f2; e3 =e1+e2; or (b)
At the time of E3 designation, s3=s1+s2; f3 =f1×f2> > E, where e=e3-E1-E2, > > represents a shift-to-right operation.
Further, the arithmetic unit is an adder for performing addition operation, and a precondition for the adder to perform addition operation is e1=e2;
the addition operation of the adder is expressed as:
(S3+F3)=(S1+F1)+(S2+F2)。
the input and output of the arithmetic unit for floating point data operation are expressed by EF16 data format. EF16 data has a better small expression range, and the maximum expression range is basically the same as that of FP16 data; the exponents of the floating point data directly represent the exponent values of the semi-precision floating point data, and the exponents-15 operation is not required to be executed, so that the mathematical expression of two EP16 data operations is simplified; the fraction value of the fraction value does not need to be a decimal valueUnder the condition of meeting the precondition of operation, the arithmetic can directly carry out addition or multiplication processing on the decimal values of the two EF16 data when carrying out addition or multiplication operation, thereby effectively simplifying the operation of floating point data and improving the calculation efficiency of the floating point data.
The invention also provides an arithmetic unit for tensor data operation, which is characterized by comprising two input tensor data and a shared E value thereof and one output tensor data and a shared E value thereof, wherein each number of the tensor data is represented by S+F in EF16 data format, S is a symbol value of EF16 data, and F is a decimal value of the EF16 data; the shared E value of the tensor data is an index value of EF16 data; the tensor data are expressed as E-value sharing tensor data and separation channel E-value sharing tensor data according to the number of the shared E-values;
the numerical expression formula of the EF16 data is as follows:
(-1) signbit ×2 (-exponent) ×fraction
wherein sign is a symbol value; the exact is an index value; fraction is a fractional value;
the E-value sharing tensor data represents: all numbers in the tensor data share a shared E value;
the separation channel E value sharing tensor data representation: tensor data has c channels, each channel having a shared E value, each shared E value being shared only among the data within each channel;
the operator is used for executing multiplication operation, addition operation or multiplication accumulation operation of tensor data.
Further, the shared E value is transmitted in a parametric manner of the tensor data.
Further, the operator is a multiplier for performing tensor data multiplication operation, the input and output of the multiplier share tensor data for the E value, and the multiplication operation of the multiplier is expressed as:
two tensor data with the same size, wherein each number is multiplied correspondingly to obtain process tensor data with the same size;
adding the shared E values of the two tensor data with the same size to obtain the shared E value of the process tensor data;
when the shared E value of the output tensor data is not specified, assigning the process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is designated, each data of the process tensor data is shifted rightwards according to the difference value between the corresponding designated shared E value and the shared E value of the process tensor data to generate the output tensor data.
Further, the arithmetic unit is a multiplier for performing tensor data multiplication operation, the input and output of the multiplier share tensor data for the separation channel E value, and the multiplication operation of the multiplier is expressed as:
two tensor data with the same size and c channels, wherein each number is multiplied correspondingly to obtain process tensor data with the same size;
the method comprises the steps that tensor data of c channels with the same size are obtained, and shared E values of the corresponding channels are added to obtain shared E values of the c channels of the process tensor data;
when the shared E value of the output tensor data is not specified, assigning the process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is designated, each data of the process tensor data is shifted rightwards according to the difference value between the corresponding designated shared E value and the shared E value of the process tensor data to generate the output tensor data.
Further, the arithmetic unit is an adder for executing tensor data addition operation, the input and output of the adder share tensor data for the E value,
the addition operation of the adder is expressed as:
the method meets the pre-operation conditions: the shared E values of two tensor data of the same size are the same;
two tensor data with the same size, wherein each number is correspondingly added to obtain process tensor data with the same size;
taking the shared E value of the input tensor data as the shared E value of the process tensor data;
when the shared E value of the output tensor data is not specified, assigning the process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is designated, each data of the process tensor data is shifted rightwards according to the difference value between the corresponding designated shared E value and the shared E value of the process tensor data to generate the output tensor data.
Further, the arithmetic unit is an adder for executing tensor data addition operation, the input and output of the adder share tensor data for the separation channel E value,
the addition operation of the adder is expressed as:
the method meets the pre-operation conditions: the shared E value of each channel in the tensor data with the same size is the same;
two tensor data with the same size, wherein each number is correspondingly added to obtain new tensor data with the same size;
taking the shared E value of each separation channel of the input tensor data as the shared E value of each separation channel of the process tensor data;
when the shared E value of the output tensor data is not specified, assigning the process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is designated, each data of the process tensor data is shifted rightwards according to the difference value between the corresponding designated shared E value and the shared E value of the process tensor data to generate the output tensor data.
Further, the operator is a multiply accumulator for performing a tensor data multiply-accumulate operation, the input and output of the multiply accumulator sharing tensor data for the E value, the multiply-accumulate operation of the multiply accumulator being expressed as:
two tensor data with the same size, wherein each number is multiplied correspondingly to obtain first process tensor data with the same size;
accumulating each number of the process tensor data to form second process tensor data with the size of 1 in all dimensions;
adding the shared E values of the tensor data with the same size to obtain the shared E value of the tensor data of the second process;
when the shared E value of the output tensor data is not specified, assigning the second process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is designated, each number of the second process tensor data is shifted rightwards according to the difference value between the corresponding designated shared E value and the shared E value of the process tensor data to generate the output tensor data.
Further, the operator is a multiply accumulator for performing a tensor data multiply-accumulate operation, the input and output of the multiply accumulator sharing tensor data for the separation channel E value, the multiply-accumulate operation of the multiply accumulator being expressed as:
two tensor data with the same size and c channels, wherein each number is multiplied correspondingly to obtain first tensor data with the same size;
accumulating each number of the process tensor data to form second process tensor data with the channel dimension of c and the other dimensions of 1;
the shared E values of the corresponding channels are added to obtain the shared E value of the c channels of the tensor data of the second process;
when the shared E value of the output tensor data is not specified, assigning the second process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is designated, each number of the second process tensor data is shifted rightwards according to the difference value between the corresponding designated shared E value and the shared E value of the process tensor data to generate the output tensor data.
The tensor data is divided into an integer part and a shared index part by extracting the shared E value (index value), so that the format of the tensor data is greatly simplified, floating point data operation among the tensor data is simplified into integer multiplication operation, addition operation, multiplication accumulation operation and index addition operation by the arithmetic unit, and the operation speed of the neural network can be greatly improved.
Drawings
FIG. 1 is a schematic diagram of an E-value sharing EF16 tensor data structure according to the present invention;
FIG. 2 is a diagram of a split channel E-value sharing EF16 tensor data structure according to the present invention;
FIG. 3 is a schematic diagram of an E-value sharing EF16 tensor multiplication operation according to the present invention;
FIG. 4 is a simplified diagram of an E-value sharing EF16 tensor multiplication operation of the present invention;
FIG. 5 is a schematic diagram of E-value sharing EF16 tensor addition according to the present invention;
FIG. 6 is a simplified diagram of an E-value sharing EF16 tensor addition operation of the present invention;
FIG. 7 is a schematic diagram of an E-value sharing EF16 tensor multiply-accumulate operation according to the present invention;
FIG. 8 is a simplified diagram of an E-value sharing EF16 tensor multiply-accumulate operation in accordance with the present invention;
FIG. 9 is a schematic diagram of a split-channel E-value sharing EF16 tensor multiply-accumulate operation in accordance with the present invention;
FIG. 10 is a simplified diagram of a split-channel E-value sharing EF16 tensor multiply-accumulate operation in accordance with the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the invention. For better illustration of the following embodiments, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the actual product dimensions; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
Example 1
The invention provides a half-precision floating point expression mode aiming at neural network computer data and tensor data, which is called EF16.
EF16 basic expression pattern:
the bit width of EF16 data is 21 bits, and the floating point number is expressed as follows:
(-1) signbit ×2 (-exponent) ×fraction
wherein sign is a symbol value, which is represented by S; exponent is an index value, denoted by E, fraction is a fractional value, denoted by F, so EF16 data can be expressed as S+E+F.
According to the above formula, the minimum expressible number of EF16 can be calculated as:
0 11111 2* 000000000000001 2 =2 -31 ×1≈0.00000000046566
the maximum number is:
0 00000 2* 111111111111111 2 =2 0 ×32768≈32768
the FP16 and EF16 data expression ranges are compared as follows, and it can be seen that EF16 has a better small expression range and the maximum value indicates that the range is substantially the same as FP 16:
EF16 compares with other floating point number expressions as follows:
multiplication of EF16 data:
the two inputs EF16 data are as follows:
(-1) s1 ×2 (-e1) ×f1
(-1) s2 ×2 (-e2) ×f2
the multiplication between them can be expressed as:
(-1) (s1+s2) ×2 (-(e1+e2)) ×(f1*f2)
in this example, the EF16 data is represented as two parts S+F and E, and the first input floating point data is represented as two parts S1+F1 and E1; the other input floating point data is expressed as two parts of S2+F2 and E2; the output floating point data is expressed as two parts of S3+F3 and E3 (S1 and S1, S2 and S2, E1 and E1, E2 and E2, F1 and F1, F2 and F2 and the like respectively represent the same meanings, and are convenient for the line, lower case letters are adopted on the expression of the formula, and upper case letters are adopted in the description of the characters);
the multiplication of EF16 data can be expressed as:
when E3 is not specified, s3=s1+s2; f3 =f1×f2; e3 =e1+e2; or (b)
At the time of E3 designation, s3=s1+s2; f3 =f1×f2> > E, where e=e3-E1-E2, > > represents a shift-to-right operation.
Addition of EF16 data:
in this example, the precondition for the addition operation of EF16 data is e1=e2;
the addition of EF16 data is expressed as:
(S3+F3)=(S1+F1)+(S2+F2)。
according to the above established half-precision floating point expression form EF16 for the neural network tensor data and the rule of multiplication and addition of the EF16 floating point data, in this embodiment of the present invention, an arithmetic unit for floating point data is constructed, and the arithmetic unit supports multiplication and addition of the EF16 floating point data.
The arithmetic unit of EF16 floating point data comprises two input EF16 floating point data and one output EF16 floating point data, wherein each EF16 floating point data is divided into an S+F value and an E value during operation so as to perform calculation respectively, wherein S represents sign bits of the EF16 data, F represents decimal places of the EF16 data, and E represents exponent bits of the EF16 data.
Specifically, in the arithmetic unit,
the first input floating point data is expressed as two parts of S1+F1 and E1;
the other input floating point data is expressed as two parts of S2+F2 and E2;
the output floating point data is expressed as S3+F3 and E3;
wherein S1, S2 and S3 are symbol values; e1, E2 and E3 are index values; f1, F2, F3 are fractional values.
The multiplication operation of the multiplier is expressed as:
when E3 is not specified, s3=s1+s2; f3 =f1×f2; e3 =e1+e2; or (b)
At the time of E3 designation, s3=s1+s2; f3 =f1×f2> > E, where e=e3-E1-E2, > > represents a shift-to-right operation.
The operation method can be realized in a mode of software, hardware or combination of the software and the hardware, and forms a basic operation unit of a neural network calculation module such as a multiplier, an adder and the like so as to execute multiplication operation and addition operation between two floating point data.
The arithmetic unit for floating point data operation can realize the following technical effects:
the input and output of the arithmetic unit for floating point data operation are expressed by EF16 data format. EF16 data has a better small expression range, and the maximum expression range is basically the same as that of FP16 data; the exponents of the floating point data directly represent the exponent values of the semi-precision floating point data, and the operation of the exponents-15 is not needed, so that the mathematical expression of the two EP16 data operations is simplified; the fraction value of the fraction value does not need to be a decimal valueUnder the condition of meeting the precondition of operation, the arithmetic operation can directly carry out addition or multiplication processing on the decimal values of two EF16 data when carrying out addition or multiplication operation, thereby effectively simplifying the operation of floating point data and improving floating pointThe computational efficiency of the data.
Example 2
EF16 tensor data:
the EF16 is a half-precision floating point format specifically proposed for tensor data, and when the data type is tensor data, all tensor data share the same exponent value (E value), hereinafter referred to as E value.
For example, one size (h, w, c) of the EF16 tensor data is shown in fig. 1.
Wherein, each number in the tensor data is expressed by using only 16 bits s+f in the EF16 data (specifically, signed integers), all data share the same E value, the E value is transmitted as a parameter of the tensor data, and we call the tensor data in the EF16 format as the E value sharing EF16 tensor data.
In addition, it is also possible to make the EF16 tensor data with c channels have c E values, where each E value is shared only among the h×w data in each channel, we call such EF16 tensor data as split channel E value shared tensor data, and one split channel E value shared EF16 tensor data with a size of (H, W, c) is shown in fig. 2.
E-value sharing EF16 tensor data supports multiplication, addition and multiply-accumulate operations. Wherein the multiplication operation refers to two tensor data with the size of (h, w, c), and each number is correspondingly multiplied to obtain new tensor data with the size of (h, w, c); the addition operation refers to two tensor data with the size of (h, w, c), and each number is correspondingly added to obtain new tensor data with the size of (h, w, c); the multiply-accumulate operation refers to two tensor data with the size of (h, w, c), and each number is correspondingly multiplied and accumulated with each other to obtain new tensor data with the size of (1, 1).
The split-channel E value sharing EF16 tensor data supports multiplication, addition and split-channel multiply-accumulate operations. Wherein the multiplication operation refers to two tensor data with the size of (h, w, c), and each number is correspondingly multiplied to obtain new tensor data with the size of (h, w, c); the addition operation refers to two tensor data with the size of (h, w, c), and each number is correspondingly added to obtain new tensor data with the size of (h, w, c); the separate channel multiply-accumulate operation refers to two tensor data with the size of (h, w, c), and each number is correspondingly multiplied and accumulated with each other to obtain new tensor data with the size of (1, c).
(1) E-value sharing EF16 tensor multiplication operation
As shown in fig. 3, the three tensors have the same size (h, w, c), and the shared E1, E2 and E3 of the three tensors are determined in advance by the system according to the characteristics of the neural network data, so that floating-point multiplication of each number in the floating-point tensor data is simplified to multiplication and shift operations as shown in fig. 4.
The split-channel E-value-sharing EF16 tensor multiplication operation is quite similar to the E-value-sharing EF16 tensor multiplication operation described above, except that the tensor data is replaced with the split-channel E-value-sharing EF16 tensor data described above.
(2) E-value sharing EF16 tensor addition operation
As shown in fig. 5, the three tensors have the same size (h, w, c), and the three tensors must have the same shared E value, at which time the floating point addition of each number in the floating point tensor data is reduced to the addition operation shown in fig. 6.
In addition, the split-channel E-value-sharing EF16 tensor multiplication operation is very similar to the E-value-sharing EF16 tensor multiplication operation described above, except that the tensor data is replaced by the split-channel E-value-sharing EF16 tensor data, and all the es in the split-channel E-value-sharing EF16 tensor data are equal.
(3) E-value sharing EF16 tensor multiply-accumulate operation
As shown in fig. 7, the two tensors have the same size (h, w, c), and the sharing E1 and the sharing E2 of the two tensors are determined in advance by the system according to the characteristics of the neural network data. The E value of the calculated result obtained by multiply-accumulate is equal to E1+E2, and corresponding shift operation can be carried out on the multiply-accumulate data according to the requirement of the subsequent operation. Floating-point multiply-accumulate for each number in the floating-point tensor data is reduced to a multiply-accumulate operation as shown in fig. 8.
(4) Shared EF16 tensor multiply-accumulate operation for separation channel E value
As shown in fig. 9, for the multiply-accumulate operation, the split-channel E value shares tensor data according to the split-channel multiply-accumulate operation method as shown below, where c channels are completely split, with the multiply-accumulate operation occurring only between numbers in the h and w ranges within the channels, as shown in fig. 10.
It should be noted that, for ease of understanding, in the example, tensor data of three dimensions that can be expressed by a physical structure are used as an example to illustrate the respective differences between the tensor data shared by the exponent values and the tensor data shared by the split channel exponent values, but in actual data calculation, the dimensions of the tensor data are not limited.
According to the rule of multiplication, addition and multiply-accumulate of EF16 tensor data, an arithmetic unit for tensor data operation is also constructed in the embodiment of the invention.
The arithmetic unit comprises two input tensor data and a shared E value thereof, and one output tensor data and a shared E value thereof, wherein each number of the tensor data is represented by S+F in an EF16 data format, S is a symbol value of EF16 data, F is a decimal value of EF16 data, and the shared E value is an exponent value of EF16 data; the tensor data are expressed as E-value sharing tensor data and separation channel E-value sharing tensor data according to the number of the shared E-values;
the E-value sharing tensor data represents: all numbers in the tensor data share a shared E value;
the separation channel E value sharing tensor data representation: tensor data has c channels, each channel having a shared E value, each shared E value being shared only among the data within each channel.
According to the classification of tensor data, an operator for tensor data operation specifically includes: multiplier, adder and multiply-add accumulator for E-value sharing tensor data operation; and multipliers, adders and multiply-add accumulators for split-channel E value-sharing tensor data operations.
At the output of the operator, the shared E value of the output tensor data may be specified according to the requirements of the neural network application. When the shared E value of the output tensor data is specified, each new tensor data generated in the operation process is shifted rightwards according to the difference value between the corresponding shared E value of the output tensor data and the shared E value of the new tensor data to generate the output tensor data.
The operation method can be realized in a mode of software, hardware or combination of software and hardware, and forms a basic operation unit of a neural network calculation module such as a multiplier, an adder, a multiply accumulator and the like so as to execute multiplication operation, addition operation and multiply accumulation operation between two tensor data.
The arithmetic unit for tensor data operation can realize the following technical effects:
the tensor data is divided into an integer part and a shared index part by extracting the shared E value (index value), so that the format of the tensor data is greatly simplified, floating point data operation among the tensor data is simplified into integer multiplication operation, addition operation, multiplication accumulation operation and index addition operation by the arithmetic unit, and the operation speed of the neural network can be greatly improved.
In the neural network calculation, the shared E value of each tensor data may be specified in advance, so that the calculation of the tensor data is independent of the E value (index value), or the calculation process of the tensor data is simplified.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (2)

1. An arithmetic unit for floating point data operation, realized by hardware or combination of hardware and software, forms a basic arithmetic unit of a neural network computing module comprising a multiplier and an adder, and is characterized in that: the arithmetic unit is used for executing multiplication or addition operation and comprises two input floating point data and an exponent value thereof and one output floating point data and an exponent value thereof, the input floating point data and the output floating point data are expressed in EF16 data format,
the numerical expression formula of the EF16 data is as follows:
(-1) signbit ×2 (-exponent) ×fraction
wherein sign is a symbol value; the exact is an index value; fraction is a fractional value;
in the case of the arithmetic unit,
one input floating point data is expressed as two parts of S1+F1 and E1;
the other input floating point data is expressed as two parts of S2+F2 and E2;
the output floating point data is expressed as S3+F3 and E3;
wherein S1, S2 and S3 are symbol values; e1, E2 and E3 are index values; f1, F2, F3 are fractional values;
the operator includes a multiplier for performing a multiplication operation;
the multiplication operation of the multiplier is expressed as:
when E3 is not specified, s3=s1+s2; f3 =f1×f2; e3 =e1+e2; or (b)
At the time of E3 designation, s3=s1+s2; f3 =f1×f2> > E, where e=e3-E1-E2, > > represents a shift-to-right operation;
or the operator includes an adder, and a precondition for the adder to perform the addition operation is e1=e2;
the addition operation of the adder is expressed as:
(S3+F3)=(S1+F1)+(S2+F2);
wherein: the EF16 data has a data bit width of 21 bits, including a sign bit with a bit width of 1 bit, a finger bit with a bit width of 5 bits and a decimal bit with a bit width of 15 bits.
2. An arithmetic unit for tensor data operation, realized by hardware or combination of hardware and software, forms a basic arithmetic unit of a neural network computing module comprising a multiplier, an adder and a multiplication accumulator, and is characterized by comprising two input tensor data with the size of (h, w, c) and shared E values thereof, and one output tensor data with the size of (h, w, c) and shared E values thereof, wherein each tensor data is represented by S+F in an EF16 data format, S is a sign value of EF16 data, and F is a decimal value of the EF16 data; the shared E value of the tensor data is an index value of EF16 data; the tensor data are expressed as E-value sharing tensor data and separation channel E-value sharing tensor data according to the number of the shared E-values;
the numerical expression formula of the EF16 data is as follows:
(-1) signbit ×2 (-exponent) ×fraction
wherein sign is a symbol value; the exact is an index value; fraction is a fractional value;
the E-value sharing tensor data represents: all numbers in tensor data of size (h, w, c) share a shared E value;
the separation channel E value sharing tensor data representation: tensor data of size (h, w, c) has c channels, each channel having a shared E value, each shared E value being shared only among the h x w data within each channel;
the arithmetic unit is a multiplier for executing tensor data multiplication operation, the input and output of the multiplier share tensor data for the E value, and the multiplication operation of the multiplier is expressed as:
two tensor data with the same size, wherein each number is multiplied correspondingly to obtain process tensor data with the same size;
adding the shared E values of the two tensor data with the same size to obtain the shared E value of the process tensor data;
when the shared E value of the output tensor data is not specified, assigning the process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is specified, each number of the process tensor data is shifted rightwards according to the difference value between the corresponding specified shared E value and the shared E value of the process tensor data to generate the output tensor data;
or the arithmetic unit is a multiplier for executing tensor data multiplication operation, the input and output of the multiplier share tensor data for the separation channel E value, and the multiplication operation of the multiplier is expressed as:
two tensor data with the same size and c channels, wherein each number is multiplied correspondingly to obtain process tensor data with the same size;
the method comprises the steps that tensor data of c channels with the same size are obtained, and shared E values of the corresponding channels are added to obtain shared E values of the c channels of the process tensor data;
when the shared E value of the output tensor data is not specified, assigning the process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is specified, each number of the process tensor data is shifted rightwards according to the difference value between the corresponding specified shared E value and the shared E value of the process tensor data to generate the output tensor data;
or the arithmetic unit is an adder for executing addition operation of tensor data, the input and output of the adder share tensor data for the E value, and the addition operation of the adder is expressed as:
the method meets the pre-operation conditions: the shared E values of two tensor data of the same size are the same;
two tensor data with the same size, wherein each number is correspondingly added to obtain process tensor data with the same size;
taking the shared E value of the input tensor data as the shared E value of the process tensor data;
when the shared E value of the output tensor data is not specified, assigning the process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is specified, each number of the process tensor data is shifted rightwards according to the difference value between the corresponding specified shared E value and the shared E value of the process tensor data to generate the output tensor data;
or the arithmetic unit is an adder for executing addition operation of tensor data, the input and output of the adder share tensor data for the separation channel E value, and the addition operation of the adder is expressed as:
the method meets the pre-operation conditions: the shared E value of each channel in the tensor data with the same size is the same;
two tensor data with the same size, wherein each number is correspondingly added to obtain new tensor data with the same size;
taking the shared E value of each separation channel of the input tensor data as the shared E value of each separation channel of the process tensor data;
when the shared E value of the output tensor data is not specified, assigning the process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is specified, each number of the process tensor data is shifted rightwards according to the difference value between the corresponding specified shared E value and the shared E value of the process tensor data to generate the output tensor data;
or the operator is a multiply accumulator for performing a tensor data multiply-accumulate operation, the input and output of the multiply accumulator sharing tensor data for the E value, the multiply-accumulate operation of the multiply accumulator being expressed as:
two tensor data with the same size, wherein each number is multiplied correspondingly to obtain first process tensor data with the same size;
accumulating each number of the process tensor data to form second process tensor data with the size of 1 in all dimensions;
adding the shared E values of the tensor data with the same size to obtain the shared E value of the tensor data of the second process;
when the shared E value of the output tensor data is not specified, assigning the second process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is specified, each number of the second process tensor data is shifted rightwards according to the difference value between the corresponding specified shared E value and the shared E value of the process tensor data to generate the output tensor data;
or the arithmetic unit is a multiplication accumulator for performing a tensor data multiplication accumulation operation, the input and output of the multiplication accumulator share tensor data for the separation channel E value, and the multiplication accumulation operation of the multiplication accumulator is expressed as:
two tensor data with the same size and c channels, wherein each number is multiplied correspondingly to obtain first tensor data with the same size;
accumulating each number of the process tensor data to form second process tensor data with the channel dimension of c and the other dimensions of 1;
the shared E values of the corresponding channels are added to obtain the shared E value of the c channels of the tensor data of the second process;
when the shared E value of the output tensor data is not specified, assigning the second process tensor data and the shared E value thereof to the output tensor data and the shared E value thereof; when the shared E value of the output tensor data is specified, each number of the second process tensor data is shifted rightwards according to the difference value between the corresponding specified shared E value and the shared E value of the process tensor data to generate the output tensor data;
wherein the shared E value is transmitted in a parametric manner of the tensor data.
CN202011427161.0A 2020-12-09 2020-12-09 Arithmetic unit for floating point data and tensor data operation Active CN112416295B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011427161.0A CN112416295B (en) 2020-12-09 2020-12-09 Arithmetic unit for floating point data and tensor data operation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011427161.0A CN112416295B (en) 2020-12-09 2020-12-09 Arithmetic unit for floating point data and tensor data operation

Publications (2)

Publication Number Publication Date
CN112416295A CN112416295A (en) 2021-02-26
CN112416295B true CN112416295B (en) 2024-02-02

Family

ID=74775456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011427161.0A Active CN112416295B (en) 2020-12-09 2020-12-09 Arithmetic unit for floating point data and tensor data operation

Country Status (1)

Country Link
CN (1) CN112416295B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526709A (en) * 2016-06-15 2017-12-29 辉达公司 Handled using the tensor of low precision format
CN109445440A (en) * 2018-12-13 2019-03-08 重庆邮电大学 The dynamic obstacle avoidance method with improvement Q learning algorithm is merged based on sensor
CN111813371A (en) * 2020-07-28 2020-10-23 上海赛昉科技有限公司 Floating-point division operation method, system and readable medium for digital signal processing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10853067B2 (en) * 2018-09-27 2020-12-01 Intel Corporation Computer processor for higher precision computations using a mixed-precision decomposition of operations

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526709A (en) * 2016-06-15 2017-12-29 辉达公司 Handled using the tensor of low precision format
CN109445440A (en) * 2018-12-13 2019-03-08 重庆邮电大学 The dynamic obstacle avoidance method with improvement Q learning algorithm is merged based on sensor
CN111813371A (en) * 2020-07-28 2020-10-23 上海赛昉科技有限公司 Floating-point division operation method, system and readable medium for digital signal processing

Also Published As

Publication number Publication date
CN112416295A (en) 2021-02-26

Similar Documents

Publication Publication Date Title
WO2022028134A1 (en) Chip, terminal, method for controlling floating-point operation, and related apparatus
CN112740171A (en) Multiply and accumulate circuit
JPH02196328A (en) Floating point computing apparatus
US5148386A (en) Adder-subtracter for signed absolute values
CN106951211A (en) A kind of restructural fixed and floating general purpose multipliers
KR20120053344A (en) Apparatus for converting between floating point number and integer, and method thereof
CN117472325B (en) Multiplication processor, operation processing method, chip and electronic equipment
CN117111881A (en) Mixed precision multiply-add operator supporting multiple inputs and multiple formats
WO2022170811A1 (en) Fixed-point multiply-add operation unit and method suitable for mixed-precision neural network
JPH04332036A (en) Floating decimal point multiplier and its multiplying system
JP4273071B2 (en) Divide and square root calculator
CN112416295B (en) Arithmetic unit for floating point data and tensor data operation
TW202319909A (en) Hardware circuit and method for multiplying sets of inputs, and non-transitory machine-readable storage device
US4866657A (en) Adder circuitry utilizing redundant signed digit operands
CN113608718A (en) Method for realizing acceleration of prime number domain large integer modular multiplication calculation
EP1710689A1 (en) Combining circuitry for multipliers
CN116048455B (en) Insertion type approximate multiplication accumulator
CN111124361A (en) Arithmetic processing apparatus and control method thereof
Iyer et al. Generalised Algorithm for Multiplying Binary Numbers Via Vedic Mathematics
CN112540743B (en) Reconfigurable processor-oriented signed multiply accumulator and method
US20100030836A1 (en) Adder, Synthesis Device Thereof, Synthesis Method, Synthesis Program, and Synthesis Program Storage Medium
US20230289141A1 (en) Operation unit, floating-point number calculation method and apparatus, chip, and computing device
CN116974512A (en) Floating point arithmetic device, vector processing device, processor, and electronic apparatus
US20220269485A1 (en) Process for a Floating Point Dot Product Multiplier-Accumulator
CN112540743A (en) Signed multiplication accumulator and method for reconfigurable processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant