CN117270813A - Arithmetic unit, processor, and electronic apparatus - Google Patents

Arithmetic unit, processor, and electronic apparatus Download PDF

Info

Publication number
CN117270813A
CN117270813A CN202311080931.2A CN202311080931A CN117270813A CN 117270813 A CN117270813 A CN 117270813A CN 202311080931 A CN202311080931 A CN 202311080931A CN 117270813 A CN117270813 A CN 117270813A
Authority
CN
China
Prior art keywords
point number
decimal
result
multiplication
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311080931.2A
Other languages
Chinese (zh)
Inventor
王伟达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei Xingji Meizu Technology Co ltd
Original Assignee
Hubei Xingji Meizu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei Xingji Meizu Technology Co ltd filed Critical Hubei Xingji Meizu Technology Co ltd
Priority to CN202311080931.2A priority Critical patent/CN117270813A/en
Publication of CN117270813A publication Critical patent/CN117270813A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/487Multiplying; Dividing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses arithmetic unit, treater and electronic equipment relates to computer technology field, and wherein the arithmetic unit includes: the parameter input unit is used for acquiring the first floating point number, the second floating point number and a preset standard value; the fixed-point conversion unit is used for carrying out fixed-point conversion on the first floating point number and the second floating point number based on a preset standard value to obtain a first fixed point number corresponding to the first floating point number and a second fixed point number corresponding to the second floating point number; the multiplication operation unit is used for carrying out integer multiplication operation on the integer part and the decimal part corresponding to the first fixed point number and the integer part and the decimal part corresponding to the second fixed point number based on a multiplication distribution law to obtain an integer operation result and a decimal operation result; and the result output unit is used for merging the integer operation result and the decimal operation result into a target fixed point number, and performing floating point number conversion on the target fixed point number based on a preset standard value to obtain multiplication operation results corresponding to the first floating point number and the second floating point number.

Description

Arithmetic unit, processor, and electronic apparatus
Technical Field
The present application relates to the field of computer technology, and in particular, to an arithmetic unit, a processor, and an electronic device.
Background
In view of low power consumption, low cost and simple chip design, the arithmetic units in many processors do not support floating point number operation, and in this time, fixed point operation needs to be performed on the floating point number, so that the floating point number is converted into fixed point number, and then integer calculation is performed, thereby improving the operation speed, autonomously controlling the data range and precision of operation, and reducing the power consumption of the processors.
Disclosure of Invention
In a first aspect, the present application provides an arithmetic unit, including a parameter input unit, a fixed-point conversion unit, a multiplication unit, and a result output unit, which are sequentially connected;
the parameter input unit is used for acquiring the first floating point number, the second floating point number and a preset standard value;
the fixed-point conversion unit is used for performing fixed-point conversion on the first floating point number and the second floating point number based on the preset scaling value to obtain a first fixed point number corresponding to the first floating point number and a second fixed point number corresponding to the second floating point number;
the multiplication unit is used for carrying out integer multiplication operation on the integer part and the decimal part corresponding to the first fixed point number and the integer part and the decimal part corresponding to the second fixed point number based on a multiplication distribution law to obtain an integer operation result and a decimal operation result;
The result output unit is configured to combine the integer operation result and the decimal operation result into a target fixed point number, and perform floating point number conversion on the target fixed point number based on the preset scale value, so as to obtain multiplication operation results corresponding to the first floating point number and the second floating point number.
In some embodiments, the multiplication unit comprises:
the first operation subunit is used for carrying out cross multiplication on the integer part and the decimal part corresponding to the first fixed point number and the integer part and the decimal part corresponding to the second fixed point number based on a multiplication distribution law to obtain an integer multiplication operation result, a mixed multiplication operation result and a decimal multiplication operation result;
a second operator unit configured to determine the integer arithmetic result based on the integer part in the integer multiplication arithmetic result and the mixed multiplication arithmetic result;
and a third operation subunit configured to determine the decimal operation result based on the decimal portion in the decimal multiplication operation result and the mixed multiplication operation result.
In some embodiments, the first operation subunit is configured to, when calculating a result of the fractional multiplication operation:
determining that the preset calibration value is greater than one half of the total bit width of the arithmetic unit;
Moving the decimal part corresponding to the first fixed point number and the decimal part corresponding to the second fixed point number rightward by a first preset number of bits; the first preset bit number is the difference between the preset standard value and one half of the total bit width of the arithmetic unit;
and carrying out integer multiplication on the right shift result of the fractional part corresponding to the first fixed point number and the right shift result of the fractional part corresponding to the second fixed point number to obtain the fractional multiplication result.
In some embodiments, the second operator subunit is configured to:
shifting the mixed multiplication operation result to the right by a second preset bit number; the second preset digit is the preset standard value;
and aligning the right shift result of the mixed multiplication operation result with the integer multiplication operation result according to the order from right to left, and adding the aligned result to obtain the integer operation result.
In some embodiments, the third operator subunit is configured to:
determining that the preset calibration value is greater than one half of the total bit width of the arithmetic unit;
shifting the decimal multiplication operation result to the right by a third preset digit; the third preset bit number is the difference between the total bit width of the arithmetic unit and the preset standard value;
And adding the right shift result of the decimal multiplication operation result to the decimal part in the mixed multiplication operation result to obtain the decimal operation result.
In some embodiments, the third operator subunit is configured to:
determining that the preset scaling value is less than or equal to one half of the total bit width of the arithmetic unit;
shifting the decimal multiplication operation result to the right by a fourth preset digit; the fourth preset bit number is the preset index value;
and adding the right shift result of the decimal multiplication operation result to the decimal part in the mixed multiplication operation result to obtain the decimal operation result.
In some embodiments, the operator further comprises an overflow control unit coupled to the multiplication unit and the result output unit;
the overflow control unit is used for:
shifting the decimal arithmetic result to the right by a fifth preset digit; the fifth preset digit is the preset standard value;
determining that the right shift result of the decimal operation result is greater than zero;
determining the decimal operation result overflow;
and adding one operation to the integer operation result.
In some embodiments, the result output unit is configured to:
Shifting the integer operation result leftwards by a sixth preset number of bits to obtain an integer part of the target fixed-point number; the sixth preset digit is the preset standard value;
performing bit summation operation on the decimal operation result and a preset mask to obtain a decimal part of the target fixed point number; the number of bits of the preset mask is the preset scaling value;
and carrying out bit-wise OR operation on the integer part of the target fixed point number and the decimal part of the target fixed point number to obtain the target fixed point number.
In a second aspect, the present application provides a processor comprising at least one of said operators.
In a third aspect, the present application provides an electronic device comprising at least one of the processors.
Drawings
In order to more clearly illustrate the technical solutions of the present application or the prior art, the following description will briefly introduce the drawings used in the embodiments or the description of the prior art, and it is obvious that, in the following description, the drawings are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an arithmetic unit according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a fixed-point multiplication provided in one embodiment of the present application;
FIG. 3 is a schematic diagram of bit alignment provided by one embodiment of the present application;
FIG. 4 is a schematic diagram of a multiplication operation provided in one embodiment of the present application;
FIG. 5 is a schematic diagram of a processor according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 7 is a schematic hardware structure of a terminal for implementing an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is apparent that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms "first," "second," and the like in this application are used for distinguishing between similar objects and not for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or otherwise described herein, and that the terms "first" and "second" are generally intended to be used in a generic sense and not to limit the number of objects, for example, the first object may be one or more. In addition, "and/or" in this application means at least one of the connected objects, and the character "/" generally means a relationship in which the associated objects are one kind of "or".
Some electronic devices (e.g., smart glasses) cannot be equipped with too large a battery due to device size constraints, and therefore require that the power consumption of the processor not be too high to extend the endurance. One such approach is to fix the operation of floating point numbers within a processor, because the fixed-point integer operations are faster or more energy efficient than floating point operations.
The localization is to multiply the data by a magnification factor, which is equal to 1, shift to the left, and then perform various operations, such as addition, subtraction, multiplication and division, divide the operation by the original magnification factor, and reduce the operation to the original order of magnitude. The formula for the conversion between floating point number and fixed point number is:
fixed point number = round (floating point number x (1 < < left shift number)
Floating point number = fixed point number ≡ (1 < < left shift number)
The left shift number above determines the balance between data range and accuracy, and < is the left shift operator. The left shift number represents the storage number of the fractional part, and is commonly referred to in the industry as a scale value, denoted by Q. When the scale value is large, the accuracy is high, but the data range is small, and the data is easy to overflow, and when the scale value is small, the data range is large, but the accuracy is low, and the error is easy to accumulate. Therefore, in the fixed-point processing, the accuracy requirement of the operation and the data range size requirement of the operation can be balanced only by selecting a proper scaling value in consideration of the accuracy requirement of the operation and the overflow prevention of the operation result, and the accuracy and the high performance of the computing system are ensured.
The existing fixed-point multiplication operation method is very simple, namely (fixed point number multiplied by fixed point number > > right shift number), and > is a right shift operation symbol. This implementation has the following drawbacks: 1. the number of bits used for storing the decimal part cannot be freely set, in other words, the scaling value cannot be freely set, and the multiplication operation above two cannot balance the requirements of the data range and the data precision, so that the decimal part is easy to overflow.
The application provides an arithmetic unit, a processor and electronic equipment, wherein two multipliers to be multiplied are respectively split into an integer part and a decimal part according to a specified scaling value, the integer part and the decimal part are multiplied according to a multiplication distribution law, binary numbers of the integer part and the decimal part are aligned through shifting, then the integer part and the decimal part of a final result are respectively calculated, finally, binary numbers of the integer part and the decimal part are spliced together, and different shifting can be determined according to whether the scaling value is larger than half of the total bit width of the processor. The fixed-point multiplication operation executed by the arithmetic unit can equivalently replace the multiplication operation of the floating point number, and the operation speed is faster than that of the floating point number. If the operator is 32 bits, the method can be realized by using 32 bits, the precision is comparable to 64 bits, and in addition, the scaling value can be flexibly set, so that the balance between the data range and the data precision is achieved.
Fig. 1 is a schematic structural diagram of an arithmetic unit according to an embodiment of the present application, and as shown in fig. 1, the arithmetic unit 100 may include a parameter input unit 110, a fixed-point conversion unit 120, a multiplication unit 130, and a result output unit 140, which are sequentially connected.
A parameter input unit 110, configured to obtain a first floating point number, a second floating point number, and a preset target value;
the fixed-point conversion unit 120 is configured to perform fixed-point conversion on the first floating point number and the second floating point number based on a preset standard value, so as to obtain a first fixed point number corresponding to the first floating point number and a second fixed point number corresponding to the second floating point number;
a multiplication unit 130, configured to perform integer multiplication on the integer part and the fraction part corresponding to the first fixed point number and the integer part and the fraction part corresponding to the second fixed point number based on a multiplication distribution law, to obtain an integer operation result and a fraction operation result;
the result output unit 140 is configured to combine the integer operation result and the decimal operation result into a target fixed point number, and perform floating point number conversion on the target fixed point number based on a preset scale value, so as to obtain multiplication operation results corresponding to the first floating point number and the second floating point number.
Specifically, the operator refers to a component in a computer that performs various arithmetic operations and logical operation operations. Basic operations of an operator include addition, subtraction, multiplication, division by four operations, and logical operations such as or, nor, exclusive or, and operations such as shift, compare, and transfer, also known as arithmetic logic units.
Integers are represented in a computer by integer types (int). The decimal is represented in a computer by a floating point number or a fixed point number. Floating point number refers to a number whose decimal point position is not fixed, including both a decimal part and an integer part, and may include, for example, single-precision floating point type (float), double-precision floating point type (double float), and the like. The fixed point number refers to a fixed number of decimal points, and the positions of the decimal points can be determined by setting a scaling value. When the floating point number is calculated, the calculation of the fraction code part and the calculation of the mantissa part are required, and the calculation result requires normalization, so that the floating point number calculation step is more than the fixed point number calculation step, and the calculation speed is lower than the fixed point calculation speed.
The calculator in the embodiment of the application is used for executing fixed-point multiplication operation of floating point numbers. Structurally, the arithmetic unit includes a parameter input unit, a fixed-point conversion unit, a multiplication unit, and a result output unit.
The parameter input unit is used for acquiring the first floating point number, the second floating point number and a preset standard value. The preset scale value is a set scale value, which can be represented by Q, and is the bit width of the decimal part. The bit width is the data width, i.e., the number of operation bits that binary data needs to occupy. The preset index value can be determined according to the calculation requirement and the total bit width of the arithmetic unit. For example, if the total bit width of the operator is 32, the preset scaling value may be selected between 1 and 31. The first floating point number and the second floating point number are floating point numbers requiring multiplication. The first floating point number may be represented by m and the second floating point number may be represented by n.
The fixed-point conversion unit is used for performing fixed-point conversion on the first floating point number M and the second floating point number N according to a preset standard value to obtain a first fixed point number M corresponding to the first floating point number and a second fixed point number N corresponding to the second floating point number. The fixed point conversion is to multiply the floating point number by the power Q of 2. The formula can be expressed as:
fixed point number=round (floating point number)×2 Q )
Where round represents the rounding function.
Taking m=5.567, n=2.7835, q=15 as an example, m=182419, n= 91210 can be calculated, and the fixed point numbers are represented by binary numbers of 32 bits, respectively, which can be obtained:
M=0000 0000 0000 0010 1100 1000 1001 0011
N=0000 0000 0000 0001 0110 0100 0100 1010
from q=15, it can be determined that the right 15 bits in the binary number are used to represent the decimal number. The preset MASK (Q) may be determined as q=15:
0000 0000 0000 0000 0111 1111 1111 1111
the preset mask is set to 0 in whole integer part and set to 1 in whole decimal part.
The binary numbers of M and N are respectively split into an integer part and a decimal part, and are recorded as follows:
M=A.B;N=C.D
wherein A is an integer portion of M and B is a fractional portion of M; c is an integer portion of N and D is a fractional portion of N.
The integer parts a and C are obtained by shifting the binary numbers of M and N by Q bits to the right, respectively, with the results shown below:
A=0000 0000 0000 0010 1
C=0000 0000 0000 0001 0
the fractional parts B AND D are obtained by bitwise AND (AND) of the binary numbers of M AND N, respectively, by a preset mask, the results are shown below:
B=100 1000 1001 0011
D=110 0100 0100 1010
And the multiplication unit is used for carrying out integer multiplication operation on the integer part A and the decimal part B corresponding to the first fixed point number M and the integer part C and the decimal part D corresponding to the second fixed point number according to the multiplication distribution law to obtain an integer operation result and a decimal operation result. The integer arithmetic result may be represented by product hi and the decimal arithmetic result may be represented by product lo.
For example, it is possible to obtain:
producthi=1111
productlo=011 1111 0111 0100
the multiplication result corresponding to the first floating point number and the second floating point number is the floating point number, and the fixed point number corresponding to the floating point number is the target fixed point number.
The result output unit is used for merging the integer operation result product hi and the decimal operation result product lo into a target fixed point number X.
For example, it is possible to obtain:
X=1111 011 1111 0111 0100
and performing floating point conversion on the target fixed point number X according to a preset standard value Q to obtain multiplication results X corresponding to the first floating point number m and the second floating point number n.
Can be expressed as:
floating point number = fixed point number/2 Q
For example, x= 15.495728 can be obtained. The result is obtained by the arithmetic unit performing the fixed-point multiplication. According to mathematical operation 5.567 × 2.7835 = 15.4957445. The error obtained after the comparison is only 0.0000165.
The arithmetic unit provided by the embodiment of the application, wherein the parameter input unit acquires the first floating point number, the second floating point number and a preset standard value; the fixed-point conversion unit performs fixed-point conversion on the first floating point number and the second floating point number based on a preset standard value to obtain a first fixed point number and a second fixed point number; the multiplication unit performs integer multiplication operation on the integer part and the decimal part corresponding to the first fixed point number and the integer part and the decimal part corresponding to the second fixed point number based on a multiplication distribution law to obtain an integer operation result and a decimal operation result; the result output unit combines the integer operation result and the decimal operation result into a target fixed point number, and performs floating point number conversion on the target fixed point number to obtain multiplication operation results corresponding to the first floating point number and the second floating point number; the floating-point multiplication is replaced by the integer multiplication, the operation speed of the floating-point multiplication is improved, the electric energy consumed by the arithmetic unit for the integer multiplication is less than that consumed by the floating-point multiplication, the power consumption of the arithmetic unit is reduced, in addition, the preset calibration value can be set according to the requirement, the operation precision requirement can be considered, the overflow of an operation result is prevented, and the operation accuracy and the operation performance of the arithmetic unit are improved.
In some embodiments, the multiplication unit comprises:
the first operation subunit is used for carrying out cross multiplication on the integer part and the decimal part corresponding to the first fixed point number and the integer part and the decimal part corresponding to the second fixed point number based on a multiplication distribution law to obtain an integer multiplication operation result, a mixed multiplication operation result and a decimal multiplication operation result;
a second operator unit for determining an integer operation result based on integer parts in the integer multiplication operation result and the mixed multiplication operation result;
and a third operation subunit for determining a decimal operation result based on the decimal portion in the decimal multiplication operation result and the mixed multiplication operation result.
Specifically, functionally, the multiplication operation unit may be divided into a first operation subunit, a second operation subunit, and a third operation subunit.
The first operation subunit is mainly configured to cross-multiply the integer part a and the fractional part B corresponding to M and the integer part C and the fractional part D corresponding to the second fixed point number according to a multiplication distribution law, to obtain an integer multiplication operation result a×c, a mixed multiplication operation result a×d+b×c, and a fractional multiplication operation result b×d.
The second operation subunit determines an integer operation result product from the integer portion in the integer multiplication operation result a×c and the mixed multiplication operation result a×d+b×c.
The third operation subunit determines a decimal operation result Productlo from the decimal part in the decimal multiplication operation result BxD and the mixed multiplication operation result AxD+BxC.
According to the arithmetic unit provided by the embodiment of the application, the multiplication operation unit divides fixed point number multiplication into the steps of cross multiplication, integer fusion, decimal fusion and the like, and the steps are executed by different operation sub-units respectively, so that the integer multiplication operation replaces floating point number multiplication operation, and the operation speed of floating point number multiplication is improved.
In some embodiments, the first operator subunit is configured to, when calculating the result of the fractional multiplication operation:
determining that the preset standard value is more than one half of the total bit width of the arithmetic unit;
moving the decimal part corresponding to the first fixed point number and the decimal part corresponding to the second fixed point number rightward by a first preset number of bits; the first preset bit number is the difference between the preset standard value and one half of the total bit width of the arithmetic unit;
and carrying out integer multiplication on the right shift result of the decimal part corresponding to the first fixed point number and the right shift result of the decimal part corresponding to the second fixed point number to obtain a decimal multiplication result.
Specifically, the preset scaling value is used to determine the bit width representing the fractional part in the operator. Since the total bit width of the operator is determined, the bit width representing the fractional part can also be determined. For example, if the total bit width of the operator is 32 and the predetermined index value is 15, 15 bits of the bit width representing the operator are used to represent the fractional part, and 17 bits are used to represent the integer part.
From this, it can be seen that: the preset standard value is set to be larger, the bit width used for the decimal part is larger (the number of operation bits is larger), the range of the calculated value is smaller, and the precision is higher; the preset scale value is set smaller, the bit width for the integer part is larger (the number of operation bits is larger), the range of the calculated value is larger, and the precision is lower.
If the preset scaling value is larger, the bit width for representing the decimal part is larger (the number of operation bits is larger). When the decimal part is multiplied, the result of calculation is easy to generate high-order overflow (the range of the calculated value is smaller), and the decimal part needs to be shifted to the right at the moment, so that the bit width occupied by the decimal part is properly reduced, and the high-order overflow is avoided.
One half of the total bit width of the operator may be used as a metric.
If the preset index value is greater than one half of the total bit width of the operator, the decimal multiplication operation result has the risk of overflowing. For example, the processor total bit width is 32, and one half of the operator total bit width is 16. If the preset index value Q is 20, the decimal multiplication operation result needs 40 bits binary numbers to represent, and the total bit width of the processor is exceeded, high-order overflow is generated, and 8 high-order overflow exists in the operation result.
Therefore, it is necessary to move the fractional part corresponding to the first fixed point number and the fractional part corresponding to the second fixed point number rightward by the first preset number of bits. The first preset bit number is the difference between the preset index value Q and one half of the total bit width of the arithmetic unit. For example, the first preset number of bits is Q-16, equal to 4. The right shift result of the decimal part corresponding to the first fixed point number and the right shift result of the decimal part corresponding to the second fixed point number are 16 bits, the decimal multiplication operation result needs 32-bit binary numbers to be represented, and the decimal multiplication operation result is just equal to the total bit width of the processor, and high-order overflow cannot be generated.
If the preset scale value is less than or equal to one half of the total bit width of the arithmetic unit, the decimal part corresponding to the first fixed point number and the decimal part corresponding to the second fixed point number can be multiplied, and the decimal multiplication operation result needs to be represented by 32-bit binary numbers at most and cannot generate high-bit overflow, so that right shift is not needed.
According to the arithmetic unit provided by the embodiment of the application, the decimal part corresponding to the fixed point number is subjected to shift processing according to the preset standard value and the total bit width of the arithmetic unit, so that the decimal multiplication operation result is prevented from generating high-order overflow, the operation result is ensured to be in a reasonable data range by properly reducing the data precision, and the operation accuracy of the arithmetic unit is improved.
In some embodiments, the second operator subunit is to:
shifting the mixed multiplication operation result to the right by a second preset number of bits; the second preset bit number is a preset calibration value;
and aligning the right shift result of the mixed multiplication operation result with the integer multiplication operation result according to the order from right to left, and adding the aligned results to obtain the integer operation result.
Specifically, the result of the mixed multiplication operation is the sum a×d+b×c of the product of the integer part a corresponding to the first fixed point number and the fractional part D corresponding to the second fixed point number and the product of the fractional part B corresponding to the first fixed point number and the integer part C corresponding to the second fixed point number.
The binary number of the mixed multiplication result includes an integer part and a decimal part. The integer part may be obtained by shifting the result of the mixed multiplication operation to the right by a second preset number of bits. The second preset bit number is a preset target value. For example, a×d+b×c= 10 1000 0110 1001 1000. As is clear from the scaling value q=15, the integer part of the mixed multiplication result is 101 and the decimal part is 000 0110 1001 1000 after shifting 15 bits to the right.
The right shift result (A×D+B×C) > Q of the mixed multiplication result A×D+B×C and the integer multiplication result A×C are aligned in order from right to left, and the aligned results are added to obtain an integer operation result product, which can be expressed as:
producthi=A×C+(A×D+B×C)>>Q
=1010+101=1111
According to the arithmetic unit provided by the embodiment of the application, the right shift result of the mixed multiplication operation result and the integer multiplication operation result are added after being aligned according to the order from right to left, so that fusion of integer parts in the operation result is realized, and the operation accuracy of the arithmetic unit is improved.
In some embodiments, the third operator subunit is to:
determining that the preset standard value is more than one half of the total bit width of the arithmetic unit;
shifting the decimal multiplication operation result to the right by a third preset digit; the third preset bit number is the difference between the total bit width of the arithmetic unit and a preset standard value;
and adding the right shift result of the decimal multiplication operation result to the decimal part in the mixed multiplication operation result to obtain a decimal operation result.
Specifically, when the decimal part of the mixed multiplication result is fused with the decimal multiplication result, the magnitude of the preset target value needs to be considered.
If the preset scale value is larger than half of the total bit width of the arithmetic unit, the bit width used for the decimal part in the fixed point number is large, and the number of operation bits is relatively large. In the above-described embodiment, when the decimal multiplication operation is performed, the bit width of the operation result is large, and in order to avoid overflow of the upper bits, the decimal part of the fixed point number has been subjected to right shift processing. In the case of performing the mixed multiplication operation, the right shift processing is not performed on the fractional part of the fixed point number. This makes it impossible to align the operations when the fractional part fusion is performed because the bit widths of the fractional parts are different.
Therefore, it is necessary to shift the result of the fractional multiplication operation by a third preset number of bits to the right. The third preset bit number is the difference between the total bit width of the arithmetic unit and the preset standard value. For example, the total bit width of the processor is 32, the half of the total bit width of the arithmetic unit is 16, the preset index value Q is 20, and in order to avoid the overflow of the high order of the decimal multiplication result, the decimal part corresponding to the first fixed point number and the decimal part corresponding to the second fixed point number are shifted to the right by 4 bits (Q-16). The right shift result of the decimal part corresponding to the first fixed point number and the right shift result of the decimal part corresponding to the second fixed point number are 16 bits, the decimal multiplication operation result needs 32-bit binary numbers to be represented, and the decimal multiplication operation result is just equal to the total bit width of the processor, and high-order overflow cannot be generated. However, the shift processing is not performed during the mixed multiplication, and 20 bits in the mixed multiplication result represent the decimal part. To perform the fusion of the fractional parts, the result of the fractional multiplication operation needs to be shifted 12 bits to the right (32-Q).
The right shift result of the decimal multiplication operation result is added with the decimal part in the mixed multiplication operation result to obtain a decimal operation result product, which can be expressed as follows by a formula:
productlo=(B×D)>>(32-Q)+(A×D+B×C)&MASK(Q)
the decimal part in the mixed multiplication result is obtained by performing bitwise AND (AND) on the mixed multiplication result through a preset MASK MASK (Q).
According to the arithmetic unit provided by the embodiment of the application, under the condition that the preset standard value is larger than one half of the total bit width of the arithmetic unit, the decimal multiplication operation result is moved rightwards, the moving digit is the difference between the total bit width of the arithmetic unit and the preset standard value, the fusion of decimal parts in the operation result is realized, and the operation accuracy of the arithmetic unit is improved.
In some embodiments, the third operator subunit is to:
determining that the preset standard value is less than or equal to one half of the total bit width of the arithmetic unit;
shifting the decimal multiplication operation result to the right by a fourth preset digit; the fourth preset bit number is a preset standard value;
and adding the right shift result of the decimal multiplication operation result to the decimal part in the mixed multiplication operation result to obtain a decimal operation result.
Specifically, if the preset scale value is less than or equal to one half of the total bit width of the arithmetic unit, it indicates that the bit width used for the decimal part in the fixed point number is smaller or more reasonable, and the number of operation bits is smaller. When the decimal multiplication operation is carried out, the bit width of the operation result does not exceed the total bit width of the operator, and high-order overflow does not occur.
In this case, the decimal multiplication result may be directly shifted to the right by a fourth preset number of digits, which is the preset scalar value Q. Then, adding the right shift result of the decimal multiplication operation result to the decimal part in the mixed multiplication operation result to obtain a decimal operation result product, which can be expressed as follows by a formula:
productlo=(B×D)>>Q+(A×D+B×C)&MASK(Q)
The decimal part in the mixed multiplication result is obtained by performing bitwise AND (AND) on the mixed multiplication result through a preset MASK MASK (Q).
The fourth preset number of bits for right-shifting the fractional multiplication result is a preset scalar value, because the bit width of the fractional multiplication result bxd in the operator becomes 2Q, and the bit width of the fractional part in the mixed multiplication result is Q, and thus the fourth preset number of bits is a preset scalar value.
According to the arithmetic unit provided by the embodiment of the application, under the condition that the preset standard value is smaller than or equal to one half of the total bit width of the arithmetic unit, the decimal multiplication operation result is moved rightwards, the moving digit is the preset standard value, the fusion of decimal parts in the operation result is realized, and the operation accuracy of the arithmetic unit is improved.
In some embodiments, the operator further comprises an overflow control unit coupled to the multiplication unit and the result output unit;
the overflow control unit is used for:
shifting the decimal arithmetic result to the right by a fifth preset digit; the fifth preset bit number is a preset calibration value;
determining that the right shift result of the decimal operation result is greater than zero;
determining the overflow of the decimal operation result;
and adding one operation to the integer operation result.
Specifically, in performing the operation, the operation of the fractional part and the operation of the integer part are separated.
The third operation subunit in the multiplication unit needs to consider that the decimal operation result overflows when determining the decimal operation result according to the addition of the decimal part in the decimal multiplication operation result and the mixed multiplication operation result.
An overflow control unit may be provided between the multiplication unit and the result output unit. The overflow control unit moves the decimal operation result to the right by a fifth preset digit, wherein the fifth preset digit is a preset calibration value. This is done to remove the fraction of the fractional result, leaving an overflow of the fractional result, which is a potentially generated integer carry.
If the right shift result of the fractional operation result is greater than zero, indicating that an overflow is generated, which is due to the fact that the fractional parts are added to generate an integer, carry is needed, and the integer needs to be added to the integer operation result; if equal to zero, indicating that no overflow is occurring, fractional addition does not produce an integer, no carry is required.
For example, if the right shift result of the fractional multiplication result is 011 1000 1101 1100, the fractional part of the mixed multiplication result is 000 0110 1001 1000, and the result obtained by adding is 011 1111 0111 0100. As is clear from the preset scalar q=15, it is necessary to take 15 bits from low order to high order (right to left) as the decimal operation result, and since the above result is just 15 bits, there is no overflow, the result is the correct decimal operation result.
If the right shift result of the fractional multiplication result is 111 1000 1101 1100, the fractional part of the mixed multiplication result is 100 0110 1001 1000, and the result obtained by adding is 1 011 1111 0111 0100. As can be seen from the preset index value q=15, 15 bits are required to be taken from the lower order to the upper order (from right to left) as the decimal operation result, since the upper result has 16 bits and the upper order overflows, the overflowed upper order should be added to the integer operation result, and thus 15 bits in the result are required to be taken as the decimal operation result, and an addition operation is required to be performed in the integer operation result, that is, the integer operation result needs to be added with 1.
According to the arithmetic unit provided by the embodiment of the application, the overflow control unit is arranged to judge whether the decimal operation result overflows or not, and the integral operation result is added with one operation under the overflow condition, so that the operation accuracy of the arithmetic unit is improved.
In some embodiments, the result output unit is to:
shifting the integer operation result leftwards by a sixth preset number of bits to obtain an integer part of the target fixed point number; the sixth preset bit number is a preset calibration value;
carrying out bit summation operation on the decimal operation result and a preset mask to obtain a decimal part of the target fixed point number; the number of bits of the preset mask is a preset calibration value;
And carrying out bit-wise OR operation on the integer part of the target fixed point number and the decimal part of the target fixed point number to obtain the target fixed point number.
Specifically, when the integer arithmetic result and the decimal arithmetic result are spliced, it is required to consider that the integer arithmetic result and the decimal arithmetic result are spliced by binary numbers, the integer arithmetic result is weighted, that is, the integer arithmetic result should be placed at a high level of the total bit width of the arithmetic unit (the effective bit number is a difference between the total bit width of the arithmetic unit and a preset standard value), and the decimal arithmetic result should be placed at a low level of the total bit width of the arithmetic unit (the effective bit number is a preset standard value), and the distinction between the high level and the low level is determined by the preset standard value.
The integer part needs to be shifted left to recover the weights and then spliced with the fractional part. Specifically, moving an integer operation result leftwards by a sixth preset number of bits to obtain an integer part of a target fixed point number; the sixth preset number of bits is a preset scalar value. For example, the left shift result of the integer arithmetic result product is 1111 000 0000 0000 0000, which is an integer part of the target fixed-point number.
The fractional part requires a preset mask to extract. The decimal part of the target fixed point number is obtained by carrying out bit summation operation on a decimal operation result and a preset mask; the number of bits of the preset mask is a preset scalar value. The preset MASK (Q) may be determined as 0000 0000 0000 0000 0111 1111 1111 1111, for example, according to q=15. The preset mask is set to 0 in whole integer part and set to 1 in whole decimal part. The decimal operation result is that the product= 011 1111 0111 0100, and the decimal part of the target fixed point number is 011 1111 0111 0100 after the decimal operation result is bitwise summed with the preset mask.
And carrying out bit-wise OR operation on the integer part of the target fixed point number and the decimal part of the target fixed point number to obtain the target fixed point number. For example, according to the results in the above example, the target fixed point number 1111 011 1111 0111 0100 can be obtained.
According to the arithmetic unit provided by the embodiment of the application, after the weight of the integer arithmetic result is recovered, the integer arithmetic result is spliced with the decimal arithmetic result to obtain the target fixed point number, so that the arithmetic accuracy of the arithmetic unit is improved.
Fig. 2 is a schematic flow chart of a fixed-point multiplication according to an embodiment of the present application, and as shown in fig. 2, the method includes steps S100 to S1100, which are executed by an arithmetic unit, and the arithmetic unit uses 32 bits to store and calculate data.
S100, inputting two floating point numbers m and n to be multiplied and a bit number Q (scaling value) to be used by a decimal part.
S200, converting the input floating point number to be multiplied into a fixed point number, for example, m=5.567, q=15, and then the fixed point number m= 182419 after the localization is 0000 0000 0000 0010 1100 1000 1001 0011 in binary representation of 32 bits. The preset mask is 0000 0000 0000 0000 0111 1111 1111 1111 determined from the scaling value. n= 2.7835, and the fixed point number n= 91210 can be obtained by the same method, and the binary representation of 32 bits is 0000 0000 0000 00010110 0100 0100 1010.
S300, splitting the binary numbers of M and N into an integer part and a fractional part, respectively, denoted m=a.b, n= C.D. The integer parts A AND C are obtained by shifting the M AND N by Q bits to the right, AND the fractional parts B AND D are obtained by bit-wise AND (AND) of M AND N with a predetermined mask that is set to 0 in all integer parts AND 1 in all fractional parts. Next to the above example, the preset mask here is 0000 0000 0000 00000111 1111 1111 1111,A = 0000 0000 0000 0010 1,B = 100 1000 10010011,C = 0000 0000 0000 0001 0,D = 110 0100 0100 1010.
S400, calculating A×C, wherein the calculation result is denoted as AC, calculating A×D+C×B, and the calculation result is denoted as AD_CB. The multiplication here makes use of the processor's own integer (int 32) multiplication. Next to the above example, ac=1010, ad_cb= 10 1000 01101001 1000.
S510, if Q is greater than 16, both B and D are right shifted by (Q-16) bits. Next to the above example, q=15, less than 16, so no right shift is needed.
S520, calculating B×D, and recording the calculation result as BD. Next to the above example, bd= 01 1100 0110 1110 0110 0110 0111 1110.
S600, the integer part of the final result, i.e., the integer parts of AC and ad_cb are added, and the calculated result is denoted as product. The integer part of ad_cb is derived from the binary number of ad_cb being shifted to the right by Q bits, following the above example, the integer part of ad_cb is 101, product=ac+ad_cb > > q=1010+101=1111. Note that the addition here is addition after low-order alignment.
S710, if Q is greater than 16, binary numbers of BD are shifted to the right by (32-Q) bits, and if Q is less than 16, binary numbers of BD are shifted to the right by Q bits. Next, in the above example, q=15, and is smaller than 16, so that BD is obtained by shifting the binary number of BD by 15 bits to the right > q= 011 1000 1101 1100.
S720, calculating the decimal part of the final result, namely adding the BD and the decimal part of the AD_CB after the right shift in the step S710, and recording the calculation result as a product. The fractional part of ad_cb is obtained by applying a preset MASK from the binary number of ad_cb, bitwise AND (AND), AND then in the above example the fractional part of ad_cb is 000 0110 1001 1000,productlo =bd > > q+ad_cb & MASK (Q) = 011 1000 1101 1100+000 0110 1001 1000 = 011 1111 0111 0100.
S800, checking whether the product lo overflows (whether carry is needed), if so, the product hi is increased by one (carry), otherwise, skipping this step. The method of checking whether the Productlo overflows is to shift the Productlo to the right by Q bit and then determine if it is greater than 0, it is indicative of an overflow. Following the example above, the product is shifted 15 bits to the right and then zero, so there is no overflow.
S900, combining the high-order part of the product hi and the low-order part of the product lo. The significant bits of the high order part of the product hi are the (32-Q) bits and the low order part of the product lo are the Q bits. The method for obtaining the high-order part of the product hi is that the product hi shifts left by Q bits, AND the method for obtaining the low-order part of the product lo is that the preset mask is applied to the product lo, AND the product is obtained by AND (AND) according to the bits. The merging method is bit-wise OR (OR). Following the example above, the high portion of product hi is the low portion of 1111000 0000 0000 0000,productlo is 011 1111 0111 0100, and the two are combined followed by 1111 011 1111 0111 0100.
S1000, converting the fixed point number obtained by combination into a floating point number. The conversion method is to divide the decimal integer represented by the fixed point number by the power of 2 to Q. Next to the above example, the decimal integer of 1111011 1111 0111 0100 is 507764 divided by 2 to the 15 th power = 15.495728, approximately equal to 5.567 x 2.7835 = 15.4957445, with only 0.0000165, very small errors being seen.
S1100, outputting floating point number results.
Fig. 3 is a schematic diagram of bit alignment provided in an embodiment of the present application, as shown in fig. 3, in which the arithmetic unit uses 32 bits to store and calculate data, the preset index value is Q, the integer part of the fixed point number M is a, the decimal part is B, the integer part of the fixed point number N is C, the decimal part is D, the bit width used in the decimal part is Q, and the bit width used in the integer part is 32-Q.
The bit width of the integer multiplication operation result A multiplied by C is 2 multiplied by (32-Q), wherein the effective part is only 32-Q, and the excess part consists of 0 at high order, which is limited by the total bit width of the operation unit and cannot be displayed in the final calculation result, and the result needs to be discarded.
The bit width used for the fractional multiplication result b×d is 2×q, where Q bits are at low order, and these parts may belong to the calculation result, limited by the total bit width limitation of the operator, cannot be displayed in the final calculation result, and have an influence on the accuracy of the calculation result, but the influence degree and limitation thereof are limited, and therefore need to be discarded.
Fig. 4 is a schematic process diagram of a multiplication operation provided in one embodiment of the present application, and as shown in fig. 4, the schematic process diagram corresponds to the above embodiment, where to multiply the floating point number m=5.567, q=15, the binary representation with 32 bits is (0000 0000 0000 00) 10 1100 1000 1001 0011,n = 2.7835, and the binary representation with 32 bits is (0000 0000 0000 000) 1 0110 0100 0100 1010. The 0 in brackets belongs to the high order, has no influence on the calculation result, and is not shown in the figure.
Fig. 5 is a schematic structural diagram of a processor according to an embodiment of the present application, and as shown in fig. 5, a processor 500 includes at least one arithmetic unit 100.
According to the processor provided by the embodiment of the application, the floating point multiplication operation is replaced by the integer multiplication operation, the operation speed of the floating point multiplication operation is improved, the electric energy consumed by the arithmetic unit in the processor for the integer multiplication operation is less than that consumed by the floating point multiplication operation, the power consumption of the processor is reduced, in addition, the preset calibration value can be set according to the requirement, the operation precision requirement can be considered, the operation result overflow can be prevented, and the performance of the processor is improved.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and as shown in fig. 6, the electronic device includes at least one Processor (Processor) 500. The electronic device may further include: communication interface (Communications Interface) 620, memory (Memory) 630 and communication bus (Communications Bus) 640, wherein processor 500, communication interface 620, memory 630 complete communication with each other via communication bus 640.
Alternatively, the processor 500 may also invoke logic commands in the memory 630 to perform the methods described in the above embodiments, such as:
acquiring a first floating point number, a second floating point number and a preset standard value; performing fixed-point conversion on the first floating point number and the second floating point number based on a preset standard value to obtain a first fixed point number corresponding to the first floating point number and a second fixed point number corresponding to the second floating point number; performing integer multiplication operation on the integer part and the decimal part corresponding to the first fixed point number and the integer part and the decimal part corresponding to the second fixed point number based on a multiplication distribution law to obtain an integer operation result and a decimal operation result; and combining the integer operation result and the decimal operation result into a target fixed point number, and performing floating point number conversion on the target fixed point number based on a preset standard value to obtain multiplication operation results corresponding to the first floating point number and the second floating point number.
In addition, the logic commands in the memory described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several commands for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Fig. 7 is a schematic hardware structure of a terminal implementing an embodiment of the present application, as shown in fig. 7, where the terminal 700 includes, but is not limited to: at least some of the components of the radio frequency unit 701, the network module 702, the audio output unit 703, the input unit 704, the sensor 705, the display unit 706, the user input unit 707, the interface unit 708, the memory 709, and the processor 500, etc.
Those skilled in the art will appreciate that the terminal 700 may further include a power source (e.g., a battery) for powering the various components, and that the power source may be logically coupled to the processor 500 by a power management system for performing functions such as managing charging, discharging, and power consumption by the power management system. The terminal structure shown in fig. 7 does not constitute a limitation of the terminal, and the terminal may include more or less components than shown, or may combine certain components, or may be arranged in different components, which will not be described in detail herein.
It should be appreciated that in embodiments of the present application, the input unit 704 may include a graphics processing unit (Graphics Processing Unit, GPU) 7041 and a microphone 7 042, with the graphics processor 7041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 706 may include a display panel 7061, and the display panel 7061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 707 includes at least one of a touch panel 7071 and other input devices 7 072. The touch panel 7071 is also referred to as a touch screen. The touch panel 7071 may include two parts, a touch detection device and a touch controller. Other input devices 7072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.
In this embodiment, after receiving downlink data from the network side device, the radio frequency unit 701 may transmit the downlink data to the processor 500 for processing; in addition, the radio frequency unit 701 may send uplink data to the network side device. Typically, the radio unit 701 includes, but is not limited to, an antenna, an amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.
The memory 709 may be used to store software programs or instructions and various data. The memory 709 may mainly include a first storage area storing programs or instructions and a second storage area storing data, wherein the first storage area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 709 may include volatile memory or nonvolatile memory, or the memory 709 may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM), static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (ddr SDRAM), enhanced SDRAM (Enhanced SDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DRRAM). Memory 709 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.
The processor 500 may include one or more processing units; optionally, the processor 500 integrates an application processor that primarily processes operations involving an operating system, user interface, application programs, etc., and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 500.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (10)

1. An arithmetic unit is characterized by comprising a parameter input unit, a fixed-point conversion unit, a multiplication unit and a result output unit which are connected in sequence;
the parameter input unit is used for acquiring the first floating point number, the second floating point number and a preset standard value;
the fixed-point conversion unit is used for performing fixed-point conversion on the first floating point number and the second floating point number based on the preset scaling value to obtain a first fixed point number corresponding to the first floating point number and a second fixed point number corresponding to the second floating point number;
the multiplication unit is used for carrying out integer multiplication operation on the integer part and the decimal part corresponding to the first fixed point number and the integer part and the decimal part corresponding to the second fixed point number based on a multiplication distribution law to obtain an integer operation result and a decimal operation result;
The result output unit is configured to combine the integer operation result and the decimal operation result into a target fixed point number, and perform floating point number conversion on the target fixed point number based on the preset scale value, so as to obtain multiplication operation results corresponding to the first floating point number and the second floating point number.
2. The operator according to claim 1, wherein the multiplication unit includes:
the first operation subunit is used for carrying out cross multiplication on the integer part and the decimal part corresponding to the first fixed point number and the integer part and the decimal part corresponding to the second fixed point number based on a multiplication distribution law to obtain an integer multiplication operation result, a mixed multiplication operation result and a decimal multiplication operation result;
a second operator unit configured to determine the integer arithmetic result based on the integer part in the integer multiplication arithmetic result and the mixed multiplication arithmetic result;
and a third operation subunit configured to determine the decimal operation result based on the decimal portion in the decimal multiplication operation result and the mixed multiplication operation result.
3. The operator according to claim 2 wherein the first operator subunit is configured to, when calculating the result of the fractional multiplication operation:
Determining that the preset calibration value is greater than one half of the total bit width of the arithmetic unit;
moving the decimal part corresponding to the first fixed point number and the decimal part corresponding to the second fixed point number rightward by a first preset number of bits; the first preset bit number is the difference between the preset standard value and one half of the total bit width of the arithmetic unit;
and carrying out integer multiplication on the right shift result of the fractional part corresponding to the first fixed point number and the right shift result of the fractional part corresponding to the second fixed point number to obtain the fractional multiplication result.
4. The operator according to claim 2, wherein the second operator subunit is configured to:
shifting the mixed multiplication operation result to the right by a second preset bit number; the second preset digit is the preset standard value;
and aligning the right shift result of the mixed multiplication operation result with the integer multiplication operation result according to the order from right to left, and adding the aligned result to obtain the integer operation result.
5. The operator according to claim 2, wherein the third operator subunit is configured to:
determining that the preset calibration value is greater than one half of the total bit width of the arithmetic unit;
Shifting the decimal multiplication operation result to the right by a third preset digit; the third preset bit number is the difference between the total bit width of the arithmetic unit and the preset standard value;
and adding the right shift result of the decimal multiplication operation result to the decimal part in the mixed multiplication operation result to obtain the decimal operation result.
6. The operator according to claim 2, wherein the third operator subunit is configured to:
determining that the preset scaling value is less than or equal to one half of the total bit width of the arithmetic unit;
shifting the decimal multiplication operation result to the right by a fourth preset digit; the fourth preset bit number is the preset index value;
and adding the right shift result of the decimal multiplication operation result to the decimal part in the mixed multiplication operation result to obtain the decimal operation result.
7. The operator according to any one of claims 1 to 6, further comprising an overflow control unit connected to the multiplication unit and the result output unit;
the overflow control unit is used for:
shifting the decimal arithmetic result to the right by a fifth preset digit; the fifth preset digit is the preset standard value;
Determining that the right shift result of the decimal operation result is greater than zero;
determining the decimal operation result overflow;
and adding one operation to the integer operation result.
8. The operator according to any one of claims 1 to 6 wherein the result output unit is configured to:
shifting the integer operation result leftwards by a sixth preset number of bits to obtain an integer part of the target fixed-point number; the sixth preset digit is the preset standard value;
performing bit summation operation on the decimal operation result and a preset mask to obtain a decimal part of the target fixed point number; the number of bits of the preset mask is the preset scaling value;
and carrying out bit-wise OR operation on the integer part of the target fixed point number and the decimal part of the target fixed point number to obtain the target fixed point number.
9. A processor comprising at least one operator according to any one of claims 1 to 8.
10. An electronic device comprising at least one processor of claim 9.
CN202311080931.2A 2023-08-24 2023-08-24 Arithmetic unit, processor, and electronic apparatus Pending CN117270813A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311080931.2A CN117270813A (en) 2023-08-24 2023-08-24 Arithmetic unit, processor, and electronic apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311080931.2A CN117270813A (en) 2023-08-24 2023-08-24 Arithmetic unit, processor, and electronic apparatus

Publications (1)

Publication Number Publication Date
CN117270813A true CN117270813A (en) 2023-12-22

Family

ID=89213296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311080931.2A Pending CN117270813A (en) 2023-08-24 2023-08-24 Arithmetic unit, processor, and electronic apparatus

Country Status (1)

Country Link
CN (1) CN117270813A (en)

Similar Documents

Publication Publication Date Title
CN107273090A (en) Towards the approximate floating-point multiplier and floating number multiplication of neural network processor
US10489153B2 (en) Stochastic rounding floating-point add instruction using entropy from a register
US20070266072A1 (en) Method and apparatus for decimal number multiplication using hardware for binary number operations
CN106575214B (en) Merge the simulation of multiply-add operation
CN114341892A (en) Machine learning hardware with reduced precision parameter components for efficient parameter updating
US20120011185A1 (en) Rounding unit for decimal floating-point division
US10416962B2 (en) Decimal and binary floating point arithmetic calculations
CN112241291A (en) Floating point unit for exponential function implementation
CN111290732B (en) Floating-point number multiplication circuit based on posit data format
US10489115B2 (en) Shift amount correction for multiply-add
CN115390790A (en) Floating point multiply-add unit with fusion precision conversion function and application method thereof
US7499962B2 (en) Enhanced fused multiply-add operation
US20070266073A1 (en) Method and apparatus for decimal number addition using hardware for binary number operations
CN117270813A (en) Arithmetic unit, processor, and electronic apparatus
CN113126954B (en) Method, device and arithmetic logic unit for floating point number multiplication calculation
CN109558109B (en) Data operation device and related product
EP3647939A1 (en) Arithmetic processing apparatus and controlling method therefor
US20160041947A1 (en) Implementing a square root operation in a computer system
KR101084581B1 (en) Method and Apparatus for Operating of Fixed-point Exponential Function, and Recording Medium thereof
CN111313906A (en) Conversion circuit of floating point number
JP2003084969A (en) Floating point remainder computing element, information processing device, and computer program
KR100974190B1 (en) Complex number multiplying method using floating point
CN117648959A (en) Multi-precision operand operation device supporting neural network operation
JP2020166661A (en) Division device, division method and program
CN117787297A (en) Floating point multiplication and addition unit and operation method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination