CN116795324A - Mixed precision floating-point multiplication device and mixed precision floating-point number processing method - Google Patents

Mixed precision floating-point multiplication device and mixed precision floating-point number processing method Download PDF

Info

Publication number
CN116795324A
CN116795324A CN202310810589.0A CN202310810589A CN116795324A CN 116795324 A CN116795324 A CN 116795324A CN 202310810589 A CN202310810589 A CN 202310810589A CN 116795324 A CN116795324 A CN 116795324A
Authority
CN
China
Prior art keywords
precision
floating point
operand
target
floating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310810589.0A
Other languages
Chinese (zh)
Inventor
范文杰
孙红江
曾令仿
陈�光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202310810589.0A priority Critical patent/CN116795324A/en
Publication of CN116795324A publication Critical patent/CN116795324A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/487Multiplying; Dividing
    • G06F7/4876Multiplying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/4833Logarithmic number system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • G06F7/485Adding; Subtracting

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Nonlinear Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The application relates to a mixed precision floating point multiplication device and a mixed precision floating point number processing method. The device determines the offset of at least two floating point operands acquired from a register through an exponent offset module, inputs the offset of each floating point operand to a corresponding first adder and outputs the result to a second adder, performs offset processing on exponents corresponding to each floating point operand by the first adder by utilizing the offset of each floating point operand, determines the offset exponent value of each floating point operand, performs addition operation on the offset exponent value of each floating point operand through the second adder to obtain an intermediate exponent value, and is used for determining the target precision of a target floating point operand according to floating point number precision marking bits output by a multiplication unit, converting the intermediate mantissa and the intermediate exponent value output by the multiplication unit based on the target precision and completing the precision conversion of the target operand. By adopting the device, the mixing precision calculation efficiency can be improved.

Description

Mixed precision floating-point multiplication device and mixed precision floating-point number processing method
Technical Field
The application relates to the technical field of data processing, in particular to a mixed precision floating point multiplication device and a mixed precision floating point number processing method.
Background
With the development of artificial intelligence technology, deep learning, convolutional neural networks (Convolutional Neural Network, CNN), cyclic neural networks (Recurrent Neural Network, RNN), transformers, graph neural networks (Graph Neural Network, GNN) and derived extension class networks of these networks all have the characteristic of being computationally intensive, and also involve a large number of floating point calculations. However, in different neural networks, the range of values in which the data of parameters, inputs, and gradient footprints are located varies. When the variable type (fp32) with high precision is used for representation, the requirements of all the neural networks can be met; when using low precision Floating Point variable type representations (e.g., (Floating Point 16, fp16), (Floating Point8, FP 8)), situations may occur where the training fails due to data overflow. Generally, high-precision floating point computing resources consume more computing resources than low-precision floating point computing resources, but low-precision floating point computing generally produces worse results, so how to apply low-precision floating point to deep learning and neural network computing becomes an important research subject while minimizing result loss.
Currently, in order to apply low-precision floating points in deep learning and neural network calculation, a mixed precision processing mode of automatic precision mapping and variable scaling is adopted for implementation. However, the current hybrid precision processing method brings a great deal of extra computing resource consumption to general computing hardware when implementing variable type conversion, scaling and other processes, resulting in low hybrid precision computing efficiency.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a hybrid precision floating-point multiplication device and a hybrid precision floating-point number processing method, a computer device, a computer-readable storage medium, and a computer program product that can improve the hybrid precision calculation efficiency.
In a first aspect, the present application provides a hybrid precision floating-point multiplication apparatus. The mixed precision floating point multiplication device comprises an exponent bias module, at least two first adders, a second adder, a multiplication unit and a precision conversion unit, wherein the exponent bias module is connected with the at least two first adders, the at least two first adders are connected with the second adder, and the second adder and the multiplication unit are both connected with the precision conversion unit; wherein:
The exponent bias module is used for determining the bias of at least two floating point operands acquired from the register; wherein the floating point operand includes an exponent and a mantissa;
the at least two first adders respectively utilize the offset of each floating point operand to carry out offset processing on the exponent corresponding to each floating point operand, and determine the offset exponent value of each floating point operand;
the second adder is configured to perform an addition operation on offset exponent values of the floating point operands to obtain an intermediate exponent value;
the multiplication unit is used for carrying out multiplication processing on mantissas of the floating-point operands to obtain floating-point number precision labeling bits and intermediate mantissas;
the precision conversion unit is used for determining the target precision of a target floating point operand according to the floating point number precision marking bit, converting the intermediate mantissa and the intermediate exponent value based on the target precision and finishing the precision conversion of the target operand; wherein the destination floating point operand is determined from the intermediate mantissa and the intermediate exponent value.
In one embodiment, the precision conversion unit is further configured to determine a first floating point precision marking bit, where the first floating point precision marking bit indicates that the target floating point operand needs to be processed with precision enhancement, when the intermediate exponent value is greater than or equal to an overflow threshold or less than or equal to an underflow threshold.
In one embodiment, the precision conversion unit is further configured to determine a second floating point precision marking bit, where the second floating point precision marking bit indicates that the precision of the target floating point operand needs to be reduced when the intermediate exponent value is greater than or equal to a reduced precision negative threshold and less than or equal to a reduced precision positive threshold; the absolute values of the negative threshold of the reduced precision and the positive threshold of the reduced precision are the same, and are opposite numbers.
In one embodiment, the precision conversion unit is further configured to determine a third floating point precision marking bit, where the intermediate exponent value is greater than a reduced precision positive threshold and less than an overflow threshold, or where the intermediate exponent value is greater than an underflow threshold and less than a reduced precision negative threshold, the third floating point precision marking bit characterizes that the target floating point operand is not required to be processed.
In one embodiment, the precision conversion unit is further configured to determine a first target precision that is greater than an initial precision of the target floating-point operand, and convert the intermediate mantissa and the intermediate exponent value based on the first target precision to complete precision conversion of the target operand when it is determined that the target floating-point operand needs to be subjected to precision lifting processing according to the floating-point precision labeling bit.
In one embodiment, the precision conversion unit is further configured to determine a second target precision smaller than the initial precision of the target floating-point operand, and convert the intermediate mantissa and the intermediate exponent value based on the second target precision to complete precision conversion of the target operand, where the precision conversion unit determines that the target floating-point operand needs to be subjected to precision reduction processing according to the floating-point precision labeling bit.
In one embodiment, the mixed precision floating-point multiplication device further includes a register allocation module, where the register allocation module is configured to determine, in a case where there is an idle allocable register set, a register matching the floating-point operands and a register matching the scaling factors corresponding to the floating-point operands from the idle allocable register set, based on a register allocation priority of at least two floating-point operands to be allocated, a register allocation priority of the scaling factors corresponding to the floating-point operands, and an initial precision of the floating-point operands and an initial precision of the scaling factors corresponding to the floating-point operands, and complete register allocation of the at least two floating-point operands and the scaling factors corresponding to the floating-point operands.
In one embodiment, the mixed precision floating-point multiplication device further includes a third adder, and if the exponent bias module is connected to the third adder, the third adder is configured to perform addition processing according to the bias of the at least two floating-point operands, and determine the bias of the target floating-point operand.
In a second aspect, the application also provides a mixed precision floating point number processing method. The method is applied to the mixed precision floating point multiplication device, and comprises the following steps:
determining a bias of at least two floating point operands retrieved from a register; wherein the floating point operand includes an exponent and a mantissa;
respectively carrying out offset processing on the exponents corresponding to the floating point operands according to the offset of the floating point operands, and determining the offset exponent value of the floating point operands;
performing addition operation on the offset exponent values of the floating point operands to obtain intermediate exponent values;
multiplying mantissas of the floating point operands to obtain floating point number precision labeling bits and intermediate mantissas;
determining the target precision of a target floating point operand according to the floating point number precision marking bit, and converting the intermediate mantissa and the intermediate exponent value based on the target precision to finish the precision conversion of the target operand; wherein the destination floating point operand is determined from the intermediate mantissa and the intermediate exponent value.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:
determining a bias of at least two floating point operands retrieved from a register; wherein the floating point operand includes an exponent and a mantissa;
respectively carrying out offset processing on the exponents corresponding to the floating point operands according to the offset of the floating point operands, and determining the offset exponent value of the floating point operands;
performing addition operation on the offset exponent values of the floating point operands to obtain intermediate exponent values;
multiplying mantissas of the floating point operands to obtain floating point number precision labeling bits and intermediate mantissas;
determining the target precision of a target floating point operand according to the floating point number precision marking bit, and converting the intermediate mantissa and the intermediate exponent value based on the target precision to finish the precision conversion of the target operand; wherein the destination floating point operand is determined from the intermediate mantissa and the intermediate exponent value.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
Determining a bias of at least two floating point operands retrieved from a register; wherein the floating point operand includes an exponent and a mantissa;
respectively carrying out offset processing on the exponents corresponding to the floating point operands according to the offset of the floating point operands, and determining the offset exponent value of the floating point operands;
performing addition operation on the offset exponent values of the floating point operands to obtain intermediate exponent values;
multiplying mantissas of the floating point operands to obtain floating point number precision labeling bits and intermediate mantissas;
determining the target precision of a target floating point operand according to the floating point number precision marking bit, and converting the intermediate mantissa and the intermediate exponent value based on the target precision to finish the precision conversion of the target operand; wherein the destination floating point operand is determined from the intermediate mantissa and the intermediate exponent value.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:
determining a bias of at least two floating point operands retrieved from a register; wherein the floating point operand includes an exponent and a mantissa;
Respectively carrying out offset processing on the exponents corresponding to the floating point operands according to the offset of the floating point operands, and determining the offset exponent value of the floating point operands;
performing addition operation on the offset exponent values of the floating point operands to obtain intermediate exponent values;
multiplying mantissas of the floating point operands to obtain floating point number precision labeling bits and intermediate mantissas;
determining the target precision of a target floating point operand according to the floating point number precision marking bit, and converting the intermediate mantissa and the intermediate exponent value based on the target precision to finish the precision conversion of the target operand; wherein the destination floating point operand is determined from the intermediate mantissa and the intermediate exponent value.
The mixed precision floating point multiplication device, the mixed precision floating point number processing method, the computer equipment, the storage medium and the computer program product are characterized in that the offset of at least two floating point operands is determined through an exponent offset module, exponents of the floating point operands with different precision are respectively input into a first adder corresponding to each of the exponents, the exponents corresponding to the offsets are subjected to exponent offset processing through the first adder by utilizing the offsets of the floating point operands with different precision, offset exponent values of all the floating point operands are determined, the offset exponent values of all the floating point operands are output to a second adder, and the addition operation is carried out on the offset exponent values of all the floating point operands through the second adder to obtain intermediate exponent values; and inputting mantissas of floating-point operands with different precision into a multiplication unit, multiplying the mantissas of the floating-point operands by the multiplication unit to obtain floating-point precision labeling bits and intermediate mantissas, judging whether precision conversion is required or not based on the floating-point precision labeling bits, determining target precision to be converted, and completing automatic conversion of the precision. The method realizes automatic conversion between different precision by two addition processes and one multiplication process and combining floating point number precision marking bits, reduces the times of multiplication calculation, does not need type conversion, scaling and other processes on floating point operands, namely reduces the calculation resource consumption of hardware resources by reducing the processing amount of data, and further improves the calculation efficiency of mixed precision.
Drawings
FIG. 1 is a block diagram of a hybrid precision floating point multiplication device in one embodiment;
FIG. 2 is a schematic diagram of transitions between different accuracies in one embodiment;
FIG. 3 is a block diagram of a hybrid precision floating point multiplication device in another embodiment;
FIG. 4 is a diagram illustrating usage of registers in one embodiment;
FIG. 5 is a block diagram of a hybrid precision floating point multiplication device in one embodiment;
FIG. 6 is an application of a hybrid precision floating point multiplication based device in one embodiment;
FIG. 7 is a flow diagram of a method of mixed precision floating point processing in one embodiment;
FIG. 8 is a flow chart of a method of mixed precision floating point processing in another embodiment;
fig. 9 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
The mixed precision floating point number operation is widely applied to the artificial intelligence fields such as deep learning, neural network training and the like. Advantages of hybrid accuracy include accelerated computation, reduced memory footprint, etc. That is, the mixed precision floating point number operation improves the training efficiency of the neural network model by introducing the advantages of high calculation efficiency, low memory occupation and the like of the low precision floating point number, and simultaneously ensures the training precision of the neural network model by utilizing the high precision floating point number.
Taking the artificial intelligence field as an example, in different neural networks, the numerical ranges of parameters, input and gradient footprint data are different, when the high-precision variable types are used for representing, the requirements of all the neural networks can be almost met, but if the low-precision floating point variable types are used for representing, the condition that the training fails due to data overflow can occur, and in the process of carrying out numerical calculation, precision errors can occur in floating point operation. Therefore, how to apply low-precision floating point to deep learning and neural network calculation under the condition of minimizing result loss is always a popular field of academic and industrial research. At present, in the neural network hybrid precision training, the existing hybrid precision floating point number operation schemes are realized by introducing a large number of types of conversion, scaling and other processing modes depending on software, so that a large number of extra computing resources are consumed for general computing hardware, and the hybrid precision computing efficiency is low.
In order to solve the technical problem, a mixed precision floating-point multiplication device is provided. The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
In one embodiment, as shown in fig. 1, a mixed-precision floating-point multiplication device is proposed, where the mixed-precision floating-point multiplication device includes an exponent bias module 102, at least two first adders 104, a second adder 106, a multiplication unit 108, and a precision conversion unit 110, the exponent bias module 102 is connected to the at least two first adders 104, the at least two first adders 104 are connected to the second adder 106, and the second adder 106 and the multiplication unit 108 are both connected to the precision conversion unit 110; wherein:
an exponent bias module 102 for determining a bias of at least two floating point operands retrieved from registers; wherein the floating point operand includes an exponent and a mantissa.
Wherein the floating point operand includes an exponent portion and a mantissa portion, the exponent portion being referred to as an exponent and the mantissa portion being referred to as a mantissa for ease of description. The at least two floating point operands may include floating point operands having the same precision or floating point operands having different precision, and this embodiment is described by taking floating point operands having different precision as an example. At least two floating point operands may be understood as the data to be operated on.
The floating point operands with different precision comprise floating point operands with different precision such as FP8, FP32, BF16, FP8-M2E5 and the like, the number and the precision type of the floating point operands with different precision are not limited, and the floating point operands can be determined according to actual application scenes. The bias of the floating point operand is determined according to the mapping relation satisfied between the scaling factor scaler and the bias, and the mapping relation may be a preset logarithm, for example, bias=log 2 (scaler), that is, the exponent portion of scaler is the value of the exponent bias block. The scaling factors corresponding to the floating point operands can be different, the scaling factors are used for gradient scaling during model training, data overflow can be prevented, a certain proportion is enlarged and reduced for all floating point operands participating in operation at the same time, and the relative proportion among the floating point operands is unchanged.
It will be appreciated that, since scaler is a floating point number of FP16, scaling based on existing software implementation results in an increase in one multiplication calculation during scaler application, and the increase in floating point multiplication calculation increases the number of times the kernel is pulled up and the number of times the variable is transferred from memory to registers, which is difficult to optimize in the current processing manner. bias is an 8-bit variable which directly affects the exponent part of floating point calculation so as to play a role of scaling, and bias variable range adjustment can be realized simultaneously in the process of carrying out one floating point multiplication calculation. Thereby avoiding the extra multiplication process brought by scaler scaling. That is, floating point operation data may be scaled quickly and efficiently by the exponent bias module 102 for determining the bias of at least two floating point operands currently retrieved from registers. In other words, the offset of the floating point operand and the exponent of the floating point operand are fed into the adder, and the floating point integral numerical value is rapidly amplified/reduced by adding (the offset value is positive number)/subtracting (the offset value is negative number).
Specifically, when at least two floating point operands are obtained from registers, the exponent and mantissa of each floating point operand are obtained by unpacking the obtained floating point operands of different precision, the offset of the exponent portion acting on the floating point operands of different precision is determined by the exponent offset module 102, and the floating point operands of different precision can be scaled based on the offset of the exponent portion.
At least two first adders 104 perform offset processing on the exponents corresponding to the floating-point operands by using the offsets of the floating-point operands respectively, and determine offset exponent values of the floating-point operands.
It will be appreciated that there is one corresponding first adder for the input floating point operands, that is to say that each first adder processes one floating point operand per operation. The first adder performs offset processing by using the offset of the floating point operand corresponding to each offset and the exponent corresponding to the offset, so as to determine the offset exponent value of each floating point operand, which can be implemented in the existing manner and will not be described herein.
And a second adder 106 for adding the offset exponent values of the floating point operands to obtain an intermediate exponent value.
The multiplication unit 108 is configured to multiply mantissas of the floating-point operands to obtain floating-point number precision labeling bits and intermediate mantissas.
It will be appreciated that when the mantissas of two floating point operands are input to the multiplication unit, a floating point number multiplication operation will be performed. When two floating point operands are multiplied, the mantissas of the two are required to be multiplied, the exponents are added, and then the result is normalized to obtain the proper mantissa and exponent. During this process, overflow, overlay, or rounding errors, etc. may occur. The intermediate mantissa is obtained by multiplying the mantissa of each floating-point operand, and can be understood as the mantissa before precision conversion.
The floating point number precision marking bit is used for representing whether the multiplication result of the mixed precision floating point operand needs to be subjected to precision improving processing or precision reducing processing. That is, when the multiplication result of the floating-point operand satisfies the condition of improving the precision processing or the condition of lowering the precision processing, the value of the floating-point number precision flag bit PR-flag is also correspondingly given a corresponding numerical value, for example, a positive value or a negative value, or zero. For example, if the value of the floating point number precision labeling bit is positive, the precision of the output multiplication result is required to be improved; the value of the floating point number precision marking bit is negative, and the floating point number precision marking bit represents that the output multiplication result needs to be subjected to precision reduction processing; the value of the floating point number precision labeling bit is zero, which represents that the output multiplication result does not need to be subjected to precision conversion processing.
A precision conversion unit 110, configured to determine a target precision of the target floating-point operand according to the floating-point number precision labeling bit, and convert the intermediate mantissa and the intermediate exponent value based on the target precision, so as to complete precision conversion of the target operand; wherein the destination floating point operand is determined based on the intermediate mantissa and the intermediate exponent value.
Wherein the precision conversion includes adjacent precision conversion and non-adjacent precision conversion. The adjacent precision conversion means that in the data type conversion, the precision of the source type and the target type are adjacent, for example, FP 32-BF 16 is subjected to adjacent precision conversion, and the conversion mode can be realized by a mode of carrying out bit number truncation on mantissa and exponent parts. Non-adjacent precision conversion means that in the process of converting the data type, the precision of the source type and the precision of the target type are not adjacent, for example, when converting from FP8 to FP32, the non-adjacent precision conversion can occur, and the conversion mode can be used for improving the floating point number to the target precision by adopting a mode of unsigned expansion of mantissa and exponent respectively. In this embodiment, the precision conversion is described taking the case of realizing the adjacent precision conversion.
Determining the target precision of the target floating-point operand based on the floating-point number precision tagging bit may be adjacent to the initial precision of the target floating-point operand. For example, if the initial precision of the target floating point operand is BF16 and the floating point precision flag bit indicates that the lifting precision processing is required, the corresponding adjacent precision is FP32.
It can be appreciated that the exponent bias module determines the bias value of the exponent portion of the current floating point number, and performs addition (the bias value is positive number)/subtraction (the bias value is negative number) to achieve fast amplification/reduction of the integral value of the floating point number, so that the step of fast amplification/reduction of the integral value of the floating point number can be performed synchronously with the multiplication of the mantissa of the floating point operand, further saving the calculation time, and reducing the additional multiplication caused by the scaling of the floating point operation data. Furthermore, pipeline parallelism between current floating point operand multiplication calculation and next floating point operand calculation can be realized, and in view of the fact that the multiplication operation of mantissas of floating point operands is the longest step consuming time in the whole calculation, other steps such as number scaling in floating point number multiplication operation, automatic precision conversion of last floating point operation and the like can be overlapped with the floating point operand mantissa multiplication operation step, and therefore time for floating point number scaling and precision adaptation conversion can be hidden, and mixed precision floating point number multiplication calculation is further accelerated.
The mixed precision floating-point multiplication device is characterized in that the offset of at least two floating-point operands is determined through an exponent offset module, exponents of the at least two floating-point operands are respectively input into corresponding first adders, the exponents of the corresponding floating-point operands are offset through the first adders by utilizing the offset of each floating-point operand, offset exponent values of each floating-point operand are determined, the offset exponent values of each floating-point operand are output to a second adder, and the offset exponent values of each floating-point operand are subjected to addition operation through the second adder to obtain intermediate exponent values; and outputting the mantissas of the floating point operands with different precision to a multiplication unit, performing multiplication processing on the mantissas of the floating point operands through the multiplication unit to obtain floating point precision labeling bits and intermediate mantissas, judging whether precision conversion is required or not based on the floating point precision labeling bits, determining target precision to be converted, and completing automatic conversion of the precision without fixing the target precision. The mixed precision automatic conversion is realized by two times of addition processing and one time of multiplication computation and combining floating point precision marking bits, so that the times of multiplication computation are reduced, the types of floating point operands do not need to be converted, scaled and other processing at a software level, namely, the computing resource consumption of hardware resources is reduced by reducing the processing amount of data, and the computing efficiency of the mixed precision is further improved.
On the basis of determining the offset of the current floating point number index part through an index offset module and realizing the rapid and efficient scaling of floating point operands when the floating point operands are required to be scaled in the mixed precision training, whether the precision of a multiplication result is required to be dynamically adapted and changed can be determined according to the floating point number precision marking bit. In order to avoid frequent conversion of floating point operands between adjacent precision, a condition for improving precision processing or a condition for improving precision processing is determined according to a set precision reducing threshold and a set precision improving threshold, and then a floating point number precision marking bit is determined.
The reduced precision threshold of the high precision and the raised precision threshold of the low precision between the adjacent precision cannot be equal, for example, because the reduced precision threshold is used for the conversion from the high precision to the low precision, the reduced precision threshold when the adjacent high precision (such as FP 32) is converted to the low precision (such as BF 16) cannot be equal to the raised precision threshold of the low precision (BF 16) in the adjacent precision. The improvement precision threshold has two corresponding judging thresholds, namely an overflow threshold and an underflow threshold, and the reduction precision threshold also has two corresponding judging thresholds, namely a reduction precision negative threshold and a reduction precision positive threshold, wherein the absolute values of the reduction precision negative threshold and the reduction precision positive threshold are equal and are opposite to each other. That is, the condition for improving the precision processing or the condition for lowering the precision processing is determined based on the intermediate index value, the overflow threshold, the underflow threshold, and the negative and positive precision lowering thresholds. The overflow threshold, the underflow threshold, and the reduced accuracy negative threshold and the reduced accuracy positive threshold may be predetermined.
In order to avoid frequent conversion of floating point operands between adjacent precision, a floating point number precision marking bit is determined according to a set precision reducing threshold and precision lifting threshold, including the following cases:
in one embodiment, the precision conversion unit 110 is further configured to determine a first floating point number precision marking bit, where the intermediate exponent value is greater than or equal to the overflow threshold or less than or equal to the underflow threshold, and the first floating point number precision marking bit indicates that the target floating point operand needs to be processed with improved precision.
Correspondingly, in the case of determining the first floating point precision labeling bit, in one embodiment, the precision conversion unit 110 is further configured to determine, in the case where it is determined that the lifting precision processing needs to be performed on the target floating point operand according to the first floating point precision labeling bit, a first target precision that is greater than the initial precision of the target floating point operand, convert the intermediate mantissa and the intermediate exponent value based on the first target precision, and complete the precision conversion of the target operand. For example, when two FP 8-type floating-point operands are multiplied to obtain an intermediate exponent value greater than or equal to the overflow threshold or less than or equal to the underflow threshold, the result indicates that the result of the calculation overflows, and determining that the first floating-point number precision marking bit indicates that the precision of the target floating-point operand needs to be improved, the precision of the output target floating-point operand is converted into adjacent precision BF16.
In one embodiment, the precision conversion unit 110 is further configured to determine a second floating point precision marking bit when the intermediate exponent value is greater than or equal to the negative precision reducing threshold and less than or equal to the positive precision reducing threshold, and determine the second floating point precision marking bit, where the second floating point precision marking bit indicates that the target floating point operand needs to be subjected to precision reducing processing.
Correspondingly, in the case of determining the second floating point precision labeling bit, in one embodiment, the precision conversion unit 110 is further configured to determine a second target precision smaller than the initial precision of the target floating point operand, and convert the intermediate mantissa and the intermediate exponent value based on the second target precision to complete the precision conversion of the target operand in the case that it is determined that the target floating point operand needs to be subjected to the precision reducing process according to the second floating point precision labeling bit.
In one embodiment, the precision conversion unit 110 is further configured to determine a third floating point number precision marking bit, where the intermediate exponent value is greater than the reduced precision positive threshold and less than the overflow threshold, or where the intermediate exponent value is greater than the underflow threshold and less than the reduced precision negative threshold, the third floating point number precision marking bit characterizes that no processing of the target floating point operand is required.
The mixed precision floating-point multiplication device can realize the precision conversion of floating-point operands, and does not need the mode of issuing variables after conversion by software commonly used in the traditional method.
Alternatively, in one embodiment, the hybrid precision floating point multiplication device based on the above can quickly complete conversion between different precision according to actual requirements. As shown in fig. 2, which is a schematic diagram of conversion between different precision, conversion between three precision of single precision floating point number FP32, BF16 (Brain flow 16) and half precision floating point number FP8 can be achieved by responding to a data type conversion instruction CVT. For example, by responding to a data type conversion instruction CVT FP8 FP32, converting an 8-bit floating point number into a 32-bit floating point number; converting the 32-bit floating point number into an 8-bit floating point number by responding to a data type conversion instruction CVT FP32 FP 8; converting the 32-bit floating point number into a 16-bit floating point number by responding to a data type conversion instruction CVT FP32 BF 16; converting the 16-bit floating point number into a 32-bit floating point number by responding to a data type conversion instruction CVT BF16 FP 32; the 16-bit floating point number is converted into an 8-bit floating point number in response to the data type conversion instruction CVT BF16 FP8, and the 8-bit floating point number is converted into a 16-bit floating point number in response to the data type conversion instruction CVT FP8 BF 16. The conversion between non-adjacent accuracies can be used for accuracy matching at the time of calculation.
In another embodiment, as shown in fig. 3, a mixed-precision floating-point multiplication apparatus is proposed, which includes a register allocation module 100 in addition to an exponent bias module 102, at least two first adders 104, a second adder 106, a multiplication unit 108, and a precision conversion unit 110.
The register allocation module 100 is configured to determine, in the case where there is an idle allocable register set, a register matching the floating point operands and a register matching the scaling factors corresponding to the floating point operands from the idle allocable register set based on the register allocation priority of the at least two floating point operands to be allocated, the register allocation priority of the scaling factors corresponding to the floating point operands, and the initial precision of the floating point operands and the initial precision of the scaling factors corresponding to the floating point operands, and complete the register allocation of the at least two floating point operands and the scaling factors corresponding to the floating point operands. The register allocation priority of the floating point operands is the same as the register allocation priority of the scaling factors corresponding to the floating point operands. The register matching each floating point operand and the corresponding scaling factor is determined from the idle allocable register set, and the register allocation of the scaling factors corresponding to at least two floating point operands and each floating point operand is completed.
To meet the demands of different precision on registers, the registers of the hybrid precision floating-point multiplication device may be multi-precision non-aligned registers. Each multiplication unit uses at most 6 x 32 bits, and in practical use, a plurality of floating point units share a group of registers according to hardware design to improve the utilization rate. In this example, each group of registers is composed of 4 8-bit registers, and when at least two floating point operands are calculated, registers with different precision are utilized multiple times according to the initial precision of each floating point operation data. Therefore, registers need to be allocated to at least two floating point operands first. In addition, scalers other than floating point numbers may also allocate registers on a 16bit basis.
Specifically, determining an idle allocable register set, when performing register allocation on at least two floating point operands, acquiring the register allocation priority of the at least two floating point operands and the initial precision of each floating point operand, and determining a register matched with the aimed floating point operand from the current register if a continuous register space capable of bearing the initial precision of the aimed floating point operand exists in the current register set according to the register allocation priority for the aimed floating point operand in the at least two floating point operands. If there is no continuous register space capable of carrying the initial precision of the targeted floating point operand, determining a register matched with the targeted floating point operand from the next group of registers, and completing the register allocation of at least two floating point operands. Similarly, register allocation of the scaling factor scaler for at least two floating point operands may also be accomplished.
For example, the functions Function (FP 32 a1, BF16 a2, FP8 a3, BF16 a4, FP32 a5, BF16 a6, FP8 a 7) require the use of registers to complete the calculation operation. The number of the register is the form of numbers (1-4) plus English letters (A-D), and represents 4 groups of registers respectively, each group of registers consists of four 8-bit registers with the number of A-D, and operands are stored in the registers with allowable capacity and the forefront in the calculation process preferentially. In the calculation of the Function functions (FP 32 a1, BF16 a2, FP8 a3, BF16 a4, FP32 a5, BF16 a6, FP8 a 7), registers with different precision are used multiple times, the usage status of the registers of the whole Function is as shown in fig. 4, when applying for registers for a1 variable, the register manager allocates 32-bit space for four 8-bit registers of 1a,1b,1c,1d to satisfy the register requirement of FP32, and at this time, the register occupation situation is changed accordingly; the register manager allocation module updates the available memory locations of the variable with different precision in the register, and after the register is allocated for the variable a1, the available foremost locations of FP32, BF16, FP8 are all updated to 2A. When a2 variable needs to be allocated with a register, the register manager allocation module allocates the register to the first register group in the current record, which can meet the BF16 storage requirement, that is, the a2 variable is allocated to two 8-bit registers of 2a and 2 b. At this time, the front available positions corresponding to FP32, BF16, FP8 are updated to 3a,2c, respectively. Similarly, the a3 variable is assigned to 2C, and the front available positions corresponding to FP32, BF16, and FP8 are updated to 3A, and 2D, respectively. Since the 2D register is an 8-bit register that can only carry 8 bits of data, whereas FP32 and BF16 data have 32 bits and 16 bits, respectively, and the second set of registers does not have enough contiguous space allocated to both types of data, the forefront available positions of the two precision variables will automatically skip the unsatisfiable 2D, updating to available 3A. Similarly, the a4 variable is allocated to two 8-bit registers 3A and 3B, the a5 variable is allocated to four 8-bit registers 4A, 4B, 4C and 4D, the a6 variable is allocated to two 8-bit registers 3C and 3D, and the a7 variable is allocated to a 2D register, so that the variables contained in the function are successfully allocated to the registers.
It can be understood that, based on the register allocation priorities of at least two floating point operands to be allocated and the register allocation priorities of the corresponding scaling factors scalers, and the initial precision of each floating point operand and the scaling factor scaler, the registers matched with each floating point operand and the scaling factor scaler are determined from the idle allocable register set, so that the register allocation of at least two floating point operands and the corresponding scaling factor scalers is completed, and the utilization rate of the registers in the mixed precision computing scene can be improved. By pre-allocating registers, the output result generated in the floating point multiplication process does not need to be separately allocated registers, but rather the registers occupied by the input variables of the function are multiplexed and allocated to the first available registers occupied by the input variables in a data flow analysis mode, so that the sharing of the register group is realized.
Further, when the register allocation is performed on the floating point operands, there may be a case that the floating point operands cannot be successfully applied to the registers, but considering that the number of data that can be processed by the computing unit at one time is limited, the computing unit releases a batch of registers every time when processing a batch of data, and a method of applying registers for the remaining variables while computing can be adopted to complete the register allocation of all the floating point operands.
In another embodiment, as shown in FIG. 5, the hybrid precision floating-point multiplication device includes an exponent bias module 102, at least two first adders 104, a second adder 106, a multiplication unit 108, and a precision conversion unit 110, a register allocation module 100, and a third adder 112. The present embodiment describes a case where the number of the first adders is two, wherein the third adder 112 is configured to add offsets of at least two floating point operands, and the added offset is used as an offset of a target operand. The limitations of other units and modules in the hybrid precision floating point multiplication device may be referred to above, and will not be described herein.
As shown in fig. 6, for the application of the mixed precision floating-point multiplication device, a floating-point operand X and a floating-point operand Y are obtained from a register, and in response to a multiplication instruction for the floating-point operand X and the floating-point operand Y, the floating-point operand X and the floating-point operand Y are unpacked to obtain respective corresponding signs, exponents, and mantissas. The exponent bias module 102 determines a first bias for the floating point operand X and a second bias for the floating point operand Y based on the scaling factors scaler for each of the floating point operand X and the floating point operand Y, and determines the corresponding scale. And carrying out offset processing on the exponents of the floating-point operands X by using the offsets of the floating-point operands X through the first adder to obtain offset exponent values of the floating-point operands X, and carrying out offset processing on the exponents of the floating-point operands Y by using the offsets of the floating-point operands Y through the first adder to obtain offset exponent values of the floating-point operands Y. And adding the offset exponent value of the floating point operand X and the offset exponent value of the floating point operand Y through a second adder to obtain an intermediate exponent value.
The mantissa of the floating point operand X and the mantissa of the floating point operand Y are multiplied by the multiplication unit 108 to obtain a floating point number precision marking bit and an intermediate mantissa, a target floating point operand of the multiplication operation of the floating point operand X and the floating point operand Y can be determined according to the intermediate mantissa and the intermediate mantissa, the target precision of the target floating point operand is determined according to the floating point number precision marking bit, the intermediate mantissa and the intermediate mantissa are converted based on the target precision to obtain a target mantissa and a target mantissa value of the target operand, the precision conversion of the target operand is completed, and the target mantissa value of the target operand are packed to obtain the target operand after the precision conversion.
And under the condition that the intermediate exponent value is larger than or equal to an overflow threshold value or smaller than or equal to an underflow threshold value, determining a first floating point number precision marking bit, wherein the first floating point number precision marking bit indicates that the lifting precision of the target floating point operand is required. And under the condition that the intermediate exponent value is larger than or equal to the reduced precision negative threshold value and smaller than or equal to the reduced precision positive threshold value, determining a second floating point number precision marking bit, wherein the second floating point number precision marking bit indicates that the target floating point operand needs to be subjected to reduced precision processing. The absolute value of the negative threshold of decreasing precision is the same as that of the positive threshold of decreasing precision, and the negative threshold of decreasing precision and the positive threshold of decreasing precision are opposite numbers. And determining a third floating point number precision marking bit under the condition that the intermediate exponent value is larger than the reduced precision positive threshold and smaller than the overflow threshold or the intermediate exponent value is larger than the underflow threshold and smaller than the reduced precision negative threshold, wherein the third floating point number precision marking bit is used for representing that the target floating point operand is not required to be processed.
In the above embodiment, by pre-allocating the registers, the output result generated in the floating point multiplication process does not need to allocate the registers separately, but multiplexes the registers occupied by the input variables of the function, and allocates the registers to the first available registers occupied by the input variables in a data stream analysis manner, so as to realize sharing of the register group and improve the utilization rate of the registers. And the mixed precision automatic conversion is realized by two addition processes and one multiplication process and combining floating point precision marking bits, so that the number of times of multiplication calculation is reduced, and the type conversion, scaling and other processes on floating point operands are not needed, namely, the calculation resource consumption of hardware resources is reduced by reducing the processing amount of data, and the calculation efficiency of the mixed precision is further improved.
The various blocks in the hybrid precision floating-point multiplication apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The functions of the above modules can be embedded in hardware or independent from a processor in the computer device, or can be coded in software and stored in a memory in the computer device, so that the processor can call and execute the operations corresponding to the functions of the above modules.
Based on the same inventive concept, the embodiment of the application also provides a mixed precision floating point number processing method for realizing the mixed precision floating point multiplication device. The implementation of the solution to the problem provided by the method is similar to that described in the above method, so the specific limitation in one or more embodiments of the hybrid precision floating point number processing method provided below may be referred to the limitation of the hybrid precision floating point number processing device hereinabove, and will not be repeated herein.
In one embodiment, as shown in fig. 7, a mixed precision floating point number processing method is provided, and the method is applied to the mixed precision floating point multiplication device, for example, and includes the following steps:
step 702, determining the offset of at least two floating point operands obtained from registers; wherein the floating point operand includes an exponent and a mantissa.
In step 704, the offset processing is performed on the exponent corresponding to each floating point operand according to the offset of each floating point operand, so as to determine the offset exponent value of each floating point operand.
Step 706, performing addition operation on the offset exponent values of each floating point operand to obtain an intermediate exponent value.
Step 708, performing multiplication processing on the mantissas of the floating-point operands to obtain floating-point number precision labeling bits and intermediate mantissas.
Step 710, determining the target precision of the target floating point operand according to the floating point number precision marking bit, and converting the intermediate mantissa and the intermediate exponent value based on the target precision to complete the precision conversion of the target operand; wherein the destination floating point operand is determined based on the intermediate mantissa and the intermediate exponent value.
Specifically, in response to a multiplication instruction of a floating-point operand, at least two floating-point operands obtained from a register are unpacked to obtain a sign, an exponent and a mantissa of each floating-point operand. And determining the offset of at least two floating point operands by an exponent offset module, inputting the offset of each floating point operation data and the corresponding exponent into the corresponding adder, performing offset processing on the exponent by using the offset to obtain an offset exponent value of each floating point operand, inputting the offset exponent value of each floating point operand into a second adder, and performing addition processing on the obtained offset exponent values of at least two floating point operands to obtain an intermediate exponent value. The method comprises the steps of inputting mantissas of each floating point operation data to a multiplication unit, outputting floating point number precision marking bits and intermediate mantissas through the multiplication unit, determining target precision of a target floating point operand according to the floating point number precision marking bits through a precision conversion unit, and converting intermediate mantissas and intermediate mantissa values based on the target precision to finish precision conversion of the target operand.
According to the mixed precision floating point number processing method, the offset of at least two floating point operands is determined through the exponent offset module, the floating point operands with different precision are respectively input into the corresponding first adders, the exponents of the corresponding floating point operands are offset through the first adders by utilizing the offset of the floating point operands with different precision, the offset exponent values of all the floating point operands are determined, the offset exponent values of all the floating point operands are output to the second adders, and the offset exponent values of all the floating point operands are subjected to addition operation through the second adders, so that the intermediate exponent values are obtained; and outputting the mantissas of the floating point operands with different precision to a multiplication unit, performing multiplication processing on the mantissas of the floating point operands by the multiplication unit to obtain floating point precision labeling bits and intermediate mantissas, judging whether precision conversion is required or not based on the floating point precision labeling bits, determining target precision to be converted, and completing automatic conversion of the precision. The mixed precision automatic conversion is realized by two times of addition processing and one time of multiplication computation and combining floating point precision marking bits, the times of multiplication computation are reduced, the types of floating point operands do not need to be converted, scaled and other processing, namely the computing resource consumption of hardware resources is reduced by reducing the processing amount of data, and the computing efficiency of the mixed precision is further improved.
Optionally, in one embodiment, before determining the offset of the at least two floating point operands obtained from the registers, the registers are further allocated in advance to the at least two floating point operands and their corresponding scaling factors scalers, and in the case that there is a free allocable register set, the register allocation of the at least two floating point operands and their scaling factors is completed based on the register allocation priorities of the at least two floating point operands to be allocated and the register allocation priorities of the corresponding scaling factors and the initial precision of each floating point operand and its corresponding scaling factor scaler, the registers matching each floating point operand are determined from the free allocable register set.
Multiplying mantissas of all floating point operands to obtain floating point number precision labeling bits comprises the following cases:
optionally, in one embodiment, in a case where the intermediate exponent value is greater than or equal to an overflow threshold, or less than or equal to an underflow threshold, a first floating point number precision marking bit is determined, where the first floating point number precision marking bit characterizes that the target floating point operand needs to be subject to a lifting precision process.
Optionally, in one embodiment, in a case where the intermediate exponent value is greater than or equal to a reduced precision negative threshold and less than or equal to a reduced precision positive threshold, determining a second floating point number precision marking bit, the second floating point number precision marking bit characterizing that reduced precision processing of the target floating point operand is required. The absolute value of the negative threshold of decreasing precision is the same as that of the positive threshold of decreasing precision, and the negative threshold of decreasing precision and the positive threshold of decreasing precision are opposite numbers.
Optionally, in one embodiment, a third floating point number precision marking bit is determined, where the intermediate exponent value is greater than the reduced precision positive threshold and less than the overflow threshold, or where the intermediate exponent value is greater than the underflow threshold and less than the reduced precision negative threshold, the third floating point number precision marking bit characterizes that no processing of the target floating point operand is required.
Determining the target precision of the target floating point operand according to the floating point number precision marking bit, converting the intermediate mantissa and the intermediate exponent value based on the target precision, and completing the precision conversion of the target operand comprises the following conditions:
optionally, in one embodiment, in a case where it is determined that the lifting precision processing needs to be performed on the target floating point operand according to the first floating point number precision labeling bit, determining a first target precision greater than an initial precision of the target floating point operand, converting the intermediate mantissa and the intermediate exponent value based on the first target precision, and completing precision conversion of the target operand.
Optionally, in one embodiment, in a case where it is determined that the precision reduction processing is required for the target floating point operand according to the second floating point precision marking bit, determining a second target precision smaller than the initial precision of the target floating point operand, converting the intermediate mantissa and the intermediate exponent value based on the second target precision, and completing the precision conversion of the target operand.
Optionally, in an embodiment, the mixed precision floating point number processing is performed on two floating point operands, and when determining the corresponding target floating point operand, the adding processing may be further performed according to the offset of at least two floating point operands, so as to determine the offset of the target floating point operand.
In another embodiment, as shown in fig. 8, a mixed precision floating point number processing method is provided, and the method is applied to the mixed precision floating point multiplication device, and the method is described by taking as an example, and includes the following steps:
step 802, determining a register matched with each floating point operand and a register matched with the scaling factor corresponding to each floating point operand from the idle allocable register group based on the register allocation priority of at least two floating point operands to be allocated and the register allocation priority of the scaling factor corresponding to each floating point operand, and the initial precision of each floating point operand and the initial precision of the scaling factor corresponding to each floating point operand, and completing the register allocation of the scaling factor corresponding to at least two floating point operands and each floating point operand.
Step 804, determining the offset of at least two floating point operands obtained from registers; wherein the floating point operand includes an exponent and a mantissa.
Step 806, performing offset processing on the exponent of the corresponding floating point operand according to the offset of each floating point operand, and determining the offset exponent value of each floating point operand.
Step 808, adding the offset exponent values of each floating point operand to obtain an intermediate exponent value.
Step 810, performing multiplication processing on mantissas of all floating-point operands to obtain floating-point number precision labeling bits and intermediate mantissas.
If the floating point number precision flag bit is the first floating point number precision flag bit that characterizes the need to perform precision promotion processing on the target floating point operand, step 812 is performed.
If the floating point number precision labeling bit is the second floating point number precision labeling bit that characterizes the need to reduce precision of the target floating point operand, step 820 is performed, step 814.
In step 816, if the floating point number precision labeling bit is a third floating point number precision labeling bit that characterizes the need not process the target floating point operand, then no precision conversion needs to be performed.
Step 818, determining a first target precision that is greater than the initial precision of the target floating point operand, converting the intermediate mantissa and intermediate exponent value based on the first target precision, and completing the precision conversion of the target operand.
Step 820, determining a second target precision that is less than the initial precision of the target floating point operand, converting the intermediate mantissa and intermediate exponent value based on the second target precision, and completing the precision conversion of the target operand.
In the above embodiment, the registers are pre-allocated, the output result generated in the floating point multiplication process does not need to be separately allocated, the registers occupied by the allocated floating point operands are multiplexed and allocated to the first available register occupied by the input variable in a data stream analysis manner, so that the sharing of the register group is realized, the utilization rate of the registers is improved, and the waste of the buffer memory unit is avoided. The mixed precision automatic conversion is realized by twice addition processing, one multiplication calculation and combining floating point precision marking bits, the times of multiplication calculation are reduced, the type conversion, scaling and other processing on floating point operands are not needed, and the precision conversion efficiency is improved; that is, the computing resource consumption of hardware resources is reduced by reducing the processing amount of data, and the computing efficiency of the mixing precision is further improved.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 9. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program, when executed by a processor, implements a hybrid precision floating point number processing method. The display unit of the computer equipment is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device, wherein the display screen can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on a shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by persons skilled in the art that the architecture shown in fig. 9 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (10)

1. The mixed precision floating point multiplication device is characterized by comprising an exponent bias module, at least two first adders, a second adder, a multiplication unit and a precision conversion unit, wherein the exponent bias module is connected with the at least two first adders, the at least two first adders are connected with the second adder, and the second adder and the multiplication unit are both connected with the precision conversion unit; wherein:
The exponent bias module is used for determining the bias of at least two floating point operands acquired from the register; wherein the floating point operand includes an exponent and a mantissa;
the at least two first adders respectively utilize the offset of each floating point operand to carry out offset processing on the exponent corresponding to each floating point operand, and determine the offset exponent value of each floating point operand;
the second adder is configured to perform an addition operation on offset exponent values of the floating point operands to obtain an intermediate exponent value;
the multiplication unit is used for carrying out multiplication processing on mantissas of the floating-point operands to obtain floating-point number precision labeling bits and intermediate mantissas;
the precision conversion unit is used for determining the target precision of a target floating point operand according to the floating point number precision marking bit, converting the intermediate mantissa and the intermediate exponent value based on the target precision and finishing the precision conversion of the target operand; wherein the destination floating point operand is determined from the intermediate mantissa and the intermediate exponent value.
2. The hybrid precision floating point multiplication apparatus of claim 1, wherein the precision conversion unit is further configured to determine a first floating point number precision marking bit indicating that the target floating point operand is to be precision-lifted if the intermediate exponent value is greater than or equal to an overflow threshold or less than or equal to an underflow threshold.
3. The hybrid precision floating point multiplication apparatus of claim 1, wherein the precision conversion unit is further configured to determine a second floating point number precision marking bit indicating that the target floating point operand is required to be precision-reduced if the intermediate exponent value is greater than or equal to a precision-reduced negative threshold and less than or equal to a precision-reduced positive threshold; the absolute values of the negative threshold of the reduced precision and the positive threshold of the reduced precision are the same, and are opposite numbers.
4. The hybrid precision floating point multiplication apparatus of claim 1, wherein the precision conversion unit is further configured to determine a third floating point number precision marking bit that characterizes no processing of the target floating point operand if the intermediate exponent value is greater than a reduced precision positive threshold and less than an overflow threshold or the intermediate exponent value is greater than an underflow threshold and less than a reduced precision negative threshold.
5. The apparatus according to claim 2, wherein the precision conversion unit is further configured to determine a first target precision that is greater than an initial precision of the target floating-point operand, convert the intermediate mantissa and the intermediate exponent value based on the first target precision, and complete precision conversion of the target operand, in a case where it is determined that lifting precision processing is required for the target floating-point operand according to the first floating-point number precision flag bit.
6. A hybrid precision floating point multiplication apparatus as claimed in claim 3, wherein the precision conversion unit is further configured to determine a second target precision smaller than an initial precision of the target floating point operand, convert the intermediate mantissa and the intermediate exponent value based on the second target precision, and complete precision conversion of the target operand, in a case where it is determined that the target floating point operand needs to be subjected to reduced precision processing based on the second floating point number precision flag bit.
7. The mixed precision floating point multiplication device according to claim 1, further comprising a register allocation module for, in the case where there is a free allocatable register group, determining, from the free allocatable register group, a register matching the register matching each floating point operand and a register matching the scaling factor corresponding to each floating point operand, based on a register allocation priority of at least two floating point operands to be allocated, a register allocation priority of a scaling factor corresponding to each floating point operand, and an initial precision of each floating point operand and an initial precision of each scaling factor, completing the register allocation of the at least two floating point operands and the scaling factor corresponding to each floating point operand.
8. The mixed-precision floating-point multiplication apparatus according to claim 1, further comprising a third adder for performing an addition process according to the offset of the at least two floating-point operands to determine the offset of the target floating-point operand if the exponent offset module is connected to the third adder.
9. A method of mixed precision floating point number processing, applied to a mixed precision floating point multiplication apparatus as claimed in any one of claims 1 to 8, the method comprising:
determining a bias of at least two floating point operands retrieved from a register; wherein the floating point operand includes an exponent and a mantissa;
respectively carrying out offset processing on indexes corresponding to the floating point operands according to the offset of the floating point operands;
performing addition operation on the offset exponent values of the floating point operands to obtain intermediate exponent values;
multiplying mantissas of the floating point operands to obtain floating point number precision labeling bits and intermediate mantissas;
determining the target precision of a target floating point operand according to the floating point number precision marking bit, and converting the intermediate mantissa and the intermediate exponent value based on the target precision to finish the precision conversion of the target operand; wherein the destination floating point operand is determined from the intermediate mantissa and the intermediate exponent value.
10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of claim 9 when executing the computer program.
CN202310810589.0A 2023-07-03 2023-07-03 Mixed precision floating-point multiplication device and mixed precision floating-point number processing method Pending CN116795324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310810589.0A CN116795324A (en) 2023-07-03 2023-07-03 Mixed precision floating-point multiplication device and mixed precision floating-point number processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310810589.0A CN116795324A (en) 2023-07-03 2023-07-03 Mixed precision floating-point multiplication device and mixed precision floating-point number processing method

Publications (1)

Publication Number Publication Date
CN116795324A true CN116795324A (en) 2023-09-22

Family

ID=88034356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310810589.0A Pending CN116795324A (en) 2023-07-03 2023-07-03 Mixed precision floating-point multiplication device and mixed precision floating-point number processing method

Country Status (1)

Country Link
CN (1) CN116795324A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648959A (en) * 2024-01-30 2024-03-05 中国科学技术大学 Multi-precision operand operation device supporting neural network operation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648959A (en) * 2024-01-30 2024-03-05 中国科学技术大学 Multi-precision operand operation device supporting neural network operation
CN117648959B (en) * 2024-01-30 2024-05-17 中国科学技术大学 Multi-precision operand operation device supporting neural network operation

Similar Documents

Publication Publication Date Title
US11797303B2 (en) Generalized acceleration of matrix multiply accumulate operations
CN107608715B (en) Apparatus and method for performing artificial neural network forward operations
CN108701250B (en) Data fixed-point method and device
US11816482B2 (en) Generalized acceleration of matrix multiply accumulate operations
JP2020074099A (en) Processing apparatus and processing method
US8745111B2 (en) Methods and apparatuses for converting floating point representations
CN116795324A (en) Mixed precision floating-point multiplication device and mixed precision floating-point number processing method
CN116700663B (en) Floating point number processing method and device
CN112232499A (en) Convolutional neural network accelerator
CN118012628A (en) Data processing method, device and storage medium
CN116127261B (en) Matrix multiply-accumulate method and device in processor and electronic equipment
US7769981B2 (en) Row of floating point accumulators coupled to respective PEs in uppermost row of PE array for performing addition operation
CN117420982A (en) Chip comprising a fused multiply-accumulator, device and control method for data operations
CN116700665B (en) Method and device for determining floating point number square root reciprocal
CN116700666B (en) Floating point number processing method and device
JP2024530974A (en) Method, device, processor, computer device, and computer program for processing multiple input floating-point numbers
CN115269003A (en) Data processing method and device, processor, electronic equipment and storage medium
CN113591031A (en) Low-power-consumption matrix operation method and device
CN108229668B (en) Operation implementation method and device based on deep learning and electronic equipment
CN118312130B (en) Data processing method and device, processor, electronic equipment and storage medium
CN118760415A (en) Data processing method and device, processor, electronic equipment and storage medium
CN115718586A (en) Pixel color mixing operation method, graph drawing method, device and equipment
JP2022183833A (en) Neural network circuit and neural network operation method
CN118519685A (en) Method for realizing rapid solving of sigmoid function based on SIMD instruction
CN116997888A (en) Approximation of matrix for matrix multiplication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination