CN112732223B - Semi-precision floating point divider data processing method and system - Google Patents

Semi-precision floating point divider data processing method and system Download PDF

Info

Publication number
CN112732223B
CN112732223B CN202011641150.2A CN202011641150A CN112732223B CN 112732223 B CN112732223 B CN 112732223B CN 202011641150 A CN202011641150 A CN 202011641150A CN 112732223 B CN112732223 B CN 112732223B
Authority
CN
China
Prior art keywords
data
bit separation
bit
module
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011641150.2A
Other languages
Chinese (zh)
Other versions
CN112732223A (en
Inventor
马向华
边立剑
叶梦琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Anlu Information Technology Co ltd
Original Assignee
Shanghai Anlu Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Anlu Information Technology Co ltd filed Critical Shanghai Anlu Information Technology Co ltd
Priority to CN202011641150.2A priority Critical patent/CN112732223B/en
Publication of CN112732223A publication Critical patent/CN112732223A/en
Application granted granted Critical
Publication of CN112732223B publication Critical patent/CN112732223B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/535Dividing only

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention provides a data processing method of a semi-precision floating point number divider, which comprises the steps of respectively carrying out bit separation processing on first data and second data to respectively obtain first bit separation data and second bit separation data, carrying out data adjustment on the second bit separation data to obtain adjustment data, carrying out iterative calculation processing on the adjustment data to obtain iterative data, carrying out mixed operation on the first bit separation data and the iterative data to obtain mixed bit separation data, and carrying out data reset adjustment on the mixed bit separation data to obtain calculation result data. According to the semi-precision floating point divider data processing method, through iterative calculation processing, the operation period is reduced, and the operation efficiency is improved. The invention also provides a half-precision floating point number divider data processing system for realizing the half-precision floating point number divider data processing method.

Description

Semi-precision floating point divider data processing method and system
Technical Field
The invention relates to the field of half-precision floating point divider, in particular to a half-precision floating point divider data processing method and system.
Background
In the design of convolutional neural networks (Convolutional Neural Networks, CNN), the number of layers of the network model is gradually increasing as the number of image pixels to be processed increases and the discrimination types increase.
From the aspect of easy design, most convolutional neural network designs adopt a fixed point number mode to process data, the mathematical operation speed is high, the algorithm circuit design is mature, the fixed point division circuit is also successfully applied to CNN neural network designs, but the out-of-range error is continuously increased along with the increase of the layer number, the final classification judgment of images is easy, and the practical floating point number mode has a large expression range and small possibility of out-of-range error, so the floating point number mode is already used for calculation in the industry.
At present, the floating point number computing circuit is mature, but the problems of too many operation periods and too long operation time exist on the divider, for example, 17 clock periods are needed for the division of FP16 in the Xilinx FPGA divider, the period is long, and the operation efficiency is low.
Therefore, it is necessary to provide a new method and system for processing data of a half-precision floating point divider to solve the above-mentioned problems in the prior art.
Disclosure of Invention
The invention aims to provide a semi-precision floating point divider data processing method and system, which reduce operation period and improve operation efficiency.
In order to achieve the above object, the method for processing data of the half-precision floating point divider of the present invention comprises the following steps:
S1: respectively performing bit separation processing on the first data and the second data to respectively obtain first bit separation data and second bit separation data:
s2: performing data adjustment on the second bit separation data to obtain adjustment data;
s3: performing iterative computation processing on the adjustment data to obtain iterative data;
S4: performing mixed operation on the first bit separation data and the iterative data to obtain mixed bit separation data;
S5: and carrying out data resetting adjustment on the mixed bit separation data to obtain calculation result data.
The half-precision floating point divider data processing method has the beneficial effects that: and carrying out iterative computation on the adjustment data to obtain iterative data, carrying out mixed operation on the first bit separation data and the iterative data to obtain mixed bit separation data, and carrying out data reset adjustment on the mixed bit separation data to obtain computation result data.
Preferably, the second bit-separated data includes first bit-separated sub-data, second bit-separated sub-data, and third bit-separated sub-data, the adjustment data includes the first bit-separated sub-data, second bit-separated shift data, and third bit-separated operation data, and the data adjustment includes the steps of:
s21: carrying out reduction processing on the third bit separator data to obtain reduction data;
s22: performing shift processing on the restored data to obtain second bit separation shift data, wherein the second bit separation shift data is located in a threshold interval;
S23: and carrying out addition processing or subtraction processing on the second bit separation sub data and the bit number of the shift processing according to the type of the shift processing so as to obtain the third bit separation operation data.
Further preferably, the threshold interval includes a first end value and a second end value, an absolute value of a difference between the first end value and the restored data is smaller than an absolute value of a difference between the second end value and the restored data, and the second bit-separated shift data is shift data closest to the first end value.
Further preferably, the iterative data includes the first bit-separated sub-data, the second bit-separated shift data, and iterative operation result data, and the iterative calculation process includes the steps of:
s31: performing iterative operation on the third bit separation operation data according to the first iteration threshold, the second iteration threshold, the third iteration threshold and the fourth iteration threshold to obtain operation result data;
S32: judging whether the difference value between the first iteration threshold and the operation result data is larger than 1;
S33: and if the difference is greater than 1, the operation result data is used as a new first iteration threshold, the step S31 and the step S32 are re-executed until the difference is less than or equal to 1, and then the operation result data obtained by executing the step S31 for the last time is used as the iteration operation result data to be output. The beneficial effects are that: the iteration times are convenient to control, and the operation period is further shortened.
Further preferably, the first bit-separated data includes fourth bit-separated sub-data, fifth bit-separated sub-data, and sixth bit-separated sub-data, the mixed bit-separated data includes first bit-separated integrated data, second bit-separated integrated data, and third bit-separated integrated data, and the mixed operation includes the steps of:
Performing power operation on the fourth bit separation sub-data and the first bit separation sub-data to obtain the first bit separation integrated data;
Adding and subtracting the fifth bit separation sub data, the second bit separation shift data and an addition and subtraction threshold value to obtain the second bit separation integrated data;
And performing multiply-divide operation on the sixth bit separation sub-data, the iterative operation result data and the multiply-divide threshold value to obtain the third bit separation integrated data. The beneficial effects are that: and mixed bit separation data is conveniently obtained through calculation.
Further preferably, the calculation result data includes first bit separation integrated data, second bit separation integrated data, and third bit data, and the data reset adjustment includes the steps of:
s51: judging whether the third bit separation integrated data is smaller than a first reset adjustment threshold value or not;
S52: if the third bit separation and integration data is smaller than the first reset adjustment threshold, moving the third bit separation and integration data to the left by the least bit number so that the third bit separation and integration data is larger than or equal to the first reset adjustment threshold;
S53: and subtracting the third bit separation and integration data and the least bit number of the third bit separation and integration data moving leftwards to obtain the third bit data. The beneficial effects are that: and half-precision floating point numbers are convenient to obtain.
Further preferably, the data reset adjustment further comprises the steps of:
s51a: judging whether the third bit separation integrated data is larger than a second reset adjustment threshold value or not;
S52a: if the third bit separation and integration data is larger than the second reset adjustment threshold, the third bit separation and integration data is moved rightwards by the least bit number, so that the third bit separation and integration data is smaller than or equal to the first reset adjustment threshold;
S53a: and adding the third bit separation and integration data and the least bit number of the third bit separation and integration data which moves rightwards to obtain the third bit data. The beneficial effects are that: and half-precision floating point numbers are convenient to obtain.
The invention also provides a semi-precision floating point number divider data processing system, which comprises a first bit separation module, a second bit separation module, an adjustment module, an iteration module, a mixed operation module and a reset adjustment module, wherein the first bit separation module is connected with the mixed operation module, the second bit separation module is connected with the adjustment module and the mixed operation module, the adjustment module is connected with the mixed operation module, the mixed operation module is connected with the reset adjustment module, the first bit separation module is used for carrying out bit separation processing on first data to obtain first bit separation data, the second bit separation module is used for carrying out bit separation processing on second data to obtain second bit separation data, the adjustment module is used for carrying out data adjustment on the second bit separation data to obtain adjustment data, the iteration module is used for carrying out iterative calculation processing on the adjustment data to obtain iteration data, the mixed operation module is used for carrying out mixed operation on the first bit separation data and the mixed operation data to obtain mixed bit separation data, and the reset adjustment module is used for carrying out bit separation processing on the mixed operation data to obtain the iteration data to obtain the reset result.
The half-precision floating point number divider data processing system has the beneficial effects that: the iterative module is used for carrying out iterative computation on the adjustment data to obtain iterative data, the mixed operation module is used for carrying out mixed operation on the first bit separation data and the iterative data to obtain mixed bit separation data, the reset adjustment module is used for carrying out data reset adjustment on the mixed bit separation data to obtain calculation result data, and the operation period is reduced and the operation efficiency is improved through the iterative computation.
Preferably, the adjusting module includes a data restoring unit, a data shifting unit and a first data operation unit, wherein an input end of the data restoring unit is connected with a second output end of the second bit separation module, an output end of the data restoring unit is connected with the data shifting unit, a first output end of the data shifting unit is connected with a first input end of the first data operation unit, and a second input end of the first data operation unit is connected with a third output end of the second bit separation module.
Further preferably, the iteration module includes an iteration operation unit, a judgment unit and a valuation unit, wherein a first input end of the iteration operation unit is connected with an output end of the first data operation unit, an output end of the iteration operation unit is connected with an input end of the judgment unit, a first output end of the judgment unit is connected with an input end of the valuation unit, and an output end of the valuation unit is connected with a second input end of the iteration operation unit.
Further preferably, the hybrid operation module includes a second data operation unit, a third data operation unit, and a fourth data operation unit, where a first input end of the second data operation unit is connected to the first output end of the first bit separation module, a second input end of the second data operation unit is connected to the first output end of the second bit separation module, a first input end of the third data operation unit is connected to the second output end of the first bit separation module, a second input end of the third data operation unit is connected to the second output end of the data shift unit, a first input end of the fourth data operation unit is connected to the third output end of the first bit separation module, and a second input end of the fourth data operation unit is connected to the second output end of the judgment unit.
Preferably, the half-precision floating point number divider data processing system further comprises a first buffer module, one end of the first buffer module is connected with the first bit separation module, and the other end of the first buffer module is connected with the hybrid operation module.
Further preferably, the half-precision floating point number divider data processing system further comprises a second buffer module, one end of the second buffer module is connected with the second bit separation module, and the other end of the second buffer module is connected with the hybrid operation module.
Further preferably, the first buffer module includes at least 1 first buffer unit, when the number of the first buffer units is greater than 1, the first buffer units are connected in series, and the second buffer module includes at least 1 second buffer unit, when the number of the second buffer units is greater than 1, the second buffer units are connected in series.
Further preferably, the number of the first cache units is the same as the number of the second cache units.
Further preferably, the iteration module includes at least 2 iteration computing units and at least 1 assignment computing unit, the number of the assignment computing units is less than 1, and the iteration computing units are connected through one assignment computing unit.
Drawings
FIG. 1 is a flow chart of a half-precision floating point divider data processing method of the present invention;
FIG. 2 is a flow chart of data adjustment according to the present invention;
FIG. 3 is a flow chart of an iterative computation process of the present invention;
FIG. 4 is a flow chart of data reset adjustment in some embodiments of the invention;
FIG. 5 is a flow chart of data reset adjustment in further embodiments of the present invention;
FIG. 6 is a block diagram of a half-precision floating point divider data processing system according to the present invention;
FIG. 7 is a block diagram illustrating a configuration of an adjustment module according to the present invention;
FIG. 8 is a block diagram of an iterative module of the present invention;
FIG. 9 is a block diagram of a half-precision floating point divider data processing system in accordance with still other embodiments of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention. Unless otherwise defined, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. As used herein, the word "comprising" and the like means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof without precluding other elements or items.
Aiming at the problems existing in the prior art, the embodiment of the invention provides a half-precision floating point divider data processing method, and referring to fig. 1, the half-precision floating point divider data processing method comprises the following steps:
S1: respectively performing bit separation processing on the first data and the second data to respectively obtain first bit separation data and second bit separation data:
s2: performing data adjustment on the second bit separation data to obtain adjustment data;
s3: performing iterative computation processing on the adjustment data to obtain iterative data;
S4: performing mixed operation on the first bit separation data and the iterative data to obtain mixed bit separation data;
S5: and carrying out data resetting adjustment on the mixed bit separation data to obtain calculation result data.
The first data is used as the dividend and the second data is used as the divisor.
In some embodiments, the second bit-separated data includes first bit-separated sub-data, second bit-separated sub-data, and third bit-separated sub-data, the adjustment data includes the first bit-separated sub-data, second bit-separated shift data, and third bit-separated operation data, and in step S2, the first bit-separated sub-data is not processed, and the second bit-separated sub-data and the third bit-separated sub-data are respectively processed to obtain the second bit-separated shift data and the third bit-separated operation data, respectively. Specifically, referring to fig. 2, the data adjustment includes the steps of:
s21: carrying out reduction processing on the third bit separator data to obtain reduction data;
s22: performing shift processing on the restored data to obtain second bit separation shift data, wherein the second bit separation shift data is located in a threshold interval;
S23: and carrying out addition processing or subtraction processing on the second bit separation sub data and the bit number of the shift processing according to the type of the shift processing so as to obtain the third bit separation operation data. Specifically, the subtraction processing is performed in step S23 if the shift processing is left shift, and the addition processing is performed in step S23 if the shift processing is right shift.
In some embodiments, the second bit-split data is a half-precision floating point number, the first bit-split sub-data is a sign bit of the half-precision floating point number, the second bit-split sub-data is a step bit of the half-precision floating point number, and the third bit-split sub-data is a mantissa of the half-precision floating point number.
In some embodiments, the threshold interval includes a first end value and a second end value, an absolute value of the difference between the first end value and the restored data is smaller than an absolute value of the difference between the second end value and the restored data, and the second bit-separated shift data is the shift data closest to the first end value. Specifically, if the first end value is 512, the second end value is 1024; the first end value is 1024, and the second end value is 512.
Specifically, for example, the restored data is 129, 200 is shifted left by 1 bit to obtain 258, 258 is shifted left by 1 bit to obtain 516, 516 is located between 512 and 1024, and 516 is shifted data closest to 512 after shifting, that is, the bit number of the shifting process is 2, and the second bit separation sub-data and the bit number of the shifting process are subjected to subtraction according to the type of the shifting process, so as to obtain the third bit separation operation data.
Specifically, for example, the restored data is 8176, 8176 is shifted by 1 bit to obtain 4088, 4088 is shifted by 1 bit to obtain 2044, 2044 is shifted by 1 bit to obtain 1022, 1022 is located between 512 and 1024, 1022 is shifted data closest to 1024 after shifting, that is, the number of bits of the shifting process is 3, and the second bit separation sub-data and the number of bits of the shifting process are processed according to the type of the shifting process, so as to obtain the third bit separation operation data.
In some embodiments, the iterative data includes the first bit-split sub-data, the second bit-split shift data, and iterative operation result data. Specifically, referring to fig. 3, the iterative calculation process includes the steps of:
s31: performing iterative operation on the third bit separation operation data according to the first iteration threshold, the second iteration threshold, the third iteration threshold and the fourth iteration threshold to obtain operation result data;
S32: judging whether the difference value between the first iteration threshold and the operation result data is larger than 1;
S33: and if the difference is greater than 1, the operation result data is used as a new first iteration threshold, the step S31 and the step S32 are re-executed until the difference is less than or equal to 1, and then the operation result data obtained by executing the step S31 for the last time is used as the iteration operation result data to be output. Specifically, in step S32, the operation result data is subtracted from the first iteration threshold.
In some embodiments, when the step S31 and the step S32 are performed for the first time, the first iteration threshold is 1, the second iteration threshold is 2048, the third iteration threshold is 1024, the fourth iteration threshold is 1024, and the iterative operation specifically includes:
The first iteration threshold is multiplied by the third bit separation operation data to obtain first third bit separation operation data; dividing the first third bit separation operation data by the third iteration threshold to obtain second third bit separation operation data; the second iteration threshold subtracts the second third bit separation operation data to obtain third bit separation operation data; multiplying the third bit separation operation data by the first iteration threshold to obtain fourth third bit separation operation data; and dividing the fourth third bit separation operation data by the fourth iteration threshold to obtain operation result data.
In some embodiments, the first bit-split data comprises a fourth bit-split sub-data, a fifth bit-split sub-data, and a sixth bit-split sub-data, the mixed bit-split data comprises a first bit-split integration data, a second bit-split integration data, and a third bit-split integration data, and the mixing operation comprises performing a power operation on the fourth bit-split sub-data and the first bit-split sub-data to obtain the first bit-split integration data; adding and subtracting the fifth bit separation sub data, the second bit separation shift data and an addition and subtraction threshold value to obtain the second bit separation integrated data; and performing multiply-divide operation on the sixth bit separation sub-data, the iterative operation result data and the multiply-divide threshold value to obtain the third bit separation integrated data. Specifically, the first bit-separated sub-data performs a power operation as an exponent of the fourth bit-separated sub-data; the addition and subtraction threshold value is 15, the fifth bit separation sub data is subtracted from the second bit separation shift data, and then the addition and subtraction threshold value 15 is added to obtain the second bit separation integration data; the multiplication and division threshold is 1024, the sixth bit separation sub-data is multiplied by the iteration operation result data, and then the multiplication and division threshold 1024 is divided, so as to obtain the third bit separation integrated data.
In some embodiments, the first bit split data is a half-precision floating point number, the fourth bit split sub data is a sign bit of the half-precision floating point number, the fifth bit split sub data is a step bit of the half-precision floating point number, and the sixth bit split sub data is a mantissa of the half-precision floating point number.
In some embodiments, the calculation result data includes first bit-separated integration data, second bit-separated integration data, and third bit data, wherein the first bit-separated integration data is a sign bit of the calculation result data, the second bit-separated integration data is a step bit of the calculation result data, and the third bit data is a mantissa of the calculation result data. Referring to fig. 4, the data reset adjustment includes the steps of:
s51: judging whether the third bit separation integrated data is smaller than a first reset adjustment threshold value or not;
S52: if the third bit separation and integration data is smaller than the first reset adjustment threshold, moving the third bit separation and integration data to the left by the least bit number so that the third bit separation and integration data is larger than or equal to the first reset adjustment threshold;
S53: and subtracting the third bit separation and integration data and the least bit number of the third bit separation and integration data moving leftwards to obtain the third bit data. Specifically, the third bit split integrated data is subtracted by the least number of bits the third bit split integrated data moves to the left to obtain the third bit data.
Specifically, the first reset adjustment threshold is 1024, for example, the third bit of the split integrated data is 500, the 500 is shifted left by 1 bit to obtain 1000, the 1000 is shifted left by 1 bit to obtain 2000, 2000 is greater than 1024, that is, the third bit of the split integrated data is shifted left by 2 bits and then greater than the second reset adjustment threshold, that is, the third bit of the split integrated data is shifted left by the least bit number of 2.
In some embodiments, referring to fig. 5, the data reset adjustment further comprises the steps of:
s51a: judging whether the third bit separation integrated data is larger than a second reset adjustment threshold value or not;
S52a: if the third bit separation and integration data is larger than the second reset adjustment threshold, the third bit separation and integration data is moved rightwards by the least bit number, so that the third bit separation and integration data is smaller than or equal to the first reset adjustment threshold;
S53a: and adding the third bit separation and integration data and the least bit number of the third bit separation and integration data which moves rightwards to obtain the third bit data. Specifically, the third bit of the split integrated data is added to the minimum bit number of the third bit of the split integrated data, so as to obtain the third bit of data.
Specifically, the second reset adjustment threshold is 2048, for example, the third bit of the integrated data is 10000, 10000 is shifted by 1 bit to the right to obtain 5000, 500 is shifted by 1 bit to the right to obtain 2500, 2500 is shifted by 1 bit to the right to obtain 1250, 1250 is smaller than 2048, that is, the third bit of the integrated data is shifted by 3 bits to the right to be smaller than the second reset adjustment threshold, that is, the minimum bit number of the integrated data is shifted by 3 to the right.
FIG. 6 is a schematic diagram of a half-precision floating point divider data processing system according to some embodiments of the present invention. Referring to fig. 6, the semi-precision floating point divider data processing system 100 includes a first bit separation module 10, a second bit separation module 20, an adjustment module 30, an iteration module 40, a hybrid operation module 50, and a reset adjustment module 60, where the first bit separation module 10 is connected to the hybrid operation module 50, the second bit separation module 20 is connected to the adjustment module 30 and the hybrid operation module 50, the adjustment module 30 is connected to the hybrid operation module 50, the hybrid operation module 50 is connected to the reset adjustment module 60, the first bit separation module 10 is configured to perform a bit separation process on the first data to obtain first bit separation data, the second bit separation module 20 is configured to perform a bit separation process on the second data to obtain second bit separation data, the adjustment module 30 is configured to perform a data adjustment on the second bit separation data to obtain adjustment data, the iteration module 40 is configured to perform an iterative calculation process on the adjustment data to obtain data, the hybrid operation module 50 is configured to perform an iterative calculation process on the first bit separation data and the hybrid operation module to obtain the blended data to obtain the reset adjustment data, and the blended data is configured to perform an iterative calculation on the reset adjustment data to obtain the blended data.
In some embodiments, referring to fig. 7, the adjustment module 30 includes a data reduction unit 301, a data shift unit 302, and a first data operation unit 303, where an input terminal of the data reduction unit 301 is connected to a second output terminal of the second bit separation module (not shown in the figure), an output terminal of the data reduction unit 301 is connected to the data shift unit 302, a first output terminal of the data shift unit 302 is connected to a first input terminal of the first data operation unit 303, and a second input terminal of the first data operation unit 303 is connected to a third output terminal of the second bit separation module. The data restoring unit 301 restores the third bit separated sub-data to obtain restored data, the data shifting unit 302 receives the restored data and shifts the restored data to obtain the second bit separated shifted data, the second bit separated shifted data is located in a threshold interval, the first data computing unit 303 receives the second bit separated sub-data and the bit number of the shifting process, and performs an addition process or a subtraction process on the second bit separated sub-data and the bit number of the shifting process according to the type of the shifting process to obtain the third bit separated operation data.
In some embodiments, referring to fig. 8, the iteration module 40 includes an iteration unit 401, a judging unit 402, and an assigning unit 403, where a first input end of the iteration unit 401 is connected to an output end of the first data operation unit (not shown in the figure), an output end of the iteration unit 401 is connected to an input end of the judging unit 402, a first output end of the judging unit 402 is connected to an input end of the assigning unit 403, and an output end of the assigning unit 403 is connected to a second input end of the iteration unit 401. The iterative operation unit 401 receives the third bit separation operation data, performs iterative operation on the third bit separation operation data according to a first iteration threshold, a second iteration threshold, a third iteration threshold and a fourth iteration threshold to obtain operation result data, the judging unit 402 receives the operation result data, judges whether a difference value between the first iteration threshold and the operation result data is greater than 1, if the difference value is greater than 1, the judging unit 402 transmits the operation result data to the assignment unit 403, the assignment unit 403 assigns the operation result data to the first iteration threshold, and transmits the assigned first iteration threshold to the iterative operation unit 401, and if the difference value is less than or equal to 1, the judging unit 402 outputs the operation result data as the iterative operation result data.
In some embodiments, the hybrid operation module includes a second data operation unit, a third data operation unit, and a fourth data operation unit, where a first input end of the second data operation unit is connected to the first output end of the first bit separation module, a second input end of the second data operation unit is connected to the first output end of the second bit separation module, a first input end of the third data operation unit is connected to the second output end of the first bit separation module, a second input end of the third data operation unit is connected to the second output end of the data shift unit, a first input end of the fourth data operation unit is connected to the third output end of the first bit separation module, and a second input end of the fourth data operation unit is connected to the second output end of the judgment unit. The second data operation unit receives the fourth bit separation sub data and the first bit separation sub data, performs power operation on the fourth bit separation sub data and the first bit separation sub data to obtain the first bit separation integrated data, receives the fifth bit separation sub data and the second bit separation shift data, performs addition and subtraction processing on the fifth bit separation sub data, the second bit separation shift data and an addition and subtraction threshold value to obtain the second bit separation integrated data, receives the sixth bit separation sub data and the iterative operation result data, performs multiplication and division operation on the sixth bit separation sub data, the iterative operation result data and a multiplication and division threshold value to obtain the third bit separation integrated data, and the reset adjustment module receives the first bit separation integrated data, the second bit separation integrated data and the third bit separation integrated data, performs reset adjustment on the first bit separation integrated data, the second bit separation integrated data and the third bit separation integrated data to obtain a result.
FIG. 9 is a schematic diagram of a half-precision floating point divider data processing system according to still other embodiments of the present invention. Referring to fig. 9, the half-precision floating point divider data processing system 100 further includes a first buffer module 70 and a second buffer module 80, wherein one end of the first buffer module 70 is connected to the first bit separation module 10, the other end of the first buffer module 70 is connected to the hybrid operation module 50, one end of the second buffer module 80 is connected to the second bit separation module 20, and the other end of the second buffer module 80 is connected to the hybrid operation module 50.
In some embodiments, the first buffer module includes at least 1 first buffer unit, when the number of the first buffer units is greater than 1, the first buffer units are connected in series, the second buffer module includes at least 1 second buffer unit, when the number of the second buffer units is greater than 1, the second buffer units are connected in series, the iteration module includes at least 2 iteration calculation units and at least 1 assignment calculation unit, the number of the assignment calculation units is less than 1 than the number of the iteration calculation units, and the iteration calculation units are connected through one assignment calculation unit. Preferably, the number of the first buffer units, the number of the second buffer units and the number of the iterative processing units are the same. Preferably, the first buffer module includes 5 first buffer units, the second buffer module includes 5 second buffer units, and the iterative module includes at least 5 iterative calculation units and at least 4 assignment calculation units.
In some embodiments, taking 2 iterative computing units and 1 assignment computing unit as examples, the two iterative computing units are a first iterative computing unit and a second iterative computing unit, a first input end of the first iterative computing unit is connected with an output end of the first data computing unit, a second input end of the first iterative computing unit is used for inputting a first iteration threshold value, a first output end of the first iterative computing unit is connected with a first input end of the assignment computing unit, a second output end of the first iterative computing unit is connected with a second input end of the assignment computing unit, a first output end of the assignment computing unit is connected with a first input end of the second iterative computing unit, a second output end of the assignment computing unit is connected with a second input end of the second iterative computing unit, and a first output end of the second iterative computing unit is connected with the hybrid computing module. The first iteration unit and the second iteration unit are used for performing iterative operation, a first output end of the first iteration unit and a first output end of the second iteration unit are used for outputting operation result data, a second output end of the first iteration unit is used for outputting third bit separation operation data, the assignment calculation unit is used for performing assignment so that the second iteration unit can perform iterative operation by taking the operation result data output by the first iteration unit as a first iteration threshold, a first output end of the assignment calculation unit is used for outputting operation result data as a first iteration threshold, a second end of the assignment calculation unit is used for outputting third bit separation operation data received by the first iteration calculation unit, and the second iteration calculation unit is used for outputting operation result data to serve as iterative operation result data.
While embodiments of the present invention have been described in detail hereinabove, it will be apparent to those skilled in the art that various modifications and variations can be made to these embodiments. It is to be understood that such modifications and variations are within the scope and spirit of the present invention as set forth in the following claims. Moreover, the invention described herein is capable of other embodiments and of being practiced or of being carried out in various ways.

Claims (11)

1. A semi-precision floating point number divider data processing method is characterized by comprising the following steps:
S1: respectively performing bit separation processing on the first data and the second data to respectively obtain first bit separation data and second bit separation data:
s2: performing data adjustment on the second bit separation data to obtain adjustment data;
s3: performing iterative computation processing on the adjustment data to obtain iterative data;
S4: performing mixed operation on the first bit separation data and the iterative data to obtain mixed bit separation data;
S5: performing data reset adjustment on the mixed bit separation data to obtain calculation result data;
The second bit separation data comprises first bit separation sub data, second bit separation sub data and third bit separation sub data, the adjustment data comprises the first bit separation sub data, second bit separation shift data and third bit separation operation data, and the data adjustment comprises the following steps:
s21: carrying out reduction processing on the third bit separator data to obtain reduction data;
s22: performing shift processing on the restored data to obtain second bit separation shift data, wherein the second bit separation shift data is located in a threshold interval;
s23: performing addition processing or subtraction processing on the second bit separation sub data and the bit number of the shift processing according to the type of the shift processing to obtain the third bit separation operation data;
The threshold interval comprises a first end value and a second end value, the absolute value of the difference value between the first end value and the restored data is smaller than the absolute value of the difference value between the second end value and the restored data, and the second bit separation shift data is the shift data closest to the first end value;
the iterative data comprises the first bit separation sub data, the second bit separation shift data and iterative operation result data, and the iterative calculation processing comprises the following steps:
s31: performing iterative operation on the third bit separation operation data according to the first iteration threshold, the second iteration threshold, the third iteration threshold and the fourth iteration threshold to obtain operation result data;
S32: judging whether the difference value between the first iteration threshold and the operation result data is larger than 1;
S33: if the difference is greater than 1, the operation result data is used as a new first iteration threshold, the step S31 and the step S32 are re-executed until the difference is less than or equal to 1, and then the operation result data obtained by executing the step S31 for the last time is used as the iteration operation result data to be output;
The first bit separation data includes a fourth bit separation sub-data, a fifth bit separation sub-data, and a sixth bit separation sub-data, the mixed bit separation data includes a first bit separation integration data, a second bit separation integration data, and a third bit separation integration data, and the mixed operation includes the steps of:
Performing power operation on the fourth bit separation sub-data and the first bit separation sub-data to obtain the first bit separation integrated data;
Adding and subtracting the fifth bit separation sub data, the second bit separation shift data and an addition and subtraction threshold value to obtain the second bit separation integrated data;
performing multiply-divide operation on the sixth bit separation sub-data, the iterative operation result data and a multiply-divide threshold value to obtain third bit separation integrated data;
the calculation result data comprises first bit separation integration data, second bit separation integration data and third bit data, and the data reset adjustment comprises the following steps:
s51: judging whether the third bit separation integrated data is smaller than a first reset adjustment threshold value or not;
S52: if the third bit separation and integration data is smaller than the first reset adjustment threshold, moving the third bit separation and integration data to the left by the least bit number so that the third bit separation and integration data is larger than or equal to the first reset adjustment threshold;
s53: and subtracting the third bit separation and integration data and the least bit number of the third bit separation and integration data moving leftwards to obtain the third bit data.
2. The method of claim 1, wherein the data reset adjustment further comprises the steps of:
s51a: judging whether the third bit separation integrated data is larger than a second reset adjustment threshold value or not;
S52a: if the third bit separation and integration data is larger than the second reset adjustment threshold, the third bit separation and integration data is moved rightwards by the least bit number, so that the third bit separation and integration data is smaller than or equal to the first reset adjustment threshold;
S53a: and adding the third bit separation and integration data and the least bit number of the third bit separation and integration data which moves rightwards to obtain the third bit data.
3. A half-precision floating point divider data processing system for implementing the half-precision floating point divider data processing method according to any one of claims 1 to 2, which is characterized by comprising a first bit separation module, a second bit separation module, an adjustment module, an iteration module, a hybrid operation module and a reset adjustment module, wherein the first bit separation module is connected with the hybrid operation module, the second bit separation module is connected with the adjustment module and the hybrid operation module, the adjustment module is connected with the hybrid operation module, the hybrid operation module is connected with the reset adjustment module, wherein the first bit separation module is used for performing bit separation processing on first data to obtain first bit separation data, the second bit separation module is used for performing bit separation processing on second data to obtain second bit separation data, the adjustment module is used for performing data adjustment on the second bit separation data to obtain adjustment data, the iteration module is used for performing iterative calculation processing on the adjustment data to obtain adjustment data, the hybrid operation module is used for performing iterative calculation processing on the first bit separation data and the hybrid operation module to obtain the mixed bit separation data, and the reset adjustment module is used for performing iterative calculation on the mixed bit separation data to obtain the adjustment data.
4. A half-precision floating point divider data processing system according to claim 3, wherein the adjustment module comprises a data reduction unit, a data shift unit and a first data operation unit, the input end of the data reduction unit is connected with the second output end of the second bit separation module, the output end of the data reduction unit is connected with the data shift unit, the first output end of the data shift unit is connected with the first input end of the first data operation unit, and the second input end of the first data operation unit is connected with the third output end of the second bit separation module.
5. The system of claim 4, wherein the iteration module comprises an iteration unit, a judgment unit and an assignment unit, wherein a first input end of the iteration unit is connected with an output end of the first data operation unit, an output end of the iteration unit is connected with an input end of the judgment unit, a first output end of the judgment unit is connected with an input end of the assignment unit, and an output end of the assignment unit is connected with a second input end of the iteration unit.
6. The system of claim 5, wherein the hybrid arithmetic unit comprises a second data arithmetic unit, a third data arithmetic unit, and a fourth data arithmetic unit, wherein a first input terminal of the second data arithmetic unit is connected to the first output terminal of the first bit separation module, a second input terminal of the second data arithmetic unit is connected to the first output terminal of the second bit separation module, a first input terminal of the third data arithmetic unit is connected to the second output terminal of the first bit separation module, a second input terminal of the third data arithmetic unit is connected to the second output terminal of the data shift unit, a first input terminal of the fourth data arithmetic unit is connected to the third output terminal of the first bit separation module, and a second input terminal of the fourth data arithmetic unit is connected to the second output terminal of the determination unit.
7. The system of claim 3, further comprising a first buffer module, wherein one end of the first buffer module is connected to the first bit separation module, and the other end of the first buffer module is connected to the hybrid operation module.
8. The system of claim 7, further comprising a second buffer module, wherein one end of the second buffer module is connected to the second bit separation module, and the other end of the second buffer module is connected to the hybrid operation module.
9. The half-precision floating point divider data processing system of claim 8, wherein the first buffer module comprises at least 1 first buffer unit, the first buffer units are connected in series when the number of the first buffer units is greater than 1, and the second buffer module comprises at least 1 second buffer unit, the second buffer units are connected in series when the number of the second buffer units is greater than 1.
10. The half precision floating point number divider data processing system of claim 9, wherein the number of first buffer units is the same as the number of second buffer units.
11. The system of claim 4, wherein the iterative module comprises at least 2 iterative computation units and at least 1 assignment computation unit, and the number of assignment computation units is less than 1, and the iterative computation units are connected through one assignment computation unit.
CN202011641150.2A 2020-12-31 2020-12-31 Semi-precision floating point divider data processing method and system Active CN112732223B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011641150.2A CN112732223B (en) 2020-12-31 2020-12-31 Semi-precision floating point divider data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011641150.2A CN112732223B (en) 2020-12-31 2020-12-31 Semi-precision floating point divider data processing method and system

Publications (2)

Publication Number Publication Date
CN112732223A CN112732223A (en) 2021-04-30
CN112732223B true CN112732223B (en) 2024-04-30

Family

ID=75609104

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011641150.2A Active CN112732223B (en) 2020-12-31 2020-12-31 Semi-precision floating point divider data processing method and system

Country Status (1)

Country Link
CN (1) CN112732223B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5341321A (en) * 1993-05-05 1994-08-23 Hewlett-Packard Company Floating point arithmetic unit using modified Newton-Raphson technique for division and square root
CN103092561A (en) * 2013-01-18 2013-05-08 北京理工大学 Goldschmidt division implementation method based on divisor mapping
CN105389157A (en) * 2015-10-29 2016-03-09 中国人民解放军国防科学技术大学 Goldschmidt algorithm-based floating-point divider
CN107992284A (en) * 2017-11-27 2018-05-04 中国航空无线电电子研究所 A kind of division function implementation method of programming device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4858794B2 (en) * 2009-12-02 2012-01-18 日本電気株式会社 Floating point divider and information processing apparatus using the same
GB2528497B (en) * 2014-07-24 2021-06-16 Advanced Risc Mach Ltd Apparatus And Method For Performing Floating-Point Square Root Operation
US10983756B2 (en) * 2014-10-17 2021-04-20 Imagination Technologies Limited Small multiplier after initial approximation for operations with increasing precision
US9916130B2 (en) * 2014-11-03 2018-03-13 Arm Limited Apparatus and method for vector processing
GB2576536B (en) * 2018-08-22 2021-05-05 Imagination Tech Ltd Float division by constant integer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5341321A (en) * 1993-05-05 1994-08-23 Hewlett-Packard Company Floating point arithmetic unit using modified Newton-Raphson technique for division and square root
CN103092561A (en) * 2013-01-18 2013-05-08 北京理工大学 Goldschmidt division implementation method based on divisor mapping
CN105389157A (en) * 2015-10-29 2016-03-09 中国人民解放军国防科学技术大学 Goldschmidt algorithm-based floating-point divider
CN107992284A (en) * 2017-11-27 2018-05-04 中国航空无线电电子研究所 A kind of division function implementation method of programming device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种单精度浮点倒数开方运算的硬件实现;焦永;;电脑知识与技术(第09期);2243-2263 *

Also Published As

Publication number Publication date
CN112732223A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN110070178B (en) Convolutional neural network computing device and method
US20210349692A1 (en) Multiplier and multiplication method
CN110688088B (en) General nonlinear activation function computing device and method for neural network
CN107305484B (en) Nonlinear function operation device and method
CN106940638B (en) Hardware architecture of binary original code addition and subtraction arithmetic unit
CN109165006B (en) Design optimization and hardware implementation method and system of Softmax function
CN111240746A (en) Floating point data inverse quantization and quantization method and equipment
CN110187866B (en) Hyperbolic CORDIC-based logarithmic multiplication computing system and method
CN110222305B (en) Logarithmic function calculation system and method based on hyperbolic CORDIC
CN112732223B (en) Semi-precision floating point divider data processing method and system
US20210044303A1 (en) Neural network acceleration device and method
US11551087B2 (en) Information processor, information processing method, and storage medium
CN103593159A (en) High efficiency high accuracy division implementation method and device
US20200192633A1 (en) Arithmetic processing device and method of controlling arithmetic processing device
CN110837624A (en) Approximate calculation device for sigmoid function
CN115526131A (en) Method and device for approximately calculating Tanh function by multi-level coding
Chang et al. Softsign function hardware implementation using piecewise linear approximation
Kumar et al. Complex multiplier: implementation using efficient algorithms for signal processing application
CN115407966A (en) Data representation method, tensor quantization method and multiply-add calculation device
CN114860193A (en) Hardware operation circuit for calculating Power function and data processing method
CN110069240B (en) Fixed point and floating point data calculation method and device
CN112783470A (en) Device and method for executing floating point logarithm operation
CN112711440A (en) Converter, chip, electronic device and method for converting data type
JP3736745B2 (en) Data arithmetic processing apparatus and data arithmetic processing program
CN113220267B (en) Booth coding bit expansion-based multiplier and implementation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant