WO2020059074A1 - Structure de données, dispositif de traitement d'informations, procédé et programme - Google Patents

Structure de données, dispositif de traitement d'informations, procédé et programme Download PDF

Info

Publication number
WO2020059074A1
WO2020059074A1 PCT/JP2018/034779 JP2018034779W WO2020059074A1 WO 2020059074 A1 WO2020059074 A1 WO 2020059074A1 JP 2018034779 W JP2018034779 W JP 2018034779W WO 2020059074 A1 WO2020059074 A1 WO 2020059074A1
Authority
WO
WIPO (PCT)
Prior art keywords
exponent
value
floating
mantissa
bit
Prior art date
Application number
PCT/JP2018/034779
Other languages
English (en)
Japanese (ja)
Inventor
恭啓 山崎
Original Assignee
株式会社Pfu
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社Pfu filed Critical 株式会社Pfu
Priority to PCT/JP2018/034779 priority Critical patent/WO2020059074A1/fr
Publication of WO2020059074A1 publication Critical patent/WO2020059074A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers

Definitions

  • the present disclosure relates to information processing technology.
  • Patent Document 1 Conventionally, as information processing using a floating-point number, a technology for compressing audio data by combining a fixed-point method and a floating-point method using a flag indicating whether the expression is a fixed-point expression or a floating-point expression (see Patent Document 1).
  • Patent Document 2 See Patent Document 2, a circuit for changing the bit length of data expressed in the floating point format in the IEEE 754 format (see Patent Document 2), and a technology for reducing the scale of a circuit that handles a denormalized number in the floating point format in the IEEE 754 format (Patent Document 2). 3) has been proposed.
  • CNN convolutional Neural Network
  • 8-bit fixed-point number representation is used.
  • a data set for calibration is inferred using bit-floating-point representation, and the distribution of each layer / data obtained therefrom and the distribution obtained by quantizing them are used to determine a scale factor that minimizes information loss.
  • a calculation method has been proposed (see Non-Patent Document 1).
  • Non-Patent Document 2 As information processing using a floating point number having a relatively small bit width, a unique floating point number expression (ms-fp8) is used (see Non-Patent Document 2), and a low-precision floating point number defined by IEEE is used. There has been proposed a technique (see Non-Patent Document 3) for improving the inference accuracy of the CNN operation by using expressions (FP8 / FP7 / FP6).
  • floating point numbers have been used in information processing, and the precision of numerical values that can be represented by floating point numbers depends on the bit width of the floating point type used, particularly the bit width of the mantissa.
  • information processing using floating-point numbers there is an information processing method that can obtain an advantage by expanding a dynamic range that can be represented by data even if data accuracy (sampling width) is reduced.
  • a floating-point number is used for data representation, but the inference accuracy of CNN is more dependent on the dynamic range of data than the accuracy of data.
  • the present disclosure has been made in view of the above-described problems, and provides a data structure for a floating-point number suitable for specific information processing in which an advantage is obtained by expanding a dynamic range that can be represented by data even if data accuracy is reduced.
  • the task is to provide
  • An example of the present disclosure is a data structure for recording a floating-point number with a predetermined bit width in a storage device of an information processing device, wherein an exponent of the floating-point number is recorded in the predetermined bit width.
  • the data structure is an extended exponent that expresses a part of the exponent, and the extended exponent and the exponent are combined to represent the exponent of the floating-point number.
  • an example of the present disclosure uses a first exponent part, a first mantissa part, and a part of the first mantissa part when the value of the first exponent part becomes a predetermined value.
  • the used bit width is expanded in order from the lower bit, and the exponent calculated with reference to the first exponent part and the extended exponent part is a second exponent part having a wider bit width than the first exponent part.
  • Exponent output means for outputting as a value recorded in the, and, of the first mantissa, outputs the value of the bit not used as the extended exponent as it is, and outputs the value of the bit used as the extended exponent.
  • the second temporary Mantissa output means for outputting the values recorded in the parts, an information processing apparatus including a.
  • the present disclosure can be understood as an information processing device, a system, a method executed by a computer, or a program executed by a computer.
  • the present disclosure can be understood as such a program recorded on a recording medium readable by a computer, another device, a machine, or the like.
  • a computer-readable recording medium is a recording medium that stores information such as data and programs by electrical, magnetic, optical, mechanical, or chemical action and can be read from a computer or the like.
  • FIG. 2 is a schematic diagram illustrating a hardware configuration of a CNN processing system according to the embodiment. It is a figure showing an outline of functional composition of a CNN processing system concerning an embodiment.
  • FIG. 2 is a diagram illustrating an outline of a connection circuit (Rotate circuit) according to the embodiment.
  • 6 is a flowchart (A) illustrating an outline of a flow of a control process of the Rotate circuit according to the embodiment.
  • 6 is a flowchart (B) illustrating an outline of a flow of a control process of the Rotate circuit according to the embodiment.
  • FIG. 9 is a diagram illustrating a relationship among a remainder x_mod6, a control signal SEL, and a read signal RD in the embodiment.
  • FIG. 6 is a time chart of signals used in the input buffer and the connection circuit when the control processing according to the embodiment is executed.
  • FIG. 3 is a diagram illustrating a data structure of a unique 9-bit floating-point type PFU-FP9 used in the embodiment.
  • FIG. 4 is a diagram illustrating a data structure of a unique 8-bit floating point type PFU-FP8 used in the embodiment.
  • FIG. 3 is a diagram schematically illustrating a functional configuration of a conversion circuit from a floating-point PFU-FP8 to a floating-point PFU-FP9 in the embodiment.
  • FIG. 4 is a diagram illustrating a conversion circuit from a floating-point PFU-FP8 to a floating-point PFU-FP9 in the embodiment.
  • FIG. 3 is a diagram showing a conversion circuit from a floating-point FP8 (IEEE) to a floating-point FP9 (IEEE) in the prior art. It is a figure showing an outline of functional composition of a variation of a CNN processing system concerning an embodiment.
  • FIG. 3 is a diagram showing a data structure of a unique 7-bit floating point type PFU-FP7 used in the embodiment.
  • FIG. 4 is a diagram illustrating a data structure of a unique 6-bit floating-point type PFU-FP6 used in the embodiment.
  • FIG. 1 is a schematic diagram illustrating a hardware configuration of a convolutional neural network (CNN) processing system 1 according to the present embodiment.
  • the CNN processing system 1 includes a CPU (Central Processing Unit) 11, a host-side RAM (Random Access Memory) 12a, an FPGA-side RAM 12b, a ROM (Read Only Memory), and an EEPROM (Electrically Available Radio Anywhere Radio Anywhere Radio Memory Array). ) Or a hard disk drive (HDD), a communication unit such as a network interface card (NIC) 15, a field-programmable gate array (FPGA) 16, and the like.
  • NIC network interface card
  • FPGA field-programmable gate array
  • image data that is obtained by being imaged by an external camera and includes a plurality of pixels arranged in a predetermined order is used as input data.
  • the type of the input data is not limited to the image data, and the elements constituting the input data are not limited to the pixels.
  • the CNN processing system 1 can handle various data such as natural language data, game data, and time-series data as learning / inference targets.
  • the CNN processing system 1 is a system that uses a FPGA as an accelerator from a host machine having the CPU 11 mounted thereon.
  • the image data obtained from the external camera is read into the FPGA-side RAM 12b via the host-side RAM 12a, and under the control of the CPU 11, an inference operation or the like is performed in the FPGA.
  • the output data as the calculation result is transmitted to the outside by the CPU 11 using the NIC 15, and is utilized.
  • FIG. 2 is a diagram schematically illustrating a functional configuration of the CNN processing system 1 according to the present embodiment.
  • the CNN processing system 1 the programs recorded in the storage device 14 are read out to the RAMs 12a and 12b, executed by the CPU 11 and / or the FPGA 16, and the hardware provided in the server 50 is controlled. 2 functions as an information processing apparatus including the input data reading unit 21, the input buffer 22, the product-sum operation module 23, the output buffer 24, the accumulation addition pipeline 25, the weight data reading unit 26, and the weight buffer 27 shown in FIG. .
  • each function of the CNN processing system 1 is executed by the CPU 11 and / or the FPGA 16 which are general-purpose processors. Alternatively, it may be executed by a plurality of dedicated processors.
  • the input data reading unit 21 reads, from the FPGA-side RAM 12b, input data (image data in the present embodiment) including a plurality of elements (pixels in the present embodiment) arranged in a predetermined order, and stores the read data in the input buffer 22. Write.
  • the input buffer 22 has ⁇ memories 0 to ⁇ 1.
  • the input data reading unit 21 stores a plurality of elements in the input data one by one in order from the first memory 0 in the memory 0 to ⁇ 1 in a predetermined order, and reaches the last memory ⁇ 1. Then, the process returns to the top memory 0 and stores the elements again in a predetermined order.
  • the product-sum operation module 23 receives input data from the input buffer 22 with an even input width ⁇ , and receives weight data from the weight data reading unit 26 as an odd number of taps r (weight data width. This corresponds to the kernel width kw in the CNN). ), And performs a Wingrad conversion process, a weighting process, and a Wingrad inverse transform process, and outputs output data with an even output width m.
  • the product-sum operation module 23 performs the input data Winograd conversion process on the input data, and performs the weight data Winograd conversion process on the weight data.
  • the result of the Winograd conversion processing of the weight data is recorded in the weight buffer 27 because it is used a plurality of times.
  • the product-sum operation module 23 performs a product-sum operation using the input data and the weight data to which the Winograd conversion process has been applied, performs the Winograd reverse conversion process on the result, and obtains output data.
  • the obtained output data may be subjected to a normalization process or a bias process, an activation process using a so-called ReLU (Rectified Linear Unit) function, or the like.
  • the output data is rearranged in the output buffer 24 so that the writing order is sequential, and is written to the FPGA-side RAM 12b via the cumulative addition pipeline 25.
  • the input width ⁇ is preferably set to an even fixed value in order to always obtain the maximum performance by operating all the multipliers.
  • an odd value such as 1, 3, 5, 7, 11 is often used for the kernel width kw corresponding to the number of taps r.
  • the convolution operation is performed by inserting padding pixels by the same number of pixels ((kw ⁇ 1) ⁇ 2) at the left and right ends of the input image.
  • the size of the input image and the size of the output image can be made the same.
  • the input width ⁇ is often even
  • the output width m is even
  • the number of taps r is odd, so that the technology according to the present disclosure can be used.
  • FIG. 3 is a diagram showing an outline of a connection circuit (Rotate circuit) according to the present embodiment.
  • the connection circuit is arranged between the memories 0 to ⁇ -1 of the input buffer 22 and the input terminals 0 to ⁇ -1 of the product-sum operation module 23 described with reference to FIG.
  • connection circuit connects the input terminals 0 to ⁇ -1 and the memories 0 to ⁇ -1 for receiving input data with the input width ⁇ in the product-sum operation module 23.
  • the connection circuit connects the odd-numbered input terminal to the odd-numbered memory, and connects the even-numbered input terminal to the even-numbered memory.
  • the connection between the other memory and the input terminal (specifically, the connection between the odd-numbered input terminal and the even-numbered memory, and the connection between the even-numbered input terminal and the odd-numbered memory) ) May be arbitrarily omitted, and in this embodiment, the input terminals 0 to ⁇ -1 and the memories 0 to ⁇ -1 are connected only by odd numbers and even numbers.
  • FIG. 3 shows an example of a connection line when the input width ⁇ is 6, and RAMs 0 to 5 in the figure correspond to memories 0 to 5.
  • the memory 0 is connected only to the input terminals 0, 2 and 4
  • the memory 1 is connected only to the input terminals 1, 3 and 5, the connection between the memory 0 and the input terminals 1, 3 and 5, and the connection between the memory 1 and the input terminals 0, It can be seen that the connections to 2 and 4 have been omitted. This is the same for the memories 3 to 5.
  • the connection circuit may be a physical circuit or a logic circuit in a programmable device.
  • the control unit When data from the input buffer 22 is input to the product-sum operation module 23, the control unit (selector) transmits data from any one of the memories 0 to ⁇ -1 to any one of the input terminals 0 to ⁇ -1. Controls whether to enter data. Specifically, the control unit divides the number of the memory input to the input terminal in the i-th process by “input terminal number + (output width m * (i ⁇ 1))” by the input width ⁇ . It is the value of the remainder at that time. In actual control, the content of the control signal by the control unit may be determined by a calculation formula, or may be determined by referring to a map, a table, or the like. A specific flow of processing by the control unit will be described later using a flowchart.
  • memories 0 to 5 are storage areas in the input buffer 22 when the input width ⁇ is 6.
  • Pixel memories D0 to D23 in the image data are stored in the memories 0 to 5, respectively.
  • the input data reading unit 21 stores the plurality of elements in the input data one by one in the memory 0 to ⁇ 1 in order from the first memory 0 according to a predetermined order, and stores the plurality of elements in the last memory ⁇ .
  • the process returns to the first memory 0 and stores the elements again in a predetermined order. Therefore, the memory 0 stores the pixel data D0, D6, D12, and D18 in order from the top, and the memory 1 stores D1, D7, D13, and D19 in order from the top.
  • Pixel data stored in each of the memories 0 to 5 is specified by a variable indicating an address in the memory.
  • variables for indicating the addresses in the memories 0 to 5 are the addresses RA0 to RA5.
  • the control unit transmits control signals SEL0 to SEL5 for selecting data to each of the input terminals 0 to 5, so that the pixel data is transmitted from any of the memories 0 to 5 to any of the input terminals 0 to 5. Control what is done.
  • the data sent to the input terminals 0 to 5 are the data ROT0 to ROT5 obtained by rotating the image data so as to be left-justified (the image data of the smallest number comes to I [0]).
  • FIGS. 4 and 5 are flowcharts showing the outline of the flow of control processing of the connection circuit (Rotate circuit) between the input buffer 22 and the product-sum operation module 23 according to the present embodiment.
  • the processing shown in this flowchart is repeatedly executed during the inference calculation in the CNN processing system 1.
  • step S101 parameters are initialized.
  • the control unit sets a value corresponding to a kernel size width (kernel width) kw at the time of executing the convolution operation in the increment x_inc of the coordinate x.
  • the control unit sets 6 when the kernel width kw is 1, 4 when the kernel width kw is 3, 2 when the kernel width kw is 5, and adds x to the increment x_inc.
  • addresses RA0 to RA5 are set.
  • the control unit sets the current value of the quotient x_div6 to each of the addresses RA0 to RA5, which is a variable for indicating the address in the memory in each of the memories 0 to 5 (step S102).
  • the control unit compares the current value of the remainder x_mod6 with a predetermined value, and updates the values of the addresses RA0 to RA5 according to the result of the comparison (steps S103 to S112).
  • step S113 the content of the control signal is determined.
  • the control unit selects the read signal RD selected by the control signals SEL0 to SEL5 according to the value of the remainder x_mod6 as shown in FIG.
  • FIG. 6 is a diagram showing the relationship between the remainder x_mod6, the control signal SEL, and the read signal RD in the present embodiment.
  • the memory number input to the input terminal in the i-th processing is the remainder value obtained by dividing “input terminal number + (output width m * (i ⁇ 1))” by the input width ⁇ . is there. Thereafter, the process proceeds to step S114.
  • step S114 pixel data is read from the input buffer 22 and input to the corresponding input terminal.
  • the control unit outputs the values of the addresses RA0 to RA5 to the corresponding memories 0 to 5, and outputs the values of the control signals SEL0 to SEL5 to the connection circuit (Rotate circuit).
  • the pixel data at the addresses indicated by the addresses RA0 to RA5 is read from the memories 0 to 5, and is input to the input terminals of the numbers specified by the control signals SEL0 to SEL5. Thereafter, the process proceeds to step S115.
  • steps S115 to S117 the parameters are updated.
  • the control unit updates the coordinate x to the value of “the value of the coordinate x before the update + the increment x_inc”, and further updates the remainder x_mod6 to the value of the “the value of the remainder x_mod6 before the update + the increment x_inc” (step).
  • the control unit subtracts the input width ⁇ from the remainder x_mod6 to obtain the remainder x_mod6.
  • x_mod6 is adjusted to a value smaller than the input width ⁇ , and 1 is added to the quotient x_div6 (step S117). Thereafter, the process proceeds to step S118.
  • step S118 it is determined whether or not the processing needs to be completed.
  • the control unit determines whether or not the updated coordinate x updated in step S115 is equal to or larger than the width of the image data in the X-axis direction. If the coordinate x is smaller than the width of the image data in the X-axis direction, the process returns to step S102 because unprocessed pixels remain in the X-axis direction. On the other hand, when the coordinate x is equal to or larger than the width of the image data in the X-axis direction, the processing of this flowchart ends.
  • FIG. 7 is a time chart of signals used in the input buffer 22 and the connection circuit when the control processing according to the present embodiment is executed. According to this time chart, if the even-numbered memories and the input terminals are connected to each other and the odd-numbered memories and the input terminals are connected to each other, it is possible to pass the input data to the product-sum operation module 23 without any problem. I understand.
  • a plurality of different data types of floating-point numbers are used to represent data.
  • the host-side RAM 12a, the FPGA-side RAM 12b, the input buffer 22, and the output buffer 24 use PFU-FP8, which is a unique 8-bit floating-point type.
  • PFU-FP9, which is a 9-bit floating-point type, and FP32, which is a single-precision floating-point type of the IEEE754 standard, are used (see FIG. 2).
  • FIG. 8 is a diagram showing a data structure of the unique 9-bit floating point type PFU-FP9 used in the present embodiment.
  • the floating-point type PFU-FP9 has a data structure for recording a floating-point number with a 9-bit width.
  • the first bit is a sign bit indicating a sign (positive or negative) of a numerical value, and a second bit to a fifth bit.
  • Four bits are an exponent part, and four bits from a sixth bit to a ninth bit are a mantissa part.
  • FIG. 9 is a diagram showing a data structure of the unique 8-bit floating point type PFU-FP8 used in the present embodiment.
  • the floating-point type PFU-FP8 has a data structure for recording a floating-point number with an 8-bit width.
  • the first bit is a sign bit indicating a sign (positive or negative) of a numerical value, and a second bit to a fourth bit.
  • Three bits are an exponent part, and four bits from a fifth bit to an eighth bit are a mantissa part.
  • indices like the common floating-point data is determined for the range from index 2 -1 to 2 -7 (range of bits of the exponent portion from “111" to "001")
  • range of bits of the exponent portion from “111" to "001”
  • the exponent from 2 ⁇ 8 to 2 ⁇ 11 when the bit of the exponent is “000”
  • a part of the mantissa is used as an extended exponent expressing a part of the exponent.
  • the exponent part and the exponent part are combined to represent the actual exponent.
  • the bit width used as the extension exponent is extended in order from the lower bit (8th bit) in accordance with the value of the exponent to be expressed, and a flag is set in any bit of the extension exponent.
  • the exponent of the floating point number is represented depending on whether
  • the exponent is 2 ⁇ 8 , the value of the exponent part is “000”, and 7 bits If there is a flag (“1") in the eye, the exponent is 2-9. If the value of the exponent part is "000” and the flag ("1") is in the sixth bit, the exponent is 2-10 . If the value of the exponent part is "000” and the flag ("1") is in the fifth bit, the exponent is 2-11 . According to such a unique floating-point type, the accuracy of data is coarse, but the expressible dynamic range is widened, so that the inference accuracy by the CNN can be improved.
  • FIG. 10 is a diagram schematically illustrating a functional configuration of a conversion circuit from the floating-point PFU-FP8 to the floating-point PFU-FP9 in the present embodiment.
  • the conversion circuit includes a receiving unit 31, an exponent output unit 32, and a mantissa output unit 33.
  • the receiving unit 31 receives the sign of the floating-point PFU-FP8, the exponent of the floating-point PFU-FP8, and the mantissa of the floating-point PFU-FP8.
  • the exponent output unit 32 converts the exponent calculated with reference to the exponent part and the extended exponent part of the floating-point PFU-FP8 into a floating-point PFU-FP9 having a wider bit width than the exponent part of the floating-point PFU-FP8. Is output as the value recorded in the exponent part of.
  • the mantissa output unit 33 outputs, as it is, the value of the bit that is not used as the extension exponent in the mantissa of the floating-point type PFU-FP8, and outputs the value of the bit that is used as the extension exponent as 0. Thus, a value recorded in the mantissa of the floating-point type PFU-FP9 is output.
  • FIG. 11 is a diagram showing a conversion circuit from the floating-point PFU-FP8 to the floating-point PFU-FP9 in the present embodiment.
  • FP8_f0 to FP3 indicate the input of the mantissa of the floating-point PFU-FP8
  • FP8_exp0 to 2 indicate the input of the exponent of the floating-point PFU-FP8
  • FP8_sign indicates the input of the floating-point PFU-FP8.
  • the input of the code of FP8 is shown.
  • FP9_f0 to FP3 indicate the output of the mantissa of the floating-point PFU-FP9
  • FP9_exp0 to 3 indicate the output of the exponent of the floating-point PFU-FP9
  • FP9_sign indicates the output of the floating-point PFU-FP9. The output of the code is shown.
  • FIG. 12 is a diagram showing a conversion circuit from a floating-point type FP8 (IEEE) to a floating-point type FP9 (IEEE) in the prior art.
  • FP8_f0 to 3 indicate the input of the mantissa of the floating-point FP8 (IEEE)
  • FP8_exp0 to 2 indicate the input of the exponent of the floating-point FP8 (IEEE)
  • FP8_sign indicates the input of the floating-point. This shows input of a code of FP8 (IEEE).
  • FP9_f0 to FP3 indicate the output of the mantissa of the floating-point FP9 (IEEE)
  • FP9_exp0 to 3 indicate the output of the exponent of the floating-point FP9 (IEEE)
  • FP9_sign indicates the output of the floating-point FP9 (IEEE). 2 shows the output of the IEEE (IEEE) code.
  • the circuits (1) and (5) indicated by broken lines in FIGS. 11 and 12 are circuits for determining whether or not all exponents are 0.
  • the circuit (2) and the circuit (6) are circuits that calculate the exponent indicated by the mantissa when the exponents are all 0 (that is, when the exponents are denormalized numbers).
  • the circuit (3) and the circuit (7) are circuits for obtaining the value of the exponent part when the denormalized number is expressed as a normalized number in the FP9.
  • the circuit (4) is a circuit for converting the mantissa part of the denormalized number in the floating-point type FP8 (IEEE) to the mantissa part of the normalized number in the floating-point type FP9 (IEEE), and performs a bit shift. It is.
  • the circuit (8) is a circuit for converting the mantissa part when expressing the denormalized number in the floating-point PFU-FP8 into the mantissa part of the normalized number in the
  • the floating-point type PFU-FP8 and the floating-point type PFU-FP9 are employed as the unique floating-point types, but other floating-point types may be employed.
  • FIG. 13 is a diagram schematically illustrating a functional configuration of the CNN processing system 1b according to the present embodiment.
  • the functional configuration of the CNN processing system 1b is substantially the same as that of the CNN processing system 1 described with reference to FIG. 2 except that the input buffer 22, the Wingrad conversion process, the cumulative addition pipeline 25, and the like are omitted.
  • the floating point type employed is different.
  • the host-side RAM 12a, the FPGA-side RAM 12b, the input buffer 22, and the output buffer 24 use the unique 6-bit floating-point type PFU-FP6.
  • PFU-FP7, a 7-bit floating-point type, and FP32, a single-precision floating-point type of the IEEE754 standard, are used.
  • FIG. 14 is a diagram showing a data structure of a unique 7-bit floating point type PFU-FP7 used in the present embodiment.
  • the floating-point type PFU-FP7 has a data structure for recording a floating-point number with a 7-bit width, and the first bit is a sign bit indicating a sign (positive or negative) of a numerical value, and a second bit to a fifth bit.
  • Four bits are an exponent part, and two bits from the sixth bit to the seventh bit are a mantissa part.
  • FIG. 15 is a diagram showing a data structure of a unique 6-bit floating point type PFU-FP6 used in the present embodiment.
  • the floating-point type PFU-FP6 is a data structure for recording a floating-point number with a 6-bit width, and the first bit is a sign bit indicating a sign (positive or negative) of a numerical value, and a second bit to a fourth bit. Three bits are an exponent part, and two bits from the fifth bit to the sixth bit are a mantissa part.
  • the second bit (seventh bit) from the lower part of the mantissa part Is an extended exponent part that expresses a part of the exponent (see the part surrounded by a broken line in the figure).
  • the exponent of the floating-point number is represented by a combination of the state of the flag at the 7th bit, which is the extended exponent part, and the value of the exponent part.
  • the values of the exponents 2-6 to 2-11 are represented by 4-bit combinations of the exponent part and the extended exponent part, the values of "0101" to "0000". According to such a unique floating-point type, the accuracy of data is coarse, but the expressible dynamic range is widened, so that the inference accuracy by the CNN can be improved.
  • the number of connection circuits between the memories 0 to ⁇ of the input buffer 22 and the input terminals 0 to ⁇ of the product-sum operation module 23 is reduced, and the circuit scale (even if it is a logical circuit, Circuit).
  • the circuit scale even if it is a logical circuit, Circuit.
  • the circuit scale for inputting data to the arithmetic module that performs is reduced to 1/4.
  • a part of the mantissa part becomes an extended exponent part expressing a part of the exponent, and the extended exponent part
  • the accuracy of the data is coarse, but the expressible dynamic range is widened, and the inference accuracy by the CNN can be improved.
  • the conversion cost of the floating-point type is small, and the conversion circuit may be a physical circuit or a logic circuit. It is possible to improve the situation where resources such as a memory and a logic circuit are scarce, and it is possible to improve productivity and customizability by using a dedicated device such as an ASIC.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Nonlinear Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

L'invention concerne une structure de données destinée à enregistrer un nombre en virgule flottante, ladite structure de données comprenant une partie exposant dans laquelle l'exposant du nombre en virgule flottante est enregistré dans une largeur en bits prescrite, et une partie mantisse dans laquelle la mantisse du nombre en virgule flottante est enregistrée dans une largeur en bits prescrite, une portion de la partie mantisse devenant une partie exposant étendue qui exprime une portion de l'exposant si la valeur de la partie exposant atteint une valeur prescrite, et la partie exposant étendue et la partie exposant combinées représentant l'exposant du nombre en virgule flottante.
PCT/JP2018/034779 2018-09-20 2018-09-20 Structure de données, dispositif de traitement d'informations, procédé et programme WO2020059074A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2018/034779 WO2020059074A1 (fr) 2018-09-20 2018-09-20 Structure de données, dispositif de traitement d'informations, procédé et programme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2018/034779 WO2020059074A1 (fr) 2018-09-20 2018-09-20 Structure de données, dispositif de traitement d'informations, procédé et programme

Publications (1)

Publication Number Publication Date
WO2020059074A1 true WO2020059074A1 (fr) 2020-03-26

Family

ID=69888561

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/034779 WO2020059074A1 (fr) 2018-09-20 2018-09-20 Structure de données, dispositif de traitement d'informations, procédé et programme

Country Status (1)

Country Link
WO (1) WO2020059074A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63259720A (ja) * 1988-03-25 1988-10-26 Hitachi Ltd 浮動小数点乗算回路
JP2010027049A (ja) * 2008-07-22 2010-02-04 Internatl Business Mach Corp <Ibm> 浮動小数点実行ユニットを用いる回路装置、集積回路装置、プログラム製品、および方法(動的値域調整浮動小数点実行ユニット)

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63259720A (ja) * 1988-03-25 1988-10-26 Hitachi Ltd 浮動小数点乗算回路
JP2010027049A (ja) * 2008-07-22 2010-02-04 Internatl Business Mach Corp <Ibm> 浮動小数点実行ユニットを用いる回路装置、集積回路装置、プログラム製品、および方法(動的値域調整浮動小数点実行ユニット)

Similar Documents

Publication Publication Date Title
CN111652368B (zh) 一种数据处理方法及相关产品
CN109685198B (zh) 用于量化神经网络的参数的方法和装置
CN108701250B (zh) 数据定点化方法和装置
CN108337000B (zh) 用于转换到较低精度数据格式的自动方法
US20210263995A1 (en) Reduced dot product computation circuit
CN110413255B (zh) 人工神经网络调整方法和装置
CN108139885B (zh) 浮点数舍入
JP2018156451A (ja) ネットワーク学習装置、ネットワーク学習システム、ネットワーク学習方法およびプログラム
CN112771547A (zh) 通信系统中的端到端学习
WO2020059073A1 (fr) Dispositif, procédé et programme de traitement d&#39;informations
US11288597B2 (en) Computer-readable recording medium having stored therein training program, training method, and information processing apparatus
JP2022512211A (ja) 画像処理方法、装置、車載演算プラットフォーム、電子機器及びシステム
CN110337636A (zh) 数据转换方法和装置
US20230133337A1 (en) Quantization calibration method, computing device and computer readable storage medium
JP5619326B2 (ja) 符号化装置、復号装置、符号化方法、符号化プログラム、復号方法および復号プログラム
CN112561050B (zh) 一种神经网络模型训练方法及装置
WO2020059074A1 (fr) Structure de données, dispositif de traitement d&#39;informations, procédé et programme
US20230161555A1 (en) System and method performing floating-point operations
CN112308226B (zh) 神经网络模型的量化、用于输出信息的方法和装置
JP7506276B2 (ja) 半導体ハードウェアにおいてニューラルネットワークを処理するための実装および方法
KR20220018199A (ko) 희소성 데이터를 이용하는 연산 장치 및 그것의 동작 방법
US20240202501A1 (en) System and method for mathematical modeling of hardware quantization process
JP6749530B1 (ja) 構造変換装置、構造変換方法及び構造変換プログラム
JP2024517707A (ja) 半導体ハードウェアにおいてニューラルネットワークを処理するための実装および方法
TWI819627B (zh) 用於深度學習網路的優化方法、運算裝置及電腦可讀取媒體

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18933935

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18933935

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP