WO2020059074A1

WO2020059074A1 - Data construct, information processing device, method, and program

Info

Publication number: WO2020059074A1
Application number: PCT/JP2018/034779
Authority: WO
Inventors: 恭啓山崎
Original assignee: 株式会社Pfu
Priority date: 2018-09-20
Filing date: 2018-09-20
Publication date: 2020-03-26

Abstract

A data construct for recording a floating-point number, said data construct comprising an exponent part in which the exponent of the floating-point number is recorded among a prescribed bit width, and a significand part in which the significand of the floating-point number is recorded among a prescribed bit width, wherein some of the significand part becomes an extended exponent part that expresses some of the exponent if the value of the exponent part reaches a prescribed value, and the extended exponent part and the exponent part combined represent the exponent of the floating-point number.

Description

Data structure, information processing device, method and program

The present disclosure relates to information processing technology.

Conventionally, as information processing using a floating-point number, a technology for compressing audio data by combining a fixed-point method and a floating-point method using a flag indicating whether the expression is a fixed-point expression or a floating-point expression (see Patent Document 1). (See Patent Document 2), a circuit for changing the bit length of data expressed in the floating point format in the IEEE 754 format (see Patent Document 2), and a technology for reducing the scale of a circuit that handles a denormalized number in the floating point format in the IEEE 754 format (Patent Document 2). 3) has been proposed. In order to represent the input / output / weight value data of each layer of a convolutional neural network (Convolutional Neural Network; hereinafter, referred to as “CNN”) without deteriorating the recognition accuracy as much as possible, an 8-bit fixed-point number representation is used. First, a data set for calibration is inferred using bit-floating-point representation, and the distribution of each layer / data obtained therefrom and the distribution obtained by quantizing them are used to determine a scale factor that minimizes information loss. A calculation method has been proposed (see Non-Patent Document 1). Further, as information processing using a floating point number having a relatively small bit width, a unique floating point number expression (ms-fp8) is used (see Non-Patent Document 2), and a low-precision floating point number defined by IEEE is used. There has been proposed a technique (see Non-Patent Document 3) for improving the inference accuracy of the CNN operation by using expressions (FP8 / FP7 / FP6).

JP 2002-271207 A JP 2012-205005 A JP 2006-318382 A

Conventionally, floating point numbers have been used in information processing, and the precision of numerical values that can be represented by floating point numbers depends on the bit width of the floating point type used, particularly the bit width of the mantissa. On the other hand, among information processing using floating-point numbers, there is an information processing method that can obtain an advantage by expanding a dynamic range that can be represented by data even if data accuracy (sampling width) is reduced. For example, in the CNN operation, a floating-point number is used for data representation, but the inference accuracy of CNN is more dependent on the dynamic range of data than the accuracy of data.

The present disclosure has been made in view of the above-described problems, and provides a data structure for a floating-point number suitable for specific information processing in which an advantage is obtained by expanding a dynamic range that can be represented by data even if data accuracy is reduced. The task is to provide

An example of the present disclosure is a data structure for recording a floating-point number with a predetermined bit width in a storage device of an information processing device, wherein an exponent of the floating-point number is recorded in the predetermined bit width. An exponent part, and a mantissa part of the predetermined bit width, in which a mantissa of the floating-point number is recorded, wherein a part of the mantissa part is when the value of the exponent part becomes a predetermined value. In addition, the data structure is an extended exponent that expresses a part of the exponent, and the extended exponent and the exponent are combined to represent the exponent of the floating-point number.

Furthermore, an example of the present disclosure uses a first exponent part, a first mantissa part, and a part of the first mantissa part when the value of the first exponent part becomes a predetermined value. Receiving means for receiving an input of a floating-point number recorded in a data structure having an extended exponent that expresses a part of an exponent; and the extended exponent includes an extended exponent according to a value of the exponent to be expressed. The used bit width is expanded in order from the lower bit, and the exponent calculated with reference to the first exponent part and the extended exponent part is a second exponent part having a wider bit width than the first exponent part. Exponent output means for outputting as a value recorded in the, and, of the first mantissa, outputs the value of the bit not used as the extended exponent as it is, and outputs the value of the bit used as the extended exponent. By outputting the value as 0, the second temporary Mantissa output means for outputting the values recorded in the parts, an information processing apparatus including a.

The present disclosure can be understood as an information processing device, a system, a method executed by a computer, or a program executed by a computer. In addition, the present disclosure can be understood as such a program recorded on a recording medium readable by a computer, another device, a machine, or the like. Here, a computer-readable recording medium is a recording medium that stores information such as data and programs by electrical, magnetic, optical, mechanical, or chemical action and can be read from a computer or the like. Say.

According to the present disclosure, it is possible to provide a data structure for a floating-point number suitable for specific information processing in which an advantage can be obtained by expanding a dynamic range that can be represented by data even when data accuracy is coarse. Becomes possible.

FIG. 2 is a schematic diagram illustrating a hardware configuration of a CNN processing system according to the embodiment. It is a figure showing an outline of functional composition of a CNN processing system concerning an embodiment. FIG. 2 is a diagram illustrating an outline of a connection circuit (Rotate circuit) according to the embodiment. 6 is a flowchart (A) illustrating an outline of a flow of a control process of the Rotate circuit according to the embodiment. 6 is a flowchart (B) illustrating an outline of a flow of a control process of the Rotate circuit according to the embodiment. FIG. 9 is a diagram illustrating a relationship among a remainder x_mod6, a control signal SEL, and a read signal RD in the embodiment. 6 is a time chart of signals used in the input buffer and the connection circuit when the control processing according to the embodiment is executed. FIG. 3 is a diagram illustrating a data structure of a unique 9-bit floating-point type PFU-FP9 used in the embodiment. FIG. 4 is a diagram illustrating a data structure of a unique 8-bit floating point type PFU-FP8 used in the embodiment. FIG. 3 is a diagram schematically illustrating a functional configuration of a conversion circuit from a floating-point PFU-FP8 to a floating-point PFU-FP9 in the embodiment. FIG. 4 is a diagram illustrating a conversion circuit from a floating-point PFU-FP8 to a floating-point PFU-FP9 in the embodiment. FIG. 3 is a diagram showing a conversion circuit from a floating-point FP8 (IEEE) to a floating-point FP9 (IEEE) in the prior art. It is a figure showing an outline of functional composition of a variation of a CNN processing system concerning an embodiment. FIG. 3 is a diagram showing a data structure of a unique 7-bit floating point type PFU-FP7 used in the embodiment. FIG. 4 is a diagram illustrating a data structure of a unique 6-bit floating-point type PFU-FP6 used in the embodiment.

Hereinafter, embodiments of a data structure, an information processing device, a method, and a program according to the present disclosure will be described with reference to the drawings. However, the embodiments described below exemplify the embodiments, and do not limit the data structure, the information processing device, the method, and the program according to the present disclosure to the specific configurations described below. In the implementation, a specific configuration according to the embodiment is appropriately adopted, and various improvements and modifications may be made.

In the description of the embodiment, an embodiment will be described in which the data structure, the information processing device, the method, and the program according to the present disclosure are implemented in a system for performing a convolutional neural network operation. The data structure, the information processing device, the method, and the program according to the present disclosure can be widely used in information processing, and the application target of the present disclosure is not limited to the examples described in the embodiments.

<System configuration>
FIG. 1 is a schematic diagram illustrating a hardware configuration of a convolutional neural network (CNN) processing system 1 according to the present embodiment. The CNN processing system 1 according to the present embodiment includes a CPU (Central Processing Unit) 11, a host-side RAM (Random Access Memory) 12a, an FPGA-side RAM 12b, a ROM (Read Only Memory), and an EEPROM (Electrically Available Radio Anywhere Radio Anywhere Radio Memory Array). ) Or a hard disk drive (HDD), a communication unit such as a network interface card (NIC) 15, a field-programmable gate array (FPGA) 16, and the like.

In the present embodiment, an example of processing an image input from an external camera connected to the CNN processing system 1 will be described. That is, in the present embodiment, image data that is obtained by being imaged by an external camera and includes a plurality of pixels arranged in a predetermined order is used as input data. However, the type of the input data is not limited to the image data, and the elements constituting the input data are not limited to the pixels. The CNN processing system 1 can handle various data such as natural language data, game data, and time-series data as learning / inference targets.

The CNN processing system 1 according to the present embodiment is a system that uses a FPGA as an accelerator from a host machine having the CPU 11 mounted thereon. The image data obtained from the external camera is read into the FPGA-side RAM 12b via the host-side RAM 12a, and under the control of the CPU 11, an inference operation or the like is performed in the FPGA. The output data as the calculation result is transmitted to the outside by the CPU 11 using the NIC 15, and is utilized.

FIG. 2 is a diagram schematically illustrating a functional configuration of the CNN processing system 1 according to the present embodiment. In the CNN processing system 1, the programs recorded in the storage device 14 are read out to the

RAMs

12a and 12b, executed by the CPU 11 and / or the FPGA 16, and the hardware provided in the server 50 is controlled. 2 functions as an information processing apparatus including the input data reading unit 21, the input buffer 22, the product-sum operation module 23, the output buffer 24, the accumulation addition pipeline 25, the weight data reading unit 26, and the weight buffer 27 shown in FIG. . In the present embodiment and other embodiments described later, each function of the CNN processing system 1 is executed by the CPU 11 and / or the FPGA 16 which are general-purpose processors. Alternatively, it may be executed by a plurality of dedicated processors.

The input data reading unit 21 reads, from the FPGA-side RAM 12b, input data (image data in the present embodiment) including a plurality of elements (pixels in the present embodiment) arranged in a predetermined order, and stores the read data in the input buffer 22. Write.

The input buffer 22 has α memories 0 to α−1. The input data reading unit 21 stores a plurality of elements in the input data one by one in order from the first memory 0 in the memory 0 to α−1 in a predetermined order, and reaches the last memory α−1. Then, the process returns to the top memory 0 and stores the elements again in a predetermined order. In this embodiment, for simplicity, only processing in the X-axis direction will be described. However, in an actual system, input data is two-dimensional input data having a width in the X-axis direction and a width in the Y-axis direction. May be. In this case, the size of the input data is α * α (that is, 6 * 6 = 36 if α = 6), and the number of memories in the input buffer 22 is also α * α.

The product-sum operation module 23 receives input data from the input buffer 22 with an even input width α, and receives weight data from the weight data reading unit 26 as an odd number of taps r (weight data width. This corresponds to the kernel width kw in the CNN). ), And performs a Wingrad conversion process, a weighting process, and a Wingrad inverse transform process, and outputs output data with an even output width m.

More specifically, the product-sum operation module 23 performs the input data Winograd conversion process on the input data, and performs the weight data Winograd conversion process on the weight data. The result of the Winograd conversion processing of the weight data is recorded in the weight buffer 27 because it is used a plurality of times. Then, the product-sum operation module 23 performs a product-sum operation using the input data and the weight data to which the Winograd conversion process has been applied, performs the Winograd reverse conversion process on the result, and obtains output data. The obtained output data may be subjected to a normalization process or a bias process, an activation process using a so-called ReLU (Rectified Linear Unit) function, or the like. The output data is rearranged in the output buffer 24 so that the writing order is sequential, and is written to the FPGA-side RAM 12b via the cumulative addition pipeline 25.

Here, due to the nature of the Wingrad conversion process, a relationship of α = m + r−1 is established between the input width α, the output width m, and the number of taps r. Even if the output width m and the number of taps r change, the input width α is preferably set to an even fixed value in order to always obtain the maximum performance by operating all the multipliers. In the CNN, an odd value such as 1, 3, 5, 7, 11 is often used for the kernel width kw corresponding to the number of taps r. Although it is possible to design a CNN having an even kernel width kw, the convolution operation is performed by inserting padding pixels by the same number of pixels ((kw−1) ÷ 2) at the left and right ends of the input image. This is because the size of the input image and the size of the output image can be made the same. As a result, in the Wingrad conversion process, the input width α is often even, the output width m is even, and the number of taps r is odd, so that the technology according to the present disclosure can be used.

FIG. 3 is a diagram showing an outline of a connection circuit (Rotate circuit) according to the present embodiment. The connection circuit is arranged between the memories 0 to α-1 of the input buffer 22 and the input terminals 0 to α-1 of the product-sum operation module 23 described with reference to FIG.

The connection circuit connects the input terminals 0 to α-1 and the memories 0 to α-1 for receiving input data with the input width α in the product-sum operation module 23. In the present embodiment, the connection circuit connects the odd-numbered input terminal to the odd-numbered memory, and connects the even-numbered input terminal to the even-numbered memory. At this time, the connection between the other memory and the input terminal (specifically, the connection between the odd-numbered input terminal and the even-numbered memory, and the connection between the even-numbered input terminal and the odd-numbered memory) ) May be arbitrarily omitted, and in this embodiment, the input terminals 0 to α-1 and the memories 0 to α-1 are connected only by odd numbers and even numbers.

FIG. 3 shows an example of a connection line when the input width α is 6, and RAMs 0 to 5 in the figure correspond to memories 0 to 5. The memory 0 is connected only to the

input terminals

0, 2 and 4, the memory 1 is connected only to the

input terminals

1, 3 and 5, the connection between the memory 0 and the

input terminals

1, 3 and 5, and the connection between the memory 1 and the input terminals 0, It can be seen that the connections to 2 and 4 have been omitted. This is the same for the memories 3 to 5. Note that the connection circuit may be a physical circuit or a logic circuit in a programmable device.

When data from the input buffer 22 is input to the product-sum operation module 23, the control unit (selector) transmits data from any one of the memories 0 to α-1 to any one of the input terminals 0 to α-1. Controls whether to enter data. Specifically, the control unit divides the number of the memory input to the input terminal in the i-th process by “input terminal number + (output width m * (i−1))” by the input width α. It is the value of the remainder at that time. In actual control, the content of the control signal by the control unit may be determined by a calculation formula, or may be determined by referring to a map, a table, or the like. A specific flow of processing by the control unit will be described later using a flowchart.

In FIG. 3, memories 0 to 5 are storage areas in the input buffer 22 when the input width α is 6. Pixel memories D0 to D23 in the image data are stored in the memories 0 to 5, respectively. As described above, the input data reading unit 21 stores the plurality of elements in the input data one by one in the memory 0 to α−1 in order from the first memory 0 according to a predetermined order, and stores the plurality of elements in the last memory α. When the value reaches -1, the process returns to the first memory 0 and stores the elements again in a predetermined order. Therefore, the memory 0 stores the pixel data D0, D6, D12, and D18 in order from the top, and the memory 1 stores D1, D7, D13, and D19 in order from the top. The same applies to the memories 2 to 5 (see FIG. 3). Pixel data stored in each of the memories 0 to 5 is specified by a variable indicating an address in the memory. In this embodiment, variables for indicating the addresses in the memories 0 to 5 are the addresses RA0 to RA5.

The control unit transmits control signals SEL0 to SEL5 for selecting data to each of the input terminals 0 to 5, so that the pixel data is transmitted from any of the memories 0 to 5 to any of the input terminals 0 to 5. Control what is done. Here, the data sent to the input terminals 0 to 5 are the data ROT0 to ROT5 obtained by rotating the image data so as to be left-justified (the image data of the smallest number comes to I [0]).

<Process flow>
Next, a flow of processing executed by the CNN processing system 1 according to the present embodiment will be described. The specific contents and the processing order of the processing described below are an example for implementing the present disclosure. The specific processing content and processing order may be appropriately selected according to the embodiment of the present disclosure.

FIGS. 4 and 5 are flowcharts showing the outline of the flow of control processing of the connection circuit (Rotate circuit) between the input buffer 22 and the product-sum operation module 23 according to the present embodiment. The processing shown in this flowchart is repeatedly executed during the inference calculation in the CNN processing system 1.

では In step S101, parameters are initialized. The control unit calculates a coordinate x indicating a position in the X-axis direction of an element (pixel data in the present embodiment) in the input data and an integer division of the coordinate x by α (α = 6 in the present embodiment). A remainder x_mod6 and a quotient x_div6 when the coordinate x is integer-divided by α (α = 6 in the present embodiment) are both initialized to 0. Further, the control unit sets a value corresponding to a kernel size width (kernel width) kw at the time of executing the convolution operation in the increment x_inc of the coordinate x. Specifically, in the present embodiment, the control unit sets 6 when the kernel width kw is 1, 4 when the kernel width kw is 3, 2 when the kernel width kw is 5, and adds x to the increment x_inc. Set. Here, as described above, the relationship of “α = m + r−1” is established between the input width α, the output width m, and the number of taps, “kernel width kw = number of taps r”, and “incremental x_inc” = Output width m ", the increment x_inc is determined based on the input width α and the kernel width kw. Thereafter, the process proceeds to step S102.

アドレス In steps S102 to S112, addresses RA0 to RA5 are set. The control unit sets the current value of the quotient x_div6 to each of the addresses RA0 to RA5, which is a variable for indicating the address in the memory in each of the memories 0 to 5 (step S102). Then, the control unit compares the current value of the remainder x_mod6 with a predetermined value, and updates the values of the addresses RA0 to RA5 according to the result of the comparison (steps S103 to S112). Specifically, the control unit sets the value of “x_div6 + 1” to the address RA0 when “x_mod6> = 1”, and sets the value of “x_div6 + 1” to the address RA1 when “x_mod6> = 2”. If “x_mod6> = 3”, the value of “x_div6 + 1” is set to the address RA2, and if “x_mod6> = 4”, the value of “x_div6 + 1” is set to the address RA3, and “x_mod6> = 5 ", the value of" x_div6 + 1 "is set to the address RA4. Thereafter, the process proceeds to step S113.

In step S113, the content of the control signal is determined. The control unit selects the read signal RD selected by the control signals SEL0 to SEL5 according to the value of the remainder x_mod6 as shown in FIG.

FIG. 6 is a diagram showing the relationship between the remainder x_mod6, the control signal SEL, and the read signal RD in the present embodiment. As described above, the relationship of “α = m + r−1” is established between the input width α, the output width m, and the number of taps, “kernel width kw = number of taps r”, and “increment x_inc = output width”. m ”, the increment x_inc is always an even number if the kernel size is any of 1 * 1, 3 * 3, and 5 * 5 in the Winograd algorithm with the input width α = 6. Therefore, the value of the remainder x_mod6 when the coordinate x is integer-divided by 6 is also an even number.

Here, the number of the read signal RD (= memory number) shown in FIG. 6 is obtained by adding the input terminal number to the value of the remainder x_mod6 and dividing the addition result by the input width α (6 in the present embodiment). This is a value calculated by calculating the remainder. In other words, the memory number input to the input terminal in the i-th processing is the remainder value obtained by dividing “input terminal number + (output width m * (i−1))” by the input width α. is there. Thereafter, the process proceeds to step S114.

In step S114, pixel data is read from the input buffer 22 and input to the corresponding input terminal. The control unit outputs the values of the addresses RA0 to RA5 to the corresponding memories 0 to 5, and outputs the values of the control signals SEL0 to SEL5 to the connection circuit (Rotate circuit). As a result, the pixel data at the addresses indicated by the addresses RA0 to RA5 is read from the memories 0 to 5, and is input to the input terminals of the numbers specified by the control signals SEL0 to SEL5. Thereafter, the process proceeds to step S115.

パラメータ In steps S115 to S117, the parameters are updated. The control unit updates the coordinate x to the value of “the value of the coordinate x before the update + the increment x_inc”, and further updates the remainder x_mod6 to the value of the “the value of the remainder x_mod6 before the update + the increment x_inc” (step). S115). Here, if the updated remainder x_mod6 is equal to or larger than the input width α (6 in the example shown in this flowchart) (step S116), the control unit subtracts the input width α from the remainder x_mod6 to obtain the remainder x_mod6. x_mod6 is adjusted to a value smaller than the input width α, and 1 is added to the quotient x_div6 (step S117). Thereafter, the process proceeds to step S118.

では In step S118, it is determined whether or not the processing needs to be completed. The control unit determines whether or not the updated coordinate x updated in step S115 is equal to or larger than the width of the image data in the X-axis direction. If the coordinate x is smaller than the width of the image data in the X-axis direction, the process returns to step S102 because unprocessed pixels remain in the X-axis direction. On the other hand, when the coordinate x is equal to or larger than the width of the image data in the X-axis direction, the processing of this flowchart ends.

FIG. 7 is a time chart of signals used in the input buffer 22 and the connection circuit when the control processing according to the present embodiment is executed. According to this time chart, if the even-numbered memories and the input terminals are connected to each other and the odd-numbered memories and the input terminals are connected to each other, it is possible to pass the input data to the product-sum operation module 23 without any problem. I understand.

<Operation using floating point number>
In the CNN processing system 1 according to the present embodiment, a plurality of different data types of floating-point numbers are used to represent data. Specifically, the host-side RAM 12a, the FPGA-side RAM 12b, the input buffer 22, and the output buffer 24 use PFU-FP8, which is a unique 8-bit floating-point type. PFU-FP9, which is a 9-bit floating-point type, and FP32, which is a single-precision floating-point type of the IEEE754 standard, are used (see FIG. 2).

FIG. 8 is a diagram showing a data structure of the unique 9-bit floating point type PFU-FP9 used in the present embodiment. The floating-point type PFU-FP9 has a data structure for recording a floating-point number with a 9-bit width. The first bit is a sign bit indicating a sign (positive or negative) of a numerical value, and a second bit to a fifth bit. Four bits are an exponent part, and four bits from a sixth bit to a ninth bit are a mantissa part.

FIG. 9 is a diagram showing a data structure of the unique 8-bit floating point type PFU-FP8 used in the present embodiment. The floating-point type PFU-FP8 has a data structure for recording a floating-point number with an 8-bit width. The first bit is a sign bit indicating a sign (positive or negative) of a numerical value, and a second bit to a fourth bit. Three bits are an exponent part, and four bits from a fifth bit to an eighth bit are a mantissa part. However, although indices like the common floating-point data is determined for the range from index 2 ^-1 to 2 ^-7 (range of bits of the exponent portion from "111" to "001"), In the range of the exponent from 2 ⁻⁸ to 2 ⁻¹¹ (when the bit of the exponent is “000”), a part of the mantissa is used as an extended exponent expressing a part of the exponent. The exponent part and the exponent part are combined to represent the actual exponent.

Specifically, in the floating-point type PFU-FP8, when the value of the exponent part is “000”, a part of the mantissa part becomes an extended exponent part expressing a part of the exponent (the broken line in the figure). (See the part surrounded by.) In the extension exponent, the bit width used as the extension exponent is extended in order from the lower bit (8th bit) in accordance with the value of the exponent to be expressed, and a flag is set in any bit of the extension exponent. The exponent of the floating point number is represented depending on whether

More specifically, if the value of the exponent part is “000” and the eighth bit has a flag (“1”), the exponent is 2 ⁻⁸ , the value of the exponent part is “000”, and 7 bits If there is a flag ("1") in the eye, the exponent is ^2-9. If the value of the exponent part is "000" and the flag ("1") is in the sixth bit, the exponent is ^2-10 . If the value of the exponent part is "000" and the flag ("1") is in the fifth bit, the exponent is ^2-11 . According to such a unique floating-point type, the accuracy of data is coarse, but the expressible dynamic range is widened, so that the inference accuracy by the CNN can be improved.

Furthermore, according to the floating-point type PFU-FP8, conversion to another floating-point type is easy. Hereinafter, it will be described that when the floating-point type according to the present disclosure is adopted, the cost of conversion to another floating-point type is reduced.

FIG. 10 is a diagram schematically illustrating a functional configuration of a conversion circuit from the floating-point PFU-FP8 to the floating-point PFU-FP9 in the present embodiment. The conversion circuit includes a receiving unit 31, an exponent output unit 32, and a mantissa output unit 33.

The receiving unit 31 receives the sign of the floating-point PFU-FP8, the exponent of the floating-point PFU-FP8, and the mantissa of the floating-point PFU-FP8.

The exponent output unit 32 converts the exponent calculated with reference to the exponent part and the extended exponent part of the floating-point PFU-FP8 into a floating-point PFU-FP9 having a wider bit width than the exponent part of the floating-point PFU-FP8. Is output as the value recorded in the exponent part of.

The mantissa output unit 33 outputs, as it is, the value of the bit that is not used as the extension exponent in the mantissa of the floating-point type PFU-FP8, and outputs the value of the bit that is used as the extension exponent as 0. Thus, a value recorded in the mantissa of the floating-point type PFU-FP9 is output.

FIG. 11 is a diagram showing a conversion circuit from the floating-point PFU-FP8 to the floating-point PFU-FP9 in the present embodiment. In FIG. 11, FP8_f0 to FP3 indicate the input of the mantissa of the floating-point PFU-FP8, FP8_exp0 to 2 indicate the input of the exponent of the floating-point PFU-FP8, and FP8_sign indicates the input of the floating-point PFU-FP8. The input of the code of FP8 is shown. FP9_f0 to FP3 indicate the output of the mantissa of the floating-point PFU-FP9, FP9_exp0 to 3 indicate the output of the exponent of the floating-point PFU-FP9, and FP9_sign indicates the output of the floating-point PFU-FP9. The output of the code is shown.

FIG. 12 is a diagram showing a conversion circuit from a floating-point type FP8 (IEEE) to a floating-point type FP9 (IEEE) in the prior art. In FIG. 12, FP8_f0 to 3 indicate the input of the mantissa of the floating-point FP8 (IEEE), FP8_exp0 to 2 indicate the input of the exponent of the floating-point FP8 (IEEE), and FP8_sign indicates the input of the floating-point. This shows input of a code of FP8 (IEEE). FP9_f0 to FP3 indicate the output of the mantissa of the floating-point FP9 (IEEE), FP9_exp0 to 3 indicate the output of the exponent of the floating-point FP9 (IEEE), and FP9_sign indicates the output of the floating-point FP9 (IEEE). 2 shows the output of the IEEE (IEEE) code.

The circuits (1) and (5) indicated by broken lines in FIGS. 11 and 12 are circuits for determining whether or not all exponents are 0. The circuit (2) and the circuit (6) are circuits that calculate the exponent indicated by the mantissa when the exponents are all 0 (that is, when the exponents are denormalized numbers). The circuit (3) and the circuit (7) are circuits for obtaining the value of the exponent part when the denormalized number is expressed as a normalized number in the FP9. The circuit (4) is a circuit for converting the mantissa part of the denormalized number in the floating-point type FP8 (IEEE) to the mantissa part of the normalized number in the floating-point type FP9 (IEEE), and performs a bit shift. It is. The circuit (8) is a circuit for converting the mantissa part when expressing the denormalized number in the floating-point PFU-FP8 into the mantissa part of the normalized number in the floating-point PFU-FP9.

Here, comparing FIG. 11 with FIG. 12, it can be seen that the circuit scale of the circuit (8) in FIG. 11 is smaller than the circuit (4) in FIG. This is because although the floating-point type PFU-FP8 employs a representation scheme in which a part of the mantissa is denormalized to be an extended exponent, the extended exponent starts from the lower bit (eighth bit). This is because the system is expanded in order, and can be converted to the floating-point PFU-FP9 only by setting the expansion exponent to “0” without performing a process such as a bit shift. That is, according to the floating-point type PFU-FP8 described in the present embodiment, when converting to another floating-point type, a circuit for bit shifting as in the related art is unnecessary, and conversion costs (circuit size and program Amount, used memory amount, etc.) can be reduced.

<Variation>
In the embodiment described above, the floating-point type PFU-FP8 and the floating-point type PFU-FP9 are employed as the unique floating-point types, but other floating-point types may be employed.

FIG. 13 is a diagram schematically illustrating a functional configuration of the CNN processing system 1b according to the present embodiment. The functional configuration of the CNN processing system 1b is substantially the same as that of the CNN processing system 1 described with reference to FIG. 2 except that the input buffer 22, the Wingrad conversion process, the cumulative addition pipeline 25, and the like are omitted. However, the floating point type employed is different. In the CNN processing system 1b, the host-side RAM 12a, the FPGA-side RAM 12b, the input buffer 22, and the output buffer 24 use the unique 6-bit floating-point type PFU-FP6. PFU-FP7, a 7-bit floating-point type, and FP32, a single-precision floating-point type of the IEEE754 standard, are used.

FIG. 14 is a diagram showing a data structure of a unique 7-bit floating point type PFU-FP7 used in the present embodiment. The floating-point type PFU-FP7 has a data structure for recording a floating-point number with a 7-bit width, and the first bit is a sign bit indicating a sign (positive or negative) of a numerical value, and a second bit to a fifth bit. Four bits are an exponent part, and two bits from the sixth bit to the seventh bit are a mantissa part.

FIG. 15 is a diagram showing a data structure of a unique 6-bit floating point type PFU-FP6 used in the present embodiment. The floating-point type PFU-FP6 is a data structure for recording a floating-point number with a 6-bit width, and the first bit is a sign bit indicating a sign (positive or negative) of a numerical value, and a second bit to a fourth bit. Three bits are an exponent part, and two bits from the fifth bit to the sixth bit are a mantissa part. However, although indices like the common floating-point data is determined for the range from index 2 ^-1 to 2 ^-5 (range of bits of the exponent portion from "111" to "011"), For exponents ranging from 2 ⁻⁶ to 2 ⁻¹¹ , part of the mantissa is used as an extended exponent to represent part of the exponent, and the extended exponent and exponent are combined to form the actual exponent. Represents

Specifically, in the floating-point type PFU-FP6, when the value of the exponent part is “010”, “001” or “000”, the second bit (seventh bit) from the lower part of the mantissa part Is an extended exponent part that expresses a part of the exponent (see the part surrounded by a broken line in the figure). The exponent of the floating-point number is represented by a combination of the state of the flag at the 7th bit, which is the extended exponent part, and the value of the exponent part.

More specifically, the values of the exponents ^2-6 to ^2-11 are represented by 4-bit combinations of the exponent part and the extended exponent part, the values of "0101" to "0000". According to such a unique floating-point type, the accuracy of data is coarse, but the expressible dynamic range is widened, so that the inference accuracy by the CNN can be improved.

<Effect>
According to the embodiment described above, the number of connection circuits between the memories 0 to α of the input buffer 22 and the input terminals 0 to α of the product-sum operation module 23 is reduced, and the circuit scale (even if it is a logical circuit, Circuit). For example, when performing the two-dimensional Wingrad conversion of α = 6, the connection circuit that conventionally required 6 * 6 = 36 is changed to 3 * 3 = 9 according to the technology according to the present embodiment, and the Wingrad conversion is performed. The circuit scale for inputting data to the arithmetic module that performs is reduced to 1/4. As a result, it is possible to improve the situation in which resources such as a memory and a logic circuit are small in the case of a programmable device such as an FPGA, and to improve productivity and customizability in the case of a dedicated device such as an ASIC. Can be done.

Further, according to the above-described embodiment, it is easy to reduce the number of operations by using the Winograd transform in a CNN using a programmable device such as an FPGA or a dedicated device such as an ASIC, thereby improving the efficiency of CNN operation. Higher speed is possible.

According to the above-described unique floating-point type, when the value of the exponent part becomes a predetermined value, a part of the mantissa part becomes an extended exponent part expressing a part of the exponent, and the extended exponent part When the exponent is combined with the exponent to represent the exponent, the accuracy of the data is coarse, but the expressible dynamic range is widened, and the inference accuracy by the CNN can be improved.

Further, according to the above-described unique floating-point type, the conversion cost of the floating-point type is small, and the conversion circuit may be a physical circuit or a logic circuit. It is possible to improve the situation where resources such as a memory and a logic circuit are scarce, and it is possible to improve productivity and customizability by using a dedicated device such as an ASIC.

{1} CNN processing system

Claims

A data structure for recording a floating-point number with a predetermined bit width in a storage device of the information processing device,
An exponent part of the predetermined bit width in which an exponent of the floating-point number is recorded;
A mantissa part in which the mantissa of the floating-point number is recorded in the predetermined bit width;
With
A part of the mantissa becomes an extended exponent expressing a part of the exponent when the value of the exponent becomes a predetermined value,
The extended exponent and the exponent are combined to represent the exponent of the floating point number,
data structure.
A part of the mantissa becomes an extended exponent expressing a part of the exponent when the value of the exponent becomes zero.
The data structure according to claim 1.
In the extended exponent, the bit width changes according to the value of the exponent to be represented, and depending on which bit is flagged, the exponent of the floating-point number is represented.
The data structure according to claim 1.
In the extension exponent, a bit width used as the extension exponent is extended in order from a lower bit in accordance with a value of an exponent to be represented.
The data structure according to any one of claims 1 to 3.
The exponent of the floating-point number is represented by a combination of a state of a flag in a predetermined bit that is the extended exponent part and a value of the exponent part.
The data structure according to claim 1.
A first exponent, a first mantissa, and an extension that expresses a part of the exponent by using a part of the first mantissa when the value of the first exponent becomes a predetermined value. Receiving means for receiving an input of a floating-point number recorded in a data structure having an exponent part;
In the extension exponent, the bit width used as the extension exponent is extended in order from the lower bit according to the value of the exponent to be expressed, and is calculated with reference to the first exponent and the extension exponent. Exponent output means for outputting an exponent as a value recorded in a second exponent part having a bit width wider than the first exponent part,
By outputting the value of the bit not used as the extended exponent part of the first mantissa part as it is and outputting the value of the bit used as the extended exponent part as 0, the second mantissa A mantissa output means for outputting a value recorded in the section,
An information processing apparatus comprising:
In the extension exponent, an exponent of the floating-point number is represented according to which bit is flagged,
The information processing device according to claim 6.
The receiving means, the exponent output means and the mantissa output means are configured as a physical circuit,
The information processing device according to claim 6.
The receiving means, the exponent output means and the mantissa output means are configured as a logic circuit in a programmable device,
The information processing device according to claim 6.
The information processing device is an information processing device that performs an operation in a convolutional neural network,
The information processing apparatus according to claim 6.
Computer
A first exponent, a first mantissa, and an extension that expresses a part of the exponent by using a part of the first mantissa when the value of the first exponent becomes a predetermined value. A receiving step of receiving an input of a floating-point number recorded in a data structure having an exponent part;
In the extension exponent, the bit width used as the extension exponent is extended in order from the lower bit according to the value of the exponent to be expressed, and is calculated with reference to the first exponent and the extension exponent. An exponent output step of outputting an exponent as a value recorded in a second exponent having a bit width wider than the first exponent;
By outputting the value of the bit not used as the extended exponent part of the first mantissa part as it is and outputting the value of the bit used as the extended exponent part as 0, the second mantissa A mantissa output step of outputting a value recorded in the section;
How to do.
On the computer,
A first exponent, a first mantissa, and an extension that expresses a part of the exponent by using a part of the first mantissa when the value of the first exponent becomes a predetermined value. A receiving step of receiving an input of a floating-point number recorded in a data structure having an exponent part;
In the extension exponent, the bit width used as the extension exponent is extended in order from the lower bit according to the value of the exponent to be expressed, and is calculated with reference to the first exponent and the extension exponent. An exponent output step of outputting an exponent as a value recorded in a second exponent having a bit width wider than the first exponent;
By outputting the value of the bit not used as the extended exponent part of the first mantissa part as it is and outputting the value of the bit used as the extended exponent part as 0, the second mantissa A mantissa output step of outputting a value recorded in the section;
A program for executing