WO2020039493A1

WO2020039493A1 - Computation optimization device, method, and program

Info

Publication number: WO2020039493A1
Application number: PCT/JP2018/030769
Authority: WO
Inventors: 芙美代鷹野; 竹中　崇; 誠也柴田; 浩明井上
Original assignee: 日本電気株式会社
Priority date: 2018-08-21
Filing date: 2018-08-21
Publication date: 2020-02-27
Also published as: JPWO2020039493A1; JP6973651B2

Abstract

An explanatory variable value acquisition means 73 performs computation using a discrimination model obtained by combining multiple layers each composed of one or more units, to acquire a certain explanatory variable value for each of application patterns which are information specifying a layer to which a first computation circuit for performing computation with a first computation accuracy is to be applied, and a layer to which a second computation circuit for performing computation with a second computation accuracy higher than the first computation accuracy is to be applied. An objective function calculation means 75 calculates, for the application pattern, the value of an objective function expressed with the certain explanatory variable. An application pattern determination means 77 determines such an application pattern that the value of the objective function becomes minimum among the application patterns.

Description

Arithmetic optimization device, method and program

The present invention relates to, for example, an operation optimization device, an operation optimization method, and an operation optimization program for optimizing an operation using a discriminant model such as a neural network.

推 Inferences on given data may be made using models. Such a model is called a discriminant model. For example, there is a case where image data is given and an object represented by the image data (an object shown in an image) is inferred by the image data and the discrimination model.

ニュー A neural network is known as an example of the discrimination model. The neural network is a model in which a plurality of layers are connected, and each layer includes one or more units (neurons). When inference processing is performed using a neural network, input data is input to the input layer, and an operation is performed in the forward direction from the input layer side to the output layer side to obtain an inference result regarding the input data.

Neural networks are learned by deep learning.

A tool for performing an operation using a neural network is described in Non-Patent Document 1, for example.

既存 Some existing tools for performing operations using a neural network can uniformly change the operation accuracy of all layers of the neural network. For example, there is a tool that can set the operation of all layers of the neural network to a floating-point operation or an integer operation. Such setting is performed by a user (human).

As described above, some existing tools for performing operations using a neural network can uniformly change the operation accuracy of all layers of the neural network. However, with such a tool, the calculation accuracy of each layer cannot be set individually. Therefore, it has been difficult to optimize the calculation using the neural network.

For example, when the calculations of all the layers of the neural network are set to floating point calculations, the calculations can be performed with high accuracy, but the efficiency with respect to power consumption and the like is reduced. Conversely, when the calculations of all layers of the neural network are set to integer calculations, the efficiency with respect to power consumption and the like is improved, but the calculation accuracy is reduced.

既存 In addition, with existing tools, the user had to decide whether to use floating point arithmetic or integer arithmetic for the arithmetic accuracy of the entire neural network.

Therefore, the present invention provides an operation optimization device, an operation optimization method, and an operation optimization program that can automatically determine the operation accuracy in each layer of the discrimination model so that the operation using the discrimination model can be optimized. The purpose is to do.

An operation optimization device according to the present invention is an operation using a discriminant model in which a plurality of layers each composed of one or more units are combined, and includes a first operation circuit that performs an operation with a first operation accuracy. For each application pattern that is information that determines to which layer a second arithmetic circuit that applies to a layer and performs an arithmetic operation with a second arithmetic accuracy higher than the first arithmetic accuracy is defined by a predetermined explanatory variable. Explanatory variable value acquiring means for acquiring a value, objective function calculating means for calculating a value of an objective function represented by a predetermined explanatory variable for each applied pattern, and an applied pattern in which the value of the objective function is minimized. And an application pattern determining unit.

In addition, the operation optimization method according to the present invention is a first operation circuit that performs an operation with a first operation accuracy by an operation using a discriminant model in which a plurality of layers each configured by one or more units are combined. Is applied to each layer, and a predetermined explanation is given for each application pattern that is information that determines to which layer the second arithmetic circuit that performs the arithmetic operation with the second arithmetic accuracy higher than the first arithmetic accuracy is applied. The method is characterized in that a value of a variable is obtained, a value of an objective function represented by a predetermined explanatory variable is calculated for each applied pattern, and an applied pattern in which the value of the objective function is minimized is determined.

In addition, the operation optimization program according to the present invention is a computer that performs an operation using a discrimination model in which a plurality of layers each composed of one or more units are combined, and performs an operation with a first operation accuracy. For each application pattern, which is information that determines to which layer the arithmetic circuit of the above is applied and to which layer the second arithmetic circuit that performs the operation with the second arithmetic accuracy higher than the first arithmetic accuracy is applied Explanatory variable value acquisition processing for acquiring the value of a predetermined explanatory variable, objective function calculation processing for calculating the value of an objective function represented by the predetermined explanatory variable for each applied pattern, and the value of the objective function is minimized An application pattern determination process for determining an application pattern is performed.

According to the present invention, the calculation accuracy in each layer of the discrimination model can be automatically determined so that the calculation using the discrimination model can be optimized.

It is a schematic diagram which shows the inference process using a neural network. FIG. 9 is an explanatory diagram showing an example of input / output of the unit and connection with another unit when focusing on one unit. It is a schematic diagram which shows the example of the processing apparatus which performs the calculation of some layers with low precision among the layers of a neural network, and performs the calculation of the remaining layers with high precision. FIG. 3 is a schematic configuration diagram illustrating an example of a low-precision arithmetic circuit. FIG. 3 is a block diagram illustrating a configuration example of a MAC. It is a flowchart which shows the example of a process progress of the inference process using a neural network. FIG. 2 is a block diagram illustrating a configuration example of an operation optimization device according to the present invention. It is a schematic diagram which shows the example of an application pattern. It is a block diagram showing an example of composition of an operation optimization device provided with a processing unit. FIG. 3 is a block diagram illustrating a configuration example of an operation optimization device including a design information storage unit. It is a flowchart which shows the example of a process progress of the operation optimization apparatus of embodiment of this invention. FIG. 11 is a schematic block diagram illustrating a configuration example of a computer according to an embodiment of the present invention or a modification thereof. It is a block diagram showing the outline of the operation optimization device of the present invention.

The operation optimization device of the present invention determines the operation accuracy of each layer in a discrimination model in which a plurality of layers each composed of one or more units are combined. An example of such a discrimination model is a neural network. In the following description, a case where the discrimination model is a neural network will be described as an example. However, the discrimination model is not limited to the neural network.

In the following description, as a process using a neural network, a process of inferring the content indicated by given input data will be described as an example. For example, an example will be described in which image data is provided, and a process of inferring an object represented by the image data (an object in an image) by the image data and a neural network.

However, the processing using the neural network is not limited to the above-described inference processing, and includes, for example, parameter updating processing for each layer of the neural network. In the embodiment described later, a case will be described as an example where the operation optimization device of the present invention determines the operation accuracy of each layer of the neural network when performing the inference process, but other processes such as the above-described parameter update process Is also applicable to the present invention.

FIG. 1 is a schematic diagram showing an inference process using a neural network. In FIG. 1, a unit 51 corresponding to a neuron in the neural network is represented by an ellipse. Each layer has one or more units. A line segment 52 (a line connecting the units in the figure) represents the connection between the units. An arrow 53 (bold arrow pointing right in the figure) schematically represents the inference processing. Although FIG. 1 shows an example of a feedforward neural network in which an input to each unit 51 is an output of a unit in a preceding layer, the input to each unit 51 is not limited to this. For example, when time-series information is held, the input to each unit 51 may include the output of the unit of the preceding layer at the previous time, as in a recurrent neural network. In such a case as well, the direction of the inference processing is considered to be the direction (forward direction) from the input layer to the output layer. Such inference processing performed in a predetermined order from the input layer is also called “forward propagation”. In the following description, the input layer is referred to as a 0th layer, and the output layer is referred to as an nth layer.

FIG. 2 is an explanatory diagram showing an example of input / output of the unit 51 and connection with another unit when focusing on one unit 51. FIG. 2A shows an example of input and output of one unit 51, and FIG. 2B shows an example of coupling between the units 51 arranged in two layers. As shown in FIG. 2A, when there are four inputs (x ₁ to x ₄ ) and one output (z) to one unit 51, the operation of the unit 51 is expressed by, for example, an equation (1A). Here, f () represents an activation function.

z = f (u) (1A)
_{_{_{_{However, u = a + w 1 x}}}} 1 + w 2 x 2 + w 3 x 3 + w 4 x 4 ··· (1B)

In the equation (1B), a represents an intercept, and w ₁ to w ₄ represent parameters such as weights corresponding to the respective inputs (x ₁ to x ₄ ).

On the other hand, as shown in FIG. 2B, when the units 51 are connected between the layers arranged in two layers, focusing on the latter layer, the inner layer (the latter layer of the two layers) The output (z ₁ to z ₄ ) of each unit 51 with respect to the input to each unit (x ₁ to x ₄ ) is expressed as follows, for example. Note that i is an identifier of a unit in the same layer (i = 1 to 3 in this example).

z _i = f (u _i ) (2A)
_{_{_{_{However, u i = a + w i}}}} , 1 x 1 + w i, 2 x 2 + w i, 3 x 3 + w i, 4 x 4
... (2B)

In the following, equation (2B) may be simplified and written as u _i = Σwi _{, k} * x _k . The section a is omitted. In addition, the intercept a can be regarded as a coefficient (one of parameters) of a constant term having a value of 1. Here, k represents an identifier of an input to each unit 51 in the layer. More specifically, it can be said that k represents an identifier of another unit that performs the input. At this time, if the input to each unit 51 in the layer is only the output of each unit in the previous layer, the above simplified expression is expressed as u _i ^(L) = Σwi _{, k} ^(L) * x _k ⁽ It is also possible to write ^L-1) . Note that L represents a layer identifier. In these equations, w _{i, k} corresponds to the parameter of each unit i in the layer (L-th layer). More specifically, this parameter corresponds to the weight of the connection between each unit i and another unit k (inter-unit connection). In the following, a function (activation function) for determining an output value of a unit may be simplified as z = Σw * x without distinguishing the unit.

In the above example, for each unit 51 in a certain layer, the operation of obtaining the output z from the input x corresponds to the inference processing in that layer.

Before describing the embodiments of the present invention, an example of a processing apparatus that performs low-precision calculations on some of the layers of the neural network and performs high-precision calculations on the remaining layers will be described. FIG. 3 is a schematic diagram showing an example of the above processing apparatus. The processing device 18 includes, for example, a low-precision arithmetic circuit 5, a high-precision arithmetic circuit 6, a first memory 7, a second memory 8, and a third memory 9. The low-precision arithmetic circuit 5, the high-precision arithmetic circuit 6, the first memory 7, the second memory 8, and the third memory 9 are connected via, for example, a bus 10.

(4) In the inference processing, the low-precision arithmetic circuit 5 executes the arithmetic of some of the layers of the neural network with the first arithmetic accuracy.

(1) The first memory 7 is a memory used when the low-precision arithmetic circuit 5 executes an operation. The low-precision arithmetic circuit 5 executes an operation while appropriately accessing the first memory 7.

(4) In the inference process, the high-precision arithmetic circuit 6 executes the arithmetic of the remaining layers of the neural network with a second arithmetic accuracy higher than the first arithmetic accuracy.

The second memory 8 is a memory used when the high-precision operation circuit 6 executes an operation. The high-precision operation circuit 6 executes an operation while appropriately accessing the second memory 8.

The first memory 7 and the second memory 8 may be realized by different memories, or may be realized by a single memory. When the first memory 7 and the second memory 8 are realized by a single memory, the single memory is divided into an access area of the low-precision arithmetic circuit 5 and an access area of the high-precision arithmetic circuit 6. It should just be done.

{Circle around (3)} The third memory 9 is a data exchange memory used when the low precision arithmetic circuit 5 and the high precision arithmetic circuit 6 exchange data. Note that the third memory 9 may not be provided. That is, the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 may exchange data by communication without passing through the third memory 9 (memory for data exchange).

(4) The calculation accuracy (second calculation accuracy) of the high-precision calculation circuit 6 is higher than the calculation accuracy (first calculation accuracy) of the low-precision calculation circuit 5. The scale of the range and fineness of the range of the numerical data used for the calculation (more specifically, the scale of the range and fineness of the range of the numerical data determined by the bit width and the handling of the decimal point in the arithmetic circuit) , “Accuracy” or “operation accuracy”.

Hereinafter, a case will be described as an example where the operation precision of the low-precision operation circuit 5 is an 8-bit integer operation and the operation accuracy of the high-precision operation circuit 6 is a 32-bit floating-point operation. However, the calculation accuracy of the low-precision calculation circuit 5 and the calculation accuracy of the high-precision calculation circuit 6 are not limited to this example, as long as the calculation accuracy of the high-precision calculation circuit 6 is higher than the calculation accuracy of the low-precision calculation circuit 5. Good.

The low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 are implemented in, for example, a GPU (Graphics Processing Unit).

FIG. 4 is a schematic configuration diagram showing an example of the low-precision arithmetic circuit 5. As illustrated in FIG. 4, the low-precision arithmetic circuit 5 may have, for example, a configuration in which a plurality of MACs (Multiplier-Accumulators) 221 are connected in parallel.

Similarly, the high-precision arithmetic circuit 6 may have a configuration in which a plurality of MACs are connected in parallel as illustrated in FIG. However, the operation accuracy of the MAC provided in the high-precision operation circuit 6 is higher than the operation accuracy of the MAC 221 provided in the low-accuracy operation circuit 5.

MAC is an example of an arithmetic unit provided in the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6.

FIG. 5 is a block diagram showing a configuration example of the MAC 221. The MAC 221 may include a multiplier 234, an adder 235, storage elements 231 to 233 holding three inputs, and a storage element 236 holding one output. The MAC 221 illustrated in FIG. 5 is an arithmetic circuit that calculates one output variable z = a + w * x when receiving three variables a, w, and x. In this example, z corresponds to the output of the unit, a and w correspond to the parameters, and x corresponds to the input of the unit. The MAC 221 receives three variables w, x, and a via the

storage elements

231, 232, and 233, respectively. The calculated z is sent to the outside via the storage element 236. In such a configuration, the operation accuracy of the MAC 221 is determined by the bit width of the multiplier 234 and the adder 235 and the handling of the decimal point (floating point or fixed point, etc.). For example, in the MAC 221 provided in the low-precision operation circuit 5, the operation by the multiplier 234 and the adder 235 only needs to correspond to the operation accuracy of the low-precision operation circuit 5 (for example, an 8-bit integer operation).

(5) The MAC provided in the high-precision arithmetic circuit 6 can be represented in the same manner as the configuration shown in FIG. However, in the MAC provided in the high-precision operation circuit 6, the operation by the multiplier 234 and the adder 235 corresponds to the operation accuracy of the high-precision operation circuit 6 (for example, 32-bit floating point operation).

The configurations of the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 are not limited to the configuration illustrated in FIG. The low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 may be realized by a configuration different from the configuration illustrated in FIG. For example, the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 may be configured to include an arithmetic unit other than the MAC.

FIG. 6 is a flowchart showing an example of the progress of inference processing using a neural network.

When input data is given to the low-precision arithmetic circuit 5 (step S111), the low-precision arithmetic circuit 5 performs forward propagation from the first layer to the (k-1) th layer of the neural network with the first arithmetic precision. Perform (Step S112). That is, the low-precision arithmetic circuit 5 executes the inference operation for calculating the output of each unit included in each of the first to (k-1) th layers with the first arithmetic accuracy.

Next, the low-precision arithmetic circuit 5 stores the arithmetic result of step S112 in the third memory 9 (step S113). Specifically, the low-precision arithmetic circuit 5 stores the output from each unit of the (k−1) th layer in the third memory 9.

Next, in the low-precision arithmetic circuit 5, the high-precision arithmetic circuit 6 reads out the operation result in step S112 (output from each unit of the (k-1) th layer) from the third memory 9 (step S114).

In steps S113 and S114, the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 output the data (the operation result of step S112, specifically, the output from each unit of the (k-1) th layer). That is, they are being exchanged via the third memory 9.

The low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 may directly exchange data by communication without passing through the third memory 9.

の後 After step S114, the high-precision arithmetic circuit 6 performs forward propagation from the k-th layer to the n-th layer of the neural network with the second arithmetic accuracy (step S115). That is, the high-precision arithmetic circuit 6 executes the inference operation for calculating the output of each unit included in each of the layers from the k-th layer to the n-th layer with the second calculation accuracy.

In the process shown in FIG. 6, it is assumed that the input layer of the neural network is the 0th layer and the nth layer is the output layer. The (k-1) th layer is an intermediate layer that is downstream of the input layer (0th layer) and upstream of the output layer (nth layer). That is, k is an integer satisfying 0 <k-1 <n.

出力 It can be said that the output of the unit of the n-th layer obtained in step S115 represents the inference result.

Hereinafter, embodiments of the present invention will be described with reference to the drawings.

In the present embodiment, a case will be described as an example where the operation optimization device of the present invention determines the operation accuracy of each layer in a neural network. In addition, as described above, a process of inferring the content indicated by given input data will be described as an example of the process using the neural network. For example, an example will be described in which image data is provided, and a process of inferring an object represented by the image data (an object in an image) by the image data and a neural network. However, the present invention is also applicable to other processes using a neural network.

FIG. 7 is a block diagram showing a configuration example of the operation optimization device of the present invention. The operation optimization device of the present invention includes a discriminant model storage unit 21, a data storage unit 22, an explanatory variable value acquisition unit 23, an objective function storage unit 24, an objective function calculation unit 25, and a calculation result storage unit 26. , An application pattern determination unit 27.

The discrimination model storage unit 21 is a storage device that stores a neural network as a discrimination model.

The data storage unit 22 is a storage device that stores data to be subjected to inference processing using a neural network (for example, image data in which an object in an image is to be inferred). The data storage unit 22 stores a plurality (N) of data to be inferred, and also stores the correct answer data of the inference result corresponding to each data. For example, the data storage unit 22 stores N pieces of image data and correct answer data (data indicating an object actually shown in an image) corresponding to each piece of image data.

The objective function storage unit 24 stores an objective function represented by a predetermined explanatory variable (hereinafter simply referred to as an explanatory variable). An expression representing the objective function is predetermined. In the present embodiment, at least “inference precision” and “processing speed” in the inference processing using the neural network are used as the above-described explanatory variables. In the following description, for the sake of simplicity, first, a case will be described in which “inference accuracy” and “processing speed” are used as explanatory variables. The objective function may be represented by other explanatory variables in addition to “inference accuracy” and “processing speed”, but this case will be described later.

Here, the “inference precision” is the accuracy of the calculation result of the inference processing (in other words, the inference result).

The objective function storage unit 24 stores, for example, a function represented by the following equation (3) as the objective function.

{Objective function = "inference accuracy" x α + "processing speed" x β} (3)

"Inference accuracy" and "processing speed" are explanatory variables. α is a coefficient of “inference accuracy”, and β is a coefficient of “processing speed”. The values of α and β are determined in advance. In the present embodiment, an example will be described in which α and β are both defined as positive values.

The explanatory variable value acquiring unit 23 acquires the value of the explanatory variable used in the objective function stored in the objective function storage unit 24. In this example, values of “inference precision” and “processing speed” in the inference processing using the neural network are acquired.

(4) The explanatory variable value acquisition unit 23 stores a plurality of types of application patterns in advance. The applied pattern is an operation using a discriminant model (in this embodiment, a neural network). The low-precision arithmetic circuit 5 (see FIG. 3) is applied to any layer of the neural network, and the high-precision arithmetic circuit 6 (FIG. 3). ) Is applied to which layer of the neural network. In this embodiment, the low-precision arithmetic circuit 5 is applied to the first and subsequent layers, and the circuit applied to the layer is switched from the low-precision arithmetic circuit 5 to the high-precision arithmetic circuit 6 between any of the layers. May be used. However, for simplicity of explanation, an example in which the switching is performed at most once will be described. Further, the high-precision arithmetic circuit 6 may be applied to all layers after the first layer.

Therefore, in this embodiment, the low-precision arithmetic circuit 5 is applied to the first to p-th layers, the high-precision arithmetic circuit 6 is applied to the p + 1-th to q-th layers, and the q + 1-th to n-th layers. The case where the low-precision arithmetic circuit 5 is applied again before (output layer) is excluded from the applied pattern. However, in the present invention, such a case may be included in the application pattern.

FIG. 8 is a schematic diagram showing an example of an application pattern. Each rectangle shown in FIG. 8 represents each layer of the neural network.

適用 Application pattern 1 shown in FIG. 8 specifies that the low-precision arithmetic circuit 5 is applied to all layers from the first layer to the n-th layer. In other words, the application pattern 1 defines that the low-precision arithmetic circuit 5 executes the arithmetic of all the layers from the first layer to the n-th layer.

適用 Application pattern 2 shown in FIG. 8 specifies that the low-precision arithmetic circuit 5 is applied to each of the first to (n−1) th layers and the high-precision arithmetic circuit 6 is applied to the n-th layer. In other words, the application pattern 2 specifies that the low-precision operation circuit 5 executes the operation of each layer from the first layer to the (n-1) th layer, and the high-precision operation circuit 6 executes the operation of the n-th layer. ing.

The application pattern 3 shown in FIG. 8 is that the low-precision arithmetic circuit 5 is applied to each of the first to n-2th layers, and the high-precision arithmetic circuit 6 is applied to the (n-1) th and n-th layers. Has been established.

適用 Applied pattern X-1 shown in FIG. 8 specifies that the low-precision arithmetic circuit 5 is applied to the first layer and the high-precision arithmetic circuit 6 is applied to each of the second to n-th layers. In other words, the application pattern X-1 specifies that the low-precision operation circuit 5 executes the operation of the first layer and the high-precision operation circuit 6 executes the operations of each layer from the second layer to the n-th layer. ing.

適用 The application pattern X shown in FIG. 8 specifies that the high-precision arithmetic circuit 6 is applied to all layers from the first layer to the n-th layer. In other words, the application pattern X specifies that the high-precision operation circuit 6 executes the operation of all layers from the first layer to the n-th layer.

種々 Various application patterns as illustrated in FIG. 8 are determined in advance, and the explanatory variable value acquisition unit 23 stores individual application patterns in advance. Then, the explanatory variable value acquiring unit 23 acquires the value of the explanatory variable “inference accuracy” and the value of the explanatory variable “processing speed” for each application pattern.

れば Different application patterns have different values of explanatory variables (in this example, “inference accuracy” and “processing speed”).

There are two modes in which the explanatory variable value acquiring unit 23 acquires the values of the explanatory variables (in this example, “inference accuracy” and “processing speed”). In the first mode, the explanatory variable value obtaining unit 23 causes the actually existing processing device 18 (see FIG. 3) to execute the inference process, and obtains the values of “inference accuracy” and “processing speed” by actual measurement. It is. The second mode is a mode in which the explanatory variable value obtaining unit 23 obtains the values of “inference accuracy” and “processing speed” by simulation. That is, the first mode is a mode in which the value of the explanatory variable is obtained by actual measurement, and the second mode is a mode in which the value of the explanatory variable is obtained by simulation.

In the case where the explanatory variable value acquisition unit 23 acquires the value of an explanatory variable by actual measurement, the arithmetic optimization device may include a processing device 18 as shown in FIG. Since the configuration and operation of the processing device 18 have already been described with reference to FIG. 3 and the like, the description is omitted here.

処理 Also, there is a case where the processing device 18 is still in the design stage and the processing device 18 does not actually exist yet. In that case, as shown in FIG. 10, the operation optimization device may include a design information storage unit 19. The design information storage unit 19 is a storage device that stores design information of the processing device 18. Examples of the design information include the number of arithmetic units (for example, MAC) provided in the low-precision arithmetic circuit 5 in the processing device 18 and the arithmetic units (for example, MAC) provided in the high-precision arithmetic circuit 6 in the processing device 18. And the like. However, the design information is not limited to these examples. The explanatory variable value acquisition unit 23 may acquire the value of the explanatory variable by simulation based on the design information stored in the design information storage unit 19.

First, the operation when the explanatory variable value acquiring unit 23 acquires the value of the explanatory variable by actual measurement will be described. Here, a case will be described as an example where the operation optimization device includes a processing device 18 as shown in FIG.

The operation in which the explanatory variable value acquiring unit 23 acquires the value of “processing speed” by actual measurement will be described. The explanatory variable value acquisition unit 23 specifies an application pattern to the processing device 18. The explanatory variable value acquisition unit 23 inputs the neural network stored in the discrimination model storage unit 21 and the data stored in the data storage unit 22 to the processing device 18, and instructs the processing device 18 to perform inference processing. May be executed, and the processing speed when the processing device 18 performs the inference process on the data may be measured. As a result, the explanatory variable value acquisition unit 23 acquires the value of the processing speed. Further, at this time, the processing device 18 performs the inference process by an operation according to the specified application pattern.

The processing speed is, for example, the inference processing time for one piece of data (in other words, the reciprocal of the number of pieces of data that can be processed per second). Alternatively, the explanatory variable value acquisition unit 23 may acquire, for example, a value of latency or throughput as a value of the processing speed. The same applies to the case where the value of the processing speed is obtained by simulation.

The explanation variable value acquisition unit 23 can acquire the value of the processing speed by causing the processing device 18 to execute the inference process for one piece of data.

The explanatory variable value acquisition unit 23 sequentially changes the application pattern to be specified, and acquires the value of the processing speed by actual measurement for each application pattern.

The operation in which the explanatory variable value acquiring unit 23 acquires the value of “inference accuracy” by actual measurement will be described. When acquiring the value of the inference accuracy by actual measurement, the explanatory variable value acquisition unit 23 may operate as follows, for example. The explanatory variable value acquisition unit 23 specifies an application pattern to the processing device 18. Then, the explanatory variable value acquisition unit 23 inputs the neural network stored in the discrimination model storage unit 21 to the processing device 18. Further, the explanatory variable value acquisition unit 23 inputs a plurality of data (assumed to be N) stored in the data storage unit 22 to the processing device 18, and sends the individual data to the processing device 18. Induce the inference result. That is, the explanatory variable value acquisition unit 23 causes the processing device 18 to execute the inference process N times. At this time, the processing device 18 executes the inference process by an operation according to the specified application pattern. As a result, N inference results are obtained. The explanatory variable value acquisition unit 23 compares the correct answer data stored in the data storage unit 22 with the respective inference results, and determines the ratio of the number of inference processes in which the correct answer data was obtained to the number of N inference processes. Calculate, and then calculate the reciprocal of the ratio. The reciprocal of the ratio corresponds to the value of the inference accuracy. The operation in which the explanatory variable value acquisition unit 23 acquires the value of the inference accuracy by actual measurement is not limited to the above example.

(4) The explanatory variable value acquisition unit 23 sequentially changes the application pattern to be specified, and acquires the value of the inference accuracy by actual measurement for each application pattern.

Next, the operation when the explanatory variable value acquiring unit 23 acquires the value of the explanatory variable by simulation will be described. Here, a case will be described as an example where the operation optimization device includes a design information storage unit 19 as shown in FIG. In this example, the number of arithmetic units (for example, MAC) provided in the low-precision arithmetic circuit 5 (see FIG. 3) in the processing device 18 and the high-precision arithmetic circuit 6 (see FIG. 3) in the processing device 18 are provided. It is assumed that the number of arithmetic units (for example, MAC) to be used is stored in the design information storage unit 19 as design information.

The operation of the explanation variable value acquisition unit 23 acquiring the value of “processing speed” by simulation will be described. In this example, the explanatory variable value acquisition unit 23 holds, for example, a function (hereinafter, referred to as a processing speed function) for obtaining a value of “processing speed” in advance. The processing speed function is predetermined. The processing speed function includes, for example, the number of arithmetic units provided in the low-precision arithmetic circuit 5, the number of arithmetic units provided in the high-precision arithmetic circuit 6, and memory access when the low-precision arithmetic circuit 5 accesses the first memory 7. Amount (the number of memory accesses), the amount of memory access when the high-precision arithmetic circuit 6 accesses the second memory 8 (the number of memory accesses), and the amount of data transmitted and received between the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6. The amount of data to be transferred (hereinafter, sometimes referred to as the data transfer amount) is used as a variable. Hereinafter, a case where the processing speed function is represented by each of the above variables will be described as an example. However, the variables used in the processing speed function are not limited to the above example.

The data transfer amount is represented, for example, by the product of the number of data to be transferred and the number of bytes per data. The unit in this case is, for example, bytes.

The explanatory variable value acquisition unit 23 may calculate the value of the processing speed by substituting the value of each of the above variables into the processing speed function. Here, among the variables, the number of arithmetic units provided in the low-precision arithmetic circuit 5 and the number of arithmetic units provided in the high-precision arithmetic circuit 6 may use values determined by the design information. The amount of memory access (the number of memory accesses) when the low-precision arithmetic circuit 5 accesses the first memory 7, the amount of memory access (the number of memory accesses) when the high-precision arithmetic circuit 6 accesses the second memory 8, and Regarding the data transfer amount between the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6, the explanatory variable value acquisition unit 23 selects an applicable pattern, and the processing device 18 determined from the design information stored in the design information storage unit 19. The above operation may be derived by simulating the operation according to the selected application pattern. The explanatory variable value acquisition unit 23 processes the number of the arithmetic units, the memory access amount derived based on the selected applied pattern, and the data transfer amount between the low precision arithmetic circuit 5 and the high precision arithmetic circuit 6. The value of the processing speed may be calculated by substituting the value into the speed function. As a result, the explanatory variable value acquisition unit 23 can acquire the value of the processing speed based on the simulation.

The explanatory variable value acquisition unit 23 sequentially changes the application pattern to be selected, and calculates a processing speed value based on a simulation for each application pattern.

The operation in which the explanatory variable value acquiring unit 23 acquires the value of “inference accuracy” by simulation will be described. When acquiring the value of the inference accuracy by simulation, the explanatory variable value acquiring unit 23 may operate as follows, for example. The explanatory variable value acquisition unit 23 selects an application pattern. Then, the explanatory variable value acquisition unit 23 performs the operation of the processing device 18 determined from the design information for each of a plurality of (N in this example) data stored in the data storage unit 22 and selects the selected application pattern. By simulating the operation according to, the inference result for the data is derived. As a result, N inference results are obtained. The explanatory variable value acquisition unit 23 calculates the ratio of the number of inference results matching the correct answer data to the number of inference results (N), and further calculates the reciprocal of the ratio. The reciprocal of the ratio corresponds to the value of the inference accuracy. The operation in which the explanatory variable value obtaining unit 23 obtains the value of the inference accuracy by simulation is not limited to the above example.

The explanatory variable value acquisition unit 23 sequentially changes the application pattern to be specified, and acquires the value of the inference accuracy by simulation for each application pattern.

In the present invention, the explanatory variable value acquiring unit 23 may acquire the value of the explanatory variable by actual measurement or by simulation. In any case, the explanatory variable value acquiring unit 23 acquires the values of the explanatory variables (in this example, “inference accuracy” and “processing speed”) for each application pattern.

When the processing speed is represented by the inference processing time for one data (in other words, the reciprocal of the number of data that can be processed per second) as in the above example, it is preferable that the value indicating the processing speed is smaller. . Similarly, the reciprocal of the ratio of the number of inference processes for which correct data was obtained to the number of N inference processes (in other words, the ratio of the number of inference results matching the correct data to the number of inference results (N)) In the case where the inference precision is expressed by the reciprocal of the above, it is preferable that the value indicating the inference precision is smaller.

The objective function calculation unit 25 calculates the value of the explanatory variable (in this example, “inference accuracy” and “processing speed”) calculated by the explanatory variable value acquisition unit 23 for each applied pattern into an expression (in this example, , The value of the objective function is calculated. The objective function calculator 25 performs a process of calculating the value of the objective function for each applied pattern.

The calculation result storage unit 26 is a storage device that stores the value of the objective function calculated for each application pattern. The objective function calculation unit 25 calculates the value of the objective function for each application pattern, and stores the value of the objective function for each application pattern in the calculation result storage unit 26.

As described above, in this example, it is preferable that the value indicating the processing speed is smaller, and similarly, the smaller the value indicating the inference accuracy, the more preferable. Therefore, it is preferable that the value of the objective function represented by the expression (3) is small. Therefore, it can be said that the application pattern with the minimum value of the objective function is the most preferable application pattern (that is, the optimal application pattern).

The application pattern determination unit 27 refers to the value of the objective function for each application pattern stored in the calculation result storage unit 26, and determines the application pattern that minimizes the value of the objective function. As described above, the application pattern that minimizes the value of the objective function is the optimal application pattern.

Here, the application pattern is an operation using a discriminant model (in the present embodiment, a neural network). The low-precision operation circuit 5 (see FIG. 3) is applied to any layer of the neural network, and the high-precision operation circuit 6 ( (See FIG. 3) is applied to which layer of the neural network. Therefore, the calculation using the neural network can be optimized by determining the application pattern. Then, since it is determined which of the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 is applied to each layer of the neural network, the arithmetic accuracy of each layer of the neural network is determined by the first arithmetic accuracy (for example, It is possible to determine whether to use an 8-bit integer operation by the low-precision operation circuit 5 or to use the second operation accuracy (for example, a 32-bit floating-point operation by the high-precision operation circuit 6).

{The explanatory variable value acquisition unit 23, the objective function calculation unit 25, and the application pattern determination unit 27 are realized by, for example, a CPU (Central Processing Unit) of a computer that operates according to an operation optimization program. In this case, the CPU reads the operation optimization program from a program recording medium such as a program storage device. Then, the CPU may operate as the explanatory variable value acquisition unit 23, the objective function calculation unit 25, and the applied pattern determination unit 27 according to the operation optimization program.

Next, an example of the processing progress of the embodiment of the present invention will be described. FIG. 11 is a flowchart illustrating an example of the processing progress of the operation optimization device according to the embodiment of this invention. Here, the objective function storage unit 24 stores the objective function represented by the above equation (3), and the explanatory variable value acquisition unit 23 uses the value of “inference accuracy” and the “processing speed” as the value of the explanatory variable. The following describes an example in which the value of “is acquired. In addition, description of the already described items will be appropriately omitted.

First, the explanatory variable value acquisition unit 23 selects one unselected application pattern from a plurality of application patterns stored in advance (step S1).

Next, the explanatory variable value acquisition unit 23 acquires the value of the explanatory variable in the operation of the inference processing under the application pattern selected in step S1 (step S2). In this example, the explanatory variable value acquisition unit 23 acquires the value of “inference accuracy” and the value of “processing speed” in the operation under the application pattern selected in step S1.

The explanatory variable value acquisition unit 23 may acquire the value of the explanatory variable by actual measurement, or may acquire the value of the explanatory variable by simulation. The operation of obtaining the value of "inference accuracy" and "processing speed" by actual measurement and the operation of obtaining the value of "inference accuracy" and "processing speed" by simulation have already been described. Description is omitted.

After step S2, the objective function calculation unit 25 calculates the values of the explanatory variables (in this example, the values of “inference accuracy” and “processing speed”) acquired in step S2 for the selected application pattern. Then, the value of the objective function is calculated by substituting it into the expression representing the objective function (in this example, the above-mentioned expression (3)) (step S3). Then, the objective function calculation unit 25 associates the application pattern selected in step S <b> 1 with the value of the objective function and stores the association pattern in the calculation result storage unit 26.

Next, the explanatory variable value acquiring unit 23 determines whether or not all of the application patterns stored in advance have been selected in step S1 (step S4).

If there is an unselected application pattern (No in step S4), the operation optimization device repeats the processing in step S1 and subsequent steps.

If all the applied patterns have been selected (Yes in step S4), the applied pattern determination unit 27 refers to the value of the objective function for each applied pattern stored in the calculation result storage unit 26, and An application pattern that minimizes the function value is determined (step S5). The process ends in step S5.

As described above, the calculation using the neural network can be optimized by determining the application pattern. Then, since it is determined which of the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6 is applied to each layer of the neural network, the arithmetic accuracy of each layer of the neural network is determined by the first arithmetic accuracy (for example, It is possible to determine whether to use an 8-bit integer operation by the low-precision operation circuit 5 or to use the second operation accuracy (for example, a 32-bit floating-point operation by the high-precision operation circuit 6).

In addition, in the present embodiment, since the applied pattern is determined by the processing of steps S1 to S5, the applied pattern can be automatically determined. Therefore, it is possible to automatically determine whether the calculation accuracy of each layer of the neural network is the first calculation accuracy or the second calculation accuracy.

Next, as a modified example of the embodiment of the present invention, a case where the objective function is represented by another explanatory variable in addition to “inference accuracy” and “processing speed” will be described. In the following description of the modified example, the description of the items already described is appropriately omitted.

In the objective function, in addition to “inference accuracy” and “processing speed”, “the amount of data transmitted and received between the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6” may also be expressed as an explanatory variable. Good. Hereinafter, “the amount of data transmitted and received between the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6” is simply referred to as the data exchange amount. As described above, the data transfer amount is represented, for example, by the product of the number of data transferred and the number of bytes per data.

In this example, the objective function storage unit 24 may store, for example, a function represented by the following equation (4) as the objective function.

Objective function = "inference accuracy" x α + "processing speed" x β + "data transfer amount" x γ
... (4)

Γ is a coefficient of “data transfer amount” and is determined in advance. In this example, a case where γ is defined as a positive value will be described as an example.

In this modification, the explanatory variable value acquisition unit 23 acquires the value of “data transfer amount” in addition to the “inference accuracy” and “processing speed” for each application pattern.

The operation in which the explanatory variable value acquiring unit 23 acquires the value of “data transfer amount” by actual measurement will be described. The explanatory variable value acquisition unit 23 specifies an application pattern to the processing device 18. The explanatory variable value acquisition unit 23 inputs the neural network stored in the discrimination model storage unit 21 and the data stored in the data storage unit 22 to the processing device 18, and instructs the processing device 18 to perform inference processing. May be executed, and the data transfer amount when the processing device 18 performs the inference process on the data may be measured. As a result, the explanatory variable value obtaining unit 23 obtains the value of the data transfer amount. Further, at this time, the processing device 18 performs the inference process by an operation according to the specified application pattern. Note that the explanatory variable value acquisition unit 23 can acquire the value of the data transfer amount by causing the processing device 18 to execute the inference process for one piece of data.

The explanatory variable value acquisition unit 23 sequentially changes the application pattern to be specified, and acquires the value of the data transfer amount by actual measurement for each application pattern.

The operation in which the explanatory variable value acquiring unit 23 acquires the value of “data transfer amount” by simulation will be described. The explanatory variable value acquisition unit 23 selects an application pattern and simulates the operation of the processing device 18 determined from the design information stored in the design information storage unit 19, which corresponds to the selected application pattern. What is necessary is just to derive the value of the transfer amount. The explanatory variable value acquisition unit 23 can derive the value of the data transfer amount by simulating the operation of the processing device 18 for one piece of data.

(4) The explanatory variable value acquisition unit 23 sequentially changes the application pattern to be selected, and derives the value of the data transfer amount by simulation for each application pattern.

In this modified example, the objective function calculation unit 25 substitutes the value of “inference accuracy”, the value of “processing speed”, and the value of “data transfer amount” into Expression (4), so that What is necessary is just to calculate the value of the objective function.

Other points are the same as in the above embodiment.

According to the present modification, it is possible to determine whether the calculation accuracy of each layer of the neural network is the first calculation accuracy or the second calculation accuracy in consideration of the “data transfer amount”.

The objective function may be expressed using “the circuit scale of the processing device 18 (hereinafter simply referred to as a circuit scale)” as an explanatory variable in addition to “inference accuracy” and “processing speed”.

In this example, the objective function storage unit 24 may store, for example, a function represented by the following equation (5) as the objective function.

Objective function = "inference accuracy" x α + "processing speed" x β + "circuit scale" x δ
... (5)

Δ is a coefficient of “circuit scale” and is determined in advance. In this example, a case where δ is defined as a positive value will be described as an example.

In the following description, the number of arithmetic units (eg, MAC) included in the low-precision arithmetic circuit 5 and the number of arithmetic units (eg, MAC) included in the high-precision arithmetic circuit 6 are included in the low-precision arithmetic circuit 5. An example will be described in which a value expressed on the basis of an arithmetic unit or an arithmetic unit included in the high-precision arithmetic circuit 6 is referred to as “circuit scale”. In this example, a description will be given assuming that the arithmetic unit included in the low-precision arithmetic circuit 5 is used as a reference. When the arithmetic units included in the low-precision arithmetic circuit 5 are used as a reference, the number of the arithmetic units included in the high-precision arithmetic circuit 6 corresponds to the number of the arithmetic units included in the low-precision arithmetic circuit 5. Convert to and represent. The number of the arithmetic units included in the high-precision arithmetic circuit 6 corresponds to the number of the arithmetic units included in the low-precision arithmetic circuit 6. What is necessary is just to obtain | require the occupation area according to how many occupation units included in the low precision arithmetic circuit 5 correspond. Hereinafter, for the sake of simplicity, a description will be given assuming that one arithmetic unit included in the high-precision arithmetic circuit 6 corresponds to J arithmetic units included in the low-precision arithmetic circuit 5.

In the present modification, the explanatory variable value acquisition unit 23 also acquires a value of “circuit scale” in addition to “inference accuracy” and “processing speed”.

The operation of the explanatory variable value acquiring unit 23 acquiring the value of “circuit scale” by actual measurement will be described. When the processing device 18 is present, the number of arithmetic units included in the low-precision arithmetic circuit 5 in the processing device 18, the number of arithmetic units included in the high-precision arithmetic circuit 6, and The information as to how many arithmetic units included in the low-precision arithmetic circuit 5 corresponds to one arithmetic unit included therein is known information. It is assumed that the explanatory variable value acquisition unit 23 stores this known information in advance, for example. In addition, for simplicity, the description will be made on the assumption that one arithmetic unit included in the high-precision arithmetic circuit 6 corresponds to J arithmetic units included in the low-precision arithmetic circuit 5.

In this case, the explanatory variable value acquiring unit 23 may calculate the value of “circuit scale” by calculating the following equation (6).

Circuit size = “the number of arithmetic units included in the low-precision arithmetic circuit 5” +
"Number of arithmetic units included in high-precision arithmetic circuit 6" x J
... (6)

In the above example, since the value of the circuit scale does not depend on the application pattern, the explanatory variable value acquisition unit 23 may calculate the value of the circuit scale as a common value for each application pattern.

The operation in which the explanatory variable value acquiring unit 23 acquires the value of “circuit scale” by simulation will be described. In this case, the design information stored in the design information storage unit 19 (see FIG. 10) includes a design value of the number of arithmetic units included in the low-precision arithmetic circuit 5 and a design value of the number of arithmetic units included in the high-precision arithmetic circuit 6. What is necessary is just to include a value and a design value of how many arithmetic units included in the low-precision arithmetic circuit 5 correspond to one arithmetic unit included in the high-precision arithmetic circuit 6. Also in this example, one arithmetic unit included in the high-precision arithmetic circuit 6 will be described as equivalent to J arithmetic units included in the low-precision arithmetic circuit 5.

In this case, the explanatory variable value acquisition unit 23 may calculate the value of “circuit scale” by calculating the following equation (7).

Circuit scale = “design value of the number of arithmetic units included in low-precision arithmetic circuit 5” +
"Design value of the number of arithmetic units included in high-precision arithmetic circuit 6" x J
... (7)

When the value of “circuit scale” is obtained by simulation, the explanatory variable value obtaining unit 23 uses a value different from the value that represents the number of arithmetic units based on the arithmetic units and the like included in the low-precision arithmetic circuit 5. You may ask. For example, the explanatory variable value acquisition unit 23 may calculate the value of the circuit scale by using a function for calculating the value of the “circuit scale” (hereinafter, referred to as a circuit scale function). In this case, the explanatory variable value acquisition unit 23 holds the circuit scale function in advance. The circuit scale function is determined in advance. The circuit scale function includes, for example, the number of arithmetic units provided in the low-precision arithmetic circuit 5, the number of arithmetic units provided in the high-precision arithmetic circuit 6, and the first memory 7 accessed by the low-precision arithmetic circuit 5 (see FIG. 3). , The memory size of the second memory 8 (see FIG. 3) accessed by the high-precision arithmetic circuit 6, and the data transfer amount (the data transmitted and received between the low-precision arithmetic circuit 5 and the high-precision arithmetic circuit 6). Is a variable. Hereinafter, a case where the circuit scale function is represented by each of the above variables will be described as an example. However, the variables used in the circuit scale function are not limited to the above example. The memory size of the first memory 7 and the memory size of the second memory 8 may be stored in the design information storage unit 19 as design information.

(4) The explanatory variable value acquisition unit 23 may calculate the value of the circuit scale by substituting the values of the above variables into the circuit scale function. Here, among the variables, the number of arithmetic units provided in the low-precision arithmetic circuit 5, the number of arithmetic units provided in the high-precision arithmetic circuit 6, the memory size of the first memory 7, and the memory size of the second memory 8 May use a value determined in the design information. Regarding the data transfer amount, the explanatory variable value acquisition unit 23 selects an application pattern, and simulates the operation of the processing device 18 determined from the design information stored in the design information storage unit 19, which corresponds to the selected application pattern. Then, it can be derived. The explanatory variable value acquisition unit 23 calculates the value of the circuit scale by substituting the number of arithmetic units, the memory size, and the data transfer amount derived based on the selected application pattern into a circuit scale function. Good. In this case, the explanatory variable value acquisition unit 23 sequentially changes the application pattern to be selected, and calculates a circuit scale value based on a simulation for each application pattern.

In this modified example, the objective function calculation unit 25 substitutes the value of “inference accuracy”, the value of “processing speed”, and the value of “circuit scale” into Expression (5) to obtain the objective for each application pattern. Just calculate the value of the function.

Other points are the same as in the above embodiment.

According to the present modification, it is possible to determine whether the calculation accuracy of each layer of the neural network is the first calculation accuracy or the second calculation accuracy in consideration of the “circuit scale”.

The objective function may be expressed using “power consumption of the processing device 18 (hereinafter simply referred to as power consumption)” as an explanatory variable in addition to “inference accuracy” and “processing speed”.

In this example, the objective function storage unit 24 may store, for example, a function represented by the following equation (8) as the objective function.

Objective function = "inference accuracy" x α + "processing speed" x β + "power consumption" x ε
... (8)

Ε is a coefficient of “power consumption” and is determined in advance. In this example, a case where ε is defined as a positive value will be described as an example.

In this modification, the explanatory variable value acquisition unit 23 acquires the value of “power consumption” in addition to “inference accuracy” and “processing speed” for each application pattern.

The operation of the explanatory variable value acquiring unit 23 acquiring the value of “power consumption” by actual measurement will be described. The explanatory variable value acquisition unit 23 specifies an application pattern to the processing device 18. The explanatory variable value acquisition unit 23 inputs the neural network stored in the discrimination model storage unit 21 and the data stored in the data storage unit 22 to the processing device 18, and instructs the processing device 18 to perform inference processing. May be executed, and the power consumption when the processing device 18 performs the inference process on the data may be measured. As a result, the explanatory variable value acquisition unit 23 acquires the value of the power consumption. Further, at this time, the processing device 18 performs the inference process by an operation according to the specified application pattern. Note that the explanatory variable value acquisition unit 23 can acquire the value of the power consumption by causing the processing device 18 to execute the inference process for one piece of data.

The explanatory variable value acquisition unit 23 sequentially changes the application pattern to be specified, and acquires the value of power consumption by actual measurement for each application pattern.

The operation in which the explanatory variable value acquiring unit 23 acquires the value of “power consumption” by simulation will be described. When the value of “power consumption” is derived by simulation, data necessary for deriving the power consumption value determined at the design stage is included in the design information stored in the design information storage unit 19. The explanatory variable value acquisition unit 23 selects an application pattern, and simulates an operation of the processing device 18 determined from the design information stored in the design information storage unit 19, the operation corresponding to the selected application pattern. What is necessary is just to derive the value of electric power. Note that the explanatory variable value acquisition unit 23 can derive the value of power consumption by simulating the operation of the processing device 18 for one piece of data.

The explanatory variable value acquisition unit 23 sequentially changes the application pattern to be selected, and derives the value of power consumption by simulation for each application pattern.

In this modified example, the objective function calculation unit 25 substitutes the value of “inference accuracy”, the value of “processing speed”, and the value of “power consumption” into Expression (8) to obtain a target for each application pattern. Just calculate the value of the function.

Other points are the same as in the above embodiment.

According to the present modification, it is possible to determine whether the calculation accuracy of each layer of the neural network is the first calculation accuracy or the second calculation accuracy in consideration of “power consumption”.

In each of the above modified examples, the objective function is represented by any one of “data transfer amount”, “circuit scale”, and “power consumption” as an explanatory variable in addition to “inference accuracy” and “processing speed”. Has been described. The objective function is represented by any one or more explanatory variables of “data transfer amount”, “circuit size”, and “power consumption” in addition to “inference accuracy” and “processing speed”. Is also good.

(5) The objective function may be expressed with “inference accuracy”, “processing speed”, “data transfer amount”, “circuit size”, and “power consumption” as explanatory variables. In this case, the objective function storage unit 24 may store, for example, a function represented by the following equation (9) as the objective function.

Objective function = "inference accuracy" x α + "processing speed" x β + "data transfer amount" x γ
+ “Circuit scale” x δ + “Power consumption” x ε
... (9)

In this case, the explanatory variable value acquisition unit 23 obtains each explanatory variable (“inference accuracy”, “processing speed”, “data transfer amount”, “circuit scale”, and “power consumption”) by actual measurement or by simulation. The value may be obtained for each application pattern.

(4) The objective function calculating unit 25 may calculate the value of the objective function for each application pattern by substituting the value of each explanatory variable acquired by the explanatory variable value acquiring unit 23 into Expression (9).

In this case, in consideration of the “data transfer amount”, the “circuit scale”, and the “power consumption”, it is determined whether the calculation accuracy of each layer of the neural network is the first calculation accuracy or the second calculation accuracy. Can be determined.

In equation (9), the term “data transfer amount” × γ may not be included. In this case, the explanatory variable value acquisition unit 23 does not need to acquire the value of “data transfer amount”.

Also, in equation (9), the term “circuit scale” × δ may not be included. In this case, the explanatory variable value acquisition unit 23 does not need to acquire the value of “circuit scale”.

項 Also, in equation (9), the term “power consumption” × ε may not be provided. In this case, the explanatory variable value acquisition unit 23 does not need to acquire the value of “power consumption”.

FIG. 12 is a schematic block diagram showing a configuration example of a computer according to the embodiment of the present invention or its modification. The computer 1000 includes a CPU 1001, a main storage device 1002, an auxiliary storage device 1003, and an interface 1004.

The operation optimization device of the present invention is implemented in the computer 1000. The operation of the operation optimization device is stored in the auxiliary storage device 1003 in the form of an operation optimization program. The CPU 1001 reads out the operation optimization program from the auxiliary storage device 1003, expands it in the main storage device 1002, and executes the processing described in the above embodiment and its modifications according to the operation optimization program.

The auxiliary storage device 1003 is an example of a non-transitory tangible medium. Other examples of non-transitory tangible media include a magnetic disk, a magneto-optical disk, a CD-ROM (Compact Disk Read Only Memory), a DVD-ROM (Digital Versatile Disk Read Only Memory) connected via the interface 1004, A semiconductor memory and the like are included. When the program is distributed to the computer 1000 via a communication line, the computer 1000 that has received the program may load the program into the main storage device 1002 and execute the above-described processing.

(4) The program may be for realizing a part of the above-described processing. Furthermore, the program may be a difference program that implements the above-described processing in combination with another program already stored in the auxiliary storage device 1003.

{Some or all of the components may be realized by a general-purpose or dedicated circuit (processor), a processor, or a combination thereof. These may be configured by a single chip, or may be configured by a plurality of chips connected via a bus. Some or all of the components may be realized by a combination of the above-described circuit and the like and a program.

When a part or all of each component is realized by a plurality of information processing devices, circuits, and the like, the plurality of information processing devices, circuits, and the like may be centrally arranged or may be distributed. For example, the information processing device, the circuit, and the like may be realized as a form in which each is connected via a communication network, such as a client and server system or a cloud computing system.

Next, the outline of the present invention will be described. FIG. 13 is a block diagram showing an outline of the operation optimization device of the present invention. The arithmetic optimization device of the present invention includes an explanatory variable value acquisition unit 73, an objective function calculation unit 75, and an applied pattern determination unit 77.

The explanatory variable value acquiring unit 73 (for example, the explanatory variable value acquiring unit 23) performs an operation using a discrimination model (for example, a neural network) in which a plurality of layers each configured by one or more units are connected. A first operation circuit (for example, a low-precision operation circuit 5) that performs an operation with an operation accuracy of 1 is applied to any layer, and a second operation that performs an operation with a second operation accuracy higher than the first operation accuracy The value of a predetermined explanatory variable is acquired for each application pattern that is information that determines to which layer a circuit (for example, the high-precision arithmetic circuit 6) is applied.

The objective function calculating means 75 (for example, the objective function calculating unit 25) calculates the value of the objective function represented by a predetermined explanatory variable for each applied pattern.

The application pattern determination unit 77 (for example, the application pattern determination unit 27) determines an application pattern that minimizes the value of the objective function.

構成 With such a configuration, the calculation accuracy of each layer of the discrimination model can be automatically determined so that the calculation using the discrimination model can be optimized.

The objective function may represent at least the processing speed of the calculation using the discriminant model and the accuracy of the calculation result as a predetermined explanatory variable.

The objective function may be represented by a data amount exchanged between the first arithmetic circuit and the second arithmetic circuit as a predetermined explanatory variable.

The objective function may be represented by a circuit size of a circuit that performs an operation using the discriminant model as a predetermined explanatory variable.

The objective function may represent power consumption in an operation using the discriminant model as a predetermined explanatory variable.

The explanation variable value acquisition means 73 may acquire the value of a predetermined explanation variable by actual measurement.

The explanation variable value acquisition means 73 may acquire the value of a given explanation variable by simulation.

Although the present invention has been described with reference to the exemplary embodiments, the present invention is not limited to the above exemplary embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

Industrial applicability

The present invention is suitably applied to, for example, an operation optimizing device that optimizes an operation using a discriminant model such as a neural network.

5 Low-precision arithmetic circuit 6 High-precision arithmetic circuit 7 First memory 8 Second memory 9 Third memory 18 Processing device 19 Design information storage unit 21 Discrimination model storage unit 22 Data storage unit 23 Explanation variable value acquisition unit 24 Objective function storage unit 25 objective function calculation unit 26 calculation result storage unit 27 application pattern determination unit

Claims

The first arithmetic circuit that performs an arithmetic operation with a first arithmetic accuracy is applied to any layer by an arithmetic operation using a discriminant model in which a plurality of layers each configured by one or more units are combined, and Acquisition of an explanatory variable value for acquiring a value of a predetermined explanatory variable for each application pattern that is information that determines to which layer a second arithmetic circuit that performs an arithmetic operation with a second arithmetic accuracy higher than the arithmetic accuracy is applied. Means,
Objective function calculating means for calculating the value of the objective function represented by the predetermined explanatory variable for each application pattern,
An operation optimization device, comprising: an application pattern determining unit that determines an application pattern that minimizes the value of an objective function.
The arithmetic optimization device according to claim 1, wherein the objective function represents at least the processing speed of the arithmetic operation using the discriminant model and the accuracy of the arithmetic result as predetermined explanatory variables.
The arithmetic optimization device according to claim 2, wherein the objective function is represented by a data amount exchanged between the first arithmetic circuit and the second arithmetic circuit as a predetermined explanatory variable.
The arithmetic optimization device according to claim 2, wherein the objective function is represented by a circuit size of a circuit that performs an arithmetic operation using the discriminant model as a predetermined explanatory variable.
The operation optimization device according to any one of claims 2 to 4, wherein the objective function represents power consumption in an operation using the discriminant model as a predetermined explanatory variable.
The arithmetic optimization device according to any one of claims 1 to 5, wherein the explanatory variable value acquiring unit acquires a value of a predetermined explanatory variable by actual measurement.
The operation optimization device according to claim 1, wherein the explanatory variable value acquiring unit acquires a value of a predetermined explanatory variable by simulation.
The first arithmetic circuit that performs an arithmetic operation with a first arithmetic accuracy is applied to any layer by an arithmetic operation using a discriminant model in which a plurality of layers each configured by one or more units are combined, and Acquiring a value of a predetermined explanatory variable for each application pattern that is information that determines to which layer a second arithmetic circuit that performs an arithmetic operation with a second arithmetic accuracy higher than the arithmetic accuracy is applied;
The value of the objective function represented by the predetermined explanatory variable is calculated for each application pattern,
A calculation optimization method characterized by determining an application pattern that minimizes the value of an objective function.
On the computer,
The first arithmetic circuit that performs an arithmetic operation with a first arithmetic accuracy in an arithmetic operation using a discriminant model in which a plurality of layers each composed of one or more units are combined is applied to any one of the layers. Acquisition of an explanatory variable value for acquiring a value of a predetermined explanatory variable for each application pattern that is information that determines to which layer a second arithmetic circuit that performs an arithmetic operation with a second arithmetic accuracy higher than the arithmetic accuracy is applied. processing,
An objective function calculation process of calculating the value of the objective function represented by the predetermined explanatory variable for each application pattern, and
An operation optimization program for executing an applied pattern determination process that determines an applied pattern that minimizes the value of the objective function.