WO2023188286A1

WO2023188286A1 - Training device, estimation device, training method, and recording medium

Info

Publication number: WO2023188286A1
Application number: PCT/JP2022/016569
Authority: WO
Inventors: 智哉坂井
Original assignee: 日本電気株式会社
Priority date: 2022-03-31
Filing date: 2022-03-31
Publication date: 2023-10-05

Abstract

This training device comprises: a reference setting means that uses the value of an evaluation function indicating an evaluation of the accuracy of estimation by a model generated in machine learning to set a reference value specifying how much improvement in the accuracy of estimation by the model is to be achieved through further machine learning; and a training means that trains the model so as to cause the output value of the evaluation function to approach the reference value.

Description

Learning device, estimation device, learning method and recording medium

The present invention relates to a learning device, an estimation device, a learning method, and a recording medium.

As a method for setting hyperparameters for a machine learning model, there is a method in which learning is performed using each of a plurality of hyperparameter values, and one of the plurality of hyperparameter values is selected based on the learning results.
For example, the parameter adjustment device described in Patent Document 1 sets a plurality of parameter values to one of two hyperparameters, hyperparameter A, and sets a fixed value to the other hyperparameter B. . The parameter adjustment device sends the combinations of each value of hyperparameter A and the fixed value of hyperparameter B to the learning device, obtains the correct answer rate for each combination, and calculates the relationship between the value of hyperparameter A and the correct answer rate. Approximate with a function. The parameter adjustment device approximates the relationship between the value of hyperparameter A and the correct answer rate with a function for other values of hyperparameter B, and selects the value of hyperparameter A and the value of hyperparameter B that gives the highest correct answer rate. Find a combination with.

Japanese Patent Application Publication No. 2018-15992

It is conceivable to provide hyperparameters so that a wide range of values that parameters can take can be searched during model learning. However, it is difficult to appropriately set hyperparameter values.

An example of the purpose of this disclosure is to provide a learning device, an estimation device, a learning method, and a recording medium that can solve the above-mentioned problems.

According to the first aspect of the present invention, the learning device determines to what extent the estimation accuracy of the model can be obtained through further machine learning, based on the evaluation function value indicating the evaluation of the estimation accuracy of the model generated in machine learning. and learning means that performs machine learning to update the model so that the output value of the evaluation function approaches the reference value.

According to the second aspect of the present invention, the estimation device improves the estimation accuracy of the model by further machine learning, which is set based on the evaluation function value indicating the evaluation of the estimation accuracy of the model generated in machine learning. The estimation unit includes an estimation unit that calculates an estimated value regarding the estimation target using the model updated by machine learning that brings the output value of the evaluation function closer to a reference value of how far to obtain it.

According to a third aspect of the present invention, the learning method is such that the computer improves the estimation accuracy of the model in further machine learning based on the evaluation function value indicating the evaluation of the estimation accuracy of the model generated in machine learning. The method includes setting a reference value to determine how far the evaluation function should be obtained, and training the model so that the output value of the evaluation function approaches the reference value.

According to the fourth aspect of the present invention, the recording medium allows the computer to evaluate the estimation accuracy of the model in further machine learning based on the evaluation function value indicating the evaluation of the estimation accuracy of the model generated in machine learning. This recording medium records a program for executing the following steps: setting a reference value to determine how far to obtain the evaluation function, and learning the model so that the output value of the evaluation function approaches the reference value.

According to the present invention, hyperparameter values for adjusting the breadth of parameter value search can be set relatively easily.

1 is a diagram illustrating an example of the configuration of a learning device according to an embodiment. FIG. 6 is a diagram illustrating an example of division of learning data by the data acquisition unit according to the embodiment. FIG. 3 is a diagram illustrating an example of robustness against changes in parameter values. FIG. 2 is a diagram illustrating a first example of a processing procedure in which the learning device according to the embodiment performs model learning. FIG. 3 is a diagram illustrating an example of start conditions and end conditions for learning using reference values according to the embodiment. FIG. 7 is a diagram illustrating a second example of a processing procedure in which the learning device according to the embodiment performs model learning. FIG. 7 is a diagram illustrating a third example of a processing procedure in which the learning device according to the embodiment performs model learning. 1 is a diagram illustrating an example of the configuration of an estimation device according to an embodiment. It is a figure showing another example of composition of a learning device concerning an embodiment. FIG. 3 is a diagram illustrating an example of a processing procedure in a learning method according to an embodiment. FIG. 1 is a schematic block diagram showing the configuration of a computer according to at least one embodiment.

Hereinafter, embodiments of the present invention will be described, but the following embodiments do not limit the invention according to the claims. Furthermore, not all combinations of features described in the embodiments are essential to the solution of the invention.
FIG. 1 is a diagram showing an example of the configuration of a learning device according to an embodiment. With the configuration shown in FIG. 1, the learning device 100 includes a communication section 110, a display section 120, an operation input section 130, a storage section 180, and a control section 190. The storage unit 180 includes a model storage unit 181. The control unit 190 includes a data acquisition unit 191, a reference setting unit 192, and a learning unit 193.

The learning device 100 performs model learning. The learning device 100 may be configured using a computer such as a personal computer (PC) or a workstation.
The communication unit 110 communicates with other devices. For example, the communication unit 110 may communicate with a device that stores learning data and receive the learning data.

The display unit 120 includes a display screen such as a liquid crystal panel or an LED (Light Emitting Diode) panel, and displays various images.
The operation input unit 130 includes input devices such as a keyboard and a mouse, and receives user operations.

For example, the display unit 120 may display a hyperparameter value input screen for hyperparameters whose values are set by the user. The operation input unit 130 may also accept a user operation to input a hyperparameter value.

The storage unit 180 stores various data. The storage unit 180 is configured using a storage device included in the learning device 100.
The model storage unit 181 stores a learning target model. However, the models to be learned by the learning device 100 are not limited to those stored in the model storage unit 181. For example, a model to be learned by the learning device 100 may be implemented using hardware and configured as a separate device from the learning device 100. In this case, the storage unit 180 may be configured without the model storage unit 181.

The control unit 190 controls each unit of the learning device 100 to perform various processes. The functions of the control unit 190 may be executed, for example, by a CPU (Central Processing Unit) included in the learning device 100 reading a program from the storage unit 180 and executing it.

The data acquisition unit 191 acquires learning data. For example, when the communication unit 110 receives learning data from another device, the data acquisition unit 191 extracts the learning data from the data received by the communication unit 110. Furthermore, the data acquisition unit 191 divides the acquired learning data into training data, confirmation data, and test data.
In the following, a case where the data acquisition unit 191 acquires supervised sample data as learning data will be described as an example. The supervised sample data referred to here can be a set whose elements are combinations of inputs to a model and correct answers of model outputs for the inputs.

FIG. 2 is a diagram showing an example of division of learning data by the data acquisition unit 191. In the example of FIG. 2, the data acquisition unit 191 divides the supervised sample data acquired as learning data into training data, confirmation data, and test data. Specifically, the data acquisition unit 191 converts a plurality of elements included in the supervised sample data, which are a combination of an input to a model and a correct answer of the output of the model in response to the input, into elements of the training data and confirmation data. and test data elements.

Training data is used as data for adjusting the values of model parameters. The parameters here are variables included in the model and whose values are subject to updating by the learning algorithm. For example, when learning a Perceptron-type neural network using backpropagation, the weighting coefficients between nodes and the bias at each node are examples of parameters. do.

The confirmation data is used as data for adjusting hyperparameter values in model learning. The hyperparameters referred to here are parameters related to model learning, such as parameters for setting the behavior of a learning algorithm, other than parameters whose values are subject to update by the learning algorithm. For example, when learning a perceptron-type neural network using error backpropagation, the learning rate is an example of a parameter. Further, when the structure of the neural network is made variable, values related to the structure of the neural network, such as the number of hidden layers and the number of nodes per layer, can also be treated as hyperparameters.

For example, the learning device 100 may set a plurality of hyperparameter values and adopt one of the hyperparameter values using confirmation data. In this case, the learning device 100 uses the training data to adjust the parameter values for each of the set hyperparameter values, and then evaluates the model using the confirmation data. The learning device 100 may then adopt the hyperparameter value with the highest evaluation.
Note that the reference value of the evaluation function value, which will be described later, can also be treated as a hyperparameter. However, as will be described later, the learning device 100 sets only one value at a time for the reference value of the evaluation function value.

Test data is used as data to evaluate the model obtained through learning. For example, the learning device 100 may evaluate a model obtained through learning using test data. Then, the learning device 100 or the user may decide whether to adopt the obtained model or re-learn the model based on the evaluation result.

Training data and validation data can be referred to as data used to construct a model. Test data can be said to be data used for model evaluation.
Further, the training data can be said to be data for updating parameter values of the model. The confirmation data and test data can be said to be data other than data for updating parameter values of the model.

However, the division of supervised sample data performed by the data acquisition unit 191 is not limited to division into training data, confirmation data, and test data. For example, if a model is tested in a trial run in an actual usage environment, there is no need to secure test data from supervised sample data. In this case, the data acquisition unit 191 may divide the acquired supervised sample data into training data and confirmation data.

The standard setting unit 192 sets a standard value of the evaluation function value for the evaluation function used for model learning. The standard setting unit 192 corresponds to an example of standard setting means.
The evaluation function here is a function that indicates the evaluation of the estimation accuracy of the model.
The estimation accuracy of the model indicated by the evaluation function can also be said to be the degree of adaptation of the model to the data input to the model. It can be said that a high evaluation indicated by the evaluation function means that the degree of fit of the model to the input data is high.

Therefore, the evaluation function can also be called a function that indicates the degree of fit of the model to the data input to the model. For example, the evaluation function value when training data is input to a model can be said to indicate the degree of adaptation of the model to the training data.

In the following, a case will be explained in which the learning device 100 uses a function such as an error function or a cross entropy loss function, in which a smaller value indicates a higher evaluation, as an evaluation function. In this case, the smaller the value of the evaluation function, the better the model fits the input data.

Further, in the following, a case where the minimum value of the evaluation function value is 0 will be explained as an example.
The error here may be a value expressed as "1-accuracy". Alternatively, the loss or error here may be a value based on the distance between the output value of the model and the correct value, such as L1 loss or L2 loss.
However, the evaluation function used by the learning unit 193 is not limited to a specific one. For example, the learning unit 193 may use, as the evaluation function, a function in which a larger evaluation function value indicates a higher evaluation.

The reference value here is a value representing the standard of how much estimation accuracy can be obtained for training data in model learning. The reference value can also be called a value that specifies the degree of adaptation of the model to the training data. In other words, the reference value can also be said to represent a criterion for preventing overfitting. The closer the reference value is to 0, the more suited the model is to the training data.

The learning unit 193 performs model learning. In particular, the learning unit 193 performs model learning to bring the evaluation function value as close to the reference value as possible.
The learning unit 193 corresponds to an example of learning means.

The standard setting unit 192 sets, as a standard value, an evaluation function value indicating that the degree of conformity of the model to the training data is more limited than the maximum degree of conformity. The smaller the evaluation function value is, the higher the degree of conformity is. If the minimum value of the evaluation function value is 0, the standard setting unit 192 sets a value larger than 0 as the standard value of the evaluation function value. The learning unit 193 performs model learning to bring the evaluation function value as close to the reference value as possible.

The reference value set by the reference setting unit 192 is used as a hyperparameter for adjusting the width of the search for the parameter value. When the reference value is relatively small (that is, close to 0), the learning unit 193 searches for a parameter value and, after reaching a local solution in which the evaluation function value becomes the reference value or a value close to it, searches for another local solution. The possibility of exploration is considered to be relatively low. On the other hand, when the reference value is relatively large, the possibility that the learning unit 193 searches for another local solution after reaching a local solution in which the evaluation function value becomes the reference value or a value close to it in the search for the parameter value is comparatively low. It is considered to be highly accurate.
After the learning unit 193 reaches a local solution by searching for parameter values, it uses a solution search method that probabilistically determines the next search point, such as a stochastic gradient method, so that it can search for another local solution. You can do it like this.

It is expected that overfitting can be prevented by the learning unit 193 learning the model using the reference value of the evaluation function value. By preventing overfitting, it is expected that, for example, the learning unit 193 can obtain parameter values that are relatively robust against changes in parameter values. Here, being robust against changes in parameter values means that the evaluation function value does not increase much even if the parameter values change somewhat.

FIG. 3 is a diagram illustrating an example of robustness against changes in parameter values. The horizontal axis of the graph in FIG. 3 indicates parameter values. The vertical axis indicates the evaluation function value. FIG. 3 shows an example in which the evaluation function value becomes minimum when the value of the parameter w is w1.
A line L11 shows an example where the parameter value w1 is relatively robust against changes in the parameter value. On the other hand, line L12 shows an example where the parameter value w1 is not relatively robust against changes in the parameter value.

Comparing the case of the line L11 and the case of the line L12, the increase in the evaluation function value is smaller in the case of the line L11 when the value of the parameter w changes somewhat. From this, when the learning unit 193 obtains a local solution for the parameter value indicated by the line L11, learning progresses and new data is obtained more than when the learning section 193 obtains a local solution for the parameter value indicated by the line L12. It is expected that it will be easier to obtain parameter values that are suitable for the new data.
In this way, parameter values that are robust to changes in parameter values are expected to have higher accuracy of the model as learning progresses than parameter values that are not robust to changes in parameter values.

It is expected that overfitting can be prevented by the learning unit 193 learning the model using the reference value of the evaluation function value. On the other hand, if the reference value is too large, it is possible that a model with a high evaluation indicated by the evaluation function value cannot be obtained, that is, the accuracy of the model obtained by learning may be reduced.

Regarding the setting of the reference value, the reference setting unit 192 sets the reference value based on the evaluation function value obtained through the learning up to that point.
For example, the standard setting unit 192 may set an evaluation function value obtained by applying confirmation data to a model in a past epoch as a standard value in a new epoch. As the evaluation function value obtained by applying data such as confirmation data to a model, an average value of evaluation function values obtained for each element included in the data may be used.

The epoch or one epoch here is one time of model learning that the learning unit 193 repeatedly performs using the same supervised sample data. The number of times an epoch is repeated is also referred to as the epoch number. An epoch corresponds to an example of one unit of learning of a model that is repeatedly performed by the learning unit 193.

When the learning unit 193 uses a loss function as an evaluation function, the evaluation function value obtained by applying training data to the model is also referred to as a training loss, and the evaluation function value obtained by applying confirmation data to the model is also referred to as a confirmation loss. , the evaluation function value obtained by applying test data to a model is also called test loss.

When the learning unit 193 uses an error function as an evaluation function, the evaluation function value obtained by applying training data to the model is also referred to as a training error, and the evaluation function value obtained by applying confirmation data to the model is also referred to as a confirmation error. , an evaluation function value obtained by applying test data to a model is also called a test error.

The standard setting unit 192 may set any one of a training loss, a verification loss, a test loss, a training error, a verification error, or a test error as the standard value of the evaluation function.
Alternatively, the standard setting unit 192 may calculate a value obtained by calculation based on any one of the training loss, confirmation loss, test loss, training error, confirmation error, or test error, such as a value obtained by multiplying the training loss by a predetermined coefficient. , may be set as the reference value of the evaluation function.
Alternatively, the standard setting unit 192 may set a combination of any two or more indicators among training loss, verification loss, test loss, training error, verification error, or test error, such as the average value of training loss and verification loss. A value obtained by calculation based on the evaluation function may be set as a reference value of the evaluation function.

The reference setting unit 192 may set the reference value by setting an evaluation function including the reference value. For example, the standard setting unit 192 may set the evaluation function J ^* (g) shown by equation (1).

g indicates the model to be learned. J(g) indicates the original evaluation function (evaluation function that does not include the reference value). b indicates a reference value. "||" indicates an absolute value.
If J(g)≧b, then J ^* (g)=J(g). In this case, the learning unit 193 performs learning using the evaluation function J ^* (g) to search for parameter values in the same way as when J(g) is used as the evaluation function.

On the other hand, when J(g)<b, J ^* (g)=2b−J(g). In this case, the sign of the term "-J(g)" is negative (-), and the slope of the evaluation function J ^* (g) is opposite to that of J(g).
When J(g) is used as an evaluation function in a solution search algorithm based on a gradient method such as error backpropagation, the learning unit 193 adjusts the parameter value so that the value of the evaluation function J(g) approaches the minimum value 0 as much as possible. Explore. On the other hand, when using J ^* (g) as the evaluation function, the learning unit 193 searches for parameter values so as to bring the value of the evaluation function J ^* (g) as close to the reference value b as possible.

The evaluation function J ^* (g) is defined as whether the output value of the evaluation function J(g) is equal to the reference value b or the output value of the evaluation function J(g) is within the domain of the evaluation function J(g). In a portion larger than the reference value b, the same value as the output value of the evaluation function J(g) is output. On the other hand, the evaluation function J ^* (g) is defined as the output of the evaluation function J(g) in the domain where the output value of the evaluation function J(g) is smaller than the reference value b. Output a value greater than the value.
The evaluation function J ^* (g) is also referred to as a restricted evaluation function.

The standard setting unit 192 may set the evaluation function value in an epoch in which the evaluation function value satisfies a predetermined standard among the epochs already executed by the learning unit 193 as the standard value in the next epoch. At this time, the evaluation function value referred to for selecting an epoch and the evaluation function value set as the reference value may be evaluation function values of different data.

For example, the standard setting unit 192 may select an epoch in which the confirmation error satisfies a predetermined standard. In this case, the predetermined criterion may be a criterion that when the confirmation errors for each epoch are arranged in descending order of value, the order is within a predetermined order of higher values. The reference setting unit 192 may set the reference value based on the training error in the selected epoch. Further, the standard setting unit 192 sets a value obtained by calculation using the training error, such as a value obtained by adding a predetermined value to the training error, or a value obtained by subtracting a predetermined value from the training error, as a standard. It may also be set to a value.

For example, the standard setting unit 192 may set the evaluation function value in the epoch with the smallest evaluation function value among the epochs already executed by the learning unit 193 as the standard value in the next epoch. At this time, the evaluation function value referred to for selecting an epoch and the evaluation function value set as the reference value may be evaluation function values of different data.

Furthermore, for example, the standard setting unit 192 may select the epoch with the smallest training error among the epochs that have been executed by the learning unit 193, and set the confirmation error in that epoch as the standard value in the next epoch. .
In this case, the standard setting unit 192 can set the standard value by referring to good learning results in that the epoch with the minimum training error is selected. Furthermore, since the confirmation error is generally considered to be a larger value than the training error, the standard setting unit 192 sets a relatively large standard value. By learning the model based on this reference value, the learning unit 193 is expected to be less prone to overfitting.

Alternatively, the standard setting unit 192 may select the epoch with the smallest confirmation error among the epochs already executed by the learning unit 193, and set the training error in that epoch as the standard value in the next epoch.
In epochs where the confirmation error is small, the generalization performance of the obtained model is expected to be relatively high. The standard setting unit 192 sets the training error in the epoch with the minimum confirmation error as the standard value, thereby setting the training error when a model with relatively high generalization performance is obtained as the standard value. By learning the model based on this reference value, the learning unit 193 is expected to make it easier to search for a solution that yields a model with relatively high generalization performance, and in this respect, overfitting can be avoided. It is expected.

Each time the learning unit 193 performs one epoch worth of learning, the reference setting unit 192 may determine whether or not to update the reference value.
FIG. 4 is a diagram showing a first example of a processing procedure in which the learning device 100 performs model learning.
In the process of FIG. 4, the learning unit 193 performs learning for the first epoch (step S101). In the first epoch of learning, there is no epoch that has been executed by the learning unit 193, and the reference setting unit 192 has not set a reference value. Therefore, the learning unit 193 performs model learning without setting a reference value. If no reference value is set, the learning unit 193 performs model learning so that the evaluation function value approaches the minimum value.

When the learning unit 193 finishes learning for the first epoch, the standard setting unit 192 sets a standard value for the evaluation function value based on the learning result for the first epoch (step S102).
The learning unit 193 performs learning for one epoch using the reference value set by the reference setting unit 192 (step S103).

Next, the standard setting unit 192 determines whether the index value in the epoch most recently executed by the learning unit 193 is the minimum value among the index values in the epochs executed by the learning unit 193 (step S104). As described above, the index value here may be any of training loss, validation loss, test loss, training error, validation error, or test error. Furthermore, the index value that the standard setting unit 192 uses for the determination in step S104 may be different from the index value that is used to set the reference value.

If the learning unit 193 determines that the index value in the most recently executed epoch is the minimum value among the index values in the executed epochs (step S104: YES), the learning unit 193 sets the reference value to The value is updated to the value obtained from the learning result in the most recent epoch (step S111).

Next, the learning unit 193 determines whether a predetermined learning end condition is satisfied (step S112). The learning end condition here is a condition for the learning unit 193 to determine whether to end model learning. The learning end conditions here are not limited to specific conditions. For example, the learning end condition may be that the learning unit 193 has completed learning for a predetermined number of epochs. Alternatively, the learning termination condition may be that the error of the obtained model is less than or equal to a predetermined error threshold.

If the learning unit 193 determines that the learning end condition is not satisfied (step S112: NO), the process returns to step S103.
On the other hand, if the learning unit 193 determines that the learning end condition is satisfied (step S112: YES), the learning device 100 ends the process of FIG. 4.

On the other hand, if the standard setting unit 192 determines in step S104 that the index value in the epoch most recently executed by the learning unit 193 is not the minimum value among the index values in the epochs executed by the learning unit 193 (step S104: NO), the process proceeds to step S112.

In the early stages of model learning, the performance of the model may be unstable. Therefore, the learning unit 193 may perform learning using the reference value after learning has progressed to a certain extent. For example, as will be described later, the learning unit 193 may perform model learning so as to bring the evaluation function value closer to 0 without using the reference value until a predetermined reference value use start condition is satisfied. Then, after the reference value usage start condition is satisfied, the learning unit 193 may perform model learning so as to bring the evaluation function value closer to the reference value. The reference value usage start condition referred to here is a condition for the learning unit 193 to determine the timing to start learning using the reference value, or a condition for the learning unit 193 to determine whether or not to perform learning using the reference value. This is a condition for doing so.

Furthermore, the learning unit 193 may perform learning without using the reference value in the final stage of model learning to improve the performance of the model. For example, the learning unit 193 may perform learning for the last 100 epochs without using the reference value. For example, as will be described later, the learning unit 193 may perform model learning to bring the evaluation function value closer to the reference value until a predetermined reference value use end condition is satisfied. Then, after the reference value use end condition is satisfied, the learning unit 193 may perform model learning so as to bring the evaluation function value closer to 0 without using the reference value. The condition for terminating the use of the reference value here is a condition for the learning unit 193 to determine the timing to end learning using the reference value, or a condition for the learning unit 193 to determine whether or not to end learning using the reference value. This is a condition for making a decision.

FIG. 5 is a diagram showing an example of start conditions and end conditions for learning using reference values. The horizontal axis of the graph in FIG. 5 indicates the number of epochs. The vertical axis shows the error. Line L21 shows an example of training error. Line L22 shows an example of confirmation error.
The reference setting unit 192 may set the confirmation error in the epoch where the training error is the minimum as the reference value when the training error becomes equal to or less than the threshold value Et. The learning unit 193 performs learning without setting a reference value in the epoch before the reference setting unit 192 sets the reference value, and performs learning based on the reference value in the epoch in which the reference setting unit 192 sets the reference value. You may also do so.
Further, the learning unit 193 may perform learning without setting a reference value after the number of epochs reaches M, and may end the learning when the number of epochs reaches N. Both M and N here are positive integers, and M<N.

FIG. 6 is a diagram illustrating a second example of a processing procedure in which the learning device 100 performs model learning.
In the process of FIG. 6, the learning unit 193 performs learning for one epoch without setting a reference value (step S201).
Next, the standard setting unit 192 determines whether a predetermined standard value use start condition is satisfied (step S202). The conditions for starting to use the reference value here are not limited to specific conditions. For example, as in the example of FIG. 5, the condition for starting to use the reference value may be that the training error is less than or equal to a predetermined threshold, but is not limited thereto.

If the standard setting unit 192 determines that the reference value use start condition is not satisfied (step S202: NO), the process returns to step S201.
On the other hand, if it is determined that the reference value use start condition is satisfied (step S202: YES), the reference setting unit 192 sets a reference value (step S211). Specifically, the standard setting unit 192 selects the epoch with the smallest index value among the epochs that have been executed by the learning unit 193, and sets the standard value based on the learning result for that epoch. As described above, the index value here may be any of training loss, validation loss, test loss, training error, validation error, or test error. Further, the index value used by the standard setting unit 192 to select an epoch and the index value used to set the standard value may be different.
The learning unit 193 performs learning for one epoch based on the reference value set by the reference setting unit 192 (step S212).

Next, the standard setting unit 192 determines whether the index value in the epoch most recently executed by the learning unit 193 is the minimum value among the index values in the epochs executed by the learning unit 193 (step S213). As described above, the index value here may be any of training loss, validation loss, test loss, training error, validation error, or test error. Furthermore, the index value that the standard setting unit 192 uses for the determination in step S213 may be different from the index value that is used for setting the reference value.

If the learning unit 193 determines that the index value in the most recently executed epoch is the minimum value among the index values in the executed epochs (step S213: YES), the learning unit 193 sets the reference value to The value is updated to the value obtained from the learning result in the most recent epoch (step S221).

Next, the learning unit 193 determines whether a predetermined reference value usage termination condition is satisfied (step S222). The reference value usage termination conditions here are not limited to specific conditions. For example, as in the example of FIG. 5, the reference value use termination condition may be that the learning unit 193 has completed learning for a predetermined number of epochs, but is not limited to this.

If the learning unit 193 determines that the reference value usage end condition is not satisfied (step S222: NO), the process returns to step S103.
On the other hand, if it is determined in step S222 that the reference value usage termination condition is not satisfied (step S222: NO), the learning unit 193 determines whether a predetermined learning termination condition is satisfied (step S231). ). The learning end conditions here are not limited to specific conditions. For example, the learning end condition may be that the learning unit 193 has completed learning for a predetermined number of epochs. Alternatively, the learning termination condition may be that the error of the obtained model is less than or equal to a predetermined error threshold.

If the learning unit 193 determines that the learning end condition is not satisfied (step S231: NO), the process returns to step S212.
On the other hand, if the learning unit 193 determines that the learning end condition is satisfied (step S231: YES), the learning device 100 ends the process of FIG. 6.

On the other hand, if it is determined in step 222 that the reference value usage termination condition is satisfied (step S222: YES), the learning unit 193 determines whether a predetermined learning termination condition is satisfied (step S241). ). The determination made by the learning unit 193 in step S241 is the same as that in step S231.

If it is determined that the learning end condition is not satisfied (step S241: NO), the learning unit 193 performs learning without setting a reference value for one epoch (step S251).
After step S251, the process returns to step S241.
On the other hand, if the learning unit 193 determines in step S241 that the learning end condition is satisfied (step S241: YES), the learning device 100 ends the process of FIG. 6.

After the learning unit 193 performs learning for a predetermined number of epochs, the standard setting unit 192 may set a reference value, and the learning unit 193 may further perform learning for a predetermined number of epochs based on the reference value. . For example, the reference setting unit 192 may set the reference value after the learning unit 193 performs learning for 500 epochs. The learning unit 193 may further perform learning for 500 epochs based on the reference value.

In this case, the standard setting unit 192 may set the training loss at the epoch where the error based on the confirmation data is the minimum (that is, the epoch where the accuracy based on the confirmation data is maximum) as the reference value. It is considered that the epoch with the smallest error based on the confirmation data corresponds to the time before the training loss becomes 0 and overfitting occurs. It is expected that overfitting can be avoided by the standard setting unit 192 setting the training loss in this epoch as a standard value and the learning unit 193 performing learning based on the standard value.

FIG. 7 is a diagram illustrating a third example of a processing procedure in which the learning device 100 performs model learning.
In the process of FIG. 7, the learning unit 193 performs learning without setting a reference value for a predetermined number of epochs (step S301).
Next, the standard setting unit 192 sets a standard value (step S302). Specifically, the standard setting unit 192 selects the epoch with the smallest index value among the epochs that have been executed by the learning unit 193, and sets the standard value based on the learning result for that epoch. As described above, the index value here may be any of training loss, validation loss, test loss, training error, validation error, or test error. Further, the index value used by the standard setting unit 192 to select an epoch and the index value used to set the standard value may be different.
The learning unit 193 performs learning for a predetermined number of epochs based on the reference value set by the reference setting unit 192 (step S303).
After step S303, the learning device 100 ends the process of FIG. 7.

Alternatively, after step S303, the learning device 100 may further set a reference value and perform learning based on the reference value. For example, as in step S302, the standard setting unit 192 selects the epoch with the smallest index value among the epochs that have been executed by the learning unit 193, and sets the standard value based on the learning result for that epoch. The learning unit 193 performs learning for a predetermined number of epochs based on the reference value set by the reference setting unit 192, as in step S303.
For example, the user may instruct the learning device 100 whether to further set a reference value and perform learning based on the reference value.

As described above, the standard setting unit 192 sets a standard value for determining the estimation accuracy of the model in further machine learning, based on the evaluation function value indicating the estimation accuracy of the model generated in machine learning. The learning unit 193 performs model learning so that the output value of the evaluation function approaches the reference value.

According to the learning device 100, since the evaluation function value is used, it is possible to relatively easily set the hyperparameter value for adjusting the width of the search for the parameter value.
Furthermore, in the learning device 100, it is expected that overfitting can be avoided by the learning unit 193 learning the model based on the reference value.
Furthermore, in the learning device 100, it is possible to set only one value at a time for setting reference values corresponding to examples of hyperparameter values for adjusting the breadth of search for parameter values. . In particular, with the learning device 100, there is no need to set a plurality of reference values, perform model learning, and select one of the reference values based on the learning results. In this respect, the learning device 100 is expected to be able to relatively shorten the time required for learning without requiring computational resources such as parallel processing.
Further, according to the learning device 100, the standard setting unit 192 can set the standard value based on the evaluation function value obtained by learning, so that the standard value can be set according to the model and the learning situation. It is expected that relatively appropriate standard values can be set in this regard.

Further, the standard setting unit 192 sets the standard value to a value indicating that the degree of conformity of the model to the training data is more limited than the maximum degree of conformity among the values that the evaluation function can take.
According to the learning device 100, the possibility of overfitting can be reduced by learning using the reference value.

Further, the standard setting unit 192 sets a standard value based on an evaluation function value obtained by applying confirmation data, which is data other than data for updating model parameter values (training data), to the model.
For example, in epochs where the evaluation function value obtained by applying data other than training data to the model, such as confirmation error, is small, the generalization performance of the obtained model is expected to be relatively high. The standard setting unit 192 sets the training error in the epoch with the minimum confirmation error as the standard value, thereby setting the training error when a model with relatively high generalization performance is obtained as the standard value. By learning the model based on this reference value, the learning unit 193 is expected to make it easier to search for a solution that yields a model with relatively high generalization performance, and in this respect, overfitting can be avoided. It is expected.

Further, after one epoch of learning of the model that is repeatedly performed by the learning unit 193 is completed, the standard setting unit 192 performs a process based on the learning result used for setting the standard value in the learning in that epoch. It is determined whether a learning result with a small evaluation function value has been obtained. If it is determined that a learning result with a smaller evaluation function value has been obtained, the standard setting unit 192 updates the standard value based on the learning result in that epoch.
Thereby, the standard setting unit 192 can update the standard value as learning of the model by the learning unit 193 progresses, and it is expected that an appropriate standard value can be set according to the progress of learning.

Furthermore, the learning unit 193 performs model learning to bring the evaluation function value as close to 0 as possible until a predetermined reference value use start condition is satisfied.
Thereby, the standard setting unit 192 can set the standard value after learning has progressed to a certain extent and the accuracy of the model has stabilized. According to the learning device 100, it is expected that the standard setting unit 192 can set an appropriate standard value in this regard.

Further, the learning unit 193 performs model learning so as to bring the evaluation function value as close to 0 as possible after a predetermined reference value usage termination condition is satisfied.
According to the learning device 100, learning can be performed without setting a reference value at the final stage of model learning, and in this respect it is expected that the accuracy of the model can be made relatively high.

Further, the standard setting unit 192 selects one of the epochs based on the learning results from among the predetermined number of epochs of learning of the model performed by the learning unit 193, and selects the evaluation function value shown in the learning result in the selected epoch. Set standard values based on
According to the learning device 100, the standard setting unit 192 only needs to set the standard value once, and in this respect, the time required for learning can be relatively shortened without requiring computational resources such as parallel processing. It is expected that it will be possible. Further, since the standard setting unit 192 sets the standard value at a stage when the learning unit 193 has progressed to a certain extent in learning the model, it is expected that the standard setting unit 192 can set an appropriate standard value.

Further, the standard setting unit 192 generates a limited evaluation function. A restricted evaluation function outputs the same value as the output value in a portion of the evaluation function's domain where the output value of the evaluation function is equal to or larger than the reference value, In parts where the output value of the evaluation function is smaller than the reference value, the function outputs a value larger than the reference value. The learning unit 193 performs model learning using the restricted evaluation function set by the standard setting unit 192.
In the learning device 100, the reference value can be included in the restricted evaluation function, and the learning unit 193 does not need to refer to the reference value separately from the evaluation function. According to the learning device 100, in this respect, it is expected that the load on the learning unit 193 is relatively small.

FIG. 8 is a diagram illustrating an example of a configuration of an estimation device according to an embodiment.
With the configuration shown in FIG. 8, the estimation device 200 includes a communication section 210, a display section 220, an operation input section 230, a storage section 280, and a control section 290. The storage unit 280 includes a model storage unit 181. The control unit 290 includes a data acquisition unit 291 and an estimation unit 292.

The estimation device 200 performs estimation using the model learned by the learning device 100. The use of estimation device 200 is not limited to a specific use. For example, the estimation device 200 may be configured as a face authentication device and use a model to calculate the degree of similarity between a face image to be authenticated and a registered face image. Alternatively, the estimation device 200 may input a given sentence into a model and estimate the emotion indicated by the sentence. In this way, the estimation device 200 can be applied to various fields such as computer vision or natural language processing.

The estimation device 200 may be configured using a computer such as a personal computer or a workstation. Estimation device 200 may be configured using the computer used as learning device 100. Alternatively, the estimation device 200 may be configured using a computer different from the computer used as the learning device 100.

The communication unit 210 communicates with other devices. For example, the communication unit 210 may communicate with another device to receive estimation target data.
The display unit 220 includes a display screen such as a liquid crystal panel or an LED panel, and displays various images. For example, the display unit 220 may display the estimation result by the estimation device 200.
The operation input unit 230 includes input devices such as a keyboard and a mouse, and receives user operations. A user operation instructing the start of estimation may be accepted.

The storage unit 280 stores various data. The storage unit 280 is configured using a storage device included in the estimation device 200.
The model storage unit 181 stores a model learned by the learning device 100. In this respect, the model storage unit 181 of the estimation device 200 stores the same model as the model storage unit 181 of the learning device 100. Therefore, in FIG. 8, 181 is used as the code for the model storage section, as in the case of FIG.

Alternatively, if the model to be learned by the learning device 100 is configured as a separate device from the learning device 100, the model used by the estimating device 200 may also be configured as a separate device from the estimating device 200. good. In this case, the storage unit 280 may be configured without the model storage unit 181.

The control unit 290 controls each unit of the estimation device 200 to perform various processes, and the function of the control unit 290 may be executed by, for example, a CPU included in the estimation device 200 reading a program from the storage unit 280 and executing it. good.

The data acquisition unit 291 acquires data to be estimated. For example, when the communication unit 210 receives data to be estimated from another device, the data acquisition unit 291 extracts the data to be estimated from the data received by the communication unit 210.
The estimation unit 292 performs estimation on the estimation target acquired by the data acquisition unit 291. The estimation unit 292 inputs the estimation target data obtained by the data acquisition unit 291 into a model stored in the model storage unit 181, and obtains the output of the model as an estimation result.

As described above, the estimating unit 292 calculates the estimated value regarding the estimation target using the learned model obtained by learning the model by the learning device 100.
The learned model obtained by learning the model by the learning device 100 is set based on the value of the evaluation function that indicates the evaluation of the estimation accuracy of the model, and specifies how much estimation accuracy of the model is to be obtained by further machine learning. This is an example of a model that has been updated using machine learning to bring the output value of the evaluation function closer to the reference value.
It is expected that the estimation device 200 does not overfit the model. In this respect, the estimation device 200 is expected to be able to perform estimation with high accuracy.

FIG. 9 is a diagram showing another example of the configuration of the learning device according to the embodiment. With the configuration shown in FIG. 9, the learning device 610 includes a reference setting section 611 and a learning section 612.
With this configuration, the standard setting unit 611 determines the standard value for determining the estimation accuracy of the model in further machine learning, based on the value of the evaluation function that indicates the evaluation of the estimation accuracy of the model generated in machine learning. Set. The learning unit 612 updates the model by performing machine learning so as to bring the output value of the evaluation function closer to the reference value.
The standard setting unit 611 corresponds to an example of standard setting means. The learning unit 612 corresponds to an example of learning means.

In the learning device 610, it is expected that overfitting can be avoided by the learning unit 612 learning the model based on the reference value.
Furthermore, in the learning device 610, it is possible to set only one value at a time for setting reference values corresponding to examples of hyperparameter values for adjusting the breadth of search for parameter values. . In particular, in the learning device 610, there is no need to set a plurality of reference values, perform model learning, and select one of the reference values based on the learning results. In this respect, the learning device 610 is expected to be able to relatively shorten the time required for learning without requiring computational resources such as parallel processing.
Further, according to the learning device 610, the standard setting unit 611 can set the standard value based on the evaluation function value obtained by learning, so that the standard value can be set according to the model and the learning situation. It is expected that relatively appropriate standard values can be set in this regard.

The standard setting unit 611 can be realized using the functions of the standard setting unit 192 in FIG. 1, for example. The learning unit 612 can be realized using the functions of the learning unit 193 in FIG. 1, for example.

FIG. 10 is a diagram illustrating an example of a processing procedure in the learning method according to the embodiment.
The learning method shown in FIG. 10 includes setting a standard (step S611) and performing learning (step S612).
In setting the standard (step S611), the computer determines to what extent the estimation accuracy of the model can be obtained through further machine learning, based on the value of the evaluation function that indicates the evaluation of the estimation accuracy of the model generated in machine learning. Set the standard value.
In performing learning (step S612), the computer performs learning of the model so that the output value of the evaluation function approaches the reference value.

The learning method shown in FIG. 10 is expected to avoid overfitting by learning the model based on reference values.
Furthermore, in the learning method shown in Fig. 10, regarding the setting of reference values corresponding to examples of hyperparameter values for adjusting the breadth of parameter value search, it is possible to set only one value at a time. It is possible. In particular, with the learning method shown in FIG. 10, there is no need to set a plurality of reference values, perform model learning, and select one of the reference values based on the learning results. In this respect, the learning method shown in FIG. 10 is expected to be able to relatively shorten the time required for learning without requiring computational resources such as parallel processing.
Further, according to the learning method shown in FIG. 10, by setting the reference value based on the evaluation function value obtained in learning, the reference value can be set according to the model and the learning situation. It is expected that relatively appropriate standard values can be set.

FIG. 11 is a schematic block diagram showing the configuration of a computer according to at least one embodiment.
With the configuration shown in FIG. 11, the computer 700 includes a CPU 710, a main storage device 720, an auxiliary storage device 730, an interface 740, and a nonvolatile recording medium 750.

Any one or more of the learning device 100, estimation device 200, and learning device 610 described above, or a portion thereof, may be implemented in the computer 700. In that case, the operations of each processing section described above are stored in the auxiliary storage device 730 in the form of a program. The CPU 710 reads the program from the auxiliary storage device 730, expands it to the main storage device 720, and executes the above processing according to the program. Further, the CPU 710 secures storage areas corresponding to each of the above-mentioned storage units in the main storage device 720 according to the program. Communication between each device and other devices is performed by the interface 740 having a communication function and performing communication under the control of the CPU 710.

When the learning device 100 is installed in the computer 700, the operation of the control unit 190 and each part thereof is stored in the auxiliary storage device 730 in the form of a program. The CPU 710 reads the program from the auxiliary storage device 730, expands it to the main storage device 720, and executes the above processing according to the program.

Further, the CPU 710 secures storage areas corresponding to the storage unit 180 and each unit thereof in the main storage device 720 according to the program. The communication performed by the communication unit 110 is performed by the interface 740 having a communication function and performing communication under the control of the CPU 710. The display of the image by the display unit 120 is performed by the interface 740 having a display device and displaying the image under the control of the CPU 710. Acceptance of a user operation by the operation input unit 130 is executed by the interface 740 having an input device and accepting the user operation.

When the estimation device 200 is installed in the computer 700, the operation of the control unit 290 and each part thereof is stored in the auxiliary storage device 730 in the form of a program. The CPU 710 reads the program from the auxiliary storage device 730, expands it to the main storage device 720, and executes the above processing according to the program.

Further, the CPU 710 reserves storage areas corresponding to the storage section 280 and each section thereof in the main storage device 720 according to the program. The communication performed by the communication unit 210 is performed by the interface 740 having a communication function and performing communication under the control of the CPU 710. The image display performed by the display unit 220 is performed by the interface 740 having a display device and displaying the image under the control of the CPU 710. Acceptance of a user operation by the operation input unit 230 is executed by the interface 740 having an input device and accepting the user operation.

When the learning device 610 is installed in the computer 700, the operations of the standard setting section 611 and the learning section 612 are stored in the auxiliary storage device 730 in the form of a program. The CPU 710 reads the program from the auxiliary storage device 730, expands it to the main storage device 720, and executes the above processing according to the program.

Further, the CPU 710 secures a storage area in the main storage device 720 for the learning device 610 to perform processing according to the program. Communication between the learning device 610 and other devices is performed by the interface 740 having a communication function and operating under the control of the CPU 710. Interaction between the learning device 610 and the user is performed by the interface 740 having a display device and an input device, displaying various images under the control of the CPU 710, and accepting user operations.

Any one or more of the programs described above may be recorded on the nonvolatile recording medium 750. In this case, the interface 740 may read the program from the nonvolatile recording medium 750. Then, the CPU 710 may directly execute the program read by the interface 740, or may temporarily store the program in the main storage device 720 or the auxiliary storage device 730 and execute it.

Note that a program for executing all or part of the processing performed by the learning device 100, the estimation device 200, and the learning device 610 is recorded on a computer-readable recording medium, and the program recorded on this recording medium is readable. Each part may be processed by loading it into a computer system and executing it. Note that the "computer system" herein includes hardware such as an OS (Operating System) and peripheral devices.
Furthermore, "computer-readable recording media" refers to portable media such as flexible disks, magneto-optical disks, ROM (Read Only Memory), and CD-ROM (Compact Disc Read Only Memory), and hard disks built into computer systems. Refers to storage devices such as Further, the above-mentioned program may be one for realizing a part of the above-mentioned functions, or may be one that can realize the above-mentioned functions in combination with a program already recorded in the computer system.

Although the embodiments of the present invention have been described above in detail with reference to the drawings, the specific configuration is not limited to these embodiments, and includes designs within the scope of the gist of the present invention.

Part or all of the above embodiments may be described as in the following supplementary notes, but are not limited to the following.

(Additional note 1)
Standard setting means for setting a standard value for determining the estimation accuracy of the model in further machine learning, based on the value of an evaluation function indicating the evaluation of the estimation accuracy of the model generated in machine learning;
learning means for updating the model by performing machine learning so as to bring the output value of the evaluation function closer to the reference value;
A learning device equipped with.

(Additional note 2)
The standard setting means sets the standard value to a value that, among the values that the evaluation function can take, has a degree of conformity of the model to training data that is data for updating parameter values of the model that is more limited than a maximum degree of conformity. set to a value indicating that
The learning device described in Appendix 1.

(Additional note 3)
The reference setting means sets the reference value based on an evaluation function value obtained by applying confirmation data, which is data other than data for updating parameter values of the model, to the model.
The learning device according to Supplementary note 1 or Supplementary note 2.

(Additional note 4)
After one unit of learning of the model that is repeatedly performed by the learning means is completed, the reference setting means determines, in the one unit of learning, that the learning result used for setting the reference value is higher than the learning result used for setting the reference value. The evaluation function value determines whether a learning result indicating that the degree of adaptation of the model to the input data is higher is obtained, and if it is determined that a learning result indicating that the degree of adaptation is higher is obtained, the updating the reference value based on the result of one unit of learning;
The learning device according to any one of Supplementary Notes 1 to 3.

(Appendix 5)
The learning means performs learning of the model so that the output value of the evaluation function approaches an evaluation function value indicating a maximum degree of adaptation until a predetermined reference value use start condition is satisfied.
The learning device described in Appendix 4.

(Appendix 6)
The learning means performs learning of the model so that the output value of the evaluation function approaches an evaluation function value indicating a maximum degree of conformity after a predetermined reference value usage termination condition is satisfied.
The learning device according to appendix 4 or appendix 5.

(Appendix 7)
The reference setting means selects one of the epochs of the learning of the model for a predetermined number of epochs performed by the learning means based on the learning results, and based on the evaluation function value shown in the learning result of the selected epoch. setting the reference value by
The learning device according to any one of Supplementary Notes 1 to 3.

(Appendix 8)
The standard setting means is configured to determine whether, in the domain of the evaluation function, the output value of the evaluation function is equal to the reference value, or the output value is training data that is data for updating parameter values of the model. A portion indicating that the degree of fit of the model is smaller than the reference value is outputted with the same value as the output value, and an output value of the evaluation function is a portion indicating that the degree of fit is greater than the reference value. Now, generate a restricted evaluation function that outputs a value indicating that the degree of conformity is smaller than the reference value,
The learning means performs learning of the model using the limited evaluation function.
The learning device according to any one of Supplementary Notes 1 to 7.

(Appendix 9)
The output of the evaluation function is set based on the value of the evaluation function that indicates the evaluation of the estimation accuracy of the model generated in machine learning, and the output of the evaluation function is set as a reference value that specifies to what extent the estimation accuracy of the model is obtained in further machine learning. An estimation device comprising: an estimation unit that calculates an estimated value regarding an estimation target using the model updated by machine learning that brings values closer together.

(Appendix 10)
The computer is
Based on the value of an evaluation function indicating an evaluation of the estimation accuracy of the model generated in machine learning, setting a reference value for how much estimation accuracy of the model is obtained in further machine learning,
learning the model so that the output value of the evaluation function approaches the reference value;
Learning methods that include.

(Appendix 11)
to the computer,
Setting a reference value for determining the estimation accuracy of the model in further machine learning based on the value of an evaluation function indicating an evaluation of the estimation accuracy of the model generated in machine learning;
Learning the model so that the output value of the evaluation function approaches the reference value;
A recording medium that records a program for executing.

The present invention may be applied to a learning device, an estimation device, a learning method, and a recording medium.

100, 610

learning device

110, 210

communication section

120, 220

display section

130, 230

operation input section

180, 280 storage section 181

model storage section

190, 290

control section

191, 291

data acquisition section

192, 611

standard setting section

193, 612 Learning section 292 Estimation section

Claims

Standard setting means for setting a standard value for determining the estimation accuracy of the model in further machine learning, based on the value of an evaluation function indicating the evaluation of the estimation accuracy of the model generated in machine learning;
learning means for updating the model by performing machine learning so as to bring the output value of the evaluation function closer to the reference value;
A learning device equipped with.
The standard setting means sets the standard value to a value that, among the values that the evaluation function can take, has a degree of conformity of the model to training data that is data for updating parameter values of the model that is more limited than a maximum degree of conformity. set to a value indicating that
The learning device according to claim 1.
The reference setting means sets the reference value based on an evaluation function value obtained by applying confirmation data, which is data other than data for updating parameter values of the model, to the model.
The learning device according to claim 1 or 2.
After one unit of learning of the model that is repeatedly performed by the learning means is completed, the reference setting means determines, in the one unit of learning, that the learning result used for setting the reference value is higher than the learning result used for setting the reference value. The evaluation function value determines whether a learning result indicating that the degree of adaptation of the model to the input data is higher is obtained, and if it is determined that a learning result indicating that the degree of adaptation is higher is obtained, the updating the reference value based on the result of one unit of learning;
The learning device according to any one of claims 1 to 3.
The learning means performs learning of the model so that the output value of the evaluation function approaches an evaluation function value indicating a maximum degree of adaptation until a predetermined reference value use start condition is satisfied.
The learning device according to claim 4.
The learning means performs learning of the model so that the output value of the evaluation function approaches an evaluation function value indicating a maximum degree of conformity after a predetermined reference value usage termination condition is satisfied.
The learning device according to claim 4 or claim 5.
The reference setting means selects one of the epochs of the learning of the model for a predetermined number of epochs performed by the learning means based on the learning results, and based on the evaluation function value shown in the learning result of the selected epoch. setting the reference value by
The learning device according to any one of claims 1 to 3.
The standard setting means is configured to determine whether, in the domain of the evaluation function, the output value of the evaluation function is equal to the reference value, or the output value is training data that is data for updating parameter values of the model. A portion indicating that the degree of fit of the model is smaller than the reference value is outputted with the same value as the output value, and an output value of the evaluation function is a portion indicating that the degree of fit is greater than the reference value. Now, generate a restricted evaluation function that outputs a value indicating that the degree of conformity is smaller than the reference value,
The learning means performs learning of the model using the limited evaluation function.
A learning device according to any one of claims 1 to 7.
The output of the evaluation function is set based on the value of the evaluation function that indicates the evaluation of the estimation accuracy of the model generated in machine learning, and the output of the evaluation function is set as a reference value that specifies to what extent the estimation accuracy of the model is obtained in further machine learning. An estimation device comprising: an estimation unit that calculates an estimated value regarding an estimation target using the model updated by machine learning that brings values closer together.
The computer is
Based on the value of an evaluation function indicating an evaluation of the estimation accuracy of the model generated in machine learning, setting a reference value for how much estimation accuracy of the model is obtained in further machine learning,
learning the model so that the output value of the evaluation function approaches the reference value;
Learning methods that include.
to the computer,
Setting a reference value for determining the estimation accuracy of the model in further machine learning based on the value of an evaluation function indicating an evaluation of the estimation accuracy of the model generated in machine learning;
Learning the model so that the output value of the evaluation function approaches the reference value;
A recording medium that records a program for executing.