WO2023073967A1

WO2023073967A1 - Accuracy determination program, accuracy determination device, and accuracy determination method

Info

Publication number: WO2023073967A1
Application number: PCT/JP2021/040157
Authority: WO
Inventors: 顕一郎成田; 理史新宮
Original assignee: 富士通株式会社
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2023-05-04

Abstract

An accuracy determination program causes a computer to execute a process in which on the basis of a second plurality of data pieces obtained by converting a first plurality of data pieces in accordance with a first rule, a first machine learning model is updated to generate a second machine learning model, a fourth plurality of data pieces obtained by converting a third plurality of data pieces in accordance with the first rule are inputted into the second machine learning model to acquire a prediction result, feature values established on the basis of parameters of the first machine learning model for each of the fourth plurality of data pieces are clustered to decide correct labels for each of the fourth plurality of data pieces, and the accuracy of the second machine learning model is determined on the basis of the prediction result and the correct labels. The computer can thereby show the effect that a fairness correction process has on the accuracy of a machine learning model.

Description

Accuracy Judgment Program, Accuracy Judgment Apparatus, and Accuracy Judgment Method

The present invention relates to accuracy determination technology.

In some cases, such as loan screening, machine learning models are used to support screening and screening. However, machine learning models may be trained on unfairly biased data, resulting in, for example, gender-dependent decisions.

Therefore, fairness correction processing is required to eliminate data with unfair biases and ensure the fairness of judgments by machine learning models.

Japanese Patent Publication No. 2018-513490 JP 2017-068710 A U.S. Patent Application Publication No. 2018/0285772

However, the prediction accuracy of machine learning models may deteriorate due to fairness correction processing. How much the prediction accuracy is when fairness correction processing is performed cannot be determined without test data having correct labels attached manually. Therefore, unless the influence of fairness correction processing on the accuracy of the machine learning model is known during the operation of the machine learning model, it is impossible to judge whether the fairness correction processing can be applied to the machine learning model.

In one aspect, we aim to show the accuracy impact of fairness correction processing on machine learning models.

In one aspect, the accuracy determination program updates the first machine learning model based on the second plurality of data obtained by converting the first plurality of data according to the first rule, and updates the second Obtaining a prediction result by generating a machine learning model and inputting a fourth plurality of data obtained by converting the third plurality of data according to the first rule to the second machine learning model, By clustering the feature amounts determined based on the parameters of the first machine learning model for each of the plurality of data in 4, the correct label for each of the plurality of data in 4th is determined, and the prediction result and the correct label are combined. Based on this, the computer is caused to execute processing for determining the accuracy of the second machine learning model.

In one aspect, it is possible to show the accuracy impact of fairness correction processing on machine learning models.

FIG. 1 is a diagram showing an example of correction processing according to this embodiment. FIG. 2 is a diagram illustrating a configuration example of the accuracy determination device 5 according to the first embodiment. FIG. 3 is a diagram illustrating a configuration example of the classification system 1 according to the second embodiment. FIG. 4 is a diagram showing an example of an accuracy determination method according to this embodiment. FIG. 5 is a diagram showing a display example of prediction accuracy and fairness score according to the present embodiment. FIG. 6 is a diagram showing an example of a prediction accuracy and fairness score output method according to the present embodiment. FIG. 7 is a flowchart showing an example of the flow of accuracy determination processing according to this embodiment. FIG. 8 is a diagram showing a hardware configuration example of the information processing apparatus 10 according to this embodiment.

Examples of the accuracy determination program, the accuracy determination device, and the accuracy determination method according to the present embodiment will be described below in detail based on the drawings. In addition, this embodiment is not limited by this Example. Moreover, each embodiment can be appropriately combined within a range without contradiction.

First, we will explain the unfair judgment by machine learning and its corrective action. FIG. 1 is a diagram showing an example of correction processing according to this embodiment. The table on the left side of FIG. 1 shows an example of the judgment result of a machine learning model generated by machine learning using attributes A to D as input data, that is, feature amounts, and using the classification results of "A" or "B" as correct labels. is shown.

　Referring to the table on the left side of Fig. 1, No. 3 woman and No. 5 men have different determination results even though attributes B to D other than gender are all the same. This indicates that the machine learning model made gender-dependent unfair judgments.

Therefore, as a fairness correcting process, as shown in the table on the right side of FIG. to retrain the machine learning model. Note that the numerical values of attributes B to D other than attribute A, which is a protected attribute, may be converted according to a predetermined rule, or may be changed randomly within a numerically possible range. Also, all the numerical values of attributes B to D need not be changed.

Also, the effect of the fairness correction process can be determined, for example, by the DI score, which is an example of the fairness score of the fairness correction process. The DI score can be calculated using the following formula (1).

The effect of the fairness correction process can be determined by calculating and comparing the fairness scores of the determination results before and after correction using formula (1).

With such fairness correction processing, machine learning models that have been trained to make unfair judgments can be corrected and the effects can be confirmed. However, since input data is processed in fairness correction processing, the prediction accuracy of the machine learning model may deteriorate. Therefore, especially when a machine learning model is introduced into a system and operated, there is a problem that the application of fairness correction processing has a large influence on the system. In order to deal with such problems, for example, ACC (accuracy), which is an example of accuracy evaluation of the machine learning model, is observed while the machine learning model is in operation, and the prediction accuracy of the machine learning model in operation is evaluated. It is conceivable to conduct an evaluation. However, since the correct label for the input data is unknown during operation, it is difficult to evaluate the output result of the machine learning model. In addition, the correct label must be attached manually, which is costly. Therefore, one of the objects of the present embodiment is to show the effect of the fairness correction process on the accuracy of the machine learning model.

[Functional configuration of accuracy determination device 5]
First, with reference to FIG. 2, the functional configuration of the accuracy determination device 5 according to Example 1 of the present embodiment will be described. FIG. 2 is a diagram illustrating a configuration example of the accuracy determination device 5 according to the first embodiment. Accuracy determination device 5 indicates the impact of the fairness correction process on the accuracy of the machine learning model.

In order to show the accuracy influence on the machine learning model, the accuracy determination device 5 uses the correct label by clustering the feature amount based on the parameter of the machine learning model and the corrective model generated by retraining the machine learning model with the corrective data. Compare with prediction results. Accuracy determination device 5 includes model storage unit 11 , classification unit 13 , generation unit 14 , label assignment unit 15 , determination unit 16 , and learning unit 17 .

The model storage unit 11 stores machine learning models. More specifically, the model storage unit 11 stores neural network parameters of the machine learning model. The parameters include weights between neurons. Weights between neurons are updated by machine learning.

For example, as shown in the table on the right side of FIG. 1, the correction unit 12 converts the input data for the machine learning model using a correction filter created according to a predetermined rule to generate correction data. Correction data is input to a machine learning model as input data to correct a machine learning model that has been trained to make unfair judgments, or is used as training data to retrain the machine learning model. do.

The classification unit 13 classifies the correction data generated by converting the input data by the correction unit 12 based on the machine learning model stored in the model storage unit 11 .

The generation unit 14 plots each point corresponding to the correction data in a DT (Durable Topology Space) space, which is the feature amount space of the correction data, based on the output values of the neurons in the output layer of the machine learning model and the judgment results. do. Here, the DT space is a feature amount space of corrected data having an axis corresponding to the output value of each neuron in the output layer. Then, the generation unit 14 clusters the points plotted in the DT space based on the density of each classification. Details of such processing for the DT space will be described later.

The label assigning unit 15 determines the label of each cluster from the clustering result by the generating unit 14, and assigns the determined label to the correction data corresponding to each point belonging to each cluster.

The determination unit 16 determines the prediction accuracy of the machine learning model based on the classification result classified by the classification unit 13, that is, the prediction result of the machine learning model for the correction data and the label assigned by the label assignment unit 15. do. The determination unit 16 can determine the influence of the fairness correction process on the prediction accuracy of the machine learning model by using the label assigned by the label assignment unit 15 as the correct label.

The learning unit 17 retrains and updates the machine learning model stored in the model storage unit 11, using the correction data as the feature amount and the label given by the labeling unit 15 as the correct label.

Next, the functional configuration of the classification system 1 according to Example 2 of this embodiment will be described using FIG. FIG. 3 is a diagram illustrating a configuration example of the classification system 1 according to the second embodiment. As shown in FIG. 3 , the classification system 1 has an input sensor 2 , a data storage device 3 , a classification device 4 , an accuracy determination device 5 and a display device 6 .

The input sensor 2 is a sensor that acquires data to be classified. For example, when classifying images, the input sensor 2 is a camera.

The data storage device 3 stores the input data acquired by the input sensor 2. The data storage device 3 stores image data, for example.

The classification device 4 is a device that classifies the input data stored in the data storage device 3 using an operation model for each input data. Here, the operating model refers to a machine learning model operated in the classification system 1 . The classification device 4, for example, inputs an image of a person captured by a camera device into the operation model, determines whether the person wears a uniform, and outputs whether the person wears a uniform or not as a result of classification. Also, the classification device 4 may transmit the classification result to the display device 6 .

The accuracy determination device 5 duplicates the operation model in advance and stores it as a correction model in order to show the accuracy impact of the fairness correction process on the operation model (t1). The corrective model is a copy of the operational model only for the first time, but after that, it is retrained based on the corrective data and the parameters of the corrective model are updated. An example of the operation model corresponds to the first machine learning model, and an example of the corrective model corresponds to the second machine learning model.

Also, the accuracy determination device 5 performs fairness correction processing on the input data by passing the input data through a correction filter created according to a predetermined rule, for example, and generates correction data (t2). Correction data is generated for each input data. If there are a plurality of predetermined rules as a plurality of correction proposals, a correction filter is created according to each rule, and input data is passed through each correction filter to generate a plurality of correction data corresponding to each correction proposal. An example of the predetermined rule corresponds to the first rule.

Also, the accuracy determination device 5 retrains and updates the correction model based on the correction data (t3). If there are multiple correction proposals, the correction model corresponding to each correction proposal is updated based on the correction data corresponding to each correction proposal. Also, an example of the retrained and updated corrective model here is a first machine learning model based on a second plurality of data obtained by transforming the first plurality of data according to the first rule. It corresponds to the second machine learning model generated by updating.

Next, the accuracy determination device 5 inputs the correction data to the correction model and determines the effect of the fairness correction process on the accuracy of the operation model (t4). If there are multiple correction proposals, each correction data is input to the corresponding correction model, and the accuracy impact is determined for each correction proposal. In addition, regarding the accuracy effect, it is possible to label the input data at regular intervals using data with correct answers and manually observe it, but it costs money to create data with correct answers. Therefore, in this embodiment, the accuracy determination device 5 performs density-based clustering based on the output results of the corrective model for the input data, and automatically labels the input data based on the clustering results.

FIG. 4 is a diagram showing an example of the accuracy determination method according to this embodiment. First, the accuracy determination device 5 inputs a plurality of correction data to the correction model, performs individual determination, and assigns a label to each correction data based on the determination result. The label assigned here is called an “individual label”. An example of the individual label is acquired by inputting the fourth plurality of data obtained by converting the third plurality of data according to the first rule, that is, the correction data to the second machine learning model. corresponds to the predicted result.

Judgments by the correction model are made based on the output values of neurons in the output layer of the correction model. Further, the accuracy determination device 5 plots points 9 in the DT space based on the output values of the neurons in the output layer and the determination results, as shown in FIG. The DT space is a feature amount space of input data having an axis corresponding to the output value of each neuron in the output layer. Each axis of the DT space corresponds to the output value of each neuron in the output layer. In addition, in the example of FIG. 4, since there are three neurons in the output layer, the DT space is a three-dimensional space, but for convenience of explanation, the DT space is expressed in two dimensions. In the example of FIG. 4, the judgment result by the corrective model, that is, the individual label is represented by the types of points 9, for example, ◯ white circles and ● black circles.

Next, the accuracy determination device 5 clusters the points 9 based on the density of each classification of the points 9 in the DT space to create each cluster. Note that the density of the points 9 is, for example, the number of points 9 per unit section of the feature amount. In the example of FIG. 4, a cluster A containing a white circle and a cluster B containing a black circle are created.

Next, the accuracy determination device 5 determines a new label for each cluster based on the ratio of individual labels in the cluster, and assigns a new label to the corresponding input data for each point 9 belonging to each cluster. By using the new label assigned here as a “pseudo label” and the pseudo label as a correct label, the accuracy determination device 5 can determine the accuracy impact of the fairness correction process on the operational model. An example of a pseudo label corresponds to a correct label determined by clustering feature amounts determined based on the parameters of the first machine learning model, that is, the operational model of each of the fourth plurality of data.

The prediction accuracy of the machine learning model is determined by the evaluation index of existing technology, such as Accuracy (correct answer rate). Accuracy can be calculated using the following formula (2).

In Equation (2), the number of correct answers is, for example, the number of pseudo labels that differ from the classification results of the corrective model, that is, the number of incorrect answers, subtracted from the total number of input data, with the pseudo labels as the correct labels.

In this way, the accuracy determination device 5 can determine the impact of the fairness correction process on the operational model accuracy using, for example, Equation (2). Further, the accuracy determination device 5 can determine the effect of the fairness correction process by calculating the fairness score for each of the classification result by the correction model and the pseudo label using, for example, Equation (1) and comparing them. .

Returning to the description of FIG. 3, the accuracy determination device 5 outputs the prediction accuracy and the fairness score of the correction model calculated using the equations (1) and (2) to the display device 6 for display. (t5). In addition, by outputting the prediction accuracy and the fairness score of the operation model together, the accuracy determination device 5 or the user can grasp the accuracy influence and its effect due to the fairness correction process in more detail.

FIG. 5 is a diagram showing a display example of prediction accuracy and fairness score according to this embodiment. FIG. 5 is a two-axis graph with the fairness score on the x-axis and the prediction accuracy on the y-axis. In the example of FIG. 5, the prediction accuracy and fairness score of the operating model and the prediction accuracy and fairness score of the corrective model corresponding to each corrective action plan as Corrective Proposals 1-5 are shown. In this manner, the accuracy determination device 5 outputs a plurality of correction proposals, so that the accuracy determination device 5 or the user can select the optimum correction proposal. In the example of FIG. 5, Remedy 3 can be selected because it effectively outperforms the operational model in predictive accuracy and fairness score in a balanced manner. Selection of such a correction plan can be performed by the accuracy determination device 5 by setting predetermined thresholds for prediction accuracy and fairness score, for example.

Here, we will explain in more detail how to output multiple correction proposals. FIG. 6 is a diagram showing an example of a prediction accuracy and fairness score output method according to the present embodiment. As shown in FIG. 6, the accuracy determination device 5 passes input data through correction filters 1 to 5 respectively created according to a plurality of different rules 1 to 5, generates respective correction data, and inputs them to the corresponding correction models. do. Then, the accuracy determination device 5 assigns a pseudo label to each correction data as shown in FIG. output to and display.

The accuracy determination device 5 also replaces the operational model with a copy of the corrective model updated by retraining (t6). Note that if there are multiple corrective proposals, the operational model is replaced with a copy of the corrective model corresponding to one selected corrective proposal. Further, when replacing the operation model, the accuracy determination device 5 may also apply a correction filter to the classification device 4, and use the correction filter to convert input data to the operation model into correction data. The remediation filter applied to the classifier 4 is also the remediation filter corresponding to the selected remediation proposal when there are multiple remediation proposals. In this manner, the accuracy determination device 5 can apply an appropriate fairness correction process to the machine learning model while indicating the accuracy impact of the fairness correction process on the machine learning model.

[Process flow]
Next, the flow of accuracy determination processing by the accuracy determination device 5 will be described with reference to FIG. FIG. 7 is a flowchart showing an example of the flow of accuracy determination processing according to this embodiment. The accuracy determination process shown in FIG. 7 is executed using the input data, for example, when the input data is input to the operational model. Further, as for the corrective model used in the accuracy determination process, the operation model is duplicated only at the first time.

First, as shown in FIG. 7, the accuracy determination device 5 executes fairness correction processing on input data (step S101). More specifically, for example, the accuracy determination device 5 passes input data through each of correction filters created according to a plurality of different rules to generate correction data for each correction proposal.

Next, the accuracy determination device 5 retrains and updates the correction model based on the corresponding correction data for each correction proposal (step S102). This generates a corrective model for each corrective action plan.

Next, the accuracy determination device 5 inputs each of the correction data generated in step S101 to the corresponding correction model for each correction proposal, and determines each classification of the correction data based on the output value of the correction model. (step S103).

Next, the accuracy determination device 5 clusters the correction data based on the density based on the determination result of step S103, labels the correction data, and calculates the accuracy of the correction model based on the determination result using the label as the correct label ( step S104). Step S103 is also executed for each corrective action.

Next, the accuracy determination device 5 calculates the fairness score of the correction model based on the determination result of step S103 for each correction proposal (step S105). Note that the execution order of corrective model accuracy and fairness score calculation in steps S104 and S105 may be reversed, or may be executed in parallel.

Next, the accuracy determination device 5 selects a correction plan to be applied to the operation model based on the accuracy calculated in step S104 and the fairness score calculated in step S105 (step S106). For the selection of remedies, for example, the remedies with the highest predictive accuracy and fairness scores compared to the operational model may be selected. Alternatively, the accuracy determination device 5 may present to the user the prediction accuracy and fairness score of each corrective action plan together with the predictive accuracy and fairness score of the operational model, and one corrective action plan may be selected by the user.

Next, the accuracy determination device 5 duplicates the correction model corresponding to the correction plan selected in step S106 and replaces it with the operation model, thereby updating the operation model (step S107). At this time, the accuracy determination device 5 duplicates the correction filter corresponding to the correction plan selected in step S106, and uses the correction filter to convert input data to the operation model into correction data. may apply. Although the accuracy determination process shown in FIG. 7 ends at step S107, for example, when input data is input to the operation model, the process is repeated from step S101 using the corrective model selected at step S106. .

[effect]
As described above, the accuracy determination device 5 updates the first machine learning model based on the second plurality of data obtained by converting the first plurality of data according to the first rule, and updates the second Obtaining a prediction result by generating a machine learning model of and inputting a fourth plurality of data obtained by converting the third plurality of data according to the first rule to the second machine learning model, A correct label for each of the fourth plurality of data is determined by clustering the feature amounts determined based on the parameters of the first machine learning model for each of the fourth plurality of data, and the prediction result and the correct label are determined. to determine the accuracy of the second machine learning model.

In this way, the accuracy determination device 5 corrects based on the prediction result of the correction model generated by retraining with the correction data and the correct label determined by clustering the feature amount based on the parameter of the operation model. Determine model accuracy. As a result, the accuracy determination device 5 can indicate the influence of the fairness correction process on the accuracy of the machine learning model.

Also, it causes the accuracy determination device 5 to execute a process of converting at least one of the feature amount and the correct label included in the first plurality of data according to the first rule to acquire the second plurality of data.

As a result, the accuracy determination device 5 can correct the fairness of the machine learning model.

Further, the process of generating the second machine learning model, which is executed by the accuracy determination device 5, converts the first plurality of data according to the plurality of types of first rules to the second plurality of data. Based on, including a process of updating the first machine learning model for each type of the first rule to generate a plurality of second machine learning models, based on a predetermined condition, a plurality of second machine learning The accuracy determination device 5 is made to execute a process of selecting one second machine learning model from the models.

As a result, the accuracy determination device 5 can correct fairness more appropriately while taking into consideration the deterioration of the prediction accuracy of the machine learning model.

Further, the process of selecting one second machine learning model, which is executed by the accuracy determination device 5, includes the fairness score of the second plurality of data for each type of the first rule, and the second machine learning Selecting a second machine learning model based on the accuracy of the model.

Also, it causes the accuracy determination device 5 to execute a process of outputting a graph with the axes of the fairness scores of the second plurality of data and the accuracy of the second machine learning model.

As a result, the accuracy determination device 5 can present the prediction accuracy and fairness score of the machine learning model to the user in order to correct the fairness more appropriately while considering the deterioration of the prediction accuracy of the machine learning model.

[system]
Information including processing procedures, control procedures, specific names, and various data and parameters shown in the above documents and drawings may be arbitrarily changed unless otherwise specified. Further, the specific examples, distributions, numerical values, etc. described in the embodiments are merely examples, and may be arbitrarily changed.

Further, the specific forms of dispersion and integration of the components of the accuracy determination device 5 are not limited to those shown in the drawings. For example, the classification unit 13 of the accuracy determination device 5 may be distributed among a plurality of processing units, or the generation unit 14 and the labeling unit 15 of the accuracy determination device 5 may be integrated into one processing unit. In other words, all or part of the constituent elements may be functionally or physically distributed and integrated in arbitrary units according to various loads, usage conditions, and the like. Further, each processing function of each device may be implemented in whole or in part by a CPU and a program analyzed and executed by the CPU, or implemented as hardware based on wired logic.

FIG. 8 is a diagram showing a hardware configuration example of the information processing apparatus 10 according to this embodiment. Since the classification device 4 can also employ the same hardware configuration as the accuracy determination device 5, FIG. As shown in FIG. 8, the information processing device 10 has a communication interface 10a, a HDD (Hard Disk Drive) 10b, a memory 10c, and a processor 10d. 8 are interconnected by a bus or the like.

The communication interface 10a is a network interface card or the like, and communicates with other information processing devices. For example, when the information processing device 10 is the accuracy determination device 5, the HDD 10b stores programs and data for operating each function shown in FIG. 2 and the like.

The processor 10d is a CPU (Central Processing Unit), MPU (Micro Processing Unit), GPU (Graphics Processing Unit), or the like. Also, the processor 10d may be realized by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). For example, when the information processing device 10 is the accuracy determination device 5, the processor 10d reads a program for executing the same processing as each processing unit shown in FIG. Thereby, the processor 10d can operate as a hardware circuit that executes processes for realizing each function described with reference to FIG. 2 and the like.

Further, the information processing apparatus 10 can also realize the same functions as the above embodiments by reading the program from the recording medium by the medium reading device and executing the read program. Note that the programs referred to in other embodiments are not limited to being executed by the information processing apparatus 10 . For example, the above-described embodiments may be similarly applied when other information processing apparatuses execute programs or when they cooperate to execute programs.

The program may be distributed via a network such as the Internet. Also, the program may be recorded on a computer-readable storage medium such as a hard disk, flexible disk (FD), CD-ROM, MO (Magneto-Optical disk), DVD (Digital Versatile Disc). Then, the program may be executed by being read from a recording medium by the information processing apparatus 10 or the like.

REFERENCE SIGNS LIST 1 classification system 2 input sensor 3 data storage device 4 classification device 5 accuracy determination device 6 display device 9 points 10 information processing device 10a communication interface 10b HDD
10c memory 10d processor 11 model storage unit 12 correction unit 13 classification unit 14 generation unit 15 labeling unit 16 determination unit 17 learning unit

Claims

updating the first machine learning model to generate a second machine learning model based on a second plurality of data obtained by transforming the first plurality of data according to a first rule;
Obtaining a prediction result by inputting a fourth plurality of data obtained by converting the third plurality of data according to the first rule into the second machine learning model;
Determining a correct label for each of the fourth plurality of data by clustering feature amounts determined based on the parameters of the first machine learning model for each of the fourth plurality of data;
Determining the accuracy of the second machine learning model based on the prediction result and the correct label;
An accuracy determination program characterized by causing a computer to execute processing.
obtaining the second plurality of data by converting at least one of the feature amount and the correct label included in the first plurality of data according to the first rule;
2. The accuracy determination program according to claim 1, causing the computer to execute the processing.
The process of generating the second machine learning model includes:
Updating the first machine learning model for each type of the first rule based on the second plurality of data obtained by converting the first plurality of data according to the plurality of types of the first rule to generate a plurality of said second machine learning models;
including processing;
Selecting one said second machine learning model from a plurality of said second machine learning models based on a predetermined condition;
3. The accuracy determination program according to claim 1, causing the computer to execute the processing.
The process of selecting the one second machine learning model is based on the fairness score of the second plurality of data for each type of the first rule and the accuracy of the second machine learning model , selecting the one second machine learning model;
4. The accuracy determination program according to claim 3, further comprising processing.
outputting a graph whose axes are the fairness score of the second plurality of data and the accuracy of the second machine learning model;
2. The accuracy determination program according to claim 1, causing the computer to execute the processing.
updating the first machine learning model to generate a second machine learning model based on a second plurality of data obtained by transforming the first plurality of data according to a first rule;
Obtaining a prediction result by inputting a fourth plurality of data obtained by converting the third plurality of data according to the first rule into the second machine learning model;
Determining a correct label for each of the fourth plurality of data by clustering feature amounts determined based on the parameters of the first machine learning model for each of the fourth plurality of data;
Determining the accuracy of the second machine learning model based on the prediction result and the correct label;
An accuracy determination device having a control unit that executes processing.
updating the first machine learning model to generate a second machine learning model based on a second plurality of data obtained by transforming the first plurality of data according to a first rule;
Obtaining a prediction result by inputting a fourth plurality of data obtained by converting the third plurality of data according to the first rule into the second machine learning model;
Determining a correct label for each of the fourth plurality of data by clustering feature amounts determined based on the parameters of the first machine learning model for each of the fourth plurality of data;
Determining the accuracy of the second machine learning model based on the prediction result and the correct label;
Accuracy determination method in which processing is performed by a computer.