WO2020121679A1

WO2020121679A1 - Mini-batch learning device, and operation program and operation method therefor

Info

Publication number: WO2020121679A1
Application number: PCT/JP2019/042937
Authority: WO
Inventors: 隆史涌井
Original assignee: 富士フイルム株式会社
Priority date: 2018-12-14
Filing date: 2019-10-31
Publication date: 2020-06-18
Also published as: JP7096362B2; CN113168698A; US20210287052A1; EP3896646A1; US11983880B2; JPWO2020121679A1; EP3896646A4

Abstract

Provided are a mini-batch learning device, and an operation program and operation method therefor, which are capable of suppressing reductions in the class determination accuracy of a machine learning model for implementing semantic segmentation. A CPU in this mini-batch learning device functions as a calculation unit, an identification unit and an evaluation unit when the operation program is started. The calculation unit calculates the area ratio of each of multiple classes in the mini-batch data. The identification unit identifies rare classes which have an area ratio of less than a set value as correction target classes. The evaluation unit uses a loss function to evaluate the class determination accuracy of the machine learning model. As correction processing, the evaluation unit increases the weighting of the loss value of rare classes more than for non-rare classes.

Description

Mini-batch learning device and its operating program and operating method

The technique of the present disclosure relates to a mini-batch learning device, an operating program thereof, and an operating method thereof.

Semantic segmentation is known in which a plurality of classes in an image are discriminated on a pixel-by-pixel basis. Semantic segmentation is realized by a machine learning model (hereinafter, simply model) such as a U-shaped convolutional neural network (U-Net; U-Shaped Neural Network).

In order to improve the discrimination accuracy of the model, it is necessary to give learning data to the model for learning and update the model. The learning data is composed of a learning input image and an annotation image in which a class in the learning input image is manually designated. In Patent Document 1, one learning input image that is the basis of the annotation image is extracted from the plurality of learning input images.

JP, 2017-107386, A

There is a method called mini-batch learning for learning. In mini-batch learning, mini-batch data is given to the model as learning data. The mini-batch data is a part (for example, 10,000 divided images divided by a frame having a size of 1/100 of the original image) obtained by dividing the learning input image and the annotation image. 100 sheets). A plurality of sets (for example, 100 sets) of mini-batch data are generated, and each set is sequentially given to the model.

Here, consider a case where the learning input image and the annotation image have class bias. For example, the learning input image is a phase-contrast microscope image showing a state of cell culture, and class 1 is classified into differentiated cells, class 2 is undifferentiated cells, class 3 is medium, and class 4 is classified as dead cells. It is an image. When the area ratio of each class in the entire learning input image and annotation image is 38% differentiated cells, 2% undifferentiated cells, 40% medium, 20% dead cells, and the area ratio of undifferentiated cells is relatively low. Is.

If the learning input image and the annotation image have a class bias in this way, class bias is likely to occur in the mini-batch data composed of the learning input image and the annotation image. When the class deviation occurs in the mini-batch data, the learning is performed without adding the rare class having a relatively small area ratio. As a result, a model with low discrimination accuracy for rare classes is created.

In Patent Document 1, as described above, one learning input image that is the source of the annotation image is extracted from the plurality of learning input images. However, with this method, if all of the plurality of learning input images have a class bias, a model with a low class discrimination accuracy is eventually created. Therefore, the method described in Patent Document 1 cannot solve the problem that a model with low rare class discrimination accuracy is created.

An object of the technology of the present disclosure is to provide a mini-batch learning device capable of suppressing a decrease in the accuracy of class discrimination of a machine learning model for performing semantic segmentation, an operating program and an operating method thereof.

In order to achieve the above object, a mini-batch learning device of the present disclosure is a mini-batch that performs learning by giving mini-batch data to a machine learning model for performing semantic segmentation for discriminating a plurality of classes in an image on a pixel-by-pixel basis. A learning device, in the mini-batch data, a calculating unit that calculates the area ratio of each of the plurality of classes, a specifying unit that specifies the correction target class based on the area ratio, and a loss function, using a plurality of classes Is an evaluation unit that evaluates the classification accuracy of the class of the machine learning model by calculating the loss value, and the first loss value of the correction target class and the second loss value of the class other than the correction target class. And an evaluation unit including a correction processing unit that executes a correction process for correcting the value of the first loss based on the result of the comparison.

The specifying unit specifies a rare class whose area ratio is lower than a preset setting value as the correction target class, and the correction processing unit sets the weight of the first loss value to the second loss value as the correction process. It is preferable to execute processing for making the value of the loss larger than the weight.

The specifying unit specifies a non-rare class whose area ratio is higher than a preset value as the correction target class, and the correction processing unit sets the weight of the first loss value to the second loss value as the correction process. It is preferable to execute processing for making the value of the loss smaller than the weight.

The specifying unit specifies a rare class whose area ratio is lower than the set value as the correction target class, and the correction processing unit determines the correct value and the predicted value when the first loss value is calculated as the correction process. It is preferable to execute enlargement processing for making the value larger than the correct value and the predicted value when the second loss value is calculated. In this case, the correction processing unit sets the enlargement ratio in the enlargement processing such that the area ratio of the rare class in the mini-batch data becomes the same as the area ratio of the rare class in the learning input image and the annotation image which are the sources of the mini-batch data. A value is preferable.

The specifying unit specifies a non-rare class whose area ratio is higher than the set value as the correction target class, and the correction processing unit determines the correct value and the predicted value when calculating the value of the first loss as the correction process. It is preferable to execute a reduction process for making the value smaller than the correct value and the predicted value when the second loss value is calculated. In this case, the correction processing unit determines the reduction ratio in the reduction process such that the area ratio of the non-rare class in the mini-batch data is the same as the area ratio of the non-rare class in the learning input image and the annotation image that are the source of the mini-batch data. It is preferable to set the value to

It is preferable that the correction processing unit includes a reception unit that receives an instruction to select whether or not to execute the correction process.

An operation program of a mini-batch learning device according to the present disclosure is an operation program of a mini-batch learning device for giving mini-batch data to a machine learning model for performing semantic segmentation for discriminating a plurality of classes in an image on a pixel-by-pixel basis. In the mini-batch data, the calculation unit that calculates the area ratio of each of the plurality of classes, the specifying unit that specifies the correction target class based on the area ratio, and the loss function for each of the plurality of classes using the loss function. Is an evaluation unit for evaluating the discrimination accuracy of the class of the machine learning model by calculating the value of, and comparing the first loss value of the correction target class and the second loss value of the class other than the correction target class. The computer is caused to function as an evaluation unit including a correction processing unit that executes a correction process that corrects the first loss value based on the result.

The operation method of the mini-batch learning apparatus according to the present disclosure is an operation method of a mini-batch learning apparatus that applies mini-batch data to a machine learning model for performing semantic segmentation for discriminating a plurality of classes in an image on a pixel-by-pixel basis. In the mini-batch data, the calculation step for calculating the area ratio of each of the plurality of classes, the specific step for specifying the correction target class based on the area ratio, and the loss function for each of the plurality of classes are performed. Is a step of evaluating the discrimination accuracy of the class of the machine learning model by calculating the value of, and comparing the first loss value of the correction target class and the second loss value of the class other than the correction target class. An evaluation step including a correction processing step of executing a correction processing for correcting the first loss value based on the result.

According to the technique of the present disclosure, it is possible to provide a mini-batch learning device capable of suppressing a decrease in the accuracy of class discrimination of a machine learning model for performing semantic segmentation, an operating program and an operating method thereof.

It is a figure which shows the outline of a mini-batch learning apparatus and its process. It is a figure showing an outline of an operating device and its processing. It is a figure which shows an image, FIG. 3A shows an input image for learning, and FIG. 3B shows an annotation image, respectively. It is a figure which shows a mode that the input image for division learning is produced|generated from the input image for learning. It is a figure which shows a mode that a division|segmentation annotation image is produced|generated from an annotation image. FIG. 6 is a diagram showing that a part of a plurality of divided learning input images constitutes a divided learning input image group. It is a figure which shows that a part of a some division|segmentation annotation image comprises a division|segmentation annotation image group. It is a block diagram which shows the computer which comprises a mini batch learning apparatus. It is a block diagram which shows the processing part of CPU of a mini-batch learning apparatus. It is a figure which shows the specific example of a process of a calculation part and a specific part. It is a figure which shows the specific example of a process of an evaluation part. It is a figure which shows the table of the loss value of each class and the calculated value of a loss function, and when FIG. 12A makes the weight of the loss value of each class the same, FIG. The case of increasing the weight of is shown respectively. It is a flow chart which shows a processing procedure of a mini-batch learning device. It is a figure which shows the modification of the process of an evaluation part. It is a figure which shows the specific example of a process of the calculation part and specific part in 2nd Embodiment. It is a figure which shows the specific example of a process of the evaluation part in 2nd Embodiment. It is a figure which shows the table of the loss value of each class in 2nd Embodiment, and the calculated value of a loss function, FIG. 17A shows the case where the weight of the loss value of each class is made the same, FIG. Each case is shown in which the weight of the class loss value is reduced. It is a figure which shows the specific example of a process of the evaluation part in 3rd Embodiment. It is a figure which shows notionally the process of the evaluation part in 3rd Embodiment. It is a figure which shows notionally the process of the evaluation part in 3rd Embodiment. It is a figure which shows the determination method of the expansion rate of an expansion process. It is a figure which shows the specific example of a process of the evaluation part in 4th Embodiment. It is a figure which shows notionally the process of the evaluation part in 4th Embodiment. It is a figure which shows notionally the process of the evaluation part in 4th Embodiment. It is a figure which shows the determination method of the reduction rate of reduction processing. It is a figure which shows 5th Embodiment which asks whether a correction process part is made to perform a correction process.

[First Embodiment]

In FIG. 1, the mini-batch learning device 2 uses the mini-batch data 11 for the model 10 in order to improve the discrimination accuracy of the model 10 for performing the semantic segmentation that discriminates a plurality of classes in the input image on a pixel-by-pixel basis. Have the mini-batch learning done. The mini-batch learning device 2 is, for example, a desktop personal computer. The model 10 is, for example, U-Net.

The class may be rephrased as the type of object shown in the input image. In short, the semantic segmentation discriminates the class of the object and the contour of the object shown in the input image, and the discrimination result is output from the model 10 as the output image. For example, when three objects of a cup, a book, and a mobile phone are reflected in the input image, the output image is ideally determined to be the class of the cup, the book, and the mobile phone, and the contours of these objects are faithfully reproduced. The contour line traced is drawn on each object.

The classification accuracy of the class of the model 10 can be improved by giving learning data to the model 10 for learning and updating the model 10. The learning data is composed of a set of a learning input image input to the model 10 and an annotation image in which a class in the learning input image is manually specified. The annotation image is an image for performing so-called answer matching with the learning output image output from the model 10 according to the learning input image, and is compared with the learning output image. The higher the classification accuracy of the model 10 is, the smaller the difference between the annotation image and the learning output image is.

As described above, the mini-batch learning device 2 uses the mini-batch data 11 as learning data. The mini-batch data 11 is composed of a divided learning input image group 12 and a divided annotation image group 13.

In the mini-batch learning, the divided learning input image group 12 is given to the model 10. As a result, the learning output image is output from the model 10 for each divided learning input image 20S (see FIG. 4) of the divided learning input image group 12. In this way, the learning output image group 14, which is a set of learning output images output from the model 10, and the divided annotation image group 13 are compared, and the class determination accuracy of the model 10 is evaluated. Then, the model 10 is updated according to the evaluation result of the discrimination accuracy of this class. The mini-batch learning device 2 inputs the divided learning input image group 12 to the model 10, outputs the learning output image group 14 from the model 10, evaluates the class discrimination accuracy of the model 10, and updates the model 10. Is performed while replacing the mini-batch data 11, and is repeated until the class discrimination accuracy of the model 10 reaches a desired level.

As shown in FIG. 2, the model 10 whose class discrimination accuracy is increased to a desired level as described above is incorporated into the operation device 15 as a learned machine learning model (hereinafter, learned model) 10T. The learned model 10T is provided with the input image 16 in which the class of the reflected object and its contour have not yet been determined. The learned model 10T discriminates the class of the object shown in the input image 16 and its contour, and outputs the output image 17 as the discrimination result. The operation device 15 is, for example, a desktop personal computer like the mini-batch learning device 2, and displays the input image 16 and the output image 17 side by side on the display. The operation device 15 may be a device different from the mini-batch learning device 2 or the same device as the mini-batch learning device 2. Further, even after the learned model 10T is incorporated in the operation device 15, the learned model 10T may be given the mini-batch data 11 for learning.

As shown in FIG. 3A, the learning input image 20 is, in this example, one image of a phase contrast microscope showing a state of cell culture. In the learning input image 20, differentiated cells, undifferentiated cells, medium, and dead cells are reflected as objects. In the annotation image 21 in this case, as shown in FIG. 3B, class 1 differentiated cells, class 2 undifferentiated cells, class 3 medium, and class 4 dead cells are manually designated. The input image 16 provided to the learned model 10T is also a phase-contrast microscope image showing the state of cell culture, like the learning input image 20.

As illustrated in FIG. 4, the divided learning input image 20S includes a region surrounded by a rectangular frame 25 that is sequentially moved in the horizontal direction by DX and in the vertical direction by DY in the learning input image 20S. , Each time it was cut out. The lateral movement amount DX of the frame 25 is, for example, ½ of the lateral size of the frame 25. Similarly, the vertical movement amount DY of the frame 25 is, for example, ½ of the vertical size of the frame 25. The frame 25 is, for example, 1/50 the size of the learning input image 20. In this case, there are 10,000 input images for divided learning 20S, that is, 20S_1 to 20S_10000.

Similarly, as shown in FIG. 5, the divided annotation image 21S is a region surrounded by a rectangular frame 25 that is sequentially moved in the annotation image 21 by DX in the horizontal direction and DY in the vertical direction. , Each time it was cut out. There are 10,000 divided annotation images 21S in total, 21S_1 to 21S_10000. In the following description, the learning input image 20 and the annotation image 21 are already prepared in the mini-batch learning device 2, and the split learning input image 20S and the split annotation image 21S are already generated.

As shown in FIG. 6, the divided learning input image group 12 is a part (for example, 10,000 divided learning input images) of the plurality of divided learning input images 20S generated as shown in FIG. It is composed of 100 sheets of 20S). Similarly, as shown in FIG. 7, the divided annotation image group 13 is part of the plurality of divided annotation images 21S generated as shown in FIG. 5 (for example, of the 10,000 divided annotation images 21S). 100 sheets). The divided learning input image 20S forming the divided learning input image group 12 and the divided annotation image 21S forming the divided annotation image group 13 have the same region cut out by the frame 25.

In FIG. 8, the computer configuring the mini-batch learning device 2 includes a storage device 30, a memory 31, a CPU (Central Processing Unit) 32, a communication unit 33, a display 34, and an input device 35. These are interconnected via a data bus 36.

The storage device 30 is a hard disk drive that is built in the computer that constitutes the mini-batch learning apparatus 2 or that is connected via a cable or a network. Alternatively, the storage device 30 is a disk array in which a plurality of hard disk drives are connected in series. The storage device 30 stores a control program such as an operating system, various application programs, and various data associated with these programs.

The memory 31 is a work memory for the CPU 32 to execute processing. The CPU 32 loads the program stored in the storage device 30 into the memory 31 and executes the process according to the program, thereby centrally controlling each unit of the computer.

The communication unit 33 is a network interface that controls transmission of various information via a network such as the Internet or a WAN (Wide Area Network) such as a public communication network. The display 34 displays various screens. Various screens are provided with an operation function by GUI (Graphical User Interface). The computer configuring the mini-batch learning device 2 receives input of an operation instruction from the input device 35 through various screens. The input device 35 is a keyboard, a mouse, a touch panel, or the like.

In FIG. 9, the storage device 30 stores a learning input image 20, an annotation image 21, a split learning input image 20S, a split annotation image 21S, and a model 10. The operation program 40 is stored in the storage device 30 as an application program. The operation program 40 is an application program for causing a computer to function as the mini-batch learning device 2.

When the operation program 40 is activated, the CPU 32 of the computer constituting the mini-batch learning device 2 cooperates with the memory 31 and the like to generate the generation unit 50, the calculation unit 51, the identification unit 52, the learning unit 53, the evaluation unit 54, And functions as the updating unit 55. A correction processing unit 56 is provided in the evaluation unit 54.

The generation unit 50 uses the divided learning input image 20S and the divided annotation image 21S generated from the learning input image 20 and the annotation image 21 as shown in FIGS. The mini-batch data 11 is generated by selecting a part of them. The generation unit 50 generates a plurality of sets (for example, 100 sets) of mini-batch data 11. The generation unit 50 outputs the generated mini-batch data 11 to the calculation unit 51, the learning unit 53, and the evaluation unit 54.

Note that the generation unit 50 may execute a method of increasing the choices of the split learning input image 20S and the split annotation image 21S to be the mini-batch data 11. Specifically, the input image for split learning 20S and the split annotation image 21S are subjected to image processing such as trimming, left-right reversal, and rotation to be made into another image, which is a new option of the mini-batch data 11. Such a technique is called data augmentation.

The calculation unit 51 calculates the area ratio of each of the plurality of classes in the mini-batch data 11. More specifically, the calculation unit 51 adds, for each class, the number of pixels in a region manually specified in the divided annotation image 21S that forms the divided annotation image group 13 of the mini-batch data 11 from the generation unit 50. Next, the area ratio is calculated by dividing the added pixel number by the total pixel number of the divided annotation image 21S. For example, when the number of added pixels is 10,000 and the total number of pixels is 50000 in a region designated as a class 1 differentiated cell, the area ratio of the class 1 differentiated cell is (10000/50000)×100= 20%. The calculating unit 51 outputs the calculated area ratio to the specifying unit 52.

The identifying unit 52 identifies the correction target class based on the area ratio. In the present embodiment, the identifying unit 52 identifies, as the correction target class, a rare class whose area ratio is lower than a preset setting value. The identification unit 52 outputs the identified rare class to the evaluation unit 54.

The learning unit 53 gives the input image group 12 for divided learning of the mini-batch data 11 from the generation unit 50 to the model 10 for learning. The learning unit 53 outputs the learning output image group 14 output from the model 10 to the evaluation unit 54.

The evaluation unit 54 compares the divided annotation image group 13 of the mini-batch data 11 from the generation unit 50 and the learning output image group 14 from the learning unit 53, and evaluates the classification accuracy of the class of the model 10. The evaluation unit 54 outputs the evaluation result to the update unit 55.

The evaluation unit 54 evaluates the class determination accuracy of the model 10 using the loss function L (TN, PN) shown below. The loss function L(TN, PN) is a function indicating the degree of difference between the divided annotation image group 13 and the learning output image group 14. The TN of the loss function L(TN, PN) represents the discrimination state of the class in the divided annotation image group 13, and corresponds to the correct value. PN represents a class discrimination state in the learning output image group 14, and corresponds to a predicted value. The closer the calculated value of the loss function L(TN, PN) is to 0, the higher the classification accuracy of the class of the model 10.

N is the number of classes, N=4 in this example. WK is a weighting factor. F(TK, PK) is, for example, a categorical cross entropy function. F(TK, PK) corresponds to the loss value of each class. That is, the loss function L(TN, PN) is the sum of products of the loss value F(TK, PK) of each class and the weighting coefficient WK. The evaluation unit 54 outputs the calculated value of the loss function L(TN, PN) to the update unit 55 as the evaluation result.

The correction processing unit 56 determines whether the first loss value, which is the loss value of the correction target class, and the second loss value, which is the loss value of the class other than the correction target class, are compared with each other. A correction process for correcting the loss value of is executed. The correction process includes a process of aligning the numbers of digits of the first loss value and the second loss value. For example, if the number of digits of the first loss value is 1 and the number of digits of the second loss value is 2, the number of digits of the first loss value is set to 2 Processing. The correction process also includes a process of setting the first loss value and the second loss value to the same value. The process of setting the same value includes not only the process of setting the first loss value and the second loss value to be completely the same value, but also the first loss value with respect to the second loss value. Within a prescribed error range, for example, within a range of ±50% (when the value of the second loss is 50, the value of the first loss is set to 25 to 75).

More specifically, the correction processing unit 56 executes, as the correction processing, processing for making the weight of the first loss value larger than the weight of the second loss value. Here, the “weight” is the weight coefficient WK. Further, in the present embodiment, as described above, the correction target class is a rare class whose area ratio is lower than the set value. Therefore, in the present embodiment, the first loss value is the rare class loss value F(TK, PK), and the second loss value is the loss value F(TK, PK) of a class other than the rare class. ). In other words, the correction processing unit 56 uses the weighting factor WK to the loss value F (TK, PK) of the rare class as the correction process and the loss value F of the class other than the rare class as the correction process. A process of increasing the weighting coefficient WK to (TK, PK) is executed. The correction processing unit 56 sets the weighting factor WK to the loss value F(TK, PK) of the rare class to 10, and sets the weighting factor WK to the loss value F(TK, PK) of the class other than the rare class, for example. 1 (see FIGS. 11 and 12).

The update unit 55 updates the model 10 according to the evaluation result from the evaluation unit 54. More specifically, the updating unit 55 changes the values of various parameters of the model 10 by a stochastic gradient descent method with a learning coefficient. The learning coefficient indicates the range of change in the values of various parameters of the model 10. That is, as the learning coefficient has a relatively large value, the range of change in the values of various parameters increases, and the degree of updating the model 10 also increases.

FIG. 10 and FIG. 11 show specific examples of the processing of each unit of the calculation unit 51, the identification unit 52, and the evaluation unit 54 (correction processing unit 56). First, in FIG. 10, the calculation unit 51 calculates the area ratio of each class for each

set

1, 2, 3,... Of the mini-batch data 11 as shown in Table 60. In FIG. 10, the area ratio of the differentiated cells of class 1 of the first set of mini-batch data 11 is 38%, the area ratio of the undifferentiated cells of class 2 is 2%, the area ratio of the medium of class 3 is 40%, and the class ratio is 40%. The case where the area ratio of the dead cells of No. 4 is calculated to be 20% or the like is illustrated.

The identifying unit 52 identifies a rare class whose area ratio is lower than the set value. In FIG. 10, since the set value is 5%, the area ratio is 2%, which is lower than the set value, and the undifferentiated cells of class 2 in the first set of mini-batch data 11 are specified as a rare class. ing. Although only one rare class is specified here as an example, if there are a plurality of classes whose area ratio is lower than the set value, the plurality of classes are naturally specified as the rare classes.

Then, in FIG. 11, the correction processing unit 56 of the evaluation unit 54, as shown in Table 61, the mini batch data of the

classes

1, 3, 4, the second set, and the third set of the mini batch data 11 of the first set. The weighting factor WK to the loss value F(TK, PK) of a class other than the rare class, such as 11 classes, is set to 1. On the other hand, the correction processing unit 56 sets the weighting factor WK to the loss value F(TK, PK) of a rare class such as class 2 of the first set of mini-batch data 11 to 10.

FIG. 12 shows a table of loss values F (TK, PK) and calculated values of loss function L (TN, PN) of each class. The table 65A in FIG. 12A shows a case where the weighting factor WK for the loss value F(TK, PK) of each class is set to 1 which is the same. On the other hand, the table 65B of FIG. 12B shows a case where the weighting factor WK for the loss value F(TK, PK) of the rare class is increased. The rare class is an undifferentiated cell of class 2, the loss value F(T2, P2) is 2, and the loss values F(T1, P1), F(T3, P3) of the

other classes

1, 3, 4 are ) And F(T4, P4) are 25, 20, and 15, respectively.

Thus, the loss value F(TK, PK) of the rare class is smaller than the loss value F(TK, PK) of the class other than the rare class. Such a difference occurs because the rare class has a limited learning opportunity for the model 10 as compared to other classes, and the learning (called an epoch) in which a set of mini-batch data 11 is given determines the model 10. This is because the degree of accuracy improvement or deterioration is small.

When there is a large difference in the loss value F (TK, PK) between the rare class and the other classes and the weighting factors WK are made the same as shown in FIG. 12A, the loss value F (TK, PK of the rare class is ) Has a relatively small effect on the calculated value (=62) of the loss function L(TN, PN). On the other hand, when the weighting factor WK for the rare class loss value F(TK, PK) is increased as shown in FIG. 12B, the rare class loss value F(TK, PK) is larger than that in FIG. 12A. ) Has a large effect on the calculated value (=80) of the loss function L(TN, PN). In this way, by increasing the weighting factor WK for the rare class loss value F(TK, PK), the evaluation unit 54 sets the rare class loss value F(TK, PK) to the loss of classes other than the rare class. The value F (TK, PK) is compared to the value F (TK, PK), the loss function L (TN, PN) is calculated, and the discrimination accuracy of the model 10 is evaluated.

Next, the operation of the above configuration will be described with reference to the flowchart shown in FIG. First, the operation program 40 is started, and as shown in FIG. 9, the CPU 32 of the computer constituting the mini-batch learning device 2 functions as the processing units 50 to 56.

The mini-batch data 11 is generated in the generation unit 50 (step ST100). The mini-batch data 11 is output from the generation unit 50 to the calculation unit 51, the learning unit 53, and the evaluation unit 54.

As shown in the table 60 of FIG. 10, the calculation unit 51 calculates the area ratio of each class for each set of the mini-batch data 11 (step ST110, calculation step). Subsequently, as also shown in FIG. 10, in the identifying unit 52, a rare class whose area ratio is lower than the set value is identified as a correction target class (step ST120, identifying step).

In the learning unit 53, the input image group 12 for divided learning of the mini-batch data 11 from the generation unit 50 is given to the model 10 and learning is performed (step ST130).

When the mini-batch data 11 given to the model 10 in step ST130 includes a rare class (YES in step ST140), as shown in table 61 of FIG. 11 and table 65B of FIG. The weighting factor WK for the loss value F(TK, PK) of the rare class is made larger than the weighting factor WK for the loss value F(TK, PK) of the class other than the rare class (step ST150, correction process). Step). On the other hand, when there is no rare class in the mini-batch data 11 given to the model 10 in step ST130 (NO in step ST140), the weighting factor WK to the loss value F(TK, PK) of the rare class is large. Instead, the normal weighting coefficient WK is set.

The evaluation unit 54 compares the learning output image group 14 output from the model 10 with the divided annotation image group 13 of the mini-batch data 11 from the generation unit 50, and evaluates the class determination accuracy of the model 10. (Step ST160, evaluation step). More specifically, the loss value F(TK, PK) is calculated for each of the plurality of classes. Then, the weighting factor WK set in step ST150 or the normal weighting factor WK is added to the loss value F(TK, PK), and the sum is calculated as the calculated value of the loss function L(TN, PN). It

When it is determined that the class determination accuracy of the model 10 has reached a desired level based on the calculated value of the loss function L(TN, PN) by the evaluation unit 54 (YES in ST170), the mini-batch learning is terminated. On the other hand, when it is determined that the class determination accuracy of the model 10 has not reached the desired level (NO in step ST170), the updating unit 55 updates the model 10 (step ST180). Then, the process is returned to step ST130, another set of mini-batch data 11 is given to the model 10, and the subsequent steps are repeated.

The case where the rare class is specified by the specifying unit 52 is that the class is biased in the mini-batch data 11. In the mini-batch data 11 having such class bias, learning is performed without adding rare classes. More specifically, the learning frequency of the rare class is relatively low compared to other classes. When the discrimination accuracy of the model 10 is evaluated by the evaluation unit 54 without any restrictions after such biased learning is performed, an evaluation result in which the rare class is not added so much is output as shown in FIG. 12A. The Rukoto. Then, the subsequent update of the model 10 will not include the rare class. As a result, the model 10 having a low discrimination accuracy of the rare class is completed.

However, in the present embodiment, as described above, the correction processing unit 56 uses the comparison result of the loss value F(TK, PK) of the rare class and the loss value F(TK, PK) of the class other than the rare class. The correction processing based on it is being executed. More specifically, in the correction processing unit 56, the weighting factor WK for the loss value F(TK, PK) of the rare class is calculated from the weighting factor WK for the loss value F(TK, PK) of the class other than the rare class. Is also getting bigger. By doing so, the evaluation result in which the rare class is sufficiently added can be output, and the subsequent update of the model 10 also tends to improve the accuracy of identifying the rare class. Therefore, it is possible to avoid a situation in which the model 10 having low discrimination accuracy of the rare class is created, and it is possible to suppress the deterioration of the discrimination accuracy of the class of the model 10.

It should be noted that the smaller the area ratio, the larger the degree of increasing the weighting factor WK for the rare class loss value F(TK, PK) may be increased. For example, as shown in Table 70 in FIG. 14, when the area ratio is 0% or more and less than 2.5% as in the 20th set of mini-batch data 11, the weight coefficient W2 of the rare class 2 is set to 100. And On the other hand, when the area ratio is 2.5% or more and less than 5% as in the 21st set of mini-batch data 11, the weight coefficient W4 of the rare class 4 is set to 10. It is considered that the smaller the area ratio, the smaller the loss value F(TK, PK). Therefore, by changing the weighting coefficient WK in accordance with the area ratio in this way, it is possible to output the evaluation result in which the rarer class is added, and as a result, it is possible to further suppress the deterioration of the classification accuracy of the model 10 class. Is possible.

[Second Embodiment]

In the second embodiment shown in FIGS. 15 to 17, contrary to the first embodiment, a non-rare class whose area ratio is higher than a preset value is specified as the correction target class, and the correction processing is performed. , The weight of the first loss value is made smaller than the weight of the second loss value.

In FIG. 15, the identifying unit 80 of the present embodiment identifies a non-rare class whose area ratio is higher than the set value, as a correction target class. In FIG. 15, since the set value is 50%, as shown in Table 75 and the like, the undifferentiated cells of class 2 of the mini-batch data 11 of the 30th set, which have an area ratio of 56% and are higher than the set value, are non-differentiated. The case where it is specified as a rare class is illustrated. Note that, like the rare class of the first embodiment, a plurality of classes may be specified as non-rare classes.

In FIG. 16, the correction processing unit 82 of the evaluation unit 81 of the present embodiment, as the correction processing, sets the weighting factor WK to the loss value F(TK, PK) of the non-rare class to the loss of the classes other than the non-rare class. The value F(TK, PK) is set to be smaller than the weighting coefficient WK. Specifically, as shown in Table 83, classes other than the non-rare class, such as all classes of the 30th set of mini-batch data 11 of

classes

1, 3, 4, 31st set, and 32nd set of mini-batch data 11. The weighting factor WK to the loss value F(TK, PK) of is set to 1. On the other hand, the correction processing unit 82 sets the weighting coefficient WK to the loss value F(TK, PK) of the non-rare class such as class 2 of the 30th set of mini-batch data 11 to 0.5.

17 shows a table of calculated values of the loss value F (TK, PK) and the loss function L (TN, PN) of each class, as in FIG. The table 85A of FIG. 17A shows a case where the weighting factor WK for the loss value F (TK, PK) of each class is set to the same value of 1. On the other hand, the table 85B of FIG. 17B shows a case where the weighting factor WK for the loss value F(TK, PK) of the non-rare class is reduced. Then, the non-rare class is an undifferentiated cell of class 2, the loss value F(T2, P2) is 42, and the loss values F(T1, P1), F(T3, In this example, P3) and F(T4, P4) are 19, 22, and 18, respectively.

Contrary to the first embodiment, the loss value F(TK, PK) of the non-rare class is larger than the loss value F(TK, PK) of the class other than the non-rare class. Therefore, the evaluation unit 81 reduces the weighting coefficient WK for the loss value F(TK, PK) of the non-rare class. As a result, as shown in FIG. 17B, the loss value F(TK, PK) of the non-rare class is reduced to a value comparable to the loss value F(TK, PK) of the class other than the non-rare class, and the weight is reduced. Compared to the case of FIG. 17A in which the coefficient WK has the same value, the influence of the loss value F(TK, PK) of the non-rare class on the calculated value of the loss function L(TN, PN) is reduced.

As described above, in the second embodiment, the non-rare class whose area ratio is higher than the set value is specified as the correction target class, and the weight of the first loss value is set to the weight of the second loss as the correction process. The process of making the value smaller than the weight is executed. Therefore, as in the first embodiment, it is possible to suppress a decrease in the accuracy of class determination of the model 10.

In this case, as in the first embodiment, the larger the area ratio, the smaller the weighting factor WK for the loss value F(TK, PK) of the non-rare class loss may be increased.

[Third Embodiment]

In the third embodiment shown in FIGS. 18 to 21, when a rare class whose area ratio is lower than a preset value is specified as the correction target class, and the first loss value is calculated as the correction process The enlarging process for increasing the correct value and the predicted value of is larger than the correct value and the predicted value when the second loss value is calculated.

FIG. 18 illustrates a case where undifferentiated cells of class 2 in the first set of mini-batch data 11 are specified as a rare class, as shown in FIG. 10. In this case, the correction processing unit 91 of the evaluation unit 90 of the present embodiment, as shown in Table 92,

class

1, 3, 4, second set, and third set of mini-batches of the first set of mini-batch data 11. Correct values and predicted values of classes other than rare classes, such as all classes of data 11, are left unchanged. On the other hand, the correction processing unit 91 executes the enlargement processing for multiplying the correct value and the predicted value of the rare class such as the class 2 of the first batch of mini-batch data 11 by 10.

19 and 20 are diagrams conceptually showing an enlargement process for multiplying the correct value and predicted value of class 2 of the first set of mini-batch data 11 of FIG. 18 by 10. As shown in FIG. 19, the size of the correct answer value T2 is set to 10 times that before the enlargement process by the enlargement process. Similarly, as shown in FIG. 20, the size of the predicted value P2 is set to 10 times that before the enlargement process by the enlargement process. As described above, the enlargement process is a process of increasing the number of target pixels of the correct value of the rare class and the number of target pixels of the predicted value of the rare class.

As shown in Table 95 in FIG. 21, the correction processing unit 91 determines the enlargement ratio in the enlargement process, the area ratio of the rare class in the mini-batch data 11 is the source of the mini-batch data 11, the learning input image 20 and the annotation. The value is the same as the area ratio of the rare class in the image 21. In FIG. 21, the undifferentiated cells of class 2 of the first set of mini-batch data 11 are identified as rare classes, the area ratio of the rare classes in the mini-batch data 11 is 2%, and the learning input image 20 and the annotation image 21. The case where the area ratio of the rare class is 20% is illustrated. In this case, the correction processing unit 91 sets the enlargement ratio in the enlargement processing to 20/2=10 times. In addition, the same value means that the area ratio of the rare class in the mini-batch data 11 and the area ratio of the rare class in the learning input image 20 and the annotation image 21 are completely the same. The area ratio of the rare class and the area ratio of the rare class in the learning input image 20 and the annotation image 21 also include values within a specified error range, for example, ±10%.

As described above, in the third embodiment, the correct value when the rare class whose area ratio is lower than the preset value is specified as the correction target class and the first loss value is calculated as the correction process. And the enlargement processing for increasing the predicted value to be larger than the correct value and the predicted value when the second loss value is calculated. Even by such a correction process, the imbalance of the loss values F(TK, PK) between the rare class and the other class can be corrected. Therefore, it is possible to suppress a decrease in the classification accuracy of the model 10 class. Further, such a correction process is effective when the loss value F(TK, PK) is not a linear function.

Further, in the third embodiment, the enlargement ratio in the enlargement processing is set to a value such that the area ratio of the rare class in the mini-batch data 11 becomes the same as the area ratio of the rare class in the learning input image 20 and the annotation image 21. Therefore, the enlargement ratio can be set to a reasonable value. It should be noted that such a method of determining the enlargement ratio is preferably adopted when there is no bias in the area ratio of each class in the learning input image 20 and the annotation image 21.

The case where there is no bias in the area ratio of each class in the learning input image 20 and the annotation image 21 is, for example, when the difference between the maximum value and the minimum value of the area ratio of each class is within 10%.

[Fourth Embodiment]

In the fourth embodiment shown in FIGS. 22 to 25, contrary to the third embodiment, a non-rare class whose area ratio is higher than a preset setting value is specified as the correction target class, and the correction processing is performed. A reduction process is executed to reduce the correct value and predicted value when calculating the first loss value to be smaller than the correct value and predicted value when calculating the second loss value.

FIG. 22 exemplifies a case where undifferentiated cells of class 2 in the mini-batch data 11 of the 30th set are specified as a non-rare class, as shown in FIG. In this case, the correction processing unit 101 of the evaluation unit 100 according to the present embodiment, as shown in Table 102,

class

1, 3, 4, 31st group, and 32nd group minibatch of the 30th group of minibatch data 11. Correct values and predicted values of classes other than the non-rare class, such as all classes of data 11, are left unchanged. On the other hand, the correction processing unit 101 executes a reduction process for multiplying the correct value and the predicted value of the non-rare class such as class 2 of the 30th set of mini-batch data 11 by 0.5.

23 and 24 are diagrams conceptually showing the reduction processing for multiplying the correct answer value and predicted value of class 2 of the mini-batch data 11 of the 30th set in FIG. 22 by 0.5. As shown in FIG. 23, the size of the correct answer value T2 is set to 0.5 times that before the reduction processing by the reduction processing. Similarly, as shown in FIG. 24, the size of the predicted value P2 is set to 0.5 times that before the reduction process by the reduction process. As described above, the reduction process is a process of reducing the number of target pixels of the correct value of the non-rare class and the number of target pixels of the predicted value of the non-rare class, contrary to the enlargement process of the third embodiment.

As shown in the table 105 of FIG. 25, the correction processing unit 101 determines the reduction ratio in the reduction process, the learning input image 20 from which the area ratio of the non-rare class in the mini-batch data 11 is the source of the mini-batch data 11, The value is the same as the area ratio of the non-rare class in the annotation image 21. In FIG. 25, the undifferentiated cells of class 2 of the 30th set of mini-batch data 11 are identified as the non-rare class, the area ratio of the non-rare class in the mini-batch data 11 is 56%, and the learning input image 20 and the annotation The case where the area ratio of the non-rare class in the image 21 is 28% is illustrated. In this case, the correction processing unit 101 sets the reduction ratio in the reduction process to 28/56=0.5 times. Also in this case, as in the third embodiment, the same value means that the area ratio of the rare class in the mini-batch data 11 and the area ratio of the rare class in the learning input image 20 and the annotation image 21 are completely. In addition to the same value, the area ratio of the rare class in the mini-batch data 11 and the area ratio of the rare class in the learning input image 20 and the annotation image 21 may fall within a specified error range, for example, ±10%. Including.

As described above, in the fourth embodiment, the correct answer in the case where the non-rare class whose area ratio is higher than the preset value is specified as the correction target class and the first loss value is calculated as the correction process A reduction process is performed to make the value and the predicted value smaller than the correct value and the predicted value when the second loss value is calculated. Therefore, as in the third embodiment, it is possible to suppress a decrease in the accuracy of class determination of the model 10. Further, like the third embodiment, it is effective when the loss value F(TK, PK) is not a linear function.

In the fourth embodiment, the reduction ratio in the reduction processing is set to a value at which the area ratio of the non-rare class in the mini-batch data 11 becomes the same as the area ratio of the non-rare class in the learning input image 20 and the annotation image 21. There is. Therefore, the reduction rate can be set to an appropriate value. Similar to the third embodiment, it is preferable to employ such a method of determining the reduction rate when the area ratios of the classes in the learning input image 20 and the annotation image 21 are not biased.

[Fifth Embodiment]

In the fifth embodiment shown in FIG. 26, it is inquired whether or not the correction processing unit is made to execute the correction processing.

In FIG. 26, the CPU of the mini-batch learning device of the fifth embodiment functions as a reception unit 110 in addition to the processing units of the above embodiments. When the specifying unit 52 specifies the correction target class, the receiving unit 110 receives an instruction to select whether or not the correction processing unit executes the correction process.

In the fifth embodiment, when the specifying unit 52 specifies the correction target class, the inquiry screen 111 is displayed on the display 34. On the inquiry screen 111, a message 112 for asking that the correction target class is specified and asking whether the correction processing for correcting the loss value of the correction target class may be executed, a Yes button 113, and a No button 114. Is displayed. The receiving unit 110 receives the selection instruction of the Yes button 113 and the No button 114 as a selection instruction of whether or not to execute the correction process. When the Yes button 113 is selected, the correction processing unit executes the correction processing. On the other hand, when the No button 114 is selected, the correction processing unit does not execute the correction processing.

When an annotation image is generated, the class is specified manually, so the class may be specified incorrectly. Although the model 10 was designated as a class at the beginning of development, some classes may become less important as the development progresses. In such a case, the correction target class is specified by the specifying unit 52, but it may not be necessary to execute the correction process.

Therefore, in the fifth embodiment, the receiving unit 110 receives a selection instruction as to whether or not the correction processing unit should execute the correction process. Therefore, it is possible to deal with the case where the correction target class is specified by the specifying unit 52, but it is not necessary to execute the correction process.

You may implement combining 1st Embodiment and 2nd Embodiment. That is, the weighting factor for the loss value of the rare class is made smaller than the weighting factor for the loss value of the classes other than the rare class, and the weighting factor for the loss value of the non-rare class is set to the non-rare class. It may be larger than the weighting factor for the loss values of other classes. Similarly, the third embodiment and the fourth embodiment may be combined and implemented. That is, the correct value and the predicted value when calculating the loss value of the rare class is made larger than the correct value and the predicted value when calculating the loss value of the class other than the rare class, and the non-rare class The correct value and the predicted value when calculating the loss value may be smaller than the correct value and the predicted value when calculating the loss value of a class other than the non-rare class.

In each of the above embodiments, the input image 16 and the learning input image 20 are exemplified by images of a phase contrast microscope showing the state of cell culture, and differentiated cells and medium are illustrated as classes, but the invention is not limited thereto. For example, an MRI (Magnetic Resonance Imaging) image may be used as the input image 16 and the learning input image 20, and an organ such as a liver or a kidney may be used as a class.

The model 10 is not limited to U-Net, but may be another convolutional neural network such as SegNet.

The hardware configuration of the computer that constitutes the mini-batch learning device 2 can be modified in various ways. For example, the mini-batch learning device 2 may be composed of a plurality of computers separated as hardware for the purpose of improving processing capacity and reliability. Specifically, the functions of the generation unit 50, the calculation unit 51, and the identification unit 52, and the functions of the learning unit 53, the evaluation unit 54, and the update unit 55 are distributed to two computers. In this case, the two computers form the mini-batch learning device 2.

In this way, the hardware configuration of the computer can be appropriately changed according to the required performance such as processing capacity, safety and reliability. Further, not only the hardware but also the application program such as the operation program 40 can be duplicated or stored in a plurality of storage devices in a distributed manner for the purpose of ensuring safety and reliability. is there.

In each of the above-described embodiments, for example, the generation unit 50, the calculation unit 51, the

identification units

52, 80, the learning unit 53, the

evaluation units

54, 81, 90, 100, the update unit 55, the

correction processing units

56, 82, 91, 101. As the hardware structure of a processing unit (Processing Unit) that executes various processes such as the reception unit 110, the following various processors (Processors) can be used. As described above, in addition to the CPU, which is a general-purpose processor that executes software (operation program 40) and functions as various processing units, various processors are manufactured after manufacturing FPGA (Field Programmable Gate Array) and the like. Programmable Logic Device (PLD), which is a processor whose circuit configuration can be changed, dedicated processor, which has a circuit configuration specifically designed to execute specific processing such as ASIC (Application Specific Integrated Circuit) An electric circuit etc. are included.

One processing unit may be configured by one of these various processors, or a combination of two or more processors of the same type or different types (for example, a combination of a plurality of FPGAs or a combination of a CPU and an FPGA). Combination). Further, the plurality of processing units may be configured by one processor.

As an example of configuring a plurality of processing units with one processor, firstly, one processor is configured with a combination of one or more CPUs and software, as represented by computers such as clients and servers. There is a form in which the processor functions as a plurality of processing units. Secondly, as represented by a system on chip (SoC) or the like, there is a form in which a processor that realizes the functions of the entire system including a plurality of processing units by one IC (Integrated Circuit) chip is used. is there. As described above, the various processing units are configured by using one or more of the various processors as a hardware structure.

Further, as a hardware structure of these various processors, more specifically, an electric circuit (circuitry) in which circuit elements such as semiconductor elements are combined can be used.

From the above description, the invention described in additional item 1 below can be understood.

[Appendix 1]

It is a mini-batch learning device that gives learning by giving mini-batch data to a machine learning model for performing semantic segmentation that performs discrimination of multiple classes in an image in pixel units,

A calculation processor for calculating the area ratio of each of the plurality of classes in the mini-batch data,

A specific processor for specifying a correction target class based on the area ratio,

A first loss value of the correction target class, which is an evaluation processor for evaluating the discrimination accuracy of the class of the machine learning model by calculating a loss value for each of the plurality of classes using a loss function. And an evaluation processor including a correction processing processor that executes a correction process for correcting the first loss value based on a comparison result of the second loss value of a class other than the correction target class. apparatus.

The technology of the present disclosure can be appropriately combined with the above-described various embodiments and various modifications. Further, it is needless to say that various configurations can be adopted without departing from the scope of the invention, without being limited to the above-mentioned respective embodiments. Furthermore, the technique of the present disclosure extends to a storage medium that stores the program non-temporarily, in addition to the program.

2 Mini batch learning device

10 Machine learning model (model)

10T learned machine learning model (learned model)

11 mini batch data

Input image group for 12-division learning

13-divided annotation image group

14 Output image group for learning

15 Operation equipment

16 Input image

17 Output image

20 Input image for learning

Input image for 20S split learning

21 Annotation image

21S split annotation image

25 frames

30 storage devices

31 memory

32 CPU

33 Communication unit

34 display

35 Input device

36 data bus

40 operating program

50 Generator

51 calculator

52,80 Specific section

53 Learning Department

54, 81, 90, 100 Evaluation Department

55 Update Department

56, 82, 91, 101 Correction processing unit

60, 61, 65A, 65B, 70, 75, 83, 85A, 85B, 92, 95, 102, 105 Table

110 Reception Department

111 Inquiry screen

112 messages

113 Yes button

114 No button

Horizontal movement amount of DX frame

Amount of vertical movement of DY frame

L(TN, PN) loss function

WK Weight coefficient of each class

F (TK, PK) Loss value of each class

TK Correct answer value for each class

PK Predicted value of each class

ST100 to ST180 steps

Claims

It is a mini-batch learning device that gives learning by giving mini-batch data to a machine learning model for performing semantic segmentation that performs discrimination of multiple classes in an image in pixel units,

In the mini-batch data, a calculation unit that calculates the area ratio of each of the plurality of classes,

A specifying unit that specifies a correction target class based on the area ratio,

A first loss value of the correction target class, which is an evaluation unit for evaluating the discrimination accuracy of the class of the machine learning model by calculating a loss value for each of the plurality of classes using a loss function. And an evaluation unit including a correction processing unit that executes a correction process for correcting the first loss value based on the comparison result of the second loss values of the classes other than the correction target class, and a mini-batch learning. apparatus.
The specifying unit, as the correction target class, specifies a rare class in which the area ratio is lower than a preset value,

The mini-batch learning device according to claim 1, wherein the correction processing unit executes, as the correction processing, processing for weighting the value of the first loss larger than weighting for the value of the second loss. ..
The identifying unit identifies, as the correction target class, a non-rare class in which the area ratio is higher than a preset value.

The mini-batch according to claim 1 or 2, wherein the correction processing unit performs, as the correction processing, processing for weighting the value of the first loss smaller than weighting for the value of the second loss. Learning device.
The identifying unit identifies, as the correction target class, a rare class in which the area ratio is lower than a set value,

As the correction processing, the correction processing unit makes the correct value and the predicted value when calculating the first loss value larger than the correct value and the predicted value when calculating the second loss value. The mini-batch learning device according to claim 1, which executes an enlarging process.
The correction processing unit, the enlargement ratio in the enlargement process, the area ratio of the rare class in the mini-batch data, the area ratio of the rare class in the learning input image and the annotation image that is the source of the mini-batch data. The mini-batch learning device according to claim 4, wherein the value is the same as
The identifying unit identifies, as the correction target class, a non-rare class in which the area ratio is higher than a set value,

As the correction processing, the correction processing unit makes the correct value and the predicted value when calculating the value of the first loss smaller than the correct value and the predicted value when calculating the value of the second loss. The mini-batch learning device according to claim 1, wherein the mini-batch learning device executes a reduction process.
The correction processing unit, the reduction ratio in the reduction processing, the area ratio of the non-rare class in the mini-batch data, the learning input image and the annotation image of the non-rare class of the source of the mini-batch data The mini-batch learning device according to claim 6, wherein the value is the same as the area ratio.
The mini-batch learning device according to any one of claims 1 to 7, further comprising: a receiving unit that receives an instruction to select whether or not to execute the correction processing in the correction processing unit.
It is an operating program of a mini-batch learning device for giving learning by giving mini-batch data to a machine learning model for carrying out semantic segmentation for discriminating a plurality of classes in an image in pixel units,

In the mini-batch data, a calculation unit that calculates the area ratio of each of the plurality of classes,

A specifying unit that specifies a correction target class based on the area ratio,

A first loss value of the correction target class, which is an evaluation unit for evaluating the discrimination accuracy of the class of the machine learning model by calculating a loss value for each of the plurality of classes using a loss function. And an evaluation unit including a correction processing unit that executes a correction process for correcting the value of the first loss based on the comparison result of the value of the second loss of the class other than the class to be corrected,

An operation program for a mini-batch learning device that causes a computer to function.
A method of operating a mini-batch learning device for giving learning by giving mini-batch data to a machine learning model for performing semantic segmentation for discriminating a plurality of classes in an image on a pixel-by-pixel basis,

In the mini-batch data, a calculation step of calculating the area ratio of each of the plurality of classes,

A specifying step of specifying a correction target class based on the area ratio;

It is an evaluation step for evaluating the discrimination accuracy of the class of the machine learning model by calculating a loss value for each of the plurality of classes using a loss function, and a first loss value of the correction target class. And an evaluation step including a correction processing step for executing a correction processing for correcting the value of the first loss based on the comparison result of the value of the second loss of the class other than the correction target class. How the device works.