WO2021095509A1

WO2021095509A1 - Inference system, inference device, and inference method

Info

Publication number: WO2021095509A1
Application number: PCT/JP2020/040205
Authority: WO
Inventors: 青郁; 祥孝牛久; 敦史橋本
Original assignee: オムロン株式会社
Priority date: 2019-11-14
Filing date: 2020-10-27
Publication date: 2021-05-20
Also published as: JP2021081795A; JP7472471B2

Abstract

This inference system includes: a learning unit for generating an inference model through machine learning by using a first data set that comprises a plurality of data items to which a class is assigned and a second data set that comprises a plurality of data items to which a class is not assigned; and an inference unit. The inference model includes: an encoder for calculating feature quantities from inputted data items; a first classifier for outputting a first probability that one of the inputted data items belongs to a first class; and a second classifier for outputting a second probability that one of the inputted data items belongs to a second class. The learning unit includes: a calculation means for calculating classification discordance on the basis of the first probability and the second probability that are outputted when one of the data items included in one of the first data set and the second data set is inputted to the inference model; and a determination means for determining a learning priority for the inputted data in accordance with the magnitude of the calculated classification discordance.

Description

Estimating system, estimation device and estimation method

The present invention relates to an estimation method of an estimation model suitable for a practical environment.

Due to the dramatic improvement in computing power in recent years, solutions called AI (Artificial Intelligence) that utilize computing power are being realized in various fields.

Such a solution includes a task of recognizing the type of the object included in the input image, a task of recognizing the area where the object included in the input image exists, and the like. In order to realize such a recognition task, various elemental technologies are required, and one of them is known as unsupervised domain adaptation (UDA). It is known that problems such as OpenSet problem, noise problem, and data imbalance problem may occur when unsupervised domain adaptation is realized in a practical environment (in the wild).

As a solution to the OpenSet problem, a method called "OpenSet DA" has been proposed (see Non-Patent Document 1 and the like). As a solution to the noise problem, a method called "Weakly-Supervised DA" has been proposed (see Non-Patent Document 2 and the like). A method called "Partial DA" has been proposed as a means for solving the data imbalance problem (see Non-Patent Document 3 and the like).

Each of the above-mentioned solutions focuses on a specific problem, and no solution that comprehensively considers a plurality of problems has been proposed.

One object of the present invention is to provide a technique for unsupervised domain adaptation that can provide a solution that comprehensively considers a plurality of problems as described above.

An estimation system according to an example of the present invention uses a first data set consisting of a plurality of data to which a class is assigned and a second data set consisting of a plurality of data to which a class is not assigned by machine learning. It includes a learning unit that generates an estimation model and an estimation unit that inputs estimation target data that can belong to a second data set into the estimation model and determines an estimation result. The estimation model includes an encoder that calculates a feature quantity from the input data, a first classifier that outputs a first probability that the input data is the first class based on the feature quantity, and a feature quantity. Includes a second classifier that outputs a second probability that the input data is a second class based on. The learning unit determines the discrepancy in identification based on the first and second probabilities that are output when the data contained in either the first data set or the second data set is input to the estimation model. It includes a calculating means for calculating and a determining means for determining the learning priority for the input data according to the magnitude of the calculated discrepancy in identification.

According to this configuration, even if the above-mentioned problems occur, the data considered not to be affected by those problems can be preferentially used for learning, so that the second class is not assigned. Data sets can also be leveraged to maintain or improve the estimation accuracy of the estimation model.

The learning unit provides a first parameter updating means for updating the model parameters of the first classifier and the second classifier for the purpose of maximizing the discrepancy in discrimination while the model parameters of the encoder are fixed. It may also be included. The determining means may determine the higher learning priority for the data with the smaller discriminant of the calculated identification with respect to the first parameter updating means. With this configuration, the model parameters of the first and second classifiers can be updated with higher accuracy.

The learning unit provides a second parameter updating means for updating the model parameters of the encoder for the purpose of maximizing the discrepancy in discrimination while the model parameters of the first classifier and the second classifier are fixed. It may also be included. The determining means may determine the higher learning priority for the data with the larger discrepancy in the calculated identification with respect to the second parameter updating means. With this configuration, the encoder model parameters can be updated with higher accuracy.

As a learning priority, the determination means may determine a weighting coefficient to be multiplied by an error that causes the estimation model to back-propagate according to the magnitude of the discrepancy in identification. According to this configuration, the weighting coefficient that determines the update width of the model parameter can be adjusted, so that the model parameter can be updated with higher accuracy.

The determination means may determine only the data in which the magnitude of the calculated identification mismatch satisfies a predetermined condition as the data used for updating the model parameters. According to this configuration, only the error caused by the data satisfying the predetermined condition is used for adjusting the model parameter, so that the model parameter can be updated with higher accuracy.

The determination means may rank the discrepancies of the identifications calculated for each of the plurality of data, and then select only the data within a predetermined range as the data to be used for updating the model parameters. According to this configuration, even if conditions such as a predetermined threshold value are not set, only the data existing at the upper level of the entire distribution is used for adjusting the model parameters, so that the model parameters can be adjusted with higher accuracy. Can be updated.

The learning unit provides a third parameter updating means for updating the encoder model parameters, the first classifier model parameters, and the second classifier model parameters based on the data contained in the first dataset. It may also be included. The third parameter updating means inputs data into the estimation model and based on the error output from one of the first and second classifiers, the first discriminator and the second discriminator. The other model parameter may be updated. According to this configuration, the model parameters of the first classifier and the second classifier can be updated based on the common error information, so that the model parameters can be updated with higher accuracy.

The estimation unit includes an estimation result output unit that outputs an estimation result depending on whether or not the first probability and the second probability output when the estimation target data is input to the estimation model match each other. You may. According to this configuration, it is possible to estimate data classified into an unknown class.

An estimation device according to another example of the present invention is machine learning using a first data set consisting of a plurality of data to which a class is assigned and a second data set consisting of a plurality of data to which a class is not assigned. It includes a storage unit that holds the estimation model generated by the above, and an estimation unit that inputs estimation target data that can belong to the second data set into the estimation model and determines the estimation result. The estimation model includes an encoder that calculates a feature quantity from the input data, a first classifier that outputs a first probability that the input data is the first class based on the feature quantity, and a feature quantity. Includes a second classifier that outputs a second probability that the input data is a second class based on. The estimation model is an identification calculated based on the first probability and the second probability output when the data contained in either the first data set or the second data set is input to the estimation model. Depending on the magnitude of the discrepancy, the input data is trained based on the determined learning priority.

An estimation method according to yet another example of the present invention uses a first dataset consisting of a plurality of data to which a class has been assigned and a second dataset consisting of a plurality of data to which a class has not been assigned. It includes a learning step of generating an estimation model by machine learning and an estimation step of inputting estimation target data that can belong to a second data set into the estimation model and determining an estimation result. The estimation model includes an encoder that calculates a feature quantity from the input data, a first classifier that outputs a first probability that the input data is the first class based on the feature quantity, and a feature quantity. Includes a second classifier that outputs a second probability that the input data is a second class based on. The learning step finds a discrepancy in identification based on the first and second probabilities that are output when the data contained in either the first or second dataset is input into the estimation model. It includes a step of calculating and a step of determining the learning priority for the input data according to the magnitude of the calculated discrepancy in identification.

According to the present invention, it is possible to provide a solution that comprehensively considers a plurality of problems as described above.

It is a schematic diagram which shows the application example which concerns on this embodiment. It is a schematic diagram which shows the application example of the unsupervised domain adaptation which concerns on this embodiment. It is a schematic diagram which shows the processing procedure which concerns on the generation and operation of the estimation model which concerns on this embodiment. It is a schematic diagram which shows the hardware configuration example of the image processing system shown in FIG. It is a figure for demonstrating the basic idea of the solution of unsupervised domain adaptation (UDA) according to the MCD (Maximum Classifier Discrepancy) method. It is a figure for demonstrating the learning method according to the MCD (Maximum Classifier Discrepancy) method. It is a figure for demonstrating the learning method according to the MCD (Maximum Classifier Discrepancy) method. It is a figure for demonstrating the learning method according to the MCD (Maximum Classifier Discrepancy) method. It is a schematic diagram which shows an example of the learning network used in the learning method which concerns on this Embodiment. It is a schematic diagram which shows the implementation example of the learning method which concerns on this Embodiment. It is a flowchart which shows the outline processing procedure of the learning method which concerns on this Embodiment. It is a conceptual diagram for demonstrating the input data which the value of Loss 2 is small based on FIG. It is a conceptual diagram for demonstrating the input data which the value of Loss 2 is small based on FIG. It is a figure which outlines an example of the update process of a model parameter in step S3 of the learning method shown in FIG. It is a schematic diagram which shows the implementation example at the time of operation of the estimation model which concerns on this Embodiment. It is a schematic diagram which shows the implementation example in the application of the learning method which concerns on this Embodiment.

An embodiment of the present invention will be described in detail with reference to the drawings. The same or corresponding parts in the drawings are designated by the same reference numerals and the description thereof will not be repeated.

<A. Application example>
First, an example of a situation in which the present invention is applied will be described.

FIG. 1 is a schematic diagram showing an application example of the learning method according to the present embodiment and the estimation model generated by the learning method. With reference to FIG. 1, the estimation model 60 is generated by machine learning using the learning network 10.

The estimation model 60 typically includes an encoder 70 and a classifier 72 and a classifier 74. The encoder 70 calculates the feature amount from the input data (data x _s , data x _t). _{The classifier 72 outputs the probability p 1} (y | x) that the input data is in the first class based on the feature amount from the encoder 70. _{The classifier 74 outputs the probability p 2} (y | x) that the input data is in the second class.

_{In the training of the estimation model 60, the probability p 1} (y | x _{) output when the data (data x s} , data x _t ) contained in either the source data set or the target data set is input to the estimation model 60. ) And the probability p ₂ (y | x), the classification discrepancy is calculated. Then, the learning priority for the input data is determined according to the magnitude of the calculated discrepancy in identification.

Finally, the error output from the discriminator 72 or the discriminator 74 is back-propagated according to the determined learning priority of the learning network 10 (encoder 70, discriminator 72, and discriminator 74). Update the model parameters that specify at least one).

In the learning method according to the present embodiment, the learning priority is determined according to the magnitude of the discrepancy in identification, so that the estimation accuracy can be maintained or improved even if the above-mentioned problems occur.

<B. Application example>
Next, an application example of unsupervised domain adaptation according to the present embodiment will be described.

FIG. 2 is a schematic diagram showing an example of an application for unsupervised domain adaptation according to the present embodiment. FIG. 2 shows an image processing system 1 as an application example.

With reference to FIG. 2, the image processing system 1 images the work 8 with a camera 20 arranged at the tip of the arm of the robot 2, and uses the image obtained by the imaging to perform an appearance inspection of the work 8 (for example, Recognize the presence or absence of defects and the type of defects).

The robot 2 is, for example, an articulated robot, which has a plurality of axes 4 corresponding to joints, and by rotating or moving each axis 4, the camera 20 arranged at the tip can be arbitrarily moved. It can be placed in any position and posture.

In the image processing system 1, an estimation model for realizing recognition processing, which is a learned model generated in advance by machine learning as described later, is used. In order to improve the estimation accuracy of the estimation model, it is necessary to perform machine learning using a learning data set containing a large number of teacher data.

On the other hand, it is necessary to give a correct answer (for example, a label indicating the type of defect) to the teacher data in advance. Typically, it is provided by annotation. More specifically, teacher data can be generated by manually assigning a correct answer (label) to an image collected by an arbitrary method. As a method of collecting images, a method of actually taking an image using an arbitrary device or a method of virtually taking an image on a simulator may be used. Further, necessary images may be collected from a website or the like. When collecting images from a website, the collected images may be given the correct answer in advance.

The estimation accuracy can be improved by generating an estimation model using a large number of teacher data to which correct answers are given in advance.

However, in a practical environment, the work 8 is imaged by the camera 20. The image collected by any means and the image actually captured by the camera 20 do not have exactly the same imaging conditions and the like. Therefore, in many cases, the estimation model generated by using the data collected in advance in a specific environment and a large number of teacher data acquired by annotations or the like cannot be used as it is in a practical environment. Therefore, we provide a method that can generate an estimation model that can be operated in a practical environment by using unsupervised domain adaptation.

FIG. 3 is a schematic diagram showing a processing procedure related to the generation and operation of the estimation model according to the present embodiment. With reference to FIG. 3, first, a data set (hereinafter, also referred to as “source data set 30”) composed of images collected by the information processing apparatus 200 is prepared. In addition, a data set (hereinafter, also referred to as “target data set 50”) composed of images used in actual operation is prepared by actually taking an image with the camera 20.

Using the source data set 30 and the target data set 50, the estimation model 60 is generated by machine learning 40. In actual operation, the estimation result 64 is obtained by inputting data (hereinafter, also referred to as “estimation target data 62”) to the generated estimation model 60. The estimation target data 62 corresponds to data that can belong to the target data set 50.

Next, an example of the hardware configuration of the image processing system 1 shown in FIG. 2 will be described.
FIG. 4 is a schematic diagram showing a hardware configuration example of the image processing system 1 shown in FIG. With reference to FIG. 4, the image processing system 1 includes a robot 2 and an image processing device 100 that controls the robot 2.

The robot 2 has as many sets of servo drivers 12 and motors 14 as the number of axes in addition to the camera 20.

The image processing device 100 is a device that constitutes the estimation system according to the present embodiment, and performs image recognition processing based on the image captured by the camera 20. More specifically, the image processing device 100 performs image recognition processing on an image of the work 8 captured by the camera 20 as a subject, and whether or not there is a defect in the work 8 and whether or not there is a defect. If so, identify the type of defect. The image processing device 100 outputs a command for positioning the camera 20 to a predetermined position and posture to one or a plurality of servo drivers 12 in response to the arrival of the work 8. When each of the servo drivers 12 supplies electric power according to a command, the associated motor 14 is rotationally driven, and the joint or arm of the robot 2 mechanically coupled to the motor 14 operates.

The image processing device 100 is typically realized by using a computer that follows a general-purpose architecture (for example, an industrial personal computer based on a general-purpose personal computer).

The image processing device 100 includes a processor 102, a main memory 104, a storage 110, a communication interface 122, an input unit 124, an output unit 126, a camera interface 128, and a motor interface 130 as components.

The processor 102 is composed of a CPU (Central Processing Unit), an MPU (Micro Processing Unit), a GPU (Graphics Processing Unit), and the like. As the processor 102, a configuration having a plurality of cores may be adopted, or a plurality of processors 102 may be arranged.

The main memory 104 is composed of a volatile storage device such as a DRAM (Dynamic Random Access Memory) or a SRAM (Static Random Access Memory). The storage 110 is composed of, for example, a non-volatile storage device such as an HDD (Hard Disk Drive) or an SSD (Solid State Drive). The processor 102 reads various programs stored in the storage 110, expands them in the main memory 104, and executes them to realize various processes as described later.

In the storage 110, in addition to the OS 112 for realizing the basic functions, the machine learning program 114, the model parameters 116 for defining the estimation model 60, and the recognition application 118 for performing the image recognition process are stored. ing. The storage 110 corresponds to a storage unit that holds the estimation model 60. In addition, the storage 110 may store the source data set 30.

The processor 102 executes the machine learning program 114 to generate the estimation model 60 by the learning process. Further, when the processor 102 executes the recognition application 118, the estimation target data 62 is input to the estimation model 60 and functions as an estimation unit for determining the estimation result.

The communication interface 122 mediates the exchange of data with other devices via an arbitrary network.

The input unit 124 is composed of a keyboard, a mouse, and the like, and accepts user operations. The output unit 126 is composed of a display, various indicators, a printer, and the like, and outputs a processing result from the processor 102 and the like.

The camera interface 128 receives the image captured by the camera 20 and outputs a necessary command to the camera 20.

The motor interface 130 outputs a necessary command to the servo driver 12 according to the instruction from the processor 102.

The program of the image processing device 100 may be installed via a computer-readable recording medium (for example, an optical recording medium such as a DVD (Digital Versatile Disc)), but may be downloaded from a server device or the like on the network. You may want to install it. Further, the function provided by the image processing device 100 according to the present embodiment may be realized by using a part of the module provided by the OS.

FIG. 4 shows a configuration example in which the functions required as the image processing device 100 are provided by the processor 102 executing the program, and some or all of these provided functions are provided by dedicated hardware. It may be implemented using a hardware circuit (for example, ASIC (Application Specific Integrated Circuit) or FPGA (Field-Programmable Gate Array)).

<C. Unsupervised domain adaptation>
Next, the outline and issues of unsupervised domain adaptation will be explained.

In the unsupervised domain adaptation, for multiple datasets with different biases (trends), the correct information of one source dataset 30 (consisting of multiple data with classes) is displayed on the other target dataset. It is a method to make it available even for 50 (consisting of a plurality of data to which no class is assigned). Here, the bias is typically an environment in which the source data set 30 is acquired (hereinafter, also referred to as “source domain”) and an environment in which the target data set 50 is acquired (hereinafter, “target domain”). It is also called.) This is due to the difference.

In the application examples shown in FIGS. 2 and 3 described above, the environment in which the application is generated by manual annotation or the like corresponds to the source domain, and the environment in which the camera 20 actually captures images corresponds to the target domain.

It is assumed that the data of the source domain is _{given by the set of (x s} , y _s ), and the data of the target domain is given only by _{(x t).} Here, x _s and x _t indicate input vectors of data included in the source domain and the target domain, respectively, and y _s means the correct answer (class) given to the corresponding x _s.

Under such a premise, it is the goal of unsupervised domain adaptation to generate a trained model that can estimate the correct answer y _t _{to be given to the data x t belonging to the target domain.}

It is known that problems such as OpenSet problem, noise problem, and data imbalance problem may occur when unsupervised domain adaptation is realized in a practical environment (in the wild).

The first OpenSet problem means a decrease in estimation accuracy caused by the inclusion of data classified into a class (unknown class) other than the class assigned to the data included in the source domain in the target domain.

The second noise problem means a decrease in estimation accuracy due to errors or deterioration caused by various reasons. The noise of interest in the noise problem is typically label noise and characteristic noise. Label noise is an error that occurs in the correct answer given to the data contained in the source domain. That is, the problem is that the wrong class is given as the correct answer. In addition, characteristic noise is data deterioration (blurring, etc.) that is different from that generated in other data at the time of observation. That is, there is a problem that data deterioration different from the others occurs only for a part of the data contained in the data set.

The third data imbalance problem means a decrease in estimation accuracy due to an imbalance in the number of data contained in the dataset. A data imbalance problem typically involves an imbalance between the number of data contained in the source domain and the number of data contained in the target domain, and the imbalance between the class-by-class data contained in the target domain. There is a state of equilibrium. The former has a great influence on the method called MCD (Maximum Classifier Discrepancy) disclosed in Non-Patent Document 4. In addition, the latter has a large effect on the method of matching the generation distributions of MCD and features.

The learning method according to the present embodiment and the estimation model generated by the learning method provide a solution method that comprehensively considers the above-mentioned plurality of problems. More specifically, the learning method according to the present embodiment and the estimation model generated by the learning method are basically based on the discriminative model-based unsupervised domain adaptation method. Typical examples of the discriminative model-based unsupervised domain adaptation method include a method called MCD (Maximum Classifier Discrepancy) disclosed in Non-Patent Document 4 and a method called ADR (Adversarial Dropout Regularization) disclosed in Non-Patent Document 5. Can be mentioned.

Below, as an example of the discriminative model-based unsupervised domain adaptation method, the method based on MCD disclosed in Non-Patent Document 4 will be described. However, the technical scope of the present invention is not limited to methods such as MCD and ADR, but includes methods based on the same technical ideas as described below.

FIG. 5 is a diagram for explaining the basic concept of the solution of unsupervised domain adaptation (UDA) according to the MCD (Maximum Classifier Discrepancy) method. With reference to FIG. 5, the source data set 30 and the target data set 50 are assumed as the pre-adaptation states.

It is assumed that the source data set 30 includes one or more data 32 classified into the first class as the correct answer and one or more data 34 classified into the second class as the correct answer.

On the other hand, the target dataset 50 shall include one or more data 52 to be classified in the first class and one or more data 54 to be classified in the second class. However, it is unknown to which class the data contained in the target data set 50 is classified.

A first class identification surface 42 for identifying data classified into the first class with respect to the data included in the source data set 30 and the target data set 50 in an arbitrary feature space, and a second class identification surface 42. It is assumed that there is a second class identification surface 44 for identifying the data classified into the class.

In discriminative model-based unsupervised domain adaptations such as MCD, the source dataset 30 and the target are not trained for the purpose of matching the distribution of the entire domain between the source dataset 30 and the target dataset 50. Learning is done for the purpose of matching the class discriminative plane with the data set 50.

More specifically, an encoder that extracts features from the source data set 30 and the target data set 50 is generated by learning so that a common class identification surface can be used, and class identification that can be used in common. The surface is also generated by learning.

6 to 8 are diagrams for explaining a learning method according to the MCD (Maximum Classifier Discrepancy) method. In FIGS. 6 to 8, the first source data group 301 including the data 32 classified in the first class included in the source data set 30 and the second data 34 classified in the second class are included. It is assumed that the source data group 302 of. Similarly, a first target data group 501 consisting of data 52 to be classified into the first class included in the target data set 50 and a second target data consisting of data 54 to be classified into the second class. Assume group 502.

In the learning method according to the MCD method, learning of the class identification surface and learning of the encoder for extracting the feature amount are alternately performed.

First, the first class identification surface 42 and the second class identification surface 44 are determined by learning using the source data set 30. As shown in FIG. 6, the first class identification surface 42 does not cross the first source data group 301, and the second class identification surface 44 does not cross the second source data group 302. ..

However, the first class identification surface 42 may cross the first target data group 501, and the second class identification surface 44 may cross the second target data group 502. That is, the first class identification surface 42 and / or the second class identification surface 44 determined by the source data set 30 may generate a discrepancy region with respect to the target data set 50.

In FIG. 6, the mismatch area 53 means an area in which the data 52 to be classified in the first class is erroneously determined not to be classified in the first class, and the mismatch area 55 is in the second class. It means an area where the data 54 to be classified is erroneously determined not to be classified in the second class.

Therefore, the first class identification surface 42 and the second class identification surface 44 are updated for the purpose of minimizing the

mismatch areas

53 and 55. At this time, the model parameters of the encoder are fixed. In FIG. 6, the first class identification surface 42'before the update and the first class identification surface 42 after the update, and the second class identification surface 44' before the update and the second class identification after the update are shown. The surface 44 is schematically shown.

Subsequently, as shown in FIG. 7, in the feature quantity space, the first source data group 301 consisting of the data 32 classified into the first class and the second data 52 to be classified into the first class. It is preferable that the distributions of 1 with the target data group 501 match as much as possible. Similarly, in the feature space, a second source data group 302 consisting of data 34 classified in the second class and a second target data group 502 consisting of data 54 to be classified in the second class. It is preferable that the distributions between them match as much as possible.

That is, in the feature space, the model parameters of the encoder are updated for the purpose of minimizing the discrepancy between the data classified in the same class.

In FIG. 6, the first target data group 501'before the update and the first target data group 501 after the update, and the second target data group 502' before the update and the second target data after the update are shown. Group 502 is schematically shown.

By repeatedly executing the update of the class identification surface shown in FIG. 6 and the update of the encoder shown in FIG. 7, the model of the encoder and the class identification surface can be determined as shown in FIG.

<D. Solution>
In the present embodiment, a learning method capable of maintaining or improving the estimation accuracy even when an OpenSet problem, a noise problem, a data imbalance problem, etc. exist is provided.

FIG. 9 is a schematic diagram showing an example of the learning network 10 used in the learning method according to the present embodiment. With reference to FIG. 9, the learning network 10 is a type of hostile network, typically including an encoder 70 and a classifier 72 and a classifier 74.

The encoder 70 corresponds to the feature amount generation unit (G), and is a feature amount from the data x _s _{(vector) included in the source data set 30 and / or the data x t} (vector) included in the target data set 50. Is calculated. The encoder 70 may be given input in the form of a mini-batch in which a plurality of data are collected.

The classifier 72 and the classifier 74 define a class discriminating surface for the feature amount output from the encoder 70. _{The classifier 72 outputs the probability p 1} (y | x) that the estimated value y of the data x input to the encoder 70 is the first class as the estimation result, and the classifier 74 is input to the encoder 70. _{The probability p 2} (y | x) in which the estimated value y of the data x is the second class is output as the estimation result. In this way, the discriminator 72 functions as the discriminant function F1, and the discriminator 74 functions as the discriminant function F2.

FIG. 10 is a schematic diagram showing an implementation example of the learning method according to the present embodiment. The configuration shown in FIG. 10 is typically realized by the processor 102 executing the machine learning program 114.

With reference to FIG. 10, the input data selection unit 1141 is arranged on the input side of the learning network 10, and the Loss1 calculation unit 1142, the Loss2 calculation unit 1143, and the error buffer are arranged on the output side of the learning network 10. 1144, a curriculum determination unit 1145, and a parameter update unit 1146 are arranged.

The input data selection unit 1141 samples the data contained in the source data set 30 and the target data set 50, and generates one or more data (mini-batch) to be input to the learning network 10 (encoder 70). The input data selection unit 1141 may output the information of the selected data to the error buffer 1144. Further, the input data selection unit 1141 may determine the data to be selected according to the instruction from the curriculum determination unit 1145.

The Loss1 calculation unit 1142 calculates the discrimination error by the classifier 72 and the classifier 74 as Loss1. It should be noted that Loss 1 can be calculated only _{for the data x s} included in the source data set 30 to which the correct answer is given.

The Loss2 calculation unit 1143 calculates the error between the estimation result by the classifier 72 and the estimation result by the classifier 74 as Loss2. Loss2 means a classification mismatch (Classifier Discrepancy). As a method for calculating Loss2, MAE (Mean Absolute Error), RMSE (Root Mean Squared Error), or the like can be typically used.

The error buffer 1144 temporarily stores the errors (Loss1 and Loss2) calculated by the Loss1 calculation unit 1142 and the Loss2 calculation unit 1143. The error buffer 1144 may store the calculated error in association with the information of the data input to the encoder 70.

The curriculum determination unit 1145 determines the learning curriculum for the learning network 10 based on the error calculated by the Loss1 calculation unit 1142 and / or the Loss2 calculation unit 1143. More specifically, the curriculum determination unit 1145 updates the type and order of data to be input, and the update target and update order of the model parameters that define the learning network 10 (encoder 70, classifier 72, and classifier 74). And so on.

The parameter update unit 1146 back-propagates the error calculated by the Loss1 arithmetic unit 1142 and / or the Ross2 arithmetic unit 1143 to define the learning network 10 (encoder 70, discriminator 72, and discriminator 74). To update.

In the learning method according to the present embodiment, forward learning and hostile learning are alternately and repeatedly executed on the learning network 10 shown in FIG.

In the forward learning, the model parameters of the encoder 70 are fixed, and the model parameters of the classifier 72 and the classifier 74 are optimized. More specifically, the model parameters of the discriminator 72 and the discriminator 74 are aimed at maximizing Loss2 (error between the estimation result by the discriminator 72 and the estimation result by the discriminator 74: discrepancy of discrimination). To update.

On the other hand, in hostile learning, the model parameters of the encoder 72 and the classifier 74 are fixed, and the model parameters of the encoder 70 are optimized. More specifically, the model parameters of the encoder 70 are updated for the purpose of minimizing Loss 2.

In the learning method according to the present embodiment, in at least one of forward learning and hostile learning, the learning priority is adjusted according to the magnitude of Noise2 calculated for the input data. Prevents deterioration of estimation accuracy due to various problems (OpenSet problem, noise problem, data imbalance problem, etc.) as described above.

FIG. 11 is a flowchart showing a schematic processing procedure of the learning method according to the present embodiment. The process shown in FIG. 11 is typically realized by the processor 102 executing the machine learning program 114.

With reference to FIG. 11, the source data set 30 and the target data set 50 are prepared (step S1).

First, the processor 102 initializes the model parameters of the encoder 70, the classifier 72, and the classifier 74 (step S2).

The processor 102 updates the model parameters of the encoder 70, the classifier 72, and the classifier 74 based on the plurality of data to which the correct answer is given included in the source data set 30 (step S3). At this time, the model parameters of the encoder 70, the classifier 72, and the classifier 74 are updated so as to minimize Loss 1 (discrimination error by the classifier 72 and the classifier 74).

Subsequently, the processor 102 selects data (or a mini-batch composed of a plurality of data) to be used for forward learning from the source data set 30 and the target data set 50 (step S4). Then, the processor 102 inputs the data selected in step S4 into the learning network 10 to calculate the estimation result (step S5), and calculates Loss2 based on the calculated estimation result (step S6).

Then, the processor 102 determines the learning priority based on the calculated Loss 2 (step S7). Finally, the processor 102 updates the model parameters of the classifier 72 and the classifier 74 for the purpose of maximizing Loss 2 based on the learning priority determined in step S7 (step S8). Here, the model parameters of the encoder 70 are fixed.

The processor 102 determines whether or not the end condition of the sequential learning of steps S4 to S8 is satisfied (step S9). If the end condition of the sequential learning of steps S4 to S8 is not satisfied (NO in step S9), the processor 102 re-executes the processes of step S4 and subsequent steps.

If the end condition of the sequential learning of steps S4 to S8 is satisfied (YES in step S9), the processor 102 uses data (or a plurality of data) for hostile learning from the source data set 30 and the target data set 50. A mini-batch consisting of) is selected (step S10). Then, the processor 102 inputs the data selected in step S10 into the learning network 10 to calculate the estimation result (step S11), and calculates Loss2 based on the calculated estimation result (step S12).

Then, the processor 102 determines the learning priority based on the calculated Loss 2 (step S13). Finally, the processor 102 updates the model parameters of the encoder 70 with the aim of minimizing Loss 2 based on the learning priority determined in step S13 (step S14). Here, the model parameters of the classifier 72 and the classifier 74 are fixed.

The processor 102 determines whether or not the end condition of the hostile learning in steps S10 to S14 is satisfied (step S15). If the end condition of the hostile learning in steps S10 to S14 is not satisfied (NO in step S15), the processor 102 executes the process of step S10 or less again.

If the end condition of the hostile learning in steps S10 to S14 is satisfied (YES in step S15), the processor 102 determines whether or not the convergence condition of the learning process is satisfied (step S16). If the convergence condition of the learning process is not satisfied (NO in step S16), the processor 102 executes the process of step S4 or less again.

If the convergence condition of the learning process is satisfied (YES in step S16), the processor 102 outputs an estimated model including the encoder 70, the classifier 72, and the classifier 74 defined by the current model parameters as the learning result. (Step S17). Then, the learning process ends.

Note that step S3 may be incorporated as part of the forward learning process.

<E. Adjusting learning priorities>
Next, the details of the learning method according to the present embodiment will be described.

(E1: Basic idea)
In the above-mentioned forward learning (update of the model parameters of the classifier 72 and the classifier 74), the model parameters are updated for the purpose of maximizing Loss 2. Therefore, it is preferable to set the input data having a smaller Loss 2 value so as to increase the learning priority. That is, with respect to forward learning, a higher learning priority is determined for data having a smaller Loss 2 (discrimination of identification) calculated.

Loss2 is an error between the estimation result by the classifier 72 and the estimation result by the classifier 74, and the input data having a small value of Loss2 is for each class discriminating surface defined by the classifier 72 and the classifier 74. It means that there is no difference in distance.

FIG. 12 is a conceptual diagram for explaining input data in which the value of Loss 2 is small based on FIG. With reference to FIG. 12, the data having a small Loss 2 value exists, for example, in the region 56 where the difference between the distance to the first class identification surface 42 and the distance to the second class identification surface 44 is small. To do. The region 56 is located near the boundary between the first target data group 501 and the second target data group 502, and is a region in which it is relatively difficult to identify which class the region 56 is classified into. .. By preferentially using the data in such a region 56 for learning, the first class identification surface 42 (identifier 72) and the second class identification surface 44 (identifier 74) can be efficiently learned.

In the above-mentioned hostile learning (update of the model parameter of the encoder 70), the model parameter is updated for the purpose of minimizing Loss2. Therefore, it is preferable to set the input data having a larger Loss 2 value so as to increase the learning priority. That is, with respect to hostile learning, a higher learning priority is determined for data having a larger Loss 2 (discrimination of identification) calculated.

Loss2 is an error between the estimation result by the classifier 72 and the estimation result by the classifier 74, and the input data having a large value of Loss2 is for each class discriminating surface defined by the classifier 72 and the classifier 74. It means that the difference in distance is large.

FIG. 13 is a conceptual diagram for explaining input data having a small value of Loss 2 based on FIG. 7. With reference to FIG. 13, the data having a large Loss 2 value is, for example, the region 57 and the region in which the difference between the distance to the first class identification surface 42 and the distance to the second class identification surface 44 is large. It exists at 58. The region 57 is located in the vicinity of the first class identification surface 42, and is a region in which it is relatively difficult to discriminate whether or not the region 57 is classified into the first class. Similarly, the region 58 is located in the vicinity of the second class identification surface 44, and is a region in which it is relatively difficult to discriminate whether or not the region 58 is classified into the second class. By preferentially using the data in the

regions

57 and 58 for learning, the region (encoder 70) on which the first target data group 501 and the second target data group 502 are projected is efficiently learned. it can.

"Adjusting the learning priority" or "prioritizingly using for learning" according to the present embodiment not only changes the magnitude of the weight assigned to each input data, but also at all. It can also include not assigning weights, i.e. not using the calculated error for learning. Some implementation examples of the "adjusting learning weights" method are described below.

(E2: Weighting coefficient depending on the size of Loss2)
As an example of the method of adjusting the learning weight, the weighting coefficient to be multiplied by the error of back-propagating the learning network 10 may be determined depending on the magnitude of Loss 2. That is, as a learning priority, a weighting coefficient for multiplying the error of back-propagating the estimation model 60 may be determined according to the magnitude of Loss 2 (mismatch of discrimination).

For example, in forward learning (update of model parameters of discriminator 72 and discriminator 74), it is preferable to set so that the input data having a smaller Loss 2 value has a higher learning priority. Therefore, the weighting coefficient to be multiplied by the error used for updating the model parameters may be determined by making it inversely proportional to Loss 2. That is, it may be determined as a weighting coefficient ∝1 / Loss2.

However, the weight coefficient may be determined to be larger as the value of Loss2 is smaller by any method, not limited to the case where it is inversely proportional to Loss2.

On the other hand, in hostile learning (update of the model parameter of the encoder 70), it is preferable to set so that the input data having a larger Loss 2 value has a higher learning priority. Therefore, the weighting coefficient to be multiplied by the error used for updating the model parameters may be determined in proportion to Loss2. That is, it may be determined as the weighting coefficient ∝Loss2.

However, the weight coefficient may be determined to be larger as the value of Loss2 is larger by any method, not limited to the case where it is proportional to Loss2.

As described above, as a method of adjusting the learning weight, the weighting coefficient to be multiplied by the error of back-propagating the learning network 10 may be determined depending on the magnitude of Loss 2.

(E3: Learning enabled / disabled)
As described above, as an extension of the method of determining the weighting coefficient to be multiplied by the error of back-propagating the learning network 10 depending on the size of Loss 2, the error of the target is set according to the magnitude or rank of the priority. You may decide whether or not to use it for learning. That is, only the data in which the calculated Loss 2 (mismatch of identification) satisfies a predetermined condition may be determined as the data to be used for updating the model parameters.

For example, in forward learning (update of model parameters of discriminator 72 and discriminator 74), it is preferable to set so that the input data having a smaller Loss 2 value has a higher learning priority. Therefore, the model parameters may be updated with the corresponding error only when the calculated Loss 2 value is smaller than the predetermined threshold value. Conversely, if the calculated Loss 2 value is greater than or equal to a predetermined threshold, the corresponding error may not be used for learning.

On the other hand, in hostile learning (update of the model parameter of the encoder 70), it is preferable to set so that the input data having a larger Loss 2 value has a higher learning priority. Therefore, the model parameters may be updated with the corresponding error only when the calculated Loss 2 value is equal to or greater than a predetermined threshold value. Conversely, if the calculated Loss 2 value is less than a predetermined threshold, the corresponding error may not be used for learning.

In this way, the corresponding error may be used for learning the model parameters only when the magnitude of the calculated Loss 2 value meets the predetermined conditions.

(E4: Ranking)
Instead of evaluating the magnitude of the calculated Loss 2 value as it is as described above, it may be evaluated as a distribution and then it may be determined which input data is to be prioritized.

For example, with respect to the set of Loss2 calculated from the estimation results obtained by inputting a predetermined number of input data (or mini-batch) into the learning network 10, the ranking from the larger value or the smaller value is obtained. It may be possible to rank from the top and use only the error corresponding to Loss 2 at a predetermined ratio (for example, several to several tens of percent) from the top of the ranking for learning. By determining the error used for learning by ranking, the error to be used for learning (that is, input data) can be appropriately determined according to the calculated Loss 2 distribution.

In this way, after ranking Loss2 (mismatch of identification) calculated for each of the plurality of data, only the data within a predetermined range may be selected as the data to be used for updating the model parameters. ..

(E5: Curriculum)
Any curriculum may be determined by any combination of one or more of the methods described above. For example, in the first learning, based on a set of Loss 2 (for example, 100 epochs) calculated from the estimation results obtained by inputting a predetermined number of input data (or mini-batch) into the learning network 10. , The model parameters are updated using the error of the top 5% of the set, and the model parameters are updated using the error of the top 10% of the set in the second training. And the order may be scheduled in advance. By predetermining such a curriculum, model parameters can be learned efficiently.

(E6: Other)
The implementation of the method of "adjusting the learning weight" is not limited to the above-mentioned form, and any form may be adopted.

<F. Optimization of estimation model by source dataset>
In the process of updating the model parameters of the encoder 70, the classifier 72, and the classifier 74 in step S3 of the learning method shown in FIG. 11, Loss 1 (discrimination error by the classifier 72 and the classifier 74) is minimized. For the purpose, the estimation result from one of the two

classifiers

72 and 74 may be used to learn the other.

FIG. 14 is a diagram illustrating an example of model parameter update processing in step S3 of the learning method shown in FIG. With reference to FIG. 14, the data contained in the source data set 30 is input to the encoder 70, the discrimination error for the estimation result output from the classifier 72 is calculated, and the error calculated from the calculated discrimination error is the other. The model parameters of the classifier 74 may be updated by propagating back to the classifier 74.

Similarly, the data contained in the source data set 30 is input to the encoder 70, the discrimination error for the estimation result output from the classifier 74 is calculated, and the error calculated from the calculated discrimination error is the other classifier 72. The model parameters of the classifier 72 may be updated by back-propagating to.

That is, in the learning procedure shown in FIG. 14, data is input to the estimation model, and the model of the other of the classifier 72 and the classifier 74 is based on the error output from one of the classifier 72 and the classifier 74. The parameters are updated. By updating the model parameters of the classifier 72 and the classifier 74 based on the information of the common error in this way, even in a noisy situation, the discrimination error by the classifier 72 (classifier 72 and the classifier 74) ) Can be minimized. For details of the learning method shown in FIG. 14, refer to Non-Patent Document 6.

<G. Modification example>
The learning network 10 and the learning method are not limited to the above-described embodiments, and various modifications as shown below are possible.

(G1: Learning network)
In the learning network 10 shown in FIG. 9, data is input from the source data set 30 and the target data set 50 to the common encoder 70, but the encoder and the target for the source data set 30 The encoders for the data set 50 may be arranged respectively.

In the learning network 10 shown in FIG. 9, a configuration using two classifiers is illustrated, but the configuration is not limited to this, and three or more classifiers may be used. Further, a random selection element such as DropOut may be introduced. With the introduction of DropOut, it has the same effect as arranging virtually innumerable classifiers.

(G2: Learning method)
In forward learning and hostile learning, data selected from both the source dataset 30 and the target dataset 50 may be used, or only the data selected from either dataset may be used. Good. That is, in any of forward learning only, hostile learning only, forward learning and hostile learning, only one of the source data set 30 and the target data set 50 may be used. At this time, the data set used in the forward learning and the data set used in the hostile learning may be different.

<H. Operation of estimation model>
Next, a configuration example of the estimation model 60 generated by the above-mentioned learning method during operation (estimation phase) will be described.

FIG. 15 is a schematic diagram showing an implementation example of the estimation model 60 according to the present embodiment during operation. The configuration shown in FIG. 15 is typically realized by the processor 102 executing the recognition application 118.

With reference to FIG. 15, when the estimation target data 62 (data x _t ) is input to the estimation model 60, the probability p ₁ (y | x _t ), which is the first class, is output from the classifier 72. , The classifier 74 _{outputs the probability p 2} (y | x) which is the second class.

The probabilities output from the classifier 72 and the classifier 74 are input to the estimation result output unit 84. When the probabilities from the respective classifiers indicate the results that are consistent with each other, the estimation result output unit 84 outputs the matched results as the estimation result 64. That is, whether or not the _{probability p 1} (y | x _t ) and the probability p ₂ (y | x _t ) output when the estimation result output unit 84 inputs the estimation target data 62 to the estimation model 60 match each other. The estimation result is output according to the above.

The probabilities from each classifier are aligned with each other, for example, for the same data x _t, high probability of being the first class, and, if the probability is the second class is low or, the This corresponds to the case where the probability of being in the first class is low and the probability of being in the second class is high.

On the other hand, for the same data x _t, in the case both probability is the probability and the second class is the first class is high or low, it does not conform to each other.

When the probabilities from the respective classifiers match each other, the estimation result output unit 84 outputs the class corresponding to the matched result as the estimation result 64. On the other hand, the estimation result output unit 84 may output an estimation result that the input data _xt is in an unknown class when the probabilities from the respective classifiers do not match each other.

Further, a reliability calculation unit 86 for calculating the reliability of the estimation result may be arranged. More specifically, the reliability calculation unit 86 may calculate the reliability from the magnitude of the identification mismatch (corresponding to Loss 2) calculated based on the probabilities from each classifier.

By calculating such reliability, it can be easily determined whether or not the estimation result of the estimation model 60 can be used as it is.

<I. Experimental example of performance evaluation>
Next, an experimental example of performance evaluation of the estimation model generated by the learning method according to the present embodiment will be described. In this experimental example, unsupervised domain adaptation for number recognition tasks was performed.

The SVHN (Street View House Numbers) dataset was used as the source domain. The source data set 30 was 250 samples (250 samples × 5 classes) arbitrarily selected for each of 5 classes (0, 1, 2, 3, 4, 5) from the SVHN data set.

The MNIST (Mixed National Institute of Standards and Technology database) dataset was used as the target domain. More specifically, for every 10 classes (0,1,2,3,4,5,6,7,8,9) [200,200,500,500,1000,1000,2000,2000,5000 , 5000] samples were used as the target data set 50.

That is, in addition to the imbalance in the number of data contained in each domain between the source domain and the target domain (the source domain has 1000 samples, the target domain has 17400 samples), and the target. There is an imbalance between the data for each class included in the domain (a class with only 200 samples and a class with 5000 samples are mixed). Furthermore, in the target domain, there are classes (unknown classes) that are not included in the source domain.

Furthermore, noise indicated by Pxx and Sxx is intentionally added to the label (class) given to the source domain.

P20: Randomly change the label of 20% of the whole sample to another P45: Change the label of 45% of the whole sample to another randomly S20: Change the label of 20% of the whole sample to another sample Random replacement S45: 45% of the labels of the entire sample are randomly replaced with other samples. In addition, the following five methods were used for performance comparison.

-DANN (Domain Adaptation Network) (see Non-Patent Document 7)
-ADDA (Adversarial Discriminative Domain Adaptation) (see Non-Patent Document 8)
-MCD (Maximum Classifier Discrepancy) (see Non-Patent Document 4)
-TCL (Transferable Curriculum for Weakly-Supervised Domain Adaptation) (see Non-Patent Document 9)
-OSBP (Open Set Domain Adaptation by Backpropagation) (see Non-Patent Document 10)
Further, as a reference for comparison, the performance when only the source data set 30 is used is also shown (Source Only).

Each value shown in the table below indicates the correct answer rate by the estimation model according to each method.

In this way, by adopting the learning method according to the present embodiment and the estimation model generated by the learning method, problems such as an OpenSet problem, a noise problem, and a data imbalance problem occur as compared with the related techniques. It can be seen that even in this situation, higher estimation performance can be achieved.

<J. Implementation example in application>
Next, a configuration example in the case of implementing the learning method according to the present embodiment in the application will be described.

FIG. 16 is a schematic diagram showing an implementation example of the learning method according to the present embodiment in the application. FIG. 16 shows an example of mounting on the above-mentioned image processing device 100 (FIG. 4).

In FIG. 16A, a data collection process 150 in which the image processing apparatus 100 collects the source data set 30 and the target data set 50, machine learning 40 for generating the estimation model 60, and the estimation model 60 are used. A configuration example for executing the estimation process is shown.

FIG. 16B shows a configuration example in which the image processing device 100 and an external device 250 such as a server are linked. In this configuration example, the image processing device 100 executes the data collection process 150 for collecting the source data set 30 and the target data set 50, and the estimation process using the estimation model 60, and the external device 250 executes the estimation model 60. The generated machine learning 40 is executed.

FIG. 16C also shows a configuration example in which the image processing device 100 and an external device 250 such as a server are linked. In this configuration example, the external device 250 executes the data collection process 150 for collecting the source data set 30 and the target data set 50, and the machine learning 40 for generating the estimation model 60, and the image processing device 100 estimates. The estimation process using the model 60 is executed.

Note that FIG. 16 shows some typical implementation examples, and the technical scope of the present invention is not limited to these implementation examples. Any mounting form can be adopted according to the required requirements, specifications, and purposes.

<K. Application example>
In the above description, as an application example, an example of absorbing a bias difference between an image generated by annotation for an image collected by an arbitrary method and an image actually captured by a camera (live-action image) is provided. As described above, the estimation model according to the present embodiment can be applied to any application, not limited to this implementation example. That is, "environment" or "domain" can be interpreted as broadly as possible.

Arbitrary information observed by an arbitrary sensing device can be applied by the method according to the present embodiment even if the observation conditions and the observation environment are different. For example, in the technical field of FA (Factory Automation), by applying the learning method according to this embodiment, it is possible to make up for the environmental difference of the factory or equipment to which the application is applied.

Furthermore, it can be applied not only to physical information observed by a sensing device, but also to artificial information such as sales results on an EC (electronic commerce) site. For example, an application that estimates the sales performance on another EC site based on the sales performance on one EC site is assumed.

Furthermore, in a device for determining whether or not a person has a lifestyle-related disease, the difference in lifestyle due to age difference, gender difference, regional difference, etc. may be compensated for. In addition, various biases caused by individual differences may be compensated for.

As described above, the learning method according to the present embodiment and the application destination of the estimation model generated by the learning method can be applied to various observable information existing in the real world.

<L. Addendum>
The present embodiment as described above includes the following technical ideas.
[Structure 1]
The estimation system (1)
By machine learning (40) using a first dataset (30) consisting of a plurality of data to which a class is assigned and a second dataset (50) consisting of a plurality of data to which a class is not assigned. A learning unit (40; 114) that generates an estimation model (60), and
It includes an estimation unit (118) that inputs estimation target data (62) that can belong to the second data set into the estimation model and determines an estimation result (64).
The estimation model is
An encoder (70) that calculates features from the input data, and
Based on the feature quantity, the first classifier (72) that outputs the first probability that the input data is the first class, and
Based on the feature quantity, the input data includes a second classifier (74) that outputs a second probability of being a second class.
The learning unit
Discrimination mismatch is calculated based on the first and second probabilities that are output when the data contained in either the first data set or the second data set is input to the estimation model. Calculation means (1143) to be calculated and
An estimation system that includes a determination means (1145) that determines the learning priority for the input data according to the magnitude of the calculated identification discrepancy.
[Structure 2]
The learning unit updates the model parameters of the first classifier and the second classifier for the purpose of maximizing the discrepancy in discrimination while fixing the model parameters of the encoder. Further including parameter updating means (S4 to S8),
The estimation system according to configuration 1, wherein the determination means determines a higher learning priority for data having a smaller discrepancy in the calculated identification with respect to the first parameter update means.
[Structure 3]
The learning unit updates the model parameters of the encoder for the purpose of maximizing the discrepancy in discrimination while fixing the model parameters of the first classifier and the second classifier. Further including parameter updating means (S10 to S14),
The estimation system according to

configuration

1 or 2, wherein the determination means determines a higher learning priority for data having a larger discrepancy in the calculated identification with respect to the second parameter update means.
[Structure 4]
The estimation according to any one of configurations 1 to 3, wherein the determination means determines, as a learning priority, a weighting coefficient for multiplying the error of back-propagating the estimation model according to the magnitude of the discrimination mismatch. system.
[Structure 5]
The estimation according to any one of configurations 1 to 4, wherein the determination means determines only data in which the magnitude of the calculated discrimination mismatch satisfies a predetermined condition as data to be used for updating the model parameters. system.
[Structure 6]
The determination means ranks the discrepancies of identification calculated for each of a plurality of data, and then selects only the data within a predetermined range as the data to be used for updating the model parameters. The estimation system according to item 1.
[Structure 7]
The learning unit updates the model parameters of the encoder, the model parameters of the first classifier, and the model parameters of the second classifier based on the data contained in the first data set. Further includes the parameter updating means (S4) of
The third parameter updating means inputs data into the estimation model and outputs the first classifier and the second classifier based on an error output from the first classifier and the second classifier. The estimation system according to any one of configurations 1 to 6, wherein the model parameter of the other of the second classifiers is updated.
[Structure 8]
The estimation unit is an estimation result output unit (84) that outputs an estimation result according to whether or not the first probability and the second probability output when the estimation target data is input to the estimation model match each other. ), The estimation system according to any one of configurations 1 to 7.
[Structure 9]
Generated by machine learning (40) using a first dataset (30) consisting of a plurality of data to which a class is assigned and a second dataset (50) consisting of a plurality of data to which a class is not assigned. A storage unit (110) that holds the estimated model,
It includes an estimation unit (118) that inputs estimation target data (62) that can belong to the second data set into the estimation model and determines an estimation result (64).
The estimation model is
An encoder (70) that calculates features from the input data, and
Based on the feature quantity, the first classifier (72) that outputs the first probability that the input data is the first class, and
Based on the feature quantity, the input data includes a second classifier (74) that outputs a second probability of being a second class.
The estimation model is calculated based on the first probability and the second probability output when the data contained in either the first data set or the second data set is input to the estimation model. An estimation system that is trained on the basis of a determined learning priority for the input data, depending on the magnitude of the discriminant discrepancy that is made.
[Structure 10]
By machine learning (40) using a first dataset (30) consisting of a plurality of data to which a class is assigned and a second dataset (50) consisting of a plurality of data to which a class is not assigned. Learning steps to generate an estimation model and
It includes an estimation step (118) in which estimation target data (62) that can belong to the second data set is input to the estimation model to determine an estimation result (64).
The estimation model is
An encoder (70) that calculates features from the input data, and
Based on the feature quantity, the first classifier (72) that outputs the first probability that the input data is the first class, and
Based on the feature quantity, the input data includes a second classifier (74) that outputs a second probability of being a second class.
The learning step
Discrimination mismatch is calculated based on the first and second probabilities that are output when the data contained in either the first data set or the second data set is input to the estimation model. Steps (S6, S12) and
An estimation method including a step (S7, S13) of determining a learning priority for the input data according to the magnitude of the calculated identification discrepancy.

<M. Effect>
According to the learning method according to the present embodiment, the learning priority for the input data is determined according to the magnitude of the classification discrepancy, and the learning priority is determined according to the determined priority. Update model parameters.

According to the learning method according to the present embodiment, discriminators for each class are provided, and by evaluating the probabilities from each discriminator, the data included in the source domain is assigned. Even when data classified into a class other than the class (unknown class) is included in the target domain (OpenSet problem), the possibility of making a mistake in estimating the class can be reduced.

According to the learning method according to the present embodiment, in the update of the model parameters in the forward learning and the hostile learning, the data having the smaller discrimination mismatch or the data having the larger discrimination mismatch is preferentially used, so that noise. Data containing the above (noise problem) can be relatively excluded, and this can prevent a decrease in estimation accuracy due to errors or deterioration caused by various reasons.

According to the learning method according to the present embodiment, in the update of the model parameters in the forward learning and the hostile learning, the data having the smaller discrimination mismatch or the data having the larger discrimination mismatch is preferentially used. Even if there is an imbalance, the effect on the learning process is small. That is, it is possible to prevent a decrease in estimation accuracy due to data imbalance.

In this way, by using the learning method according to the present embodiment and the estimation model generated by the learning method, unsupervised domain adaptation can be more reliably realized in a practical environment (in the wild).

The embodiments disclosed this time should be considered to be exemplary in all respects and not restrictive. The scope of the present invention is shown by the scope of claims, not the above description, and is intended to include all modifications within the meaning and scope of the claims.

1 image processing system, 2 robots, 4 axes, 8 works, 10 learning networks, 12 servo drivers, 14 motors, 20 cameras, 30 source data sets, 32, 34, 52, 54 data, 40 machine learning, 42 1st Class identification surface, 44 second class identification surface, 50 target data set, 53,55 mismatch area, 56,57,58 area, 60 estimation model, 62 estimation target data, 64 estimation result, 70 encoder, 72,74 Identifyer, 84 estimation result output unit, 86 reliability calculation unit, 100 image processing device, 102 processor, 104 main memory, 110 storage, 114 machine learning program, 116 model parameters, 118 recognition application, 122 communication interface, 124 input unit , 126 output unit, 128 camera interface, 130 motor interface, 150 data collection processing, 200 information processing device, 250 external device, 301 first source data group, 302 second source data group, 501 first target data group , 502 Second target data group, 1141 Input data selection unit, 1142, 1143 calculation unit, 1144 error buffer, 1145 curriculum determination unit, 1146 parameter update unit.

Claims

It ’s an estimation system,
A learning unit that generates an estimation model by machine learning using a first data set consisting of a plurality of data to which a class is assigned and a second data set consisting of a plurality of data to which a class is not assigned.
It is provided with an estimation unit that inputs estimation target data that can belong to the second data set into the estimation model and determines an estimation result.
The estimation model is
An encoder that calculates features from the input data,
Based on the feature quantity, the first classifier that outputs the first probability that the input data is the first class, and
Based on the feature quantity, the input data includes a second classifier that outputs a second probability of being a second class.
The learning unit
Discrimination mismatch is calculated based on the first and second probabilities that are output when the data contained in either the first data set or the second data set is input to the estimation model. Calculation method and
An estimation system that includes a determinant that determines the learning priority for the input data according to the magnitude of the calculated identification discrepancy.
The learning unit updates the model parameters of the first classifier and the second classifier for the purpose of maximizing the discrepancy in discrimination while fixing the model parameters of the encoder. Including additional parameter update means
The estimation system according to claim 1, wherein the determination means determines a higher learning priority for data having a smaller discrepancy in the calculated identification with respect to the first parameter update means.
The learning unit updates the model parameters of the encoder for the purpose of maximizing the discrepancy in discrimination while fixing the model parameters of the first classifier and the second classifier. Including additional parameter update means
The estimation system according to claim 1 or 2, wherein the determination means determines a higher learning priority for data having a larger discrepancy in the calculated identification with respect to the second parameter update means.
The determination means according to any one of claims 1 to 3, wherein as a learning priority, a weighting coefficient for multiplying an error for back-propagating the estimated model is determined according to the magnitude of the discrepancy in identification. Estimate system.
The determination means according to any one of claims 1 to 4, wherein the determination means determines only the data in which the magnitude of the calculated identification mismatch satisfies a predetermined condition as the data used for updating the model parameters. Estimate system.
The determination means of claims 1 to 5 ranks the discrepancies of identification calculated for each of a plurality of data, and then selects only the data within a predetermined range as the data to be used for updating the model parameters. The estimation system according to any one item.
The learning unit updates the model parameters of the encoder, the model parameters of the first classifier, and the model parameters of the second classifier based on the data contained in the first data set. Including further parameter update means of
The third parameter updating means inputs data into the estimation model and outputs the first classifier and the second classifier based on an error output from the first classifier and the second classifier. The estimation system according to any one of claims 1 to 6, wherein the model parameter of the other of the second classifiers is updated.
The estimation unit includes an estimation result output unit that outputs an estimation result depending on whether or not the first probability and the second probability output when the estimation target data is input to the estimation model match each other. , The estimation system according to any one of claims 1 to 7.
A storage unit that holds an estimation model generated by machine learning using a first dataset consisting of a plurality of data to which a class is assigned and a second dataset consisting of a plurality of data to which a class is not assigned. When,
It is provided with an estimation unit that inputs estimation target data that can belong to the second data set into the estimation model and determines an estimation result.
The estimation model is
An encoder that calculates features from the input data,
Based on the feature quantity, the first classifier that outputs the first probability that the input data is the first class, and
Based on the feature quantity, the input data includes a second classifier that outputs a second probability of being a second class.
The estimation model is calculated based on the first probability and the second probability output when the data contained in either the first data set or the second data set is input to the estimation model. An estimator that is trained based on a determined learning priority for the input data, depending on the magnitude of the discriminant discrepancy.
A learning step of generating an estimation model by machine learning using a first dataset consisting of a plurality of data to which a class is assigned and a second dataset consisting of a plurality of data to which a class is not assigned.
It includes an estimation step of inputting estimation target data that can belong to the second data set into the estimation model and determining an estimation result.
The estimation model is
An encoder that calculates features from the input data,
Based on the feature quantity, the first classifier that outputs the first probability that the input data is the first class, and
Based on the feature quantity, the input data includes a second classifier that outputs a second probability of being a second class.
The learning step
Discrimination mismatch is calculated based on the first and second probabilities that are output when the data contained in either the first data set or the second data set is input to the estimation model. Steps to do and
An estimation method that includes a step of determining the learning priority for the input data according to the magnitude of the calculated identification discrepancy.