WO2021255778A1

WO2021255778A1 - Learning data selection method, learning data selection device, and learning data selection program

Info

Publication number: WO2021255778A1
Application number: PCT/JP2020/023371
Authority: WO
Inventors: 俊介塚谷; 和彦村崎; 慎吾安藤; 潤島村
Original assignee: 日本電信電話株式会社
Priority date: 2020-06-15
Filing date: 2020-06-15
Publication date: 2021-12-23

Abstract

This learning data selection device: calculates an expected range of improvement in AUC score for each candidate item of data to be labeled, using a range of improvement in AUC score for the candidate item of data to be labeled, the probability that the candidate item of data to be labeled belongs to the positive class, and the probability that the candidate item of data to be labeled belongs to the negative class; and selects, as data to be labeled that is used to generate additional learning data for a learning model, the candidate item of data to be labeled for which the expected range of improvement in AUC score is the highest among a set of candidate items of data to be labeled.

Description

Training data selection method, training data selection device, and training data selection program

The present invention relates to a learning data selection method, a learning data selection device, and a learning data selection program.

When supervised learning is applied as a machine learning method, information indicating the correct answer to the data, that is, learning data with a label is used.

In order to further improve the accuracy of the training model generated by machine learning using the training data, it is necessary to retrain using the additional training data generated by adding a label to new data different from the trained data. good.

However, there are limits to the human and time resources for associating data with labels. Therefore, it is often difficult to generate additional training data by associating labels with all new data different from the trained data, that is, all labeling candidate data.

For the above reasons, in order to improve the accuracy of the training model within the limited resources, labels are given to the data with the highest learning effect from among multiple data that are not used for the trained data. , It is preferable to use additional learning data.

Under such circumstances, for example, Non-Patent Document 1 discloses a method of selecting data having as high a learning effect as possible in the index of AUC (Area Under the Curve) score represented by the area under the ROC (Receiving Operating Characteristics). There is.

<Non-Patent Document 1>
Culver, Matt, Deng Kun, and Stephen Scott. "Active learning to maximize area under the ROC curve." Sixth International Conference on Data Mining (ICDM'06). IEEE, 2006.

The selection method of Non-Patent Document 1 is a method of eliminating the class imbalance of learning data, but the improvement range of the AUC score is not explicitly optimized. Therefore, if the selected labeling candidate data is similar to the data contained in the trained data, additional training of the training model is performed using the additional training data generated by adding a label to the selected labeling candidate data. However, the AUC score may not improve.

Therefore, from the label assignment candidate data prepared for the additional learning of the learning model, the label assignment candidate data that improves the learning accuracy of the learning model more efficiently than before the additional learning is performed by causing the learning model to perform additional learning. Disclose a learning data selection method, a learning data selection device, and a learning data selection program that can preferentially select.

The first aspect of the present disclosure is a training data selection method, which is prepared for each feature amount of the trained data included in the trained data set used for training the training model and for additional training of the training model. The step of extracting the feature amount of each of the label assignment candidate data included in the label assignment candidate data set, the feature amount of the trained data belonging to the positive class indicating the event to be estimated by the training model, and the feature amount of the trained data. Using the feature quantity of the trained data belonging to the negative class indicating an event other than the estimation target in the training model, a parameter representing the probability distribution of the feature quantity of the trained data in the positive class and the negative class is estimated. The probability that the label assignment candidate data belongs to the positive class and the probability that the label assignment candidate data belongs to the negative class are determined for each of the label assignment candidate data by using the step to be performed, the feature amount of the label assignment candidate data, and the parameter. The trained data and the trained data and the trained data using the trained model parameters, the trained data, and the labeling candidate data of the function representing the input / output relationship of the training model for estimating the event-likeness to be estimated and the step to be estimated. For each of the label assignment candidate data, the step of calculating the score which is the output of the training model for each of the label assignment candidate data and the score for each of the trained data and the label assignment candidate data are used. The step of calculating the improvement range of the AUC score in each case assuming that the label assignment candidate data belongs to either the positive class or the negative class, and the improvement of the AUC score for each of the label assignment candidate data. Using the width, the probability that the label assignment candidate data belongs to the positive class, and the probability that the label assignment candidate data belongs to the negative class, the expected value of the improvement width of the AUC score is calculated for each of the label assignment candidate data. , The step of selecting the label assignment candidate data having the highest expected value of the improvement range of the AUC score from the label assignment candidate data set as the label assignment target data used for generating the additional training data of the training model. including.

The second aspect of the present disclosure is a training data selection device, which is prepared for each feature amount of the trained data included in the trained data set used for training the training model and for additional training of the training model. The feature amount extraction unit that extracts the feature amount of each of the label assignment candidate data included in the labeled candidate data set, and the feature amount extraction unit that belongs to the positive class indicating the event to be estimated by the learning model. Using the feature amount of the trained data extracted in the above and the feature amount of the trained data extracted by the feature amount extraction unit belonging to the negative class indicating an event other than the estimation target in the learning model, the feature amount is used. The distribution estimation unit that estimates the parameters representing the probability distribution of the feature amount of the learned data in the positive class and the negative class, the feature amount of the label assignment candidate data extracted by the feature amount extraction unit, and the distribution. Using the parameters estimated by the estimation unit, the probability estimation unit that estimates the probability that the label assignment candidate data belongs to the positive class and the probability that the label assignment candidate data belongs to the negative class for each of the label assignment candidate data, and the estimation. Using the trained model parameters of the function representing the input / output relationship of the training model for estimating the likelihood of the target event, the trained data, and the label assignment candidate data, the trained data and the label assignment candidate data can be obtained. The label assignment candidate is used by using the score calculation unit that calculates the score that is the output of the learning model for each, and the scores for each of the trained data and the label assignment candidate data calculated by the score calculation unit. For each data, the AUC improvement width calculation unit that calculates the improvement width of the AUC score in each case assuming that the label assignment candidate data belongs to either the positive class or the negative class, and the AUC improvement The improvement width of the AUC score for each of the label assignment candidate data calculated by the width calculation unit, the probability that the label assignment candidate data estimated by the probability estimation unit belongs to the positive class, and the estimation by the probability estimation unit. Using the probability that the label assignment candidate data belongs to the negative class, the expected value of the improvement range of the AUC score is calculated for each of the label assignment candidate data, and the improvement range of the AUC score in the label assignment candidate data set is calculated. It includes a selection unit that selects the label assignment candidate data having the highest expected value as the label assignment target data used for generating the additional training data of the learning model.

The third aspect of the present disclosure is a learning data selection program, in which a computer functions as each part of a learning data selection device.

According to the learning data selection method, the learning data selection device, and the learning data selection program of the present disclosure, the learning model is additionally trained from the labeling candidate data prepared for the additional learning of the learning model. It has the effect of being able to preferentially select labeling candidate data that efficiently improves the learning accuracy of the learning model compared to before learning.

It is a figure which shows the functional configuration example of the learning data selection apparatus. It is a figure which shows the example of the main part structure of the computer applied to the learning data selection apparatus. It is a flowchart which shows an example of the flow of a learning data selection process.

Hereinafter, an embodiment of the learning data selection device 100 of the present disclosure will be described with reference to the drawings. The same components and the same processing are given the same reference numerals throughout the drawings, and duplicate description will be omitted.

FIG. 1 is a diagram showing a functional configuration example of the learning data selection device 100. As shown in FIG. 1, the learning data selection device 100 includes an input unit 10 and a calculation unit 20.

The input unit 10 includes a trained data set A, which is a set of trained data a used for learning a learning model generated by supervised learning, and labeling candidate data prepared for additional learning of the learning model. Accepts the label assignment candidate data set B, which is a set of b. Further, the input unit 10 receives the trained model parameter θ of the function f representing the input / output relationship of the learning model generated by the learning of the trained data a, that is, the function f for estimating the event-likeness to be estimated. The trained model parameter θ is a parameter that defines the input / output relationship of the generated training model.

There are no restrictions on the estimation target by the learning model, and any event may be the estimation target. Later, an event that the estimated target of the learning model is referred to as a "positive class C ^+", an event other than the estimation target of the learning model "negative class C ^-" that. For example, when the animal represented by the image is estimated by the learning model whether the "cat", event of a "cat" is a positive class C ^+, and the an event that is other than cat negative Class C ^- a.

The calculation unit 20 performs an operation for selecting label assignment candidate data b to be used for generating additional learning data of the learning model from the label assignment candidate data set B using various data received by the input unit 10. ..

As an example, the calculation unit 20 includes a feature amount extraction unit 21, a distribution estimation unit 22, a probability estimation unit 23, a score calculation unit 24, an AUC improvement width calculation unit 25, and a selection unit 26.

The feature amount extraction unit 21 inputs the trained data set A and the label assignment candidate data set B received by the input unit 10, and inputs each trained data a and the label assignment candidate data set B included in the trained data set A. The feature amount of each label addition candidate data b included is extracted. There are no restrictions on the method of extracting the features of the trained data a and the label assignment candidate data b. As an example of the extraction method, a feature extractor using a convolutional neural network (CNN) learned from ImageNet data is used. Instead of extracting the features using CNN, even if the features of the trained data a and the labeling candidate data b are extracted using the heuristically designed features such as the local image features. good. Further, instead of extracting the features using CNN, the features of the trained data a and the label assignment candidate data b are extracted by using the reconstruction error of the Variational AutoEncoder obtained from the trained data set A. You may.

Hereinafter, the feature amount of each trained data a included in the trained data set A is represented by g (a), and the feature amount of each label assignment candidate data b included in the label assignment candidate data set B is g ( It is expressed as b). The feature amount extraction unit 21 may be separated into an extraction unit for extracting the feature amount g (a) of the learned data a and an extraction unit for extracting the feature amount g (b) of the label assignment candidate data b. ..

Distribution estimating unit 22, each of the feature quantity g of the learned data a extracted by the feature amount extracting section 21 (a) as input for the learned data a, the positive class C ⁺ and the negative class C ^- each in parameters of the probability distribution of ω ^+, ω ^- to estimate.

Specifically, the distribution estimation unit 22 uses each feature amount g (a) as a feature amount of the ^{trained data a +} (a ⁺ ∈ A ⁺ ) included in the trained data set A ⁺ ^{belonging to the positive class C +.} It is classified into g (a ⁺ ) and the feature quantity g (a ⁻ ^{) of the trained data a −} (a ⁻ ∈ A ⁻ ) contained in the trained data set A ⁻ belonging to the negative class C ^−. The distribution estimation unit 22 has a positive class C ⁺ probability distribution h ⁺ (g (a ⁺ ); ω ⁺ ) and a negative class C indicated by the classified feature amount g (a ⁺ ) and feature amount g (a ^{−), respectively.} ^- probability distribution h of ^{^{- (g (a -);}} ω -) for each of, for example, the parameters of the probability distribution modeled by applying a normal distribution model omega ^+, omega ^- estimated. Incidentally, the probability distribution ^{^{h + (g (a +)}} ; ω +) and probability distributions ^{^{h - (g (a -)}} ; ω -) of the modeling, other such Bernoulli distribution or a Poisson distribution instead of the normal distribution A probability distribution may be applied.

Further, the distribution estimation unit 22 has an estimation unit that estimates the parameter ω ⁺ of the probability distribution h ⁺ (g (a ⁺ ); ω ⁺ ^{) using the feature amount g (a +} ), and the feature amount g (a ⁻ ). parameters of omega ^- may be separated in the estimation unit for estimating ^{^{a; (- - ω g (a}} -)) probability distribution h using.

The probability estimation unit 23 extracts each feature amount g (b) of the label assignment candidate data b extracted by the feature amount extraction unit 21, and the parameters ω ⁺ and ω ⁻ estimated by the distribution estimation unit 22, respectively, in the feature amount extraction unit 21. And received from the distribution estimation unit 22. Then, the probability estimation unit 23 inputs the feature amount g (b) and the parameters ω ⁺ and ω ^−, and the probability p that the label assignment candidate data b belongs to the ^{positive class C + for each label assignment candidate data b,} And the probability p belonging to the negative class C ^{-is estimated respectively.}

For convenience of explanation below, the case of performing description by focusing on each of the labeling candidate data b, which may in particular labeling candidate data b representing the "labeling candidate data b _i". “I” is an index for uniquely indicating the label assignment candidate data b.

Class contained in the probability distribution that is modeled by the trained data set A ^- labeling candidate data b _i is a positive class C ⁺ or negative class C ^- when estimating the probability belonging to p, each class C ^+, C And, it is considered that it is composed of two kinds of sets of classes not included in the probability distribution.

Here, class C ^{+ I} included in the probability distribution in the positive class C ^{^+,} the class as a probability distribution outside the C ^{+ O,} minus Class C ^- consisting of classes in the probability distribution in C ^-I, and the probability distribution outside ^{Let the} class be CO.

Probability _p class _{C i} which belongs labeling candidate data _{b i} is a positive class ^{^{C + (c i = C +}} | g (b i)) , the probability p (c generated from class ^{C + I} contained in the probability distribution _i = ^C + _I | is the sum of ^{_{| (g (b i c i}} = C + O)) and g _(b i)), the probability generated from class ^{C + O} outside the probability distribution _p. Thus, the probability _{^{p (c i = C + |}} g (b i)) is represented by equation (1) is expanded as (2).

Similarly, labeling candidate data _{b i} class _{C i} is negative class belongs C ^- become probability _{^{p (c i = C - |}} g (b i)) is generated from the class ^{C -I} contained in the probability distribution probability _p is the sum of _{| | (g (b i)} c i = C -O) and ^{_{(c i = C -I g (}} b i)), the probability generated from the probability distribution outside of class ^{C -O} _p. Thus, the probability _{^{p (c i = C - |}} g (b i)) is represented by equation (3).

_{_{Here, p (g (b i)}} | c i = C + I) is a positive class ^{C +} probability distribution ^{^{h + (g (a +)}} ; ω +) and by using the parameter omega ^+, labeling candidate data b _i for the probability distribution ^{^{h + (g (a +)}} ; ω +) probabilities generated from the distribution ^{_{h + (g (b i)}} ; ω +) may be used a value obtained by calculating the. _{_{p (g (b i) |}} c i = C -I) , a negative class C ^- probability distributions ^{^{h - (g (a -)}} ; ω -) as a parameter omega ^- using a labeling candidate data _{b i} probability distributions ^{^{h - (g (a -)}} ; ω -) probabilities generated from the distribution ^{_{h - (g (b i)}} ; ω -) may be used a value obtained by calculating the.

On the other _{_{hand, p (g (b i)}} | c i = C + O) and _{_{p (g (b i) |}} c i = C -O) respect, labeling candidate data _{b i} is a positive class ^{C +} and the negative class C ^- If _{_{belonging, p (g (b i)}} | c i = C + O) and _{_{p (g (b i) |}} c i = C -O) each of the probability distribution of a uniformly distributed. If the average value of the normal distribution model representing the probability distribution h ⁺ (g (a ⁺ ); ω ⁺ ) of the positive class C ⁺ _{is μ +} and the variance is σ ₊ ² , then 99.7% within the range of _{3σ +.} from be included learned data _{a, p (g (b i} ) | c i = C + O) probability distribution of ^{h +;} is represented by _{_{^{(μ + + 3σ + ω +}}} ), the other definitions It may be expressed using. Similarly, the negative class C ^- probability distributions ^{^{h - (g (a -)}} ; ω -) the average value of the normal distribution model representing the mu _-, the variance sigma _- if ^2, p _{(g (b} i) The probability distribution of | c _i = C − ^O ) is represented by, for example, h ⁻ (μ ₋ + 3σ ₋ ; ω ⁻ ).

Here, the learned data ^{a +} number of n ^(A +), the learned data a ^- the number of n (A ^-), among the labeling candidate data set B, the labeling candidate data _{b i} is a probability distribution h ^{_{+ (g (b i);}} ω +) and ^{_{h - (g (b i)}} ; ω -) , respectively outside a proportion of the t. In this _{^{_{case, p (c i = C +}}} I), p (c i = C -I), p (c i = C + O), and _p (c i ^{= C -O),} respectively (4) to (7 ) Is expressed by the formula. The ratio t is a value determined by an experiment.

The score calculation unit 24 includes the trained model parameter θ of the function f represented by the learning model for estimating the degree to which the event belongs to the ^{positive class C +} ^{, that is, the positive class C + of the event, the trained data set A, and the trained data set A.} The label assignment candidate data set B is received from the input unit 10.

The score calculation unit 24 has, in the function f represented by the trained model parameter θ, each of the trained data a included in the trained data set A and each of the label assignment candidate data b included in the label assignment candidate data set B. Enter. _{As a result, the score calculation unit 24 calculates the score f θ} (a) and the score f _θ (b), which are the outputs of the training model for each of the trained data a and the label assignment candidate data b.

The score calculation unit 24 is separated into a calculation unit that calculates the _{score f θ} (a) using the learned data a and a calculation unit that calculates the score f _{θ (b) using the label assignment candidate data b.} You may.

The AUC improvement width calculation unit 25 receives the calculated score f _θ (a) and score f _θ (b) from the score calculation unit 24. The AUC improvement width calculation unit 25 uses the score f _θ (a) and the score f _θ (b), and the label assignment candidate data b is ^{either positive class C +} or negative class C ⁻ for each label assignment candidate data b. The improvement width I of the AUC score in each case assuming that it belongs to one is calculated. Improvements width I of the AUC score, and AUC score calculated by the current learning model for current learning model, the additional learning model obtained by additional learning using label assignment candidate data b _i imparted with some label It is represented by the difference between the calculated AUC scores.

If the function H is a heavyside function, the AUC score AUC in the current learning model is expressed by Eq. (8).

For any one label assignment candidate data b _i, labels labeling candidate data b _i represents belong to positive class C ^+, i.e. when applying the C ⁺ labels, is calculated by the current learning model AUC score ^AUC + _{(b i)} it is represented by equation (9).

Next, consider the ^{C +} label imparted with labeling candidate data _{b i} AUC score is calculated by adding the learning model in addition to the additional learning data added learning the current status of the learning models ^{AUC + target} _{(b i).} Assuming that there is no variation element in AUC score than C ⁺ Labels imparted with grant candidate data _{b i,} all learned data a ^- and until ^- against _{f (b i)> f (} a) The learning model will be updated. Is calculated by such additional learning model AUC score ^{AUC + target} _{(b i)} is represented by equation (10).

Accordingly, + improving width of AUC scores If granted the ^{C +} labels labeling candidate data _{b i} ^I _{(b i)} is represented by equation (11).

Similarly, C to labeling candidate data _{b i} ^- improving width of AUC scores If granted the label ^I - _{(b i)} is represented by equation (12).

AUC improvements width calculating unit 25, for each of the labeling candidate data _{b i,} improved width ^I + _{(b i)} of the AUC scores and ^I - to calculate the _{(b i).}

Selecting unit 26, the probability from a probability estimation section 23 labeling candidate data _{b i} belonging to the positive class ^{C +} _p | a ^{_{(c i = C + g (}} b i)), the labeling candidate data _{b i} is negative class C ^- probability belong to _p - | accepts ^{_{(c i = C g (b}} i)). The selection unit 26, improved from AUC improve width calculating section 25 of the AUC scores for each labeling candidate data _{b i} width ^I + _{(b i)} and ^I - accepting _{(b i).}

Selecting unit 26, improved width ^I + _{(b i)} and ^I of AUC scores for each labeling candidate data _{_b i} - _{_(b i),} and the probability _{^{p (c i = C + |}} g (b i)) and probability _p - | with ^{_{(c i = C g (b}} i)), calculates the expected value of improving the width I of the AUC score E _{(b i)} for each label applying candidate data _{b i.} On top of that, the selection unit 26 of the labeling candidate data set B, and expectation E (b _i) is the highest labeling candidate data b _i improvements width I of AUC scores, the learning model of additional training data Select as the data to be labeled as used for generation.

Specifically, the expected value E of the improved width I of AUC score in the label applying candidate data _{b i} _{(b i)} is represented by (13).

That is, the selection unit 26, the labeling candidate data _{b i} is a positive class ^C probability _p that belongs to ⁺ | when the ^{C +} label given to ^{_{(= c i C + g (}} b i)) and labeling candidate data _{b i} calculating the product of the improved width ^I + _{(b i)} of the AUC score. On top of that, the selection unit 26, the calculated product to the label applying candidate data _{b i} is negative Class C ^- probability _p that belongs to ^{(c i = C - | g} (b i)) and the label assignment candidate data _{b i} C ^- improvement width of AUC scores If granted the label ^I - by adding the product of _{(b i),} and calculates the expected value of improving the width I of the AUC score E _{(b i).}

Note that (13) is an example of a calculation formula for calculating the expected value E _{(b i)} improvements width I of the AUC score. For example selecting section 26 may calculate the (13) by integrating the coefficients representing the weighting sections the right side of the equation, the expected value of the improvements width I of AUC score E (b _i).

By performing the labeling labeling candidate data b _i, which is selected from the label assignment candidate data set B by the learning data selecting apparatus 100, additional training data to maximize the improvement width I of the AUC scores in learning model is obtained Will be. Therefore, by performing the additional learning of the learning model using the additional learning data, as compared with the case of using the additional learning data generated by the label applying candidate data b _i selected at random from the label assignment candidate data set B , The learning accuracy of the learning model will be improved efficiently.

The learning data selection device 100 is configured by using a computer 30 as an example.

FIG. 2 is a diagram showing a configuration example of a main part of the computer 30 applied to the learning data selection device 100.

The computer 30 includes a CPU (Central Processing Unit) 31 that is responsible for processing in each part of the learning data selection device 100 shown in FIG. Further, the computer 30 has a ROM (Read Only Memory) 32 for storing a learning data selection program that causes the computer 30 to function as a learning data selection device 100, and a RAM (Random Access Memory) used as a temporary work area of the CPU 31. Includes 33. Further, the CPU 31 includes a non-volatile memory 34 and an input / output interface (I / O) 35. Then, the CPU 31, ROM 32, RAM 33, non-volatile memory 34, and I / O 35 are each connected by the bus 36.

The non-volatile memory 34 is an example of a storage device in which the stored information is maintained even if the power supplied to the non-volatile memory 34 is cut off. For example, a semiconductor memory is used, but a hard disk may be used. The non-volatile memory 34 does not have to be included in the computer 30, and for example, a portable storage device that can be attached to and detached from the computer 30 may be used as the non-volatile memory 34.

For example, a communication unit 37, an input unit 38, and a display unit 39 are connected to the I / O 35.

The communication unit 37 is connected to a communication line such as the Internet and a LAN (Local Area Network), and includes a communication protocol for performing data communication with an external device connected to the communication line. Wired communication or wireless communication such as Wi-Fi (registered trademark) is used as the communication line.

The input unit 38 is a device that receives a user's instruction and notifies the CPU 31, for example, a button, a touch panel, a keyboard, and a mouse are used. When receiving an instruction by voice, a microphone may be used as the input unit 38.

The display unit 39 is an example of a device that visually displays information processed by the CPU 31, and for example, a liquid crystal display, an organic EL (Electroluminescence) display, or a projector is used.

The trained data set A, the label assignment candidate data set B, and the trained model parameter θ are input to the input unit 10 via, for example, a portable non-volatile memory 34 that can be attached to and detached from the communication unit 37 or the computer 30. Will be. In particular, when various data are input to the input unit 10 via the portable non-volatile memory 34 that can be attached to and detached from the computer 30, the computer 30 does not necessarily have to include the communication unit 37. Further, for example, when the learning data selection device 100 is installed in an unmanned data center and receives control from a remote location through a communication line, the computer 30 does not necessarily have to include the input unit 38 and the display unit 39.

Further, various units connected to the I / O 35 are an example, and for example, an image forming unit that forms information on a recording medium as characters or images may be connected to the I / O 35.

Next, the operation of the learning data selection device 100 of the present disclosure will be described. When the input unit 10 receives the trained data set A, the label assignment candidate data set B, and the trained model parameter θ of the training model, the CPU 31 of the training data selection device 100 performs the training data selection process according to the flowchart shown in FIG. To execute.

The learning data selection program that defines the learning data selection process is stored in advance in, for example, the ROM 32 of the learning data selection device 100. The CPU 31 of the learning data selection device 100 reads the learning data selection program stored in the ROM 32 and executes the learning data selection process. The trained data set A, the label assignment candidate data set B, and the trained model parameter θ received by the input unit 10 are stored in the RAM 33.

First, in step S10, the CPU 31 features the feature amount g (a) of each trained data a included in the trained data set A and the features of each label assignment candidate data b included in the label assignment candidate data set B. The amount g (b) is extracted and stored in the RAM 33.

In step S20, the CPU 31 classifies the trained data a into the trained data a ⁺ and the trained data a ^−. Further, CPU 31 is learned data ^{a +} feature quantity g ^{(a +)} and the learned data a ^- feature quantity g (a ^-) with respect to the learned data a, the positive class ^{C +} and the negative ^{The parameters ω +} and ω ⁻ representing the probability distribution of each feature amount g (a) in class C ⁻ are estimated, and the estimation result is stored in the RAM 33.

In step S30, the CPU 31 acquires the feature amount g (b) of the label assignment candidate data b extracted in step S10 and the parameters ω ⁺ and ω ⁻ estimated in step S20 from the RAM 33. On top of that, CPU 31, for each label applying candidate data _{b i,} the probability labeling candidate data _{b i} belong to the positive class ^{_{^{C + p (c i = C}}} + | g (b i)), and the negative class C ^- probability _p belonging to ^{(c i = C - | g} (b i)) and (2) were respectively estimated according to expression (7), stores the estimation result to RAM 33.

In step S40, the CPU 31 acquires the trained data set A, the label assignment candidate data set B, and the trained model parameter θ received by the input unit 10 from the RAM 33. _{Then, the CPU 31 has a score f θ} (a) and a score for each trained data a included in the trained data set A and each label assignment candidate data b included in the label assignment candidate data set B. f _θ (b) is calculated, and the score f _θ (a) and the score f _θ (b) are stored in the RAM 33.

In step S50, the CPU 31 _{acquires the score f θ} (a) and the score f _θ (b) calculated in step S40 from the RAM 33. CPU31, for each label applying candidate data _{b i,} improved width ^I + _(b i) of the AUC scores If granted the ^{C +} labels labeling candidate data _{b i,} and C in labeling candidate data _{b i} ^- label improvements width of AUC scores If granted the ^I - _{(b i)} the calculated according respectively (11) and (12). CPU31 is improved calculated AUC score width ^I + _{(b i)} and ^I - storing _{(b i)} into RAM 33.

In step S60, CPU 31 is improved calculated AUC score step S50 width ^I + _{(b i)} and ^I - _(b i), and the probability estimated in step _{^{S30 p (c i = C +}} | g (b _i)) and probability _p ^(c i = ^C - | acquiring g of _(b i)) from the RAM 33. On top of that, CPU 31, for each label applying candidate data _{b i,} the expected value of the improvements width I of AUC score E a _{(b i)} (13) is calculated according to equation of the calculated respectively in labeling candidate data _{b i} storing the expected value E of the improved width I of scores _{(b i)} into RAM 33.

CPU31, of the labeling candidate data set B, labeling used the expectation E (b _i) is the highest labeling candidate data b _i improvements width I of AUC score, the generation of the additional learning data of the learning model It is selected as the target data, and the learning data selection process shown in FIG. 3 is completed.

Thus, according to the example of the training data selecting apparatus 100 of the present disclosure, the expected value of the improvements width I of AUC score if the label for each label applying candidate data b _i is assigned E (b _i) It is calculated and selects the highest labeling candidate data b _i of the expected value E (b _i) as a labeling target data.

Training data selecting apparatus 100, the labeling candidate data _{b i} is a positive class ^{C +} and the negative class C ^- calculating the expected value of improving the width I of the AUC score based on the likelihood belonging to E _{(b i).} Accordingly, the learning data selecting apparatus 100, the labeling candidates from the data set B, the identification difficult labeling the current learning model candidate data b _i, and the positive class C ⁺ and the negative class C ^- likelihood of the same degree in selecting labeling candidate data b _i to be outliers preferentially. As a result, the learning data selecting apparatus 100, from among the label assignment candidate data set B, and compared with the case in which the additional learning using all labeling candidate data b _i, to improve the learning accuracy of the learning model efficiently It will select the labeling candidate data b _i preferentially.

Although one aspect of the learning data selection device 100 has been described above using the embodiment, the disclosed form of the learning data selection device 100 is an example, and the form of the learning data selection device 100 is limited to the range described in the embodiment. Not done. Various changes or improvements may be made to the embodiments without departing from the gist of the present disclosure, and the modified or improved forms are also included in the technical scope of the disclosure. For example, the order of the learning data selection processing shown in FIG. 3 may be changed without departing from the gist of the present disclosure.

Further, in this disclosure, as an example, a form in which learning data selection processing is realized by software has been described. However, the same processing as the flowchart shown in FIG. 3 can be implemented by, for example, ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array), or PLD (Programmable Logic Device). May be good. In this case, the processing speed can be increased as compared with the case where the learning data selection process is realized by software.

As described above, the CPU 31 of the learning data selection device 100 may be replaced with a dedicated processor specialized for a specific process such as ASIC, FPGA, PLD, GPU (Graphics Processing Unit), and FPU (Footing Point Unit).

The processing of the learning data selection device 100 is executed by a combination of two or more processors of the same type or different types, such as a plurality of CPU 31, or a combination of the CPU 31 and the FPGA, in addition to the form realized by one CPU 31. May be good. Further, the processing of the learning data selection device 100 may be realized by the cooperation of a processor located outside the housing of the learning data selection device 100 and located at a physically distant place.

In the embodiment, an example in which the learning data selection program is stored in the ROM 32 of the learning data selection device 100 has been described, but the storage destination of the learning data selection program is not limited to the ROM 32. The learning data selection program of the present disclosure can also be provided in a form recorded on a storage medium readable by a computer 30. For example, the learning data selection program may be provided in the form of being recorded on an optical disk such as a CD-ROM (Compact Disk Read Only Memory) and a DVD-ROM (Digital Versaille Disk Ready Memory). Further, the learning data selection program may be provided in the form of being recorded in a portable semiconductor memory such as a USB (Universal Serial Bus) memory and a memory card. The ROM 32, the non-volatile memory 34, the CD-ROM, the DVD-ROM, the USB, and the memory card are examples of non-transitory storage media.

Further, the learning data selection device 100 may acquire a learning data selection program from an external device through the communication unit 37 and store the downloaded learning data selection program in, for example, the ROM 32 or the non-volatile memory 34. In this case, the learning data selection device 100 reads the learning data selection program downloaded from the external device and executes the learning data selection process.

All documents, patent applications, and technical standards described herein are to the same extent as if the individual documents, patent applications, and technical standards were specifically and individually stated to be incorporated by reference. Incorporated by reference herein.

Regarding the above embodiments, the following additional notes will be further disclosed.

(Appendix 1)
With memory
With at least one processor connected to the memory
Including
The processor
Each feature amount of the trained data included in the trained data set used for training the training model, and each of the labeling candidate data included in the labeling candidate data set prepared for the additional learning of the training model. Extract the feature amount of
Using the feature amount of the trained data belonging to the positive class indicating the event to be estimated in the learning model and the feature amount of the trained data belonging to the negative class indicating the event other than the estimation target in the learning model. Then, a parameter representing the probability distribution of the feature amount of the learned data in the positive class and the negative class is estimated.
Using the feature amount of the label assignment candidate data and the parameter, the probability that the label assignment candidate data belongs to the positive class and the probability that the label assignment candidate data belongs to the negative class are estimated for each label assignment candidate data.
The trained data and the label assignment candidate data are used using the trained model parameters of the function representing the input / output relationship of the learning model for estimating the event-likeness to be estimated, the trained data, and the label assignment candidate data. The score, which is the output of the learning model, is calculated for each of the above.
When it is assumed that the label assignment candidate data belongs to either the positive class or the negative class for each label assignment candidate data by using the scores for each of the trained data and the label assignment candidate data. Calculate the improvement range of the AUC score in each case,
For each label assignment candidate data, the improvement width of the AUC score for each label assignment candidate data, the probability that the label assignment candidate data belongs to the positive class, and the probability that the label assignment candidate data belongs to the negative class are used. The expected value of the improvement range of the AUC score is calculated, and the label assignment candidate data having the highest expected value of the improvement range of the AUC score among the label assignment candidate data sets is used to generate additional training data of the training model. A training data selection device that is configured to be selected as the data to be labeled.

(Appendix 2)
A non-temporary storage medium that stores a program that can be executed by a computer to perform training data selection processing.
The learning data selection process is
Each feature amount of the trained data included in the trained data set used for training the training model, and each of the labeling candidate data included in the labeling candidate data set prepared for the additional learning of the training model. Extract the feature amount of
Using the feature amount of the trained data belonging to the positive class indicating the event to be estimated in the learning model and the feature amount of the trained data belonging to the negative class indicating the event other than the estimation target in the learning model. Then, a parameter representing the probability distribution of the feature amount of the learned data in the positive class and the negative class is estimated.
Using the feature amount of the label assignment candidate data and the parameter, the probability that the label assignment candidate data belongs to the positive class and the probability that the label assignment candidate data belongs to the negative class are estimated for each label assignment candidate data.
The trained data and the label assignment candidate data are used using the trained model parameters of the function representing the input / output relationship of the learning model for estimating the event-likeness to be estimated, the trained data, and the label assignment candidate data. The score, which is the output of the learning model, is calculated for each of the above.
When it is assumed that the label assignment candidate data belongs to either the positive class or the negative class for each label assignment candidate data by using the scores for each of the trained data and the label assignment candidate data. Calculate the improvement range of the AUC score in each case,
For each label assignment candidate data, the improvement width of the AUC score for each label assignment candidate data, the probability that the label assignment candidate data belongs to the positive class, and the probability that the label assignment candidate data belongs to the negative class are used. The expected value of the improvement range of the AUC score is calculated, and the label assignment candidate data having the highest expected value of the improvement range of the AUC score among the label assignment candidate data sets is used to generate additional training data of the training model. The non-temporary storage medium selected for the data to be labeled.

Claims

Each feature amount of the trained data included in the trained data set used for training the training model, and each of the labeling candidate data included in the labeling candidate data set prepared for the additional learning of the training model. Steps to extract the feature amount of
Using the feature amount of the trained data belonging to the positive class indicating the event to be estimated in the learning model and the feature amount of the trained data belonging to the negative class indicating the event other than the estimation target in the learning model. Then, the step of estimating the parameter representing the probability distribution of the feature amount of the learned data in the positive class and the negative class, and
A step of estimating the probability that the label assignment candidate data belongs to the positive class and the probability that the label assignment candidate data belongs to the negative class for each label assignment candidate data using the feature amount of the label assignment candidate data and the parameter. ,
The trained data and the label assignment candidate data are used using the trained model parameters of the function representing the input / output relationship of the learning model for estimating the event-likeness to be estimated, the trained data, and the label assignment candidate data. And the step of calculating the score which is the output of the learning model for each of
When it is assumed that the label assignment candidate data belongs to either the positive class or the negative class for each label assignment candidate data by using the scores for each of the trained data and the label assignment candidate data. Steps to calculate the improvement range of AUC score in each case,
For each label assignment candidate data, the improvement width of the AUC score for each label assignment candidate data, the probability that the label assignment candidate data belongs to the positive class, and the probability that the label assignment candidate data belongs to the negative class are used. The expected value of the improvement range of the AUC score is calculated, and the label assignment candidate data having the highest expected value of the improvement range of the AUC score among the label assignment candidate data sets is used to generate additional training data of the training model. The steps to be selected as the data to be labeled and
Training data selection method including.
In the label assignment candidate data when the probability that the label assignment candidate data belongs to the positive class and the label that the label assignment candidate data belongs to the positive class are given in the step of selecting the label assignment target data from the label assignment candidate data set. The product of the product of the improvement width of the AUC score and the improvement width of the AUC score in the label assignment candidate data when the label that the label assignment candidate data belongs to the negative class is given and the label belongs to the negative class. The training data selection method according to claim 1, wherein the expected value of the improvement range of the AUC score in the label assignment candidate data is calculated by adding.
The positive class and the negative class are treated as a set of classes included in the probability distribution and classes not included in the probability distribution corresponding to each class.
The probability that the label assignment candidate data belongs to the positive class is generated from the probability that the label assignment candidate data is generated from the class included in the probability distribution in the positive class and the probability that the label assignment candidate data is generated from the class outside the probability distribution in the positive class. Estimated by the sum of probabilities,
The probability that the label assignment candidate data belongs to the negative class is generated from the probability that the label assignment candidate data is generated from the class included in the probability distribution in the negative class and the probability that the label assignment candidate data is generated from the class outside the probability distribution in the negative class. The training data selection method according to claim 1 or claim 2, which is estimated by the sum of probabilities.
Each feature of the trained data included in the trained data set used for training the training model, and each of the labeling candidate data included in the labeling candidate data set prepared for the additional learning of the training model. The feature amount extraction unit that extracts the feature amount of
The feature amount of the trained data extracted by the feature amount extraction unit, which belongs to the positive class indicating the event to be estimated by the learning model, and the negative class indicating the event other than the estimation target by the learning model. A distribution estimation unit that estimates a parameter representing the probability distribution of the feature amount of the learned data in the positive class and the negative class by using the feature amount of the learned data extracted by the feature amount extraction unit to which the feature amount belongs. ,
Using the feature amount of the label assignment candidate data extracted by the feature amount extraction unit and the parameter estimated by the distribution estimation unit, the label assignment candidate data is the positive class for each label assignment candidate data. A probability estimation unit that estimates the probability of belonging to the negative class and the probability of belonging to the negative class, respectively.
The trained data and the label assignment candidate data are used using the trained model parameters of the function representing the input / output relationship of the learning model for estimating the event-likeness to be estimated, the trained data, and the label assignment candidate data. A score calculation unit that calculates the score that is the output of the learning model for each of
Using the scores for each of the learned data and the label assignment candidate data calculated by the score calculation unit, the label assignment candidate data is either the positive class or the negative class for each label assignment candidate data. The AUC improvement width calculation unit that calculates the improvement width of the AUC score in each case assuming that it belongs to one of them,
The improvement width of the AUC score for each label assignment candidate data calculated by the AUC improvement width calculation unit, the probability that the label assignment candidate data estimated by the probability estimation unit belongs to the positive class, and the probability estimation unit. Using the probability that the estimated label assignment candidate data belongs to the negative class, the expected value of the improvement range of the AUC score is calculated for each label assignment candidate data, and the AUC score of the label assignment candidate data set is calculated. A selection unit that selects the label assignment candidate data having the highest expected value of improvement as the label assignment target data used for generating additional training data of the training model.
Training data selection device including.
The selection unit assigns the label to the product of the probability that the label assignment candidate data belongs to the positive class and the improvement range of the AUC score in the label assignment candidate data when the label that the label assignment candidate data belongs to the positive class is given. By adding the product of the probability that the candidate data belongs to the negative class and the improvement range of the AUC score in the label assignment candidate data when the label that the candidate data belongs to the negative class is given, the AUC in the label assignment candidate data is added. The expected value of the improvement range of the score is calculated, and the label assignment candidate data having the highest expected value of the improvement range of the AUC score among the label assignment candidate data sets is used to generate additional training data of the training model. The training data selection device according to claim 4, which is selected as the data to be labeled.
The probability estimation unit treats the positive class and the negative class as a set of classes included in the probability distribution corresponding to each class and classes not included in the probability distribution.
The probability that the label assignment candidate data belongs to the positive class is generated from the probability that the label assignment candidate data is generated from the class included in the probability distribution in the positive class and the probability that the label assignment candidate data is generated from the class outside the probability distribution in the positive class. Estimated by the sum of probabilities,
The probability that the label assignment candidate data belongs to the negative class is generated from the probability that the label assignment candidate data is generated from the class included in the probability distribution in the negative class and the probability that the label assignment candidate data is generated from the class outside the probability distribution in the negative class. The learning data selection device according to claim 4 or claim 5, which is estimated by the sum of probabilities.
A learning data selection program for making a computer function as each part of the learning data selection device according to any one of claims 4 to 6.