WO2022044233A1

WO2022044233A1 - Estimation device, estimation method, and program

Info

Publication number: WO2022044233A1
Application number: PCT/JP2020/032485
Authority: WO
Inventors: 洋介高橋
Original assignee: 日本電信電話株式会社
Priority date: 2020-08-27
Filing date: 2020-08-27
Publication date: 2022-03-03

Abstract

An estimation device according to one embodiment is an estimation device to be connected, through a communication network, to a machine learning service for providing an inference result of a predetermined task by performing feature amount combining on data including one or more feature amounts, the estimation device comprising: a first acquisition unit for transmitting a first data set made up of data including one or more feature amounts to the machine learning service, and acquiring first inference result data indicating an inference result for each piece of data forming the first data set; a generation unit for generating a candidate function for a feature amount combining function used for feature amount combining in the machine learning service; a creation unit for performing, on the basis of the candidate function, predetermined conversion on each piece of data forming the first data set to create a second data set made up of the converted data; a second acquisition unit for transmitting the second data set to the machine learning service and acquiring second inference result data indicating an inference result for each piece of data forming the second data set; and an estimation unit for calculating a degree of similarity between the first inference result data and the second inference result data and estimating whether the candidate function is used for feature amount combining in the machine learning service.

Description

Estimator, estimation method and program

The present invention relates to an estimation device, an estimation method and a program.

Research is being conducted on machine learning models that realize tasks such as prediction and classification by learning some features and patterns from training data. Conventionally, when creating a machine learning model, a model creator with specialized skills has repeated trial and error to create an appropriate model and improve the accuracy of the task. However, since creating a machine learning model requires advanced skills, in recent years, there have been an increasing number of cases where necessary training data is provided to an external machine learning vendor and the model creation is outsourced. For example, various machine learning vendors provide services that automatically create machine learning models by simply uploading training data to the cloud and specifying task content. Such a service is also generally referred to as an AutoML (Automated Machine Learning) service.

In the AutoML service, in general, (1) a plurality of training data uploaded from a user are read, (2) each training data is preprocessed, and then (3) various machine learning models are subjected to various methods. In many cases, parameters are used to train the training data after preprocessing in parallel, and (4) a machine learning model with high accuracy is provided to the user. Further, as the preprocessing of (2) above, processing such as data normalization and feature amount synthesis is often performed. Among these preprocessing, feature synthesis is an important part in that it affects the accuracy of the inference result of the machine learning model. Note that feature quantity synthesis is a process of converting a feature quantity vector to create a new feature quantity vector. For example, a feature quantity x of a feature quantity vector (x ₁ , x ₂ , x ₃ , x ₄ ). By multiplying ₂ and x ₃ , a new feature vector (x ₁ , x ₂ × x ₃ , x ₄ ) can be created.

On the other hand, as the application of machine learning models to the real world progresses, the possibility of explaining them is becoming an important issue. In other words, for the inference results (prediction results, classification results, etc.) output by the machine learning model, it is possible to present the judgment criteria, that is, which features in the data have an effect on the inference results. It has been demanded.

As a conventional technique relating to the explainability of a machine learning model, the techniques described in Non-Patent Documents 1 to 3 are known.

Non-Patent Documents

1 and 2 describe technologies called LIMITE (Local Interpretable Model-agnostic Expansions) and SHAP (SHapley Additive exPlanations), respectively, and create a plurality of data in which noise is added to input data, and they are described. Is input to the machine learning model, and the contribution of the feature to a specific inference result is estimated by observing the output result and performing a linear approximation. Further, in Non-Patent Document 3, the inference result when a specific feature amount of the input data is randomly rearranged is observed, and the degree of deterioration of the inference accuracy due to the rearrangement is evaluated as the importance of the feature amount. There is.

When using the AutoML service or an external machine learning vendor, it is difficult to know later what function is used for feature quantity synthesis when learning a machine learning model. On the other hand, if the function used for feature amount synthesis (hereinafter, also referred to as "feature amount synthesis function") can be known, it is considered that the possibility of explaining the inference result of the machine learning model will be improved. However, in the prior art regarding the explainability of machine learning models, the influence of feature synthesis was not taken into consideration.

One embodiment of the present invention has been made in view of the above points, and an object thereof is to estimate a feature amount synthesis function.

In order to achieve the above object, the estimation device according to the embodiment provides a machine learning service and a communication network that perform feature quantity synthesis on data containing one or more feature quantities and provide inference results for a predetermined task. An estimation device connected via a device that transmits a first data set composed of data including one or more feature quantities to the machine learning service, and for each data constituting the first data set. The first acquisition unit that acquires the first inference result data indicating the inference result, the generation unit that generates the candidate function of the feature quantity synthesis function used for the feature quantity synthesis of the machine learning service, and the candidate function. To create a second data set composed of the converted data by performing a predetermined conversion on each data constituting the first data set based on the above, and the second data set. A second acquisition unit that transmits a data set to the machine learning service and acquires a second inference result data indicating an inference result for each data constituting the second data set, and the first inference result data. It is characterized by having an estimation unit for calculating the similarity between the data and the second inference result data and estimating whether or not the candidate function is used for feature quantity synthesis of the machine learning service.

The feature composition function can be estimated.

It is a figure which shows an example of the whole structure of the feature quantity synthesis function estimation system which concerns on this embodiment. It is a figure which shows an example of the hardware composition of the estimation apparatus which concerns on this embodiment. It is a figure which shows an example of the functional structure of the estimation apparatus which concerns on this embodiment. It is a flowchart which shows an example of the estimation process which concerns on this embodiment. It is a figure (the 1) for demonstrating an example of conversion of inference data. It is a figure (the 2) for demonstrating an example of conversion of inference data. It is a figure (3) for demonstrating an example of conversion of inference data. It is a figure (4) for demonstrating an example of conversion of inference data.

Hereinafter, an embodiment of the present invention will be described. In the present embodiment, when a machine learning model of a predetermined task is created by an external machine learning service (for example, AutoML service, a service in which an external machine learning vendor creates a machine learning model, etc.), this machine learning service is used. The feature quantity synthesis function estimation system 1 capable of estimating the feature quantity synthesis function used will be described. The feature amount synthesis function is a function used for feature amount synthesis, which is one of the preprocessing of data.

<Overall configuration of feature amount synthesis function estimation system 1>
First, the overall configuration of the feature quantity synthesis function estimation system 1 according to the present embodiment will be described with reference to FIG. FIG. 1 is a diagram showing an example of the overall configuration of the feature quantity synthesis function estimation system 1 according to the present embodiment.

As shown in FIG. 1, the feature quantity synthesis function estimation system 1 according to the present embodiment includes an estimation device 10, a user terminal 20, and a machine learning service providing device 30. These are communicably connected via any communication network, including the Internet N.

The estimation device 10 is a computer or a computer system that estimates the feature amount synthesis function used in the machine learning service.

The user terminal 20 is various terminals (for example, a PC (personal computer), a smartphone, a tablet terminal, etc.) used by the user of the machine learning service. The user operates the user terminal 20 to send a set of training data for creating a machine learning model (hereinafter, also referred to as a “learning data set”) to the machine learning service providing device 30, or has already learned. A set of data for obtaining an inference result (hereinafter, also referred to as "inference data set") can be transmitted to the machine learning service providing device 30 by the machine learning model of.

The learning data set is composed of one or more learning data (training data), and each learning data includes a data ID, one or more feature amount (Feature), and an objective variable (Target). .. On the other hand, the inference data set is composed of one or more inference data, and each inference data includes a data ID and one or more feature quantities. The objective variable is a variable indicating the purpose of the machine learning model, and the feature quantity is a numerical value that characterizes the objective variable. The machine learning model infers the value of the objective variable from the feature quantity contained in the inference data by learning the relationship between the feature quantity contained in the training data and the objective variable.

The machine learning service providing device 30 is a server managed by an external machine learning vendor or the like that provides the machine learning service.

Here, the machine learning service has a learning phase in which a machine learning model is created using a learning data set and an inference phase in which an inference result is obtained from a machine learning model that has been trained using an inference data set. In the learning phase, the machine learning service providing device 30 performs predetermined preprocessing including feature quantity synthesis on each learning data constituting the learning data set transmitted from the user terminal 20, and then preprocesses. A machine learning model is trained using each of the later training data. On the other hand, in the inference phase, the machine learning service providing device 30 performs preprocessing including feature quantity synthesis on each inference data constituting the inference data set transmitted from the user terminal 20, and then preprocesses. An inference result is obtained by a machine learning model that has been trained using each inference data after processing, and is returned to the user terminal 20.

As described above, the machine learning service is, for example, an AutoML service or a service in which an external machine learning vendor creates a machine learning model, and the user is a feature quantity synthesis function used in the machine learning service. It is not possible to know. Further, the configuration of the feature quantity synthesis function estimation system 1 shown in FIG. 1 is an example, and may be another configuration. For example, the estimation device 10 may be included in the user terminal 20.

<Hardware configuration of estimation device 10>
Next, the hardware configuration of the estimation device 10 according to the present embodiment will be described with reference to FIG. FIG. 2 is a diagram showing an example of the hardware configuration of the estimation device 10 according to the present embodiment.

As shown in FIG. 2, the estimation device 10 according to the present embodiment is realized by hardware of a general computer or computer system, and includes an input device 11, a display device 12, an external I / F13, and a communication I / F14. It has a processor 15 and a memory device 16. Each of these hardware is connected so as to be communicable via the bus 17.

The input device 11 is, for example, a keyboard, a mouse, a touch panel, or the like. The display device 12 is, for example, a display or the like. The estimation device 10 does not have to have at least one of the input device 11 and the display device 12.

The external I / F13 is an interface with an external device. The external device includes a recording medium 13a and the like. The estimation device 10 can read or write the recording medium 13a via the external I / F 13. The recording medium 13a includes, for example, a CD (Compact Disc), a DVD (Digital Versatile Disk), an SD memory card (Secure Digital memory card), a USB (Universal Serial Bus) memory card, and the like.

The communication I / F 14 is an interface for connecting the estimation device 10 to the communication network. The processor 15 is, for example, various arithmetic units such as a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit). The memory device 16 is, for example, various storage devices such as an HDD (Hard Disk Drive), an SSD (Solid State Drive), a RAM (Random Access Memory), a ROM (Read Only Memory), and a flash memory.

By having the hardware configuration shown in FIG. 2, the estimation device 10 according to the present embodiment can realize the estimation process described later. The hardware configuration shown in FIG. 2 is an example, and the estimation device 10 may have another hardware configuration. For example, the estimation device 10 may have a plurality of processors 15 or a plurality of memory devices 16.

<Functional configuration of estimation device 10>
Next, the functional configuration of the estimation device 10 according to the present embodiment will be described with reference to FIG. FIG. 3 is a diagram showing an example of the functional configuration of the estimation device 10 according to the present embodiment.

As shown in FIG. 3, the estimation device 10 according to the present embodiment includes an inference result acquisition unit 101, a function generation unit 102, a conversion unit 103, a distance calculation unit 104, and a determination unit 105. Each of these parts is realized, for example, by a process of causing the processor 15 to execute one or more programs installed in the estimation device 10.

Further, the estimation device 10 according to the present embodiment has a storage unit 106. The storage unit 106 is realized by, for example, a memory device 16. However, the storage unit 106 may be realized by, for example, a database server connected to the estimation device 10 via a communication network or the like.

The storage unit 106 stores the inference data set D used when estimating the feature amount synthesis function used in the machine learning service. In addition, one or more inference data sets D may be stored in the storage unit 106.

The inference result acquisition unit 101 transmits the inference data set D stored in the storage unit 106 to the machine learning service providing device 30, and as a reply, the inference result of the trained machine learning model for the inference data set D. Acquire the data R. Further, the inference result acquisition unit 101 transmits the inference data set D'created by the conversion unit 103, which will be described later, to the machine learning service providing device 30, and as a reply, the learned machine for the inference data set D'. Acquire the inference result data R'of the training model.

The function generation unit 102 generates candidates for the feature amount synthesis function used in the machine learning service. Hereinafter, the feature amount synthesis function (that is, the true feature amount synthesis function) used in the machine learning service is referred to as trans _t , and the candidate of the feature amount synthesis function created by the function generation unit 102 is referred to as trans.

The conversion unit 103 creates a new inference data set D'that is a conversion of the inference data set D. At this time, the conversion unit 103 creates inference data d'in which trans (d) = trans (d') and d ≠ d'for each inference data d included in the inference data set D. , Create an inference data set D'consisting of these inference data d'. In the following, when trans (d) = trans (d') and d ≠ d'are satisfied between all the inference data d and the inference data d'created from the inference data d. It is said that trans (D) = trans (D') and D ≠ D'are satisfied between the inference data set D and the inference data set D'.

The distance calculation unit 104 calculates the similarity distance (R, R') between the inference result data R and the inference result data R'using a predetermined distance function distance.

The determination unit 105 determines whether or not the similarity distance (R, R') calculated by the distance calculation unit 104 is equal to or greater than a predetermined threshold value σ. Then, when distance (R, R') ≦ σ, the determination unit 105 determines that the function trans created by the function generation unit 102 may be used in the machine learning service. On the other hand, when distance (R, R')> σ, the determination unit 105 determines that the function trans created by the function generation unit 102 is not used in the machine learning service. As a result, the feature composition function used in the machine learning service is estimated. This is because the inference data d'is created so that trans (d) = trans (d') for each inference data d constituting the inference data set D, so that the feature amount synthesis function is used. This is because the inference result data R and the inference result data R'are the same when the candidate trans is the same as the true feature amount synthesis function trans _t .

<Estimation processing>
Next, the flow of the process of estimating the feature amount synthesis function used in the machine learning service by the estimation device 10 according to the present embodiment will be described with reference to FIG. FIG. 4 is a flowchart showing an example of the estimation process according to the present embodiment. It is assumed that the trained machine learning model has been created by the machine learning service.

First, the inference result acquisition unit 101 transmits the inference data set D stored in the storage unit 106 to the machine learning service providing device 30, and as a reply, of the trained machine learning model for the inference data set D. The inference result data R is acquired (step S101).

From now on, for the sake of simplicity, the inference result of the trained machine learning model for the inference data d is represented by a scalar value, and the inference result data R is represented by a one-dimensional vector. Therefore, for example, when the inference data set D is composed of n inference data d, the inference result data R is represented by a one-dimensional vector composed of n elements (inference result).

Next, the function generation unit 102 generates candidate transs of the feature amount synthesis function used in the machine learning service (step S102). The function generation unit 102 may randomly create a function trans from a combination of four rules of operation of each feature amount included in the inference data d constituting the inference data set D.

For example, suppose that the inference data d is represented by a four-dimensional feature vector (x ₁ , x ₂ , x ₃ , x ₄ ) having four features x ₁ , x ₂ , x ₃ , and x ₄ . In this case, it is conceivable that the function generation unit 102 calculates a four-rule operation of two arbitrary features and generates a function that outputs a three-dimensional feature vector as a candidate trans of the feature composition function. Specifically, it is conceivable to generate the following function as a trans.

-Trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ / x ₂ , x ₃ , x ₄ )
-Trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ , x ₂ x x ₃ , x ₄ )
-Trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ , x ₂ + x ₃ , x ₄ )
-Trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ , x ₂ -x ₃ , x ₄ )
Further, for example, when a function that calculates a combination of four arithmetic operations of any three features and outputs a two-dimensional feature vector is generated as a candidate trans of a feature composition function, the following function is trans. It is conceivable to generate as.

-Trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ , x ₂ / x ₃ + x ₄ )
-Trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ , x ₂ x (x ₃ -x ₄ ))
-Trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ + x ₂ + x ₃ , x ₄ )
-Trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ x (x ₂ + x ₃ ), x ₄ )
However, these functions are an example, and a function that calculates an arbitrary four arithmetic operations of an arbitrary number of features included in the inference data d or a combination thereof and outputs a feature vector of an arbitrary dimension is defined as trans. It is possible to generate. In addition to the four-rule operation, a function for calculating an arbitrary operation (for example, logical operation, logarithmic conversion, exponential conversion, triangular function, exponentiation, root, etc.) may be generated as trans. Further, if the user has some knowledge about inference data or a machine learning model (for example, knowledge that feature quantity synthesis is performed between two feature quantities), this knowledge is also included. Candidate trans of the feature amount synthesis function may be generated by using.

In the following, it is assumed that a candidate trans of a certain feature amount synthesis function is generated in step S102 above. Further, the input of the candidate trans of the feature quantity synthesis function is also referred to as a "feature quantity vector", and the output is also referred to as a "composite feature quantity vector".

Next, the conversion unit 103 is composed of these inference data d'by converting each inference data d so as not to change the composite feature amount vector and creating new inference data d', respectively. Inference data set D'is created (step S103). That is, the conversion unit 103 creates an inference data set D'in which trans (D) = trans (D') and D ≠ D'.

Here, the conversion method in which the feature amount vector of the inference data d and the composite feature amount vector are the same differs depending on the candidate trans of the feature amount synthesis function. Hereinafter, assuming that the inference data d is represented by a four-dimensional feature vector (x ₁ , x ₂ , x ₃ , x ₄ ), an example of conversion when trans performs a simple four arithmetic operation is shown in FIGS. 5 to 5. It is shown in FIG.

FIG. 5 shows an example in the case where trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ / x ₂ , x ₃ , x ₄ ). In this case, the Hadamard product (element) of the feature vector (x ₁ , x ₂ , x ₃ , x ₄ ) of the inference data d and the transformation vector (1 / x ₂ , 1 / x ₂ , 1, 1). The vector (x ₁ / x ₂ , 1, x ₃ , x ₄ ) representing each product) may be used as the feature quantity vector of the inference data d'having the same data ID as the inference data d. As a result, trans (D) = trans (D') and D ≠ D'is established.

FIG. 6 shows an example in the case where trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ , x ₂ × x ₃ , x ₄ ). In this case, a vector (1, x 3, 1 / x 3, 1) representing the Hadamard product of the feature vector (x ₁ , x ₂ , x ₃ , x ₄ ) of the inference data d and the conversion vector (1, x ₃ , 1 / x ₃ , 1). x ₁ , x ₂ x x ₃ , 1, x ₄ ) may be set as a feature vector of the inference data d'having the same data ID as the inference data d. As a result, trans (D) = trans (D') and D ≠ D'is established.

FIG. 7 shows an example in the case where trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ , x ₂ + x ₃ , x ₄ ). In this case, the vector (x ₁ , 0) to which the conversion vector (0, x ₃ , -x ₃ , 0) is added to the feature vector (x ₁ , x ₂ , x ₃ , x ₄ ) of the inference data d. x ₂ + x ₃ , 0, x ₄ ) may be set as a feature vector of the inference data d'having the same data ID as the inference data d. As a result, trans (D) = trans (D') and D ≠ D'is established.

FIG. 8 shows an example in the case where trans (x ₁ , x ₂ , x ₃ , x ₄ ) = (x ₁ , x ₂ -x ₃ , x ₄ ). In this case, a vector (x ₁ ) obtained by adding a conversion vector (0, -x ₃ , -x ₃ , 0) to the feature vector (x ₁ , x ₂ , x ₃ , x ₄ ) of the inference data d. , X ₂ -x ₃ , 0, x ₄ ) may be set as a feature vector of the inference data d'having the same data ID as the inference data d. As a result, trans (D) = trans (D') and D ≠ D'is established.

Return to Fig. 4. Following step S103, the inference result acquisition unit 101 transmits the inference data set D'created in step S103 to the machine learning service providing device 30, and as a reply to the inference data set D'. Acquire the inference result data R'of the trained machine learning model. (Step S104). The inference result data R'is also represented by a one-dimensional vector like the inference result data R.

Hereinafter, it is assumed that the inference data set D is composed of n inference data d from the data IDs “1” to “n”, and R = (r ₁ , r ₂ , ..., R _n) . ). Here, r _k is the inference result of the trained machine learning model for the inference data d of the data ID “k” (where k = 1, ..., N). Similarly, the inference result of the trained machine learning model for the inference data d'of the data ID "k" (however, k = 1, ..., N) is set as r _i ', and R'= (r ₁ ', It shall be expressed as r ₂ ', ..., r _n ').

Next, the distance calculation unit 104 calculates the similarity distance (R, R') between the inference result data R and the inference result data R'using a predetermined distance function distance (step S105). As the distance function distance, for example, a root mean square error (RMSE: Root Mean Square Error), a mean absolute error rate (MAPE: Mean Absolute Percentage Error), or the like may be used. For example, when using the root mean square error, the distance function distance (R, R') can be calculated as follows.

Next, the determination unit 105 determines whether or not the similarity distance (R, R') calculated in step S105 is equal to or greater than a predetermined threshold value σ. Then, when the determination (R, R') ≤ σ, the determination unit 105 determines that the function trans generated in step S102 may be used as a feature composition function in the machine learning service. , Disstance (R, R')> σ, it is determined that the function trans is not used as a feature composition function in the machine learning service (step S106). As a result, when distance (R, R') ≤ σ, the feature composition function used in the machine learning service is estimated. This is because the inference data set D'is created so that trans (D) = trans (D') and D ≠ D', so when trans = trans _t , R = R'. This is because while distance (R, R') = 0, when trans ≠ trans _t , R ≠ R'and distance (R, R')> σ for a certain threshold value σ. ..

As described above, the estimation device 10 according to the present embodiment changes the output information (inference result data R) of the inference data set D input to the external machine learning service and the inference data set D. By analyzing the output information (inference result data R') of, the feature quantity synthesis function used in the machine learning service can be estimated. This makes it possible to clarify the feature amount synthesis function used in the machine learning service (that is, to clarify the feature amount engineering process) and improve the explainability to the inference result of the machine learning model.

The present invention is not limited to the above-described embodiment specifically disclosed, and various modifications and modifications, combinations with known techniques, and the like are possible without departing from the description of the claims. ..

1 Feature composition function estimation system 10 Estimator 11 Input device 12 Display device 13 External I / F
13a Recording medium 14 Communication I / F
15 Processor 16 Memory device 17 Bus 20 User terminal 30 Machine learning service provider 101 Inference result acquisition unit 102 Function generation unit 103 Conversion unit 104 Distance calculation unit 105 Judgment unit 106 Storage unit N Internet

Claims

An estimation device connected via a communication network to a machine learning service that synthesizes features for data containing one or more features and provides inference results for a predetermined task.
A first data set composed of data including one or more features is transmitted to the machine learning service, and a first inference result data showing an inference result for each data constituting the first data set is transmitted. The first acquisition unit to acquire and
A generator that generates a candidate function for the feature synthesis function used for feature synthesis of the machine learning service, and a generator.
A creation unit that creates a second data set composed of the converted data by performing a predetermined conversion on each data constituting the first data set based on the candidate function.
A second acquisition unit that transmits the second data set to the machine learning service and acquires the second inference result data indicating the inference result for each data constituting the second data set.
An estimation unit that calculates the similarity between the first inference result data and the second inference result data and estimates whether or not the candidate function is used for feature quantity synthesis of the machine learning service.
An estimation device characterized by having.
The creation part
The claim is characterized in that a predetermined conversion is performed for each data constituting the first data set so as not to change the output of the candidate function for each data constituting the first data set. The estimation device according to 1.
The estimation unit
Using a predetermined distance function, the similarity between the first inference result data and the second inference result data is calculated.
When the similarity is equal to or less than a predetermined threshold value, it is estimated that the candidate function is used for feature quantity synthesis of the machine learning service.
The estimation device according to claim 1 or 2, wherein when the similarity is larger than a predetermined threshold value, it is estimated that the candidate function is not used for feature quantity synthesis of the machine learning service.
The generator is
The estimation device according to any one of claims 1 to 3, wherein a function that performs a predetermined operation between the one or more feature quantities is randomly generated as the candidate function.
An estimation device connected via a communication network to a machine learning service that synthesizes features for data containing one or more features and provides inference results for a predetermined task.
A first data set composed of data including one or more features is transmitted to the machine learning service, and a first inference result data showing an inference result for each data constituting the first data set is transmitted. The first acquisition procedure to acquire and
A generation procedure for generating a candidate function of the feature amount synthesis function used for feature amount synthesis of the machine learning service, and a generation procedure.
Based on the candidate function, a predetermined conversion is performed on each data constituting the first data set to create a second data set composed of the converted data, and a creation procedure.
The second acquisition procedure of transmitting the second data set to the machine learning service and acquiring the second inference result data indicating the inference result for each data constituting the second data set, and the second acquisition procedure.
An estimation procedure for calculating the similarity between the first inference result data and the second inference result data and estimating whether or not the candidate function is used for feature quantity synthesis of the machine learning service, and an estimation procedure.
An estimation method characterized by performing.
A program that causes a computer to function as the estimation device according to any one of claims 1 to 4.