WO2023166564A1 - Estimation device - Google Patents

Estimation device Download PDF

Info

Publication number
WO2023166564A1
WO2023166564A1 PCT/JP2022/008617 JP2022008617W WO2023166564A1 WO 2023166564 A1 WO2023166564 A1 WO 2023166564A1 JP 2022008617 W JP2022008617 W JP 2022008617W WO 2023166564 A1 WO2023166564 A1 WO 2023166564A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
result
inference
estimation
candidate data
Prior art date
Application number
PCT/JP2022/008617
Other languages
French (fr)
Japanese (ja)
Inventor
バトニヤマ エンケタイワン
光 土田
邦大 伊東
勇 寺西
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to PCT/JP2022/008617 priority Critical patent/WO2023166564A1/en
Publication of WO2023166564A1 publication Critical patent/WO2023166564A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present invention relates to an estimation device, an estimation method, and a recording medium.
  • a known technique is to estimate the data used during learning based on the output from a learning model for the purpose of risk assessment of a learning model learned using machine learning.
  • Non-Patent Document 1 describes a method of outputting a plausible attribute value by executing a predetermined process with known attributes and true labels of target data as inputs.
  • an estimated label to be output from a decision tree is calculated by fixing an unknown attribute to be estimated at a certain value.
  • the error function assumed is used to calculate the deviation between the true label and the estimated label, and the marginal probability is evaluated using the calculated deviation as a weight.
  • a likely attribute value is identified as a result of the above processing.
  • there is also a technique such as Non-Patent Document 2.
  • Patent Document 1 describes giving acquired data to a trained machine learning model, causing the trained machine learning model to perform predetermined inference, and as a result, obtaining an inference result for the data. ing.
  • Non-Patent Document 1 and Non-Patent Document 2 it is necessary to assume the shape of the error function, to know the marginal probability, etc.
  • Various prior knowledge is required for estimation. , an assumption was necessary. Therefore, there is a problem that data cannot be estimated accurately when the above knowledge is not available or assumptions are not made.
  • an object of the present invention is to provide an estimating device, an estimating method, and a recording medium that can solve the above-described problems.
  • the estimating device which is one aspect of the present disclosure, an acquisition unit for acquiring a plurality of inference results respectively inferred as a result of inputting a plurality of candidate data created based on information indicating an unknown attribute candidate to a learning model; a calculation unit that calculates a distance between the inference result acquired by the acquisition unit and a label corresponding to the candidate data for each inference result; an estimation unit that estimates the value of an unknown attribute according to the result calculated by the calculation unit; It has a configuration of
  • an estimation method which is another form of the present disclosure, The information processing device Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively; calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result; It is configured to estimate the value of an unknown attribute according to the calculated result.
  • a recording medium that is another aspect of the present disclosure includes: information processing equipment, Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively; calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result; It is a computer-readable recording medium recording a program for realizing a process of estimating the value of an unknown attribute according to the calculated result.
  • FIG. 3 is a block diagram showing a configuration example of a model storage device;
  • FIG. It is a block diagram which shows the structural example of a risk-evaluation apparatus.
  • It is a figure which shows an example of prior information.
  • It is a figure which shows another example of prior information.
  • It is a flowchart which shows the operation example of the risk-evaluation apparatus at the time of attribute estimation.
  • 4 is a flowchart showing an operation example of the risk evaluation device during risk evaluation; It is a figure which shows another example of prior information.
  • It is a block diagram which shows the structural example of an estimation apparatus.
  • FIG. 1 is a diagram showing a configuration example of a risk evaluation system 100.
  • FIG. 2 is a block diagram showing a configuration example of the model storage device 200.
  • FIG. 3 is a block diagram showing a configuration example of the risk evaluation device 300.
  • FIG. 4 is a diagram showing an example of the prior information 341.
  • FIG. 5 is a diagram showing another example of the prior information 341.
  • FIG. 6 is a flowchart showing an operation example of the risk evaluation device 300 during attribute estimation.
  • FIG. 7 is a flowchart showing an operation example of the risk evaluation device 300 during risk evaluation.
  • FIG. 8 is a diagram showing another example of the prior information 341. As shown in FIG.
  • a known attribute A risk assessment system 100 that estimates the values of missing attributes using For example, the risk assessment system 100 knows the values (x 2, ..., x d ) of some of the attributes (x 1 , x 2 , ..., x d ) that make up the training data, Suppose we know that an unknown attribute x 1 can take any of k values (v 11 , . . . , v 1k ). In such a case, the risk assessment system 100 assumes that the unknown attribute X 1 takes any value of (v 11 , . . .
  • the risk evaluation system 100 inputs each created candidate data to the learning model 241 and obtains an inference result corresponding to each candidate data. Then, the risk evaluation system 100 calculates the distance (for example, residual) between each acquired inference result and the known label, and based on the calculation result, the unknown attribute is (v 11 , . . . , v 1k ). In this way, the risk evaluation system 100 described in the present embodiment calculates the distance between the inference result for candidate data created based on known knowledge and the known label, thereby obtaining an unknown attribute value to estimate Moreover, the risk evaluation system 100 can perform risk evaluation according to the risk of leakage of training data based on the result of attribute value estimation.
  • the learning model 241 is generated by supervised learning using a plurality of training data.
  • the learning model 241 includes a plurality of attributes and labels so as to output a label indicating whether or not the patient is ill in response to the input of a plurality of attributes such as gender, age, height, weight, and so on. It is learned using multiple training data.
  • attributes and labels are not limited to the above examples, and may be set arbitrarily. Any model such as a decision tree or a neural network may be used as the model trained using the training data.
  • An attribute can also be called an explanatory variable or a feature amount.
  • a label can also be called an objective variable.
  • the risk evaluation system 100 described in the present embodiment estimates unknown attributes when, for example, the learning model 241 is set in a black box.
  • a model generated by machine learning may have a black box setting in which only the output for the input is disclosed to the user, and a white box setting in which model information such as the model structure and branching conditions are also disclosed.
  • the risk evaluation system 100 in this embodiment can estimate unknown attributes without using information disclosed by white box setting.
  • FIG. 1 shows a configuration example of the risk assessment system 100 in this embodiment.
  • the risk assessment system 100 has, for example, a risk assessment device 300 and a model storage device 200 .
  • the risk evaluation device 300 and the model storage device 200 are connected, for example, via a network or the like so that they can communicate with each other.
  • the model storage device 200 is an information processing device that stores a learning model 241 learned using training data.
  • FIG. 2 shows a configuration example of the model storage device 200 .
  • the model storage device 200 has a storage unit 240 in which a learning model 241 is stored, a receiving unit 210 , an inference unit 220 and an output unit 230 .
  • the model storage device 200 has an arithmetic device such as a CPU (Central Processing Unit) and a storage device.
  • the model storage device 200 can realize each of the above-described processing units by executing the program stored in the storage device by the arithmetic device.
  • the learning model 241 stored in the storage unit 240 is learned in advance using a plurality of training data including a plurality of attributes and labels.
  • the learning model 241 may be learned within the model storage device 200 or may be learned outside the model storage device 200 .
  • Receiving unit 210 receives candidate data, which will be described later, from risk evaluation device 300 .
  • the receiving unit 210 includes values of attributes known to the risk assessment apparatus 300 such as “v 11 , x 2 , . . . , x d ” and “v 12 , x 2 , .
  • the receiving unit 210 receives from the risk assessment device 300 a number of pieces of candidate data corresponding to the number of unknown attribute candidates for the risk assessment device 300 .
  • the receiving unit 210 may receive information other than the above examples, such as identification information, together with the candidate data.
  • the inference unit 220 inputs each candidate data received by the reception unit 210 to the learning model 241 . As a result of the input, the inference unit 220 acquires an inference label, which is an inference result corresponding to each candidate data.
  • the output unit 230 transmits the inference label acquired by the inference unit 220 to the risk evaluation device 300 .
  • the output unit 230 may transmit the inference label to the risk assessment apparatus 300 together with the identification information of the candidate data so that the inference label can be determined based on which candidate data. .
  • the model storage device 200 has a learning model 241 learned using training data. Also, upon receiving candidate data from the risk evaluation device 300, the model storage device 200 obtains an inference label corresponding to the candidate data by performing inference using the learning model 241 based on the received candidate data. The model storage device 200 then transmits the acquired inference label to the risk evaluation device 300 .
  • the risk evaluation device 300 is an information processing device that estimates the values of hidden attributes using known knowledge such as information about known attributes. Also, the risk assessment device 300 can perform risk assessment based on the estimation results.
  • FIG. 3 shows a configuration example of the risk evaluation device 300.
  • the risk assessment device 300 includes, as main components, for example, an operation input unit 310, a screen display unit 320, a communication I/F unit 330, a storage unit 340, and an arithmetic processing unit 350. ,have.
  • FIG. 3 illustrates a case where the function of the risk evaluation device 300 is realized using one information processing device.
  • the risk evaluation device 300 may be implemented using a plurality of information processing devices, such as being implemented on a cloud.
  • the functions of the risk evaluation device 300 include an estimation device having functions as a candidate data creation unit 351, a candidate data transmission unit 352, an inference result acquisition unit 353, a distance calculation unit 354, and an estimation unit 355, and an evaluation unit 356. It may be realized by two information processing devices, one is an evaluation device having a function as the output unit 357 .
  • the risk assessment device 300 may not include a part of the above-exemplified configuration such as having no operation input unit or screen display unit, or may have a configuration other than the above-exemplified configuration.
  • the operation input unit 310 consists of operation input devices such as a keyboard and a mouse.
  • the operation input unit 310 detects the operation of the operator who operates the risk evaluation device 300 and outputs it to the arithmetic processing unit 350 .
  • the screen display unit 320 consists of a screen display device such as an LCD (Liquid Crystal Display).
  • the screen display unit 320 can display various information stored in the storage unit 340 on the screen in accordance with instructions from the arithmetic processing unit 350 .
  • the communication I/F unit 330 consists of a data communication circuit and the like.
  • the communication I/F unit 330 performs data communication with an external device such as the model storage device 200 connected via a communication line.
  • the storage unit 340 is a storage device such as a hard disk or memory.
  • the storage unit 340 stores processing information and programs 345 necessary for various processes in the arithmetic processing unit 350 .
  • the program 345 realizes various processing units by being read and executed by the arithmetic processing unit 350 .
  • the program 345 is read in advance from an external device or recording medium via a data input/output function such as the communication I/F unit 330 and stored in the storage unit 340 .
  • Main information stored in the storage unit 340 includes, for example, advance information 341, inference result information 342, distance information 343, estimation information 344, and the like.
  • the prior information 341 includes previously known information about training data used during training of the learning model 241 stored in the model storage device 200 .
  • prior information 341 is acquired in advance using a method such as being acquired from an external device via communication I/F unit 330 or being input using operation input unit 310, and is stored in storage unit 340. ing.
  • FIG. 4 shows an example of the prior information 341.
  • the prior information 341 includes partial training data information and missing attribute information.
  • the prior information 341 can include a plurality of pieces of information in which partial training data information and missing attribute information are associated with each other.
  • the partial training data information indicates known attribute values and corresponding labels in a state in which some attributes of the training data used for learning the learning model 241 are concealed (deleted).
  • FIG. 4 illustrates a case where attributes (x 2 , . . . , x d ) and label y are known and attribute x 1 is missing. Missing attribute information indicates information about the value of the missing attribute.
  • FIG. 4 shows that the missing attribute x 1 takes one of k values (v 11 , . . . , v 1k ).
  • missing attributes are, for example, categorical variables (discrete variables).
  • the advance information 341 can include information other than the information illustrated in FIG.
  • FIG. 5 shows another example of the prior information 341.
  • the prior information 341 can include, in addition to the information exemplified above, information indicating the marginal probability that the missing attribute takes the value of each candidate.
  • the a priori information can include information indicating the marginal probabilities corresponding to each candidate (v 11 , . . . , v 1k ) for the unknown attribute x 1 , as shown in FIG.
  • the prior information 341 may include information other than the above examples.
  • the inference result information 342 includes information indicating an inference label obtained by inputting candidate data created by the candidate data creation unit 351 based on the prior information 341 to the learning model 241, which will be described later.
  • the inference result information 342 may include information indicating inference labels corresponding to the number of candidates for missing attributes.
  • the inference result information 342 is generated and updated in response to an inference label acquired from the model storage device 200 by an inference result acquisition unit 353 (to be described later).
  • the distance information 343 includes information indicating the result of calculating the distance between the inference label included in the inference result information 342 and the label used as training data by the distance calculation unit 354, which will be described later.
  • the distance information 343 may include information indicating distances according to the number of candidates for missing attributes.
  • the distance information 343 is generated and updated as the distance calculator 354 calculates the distance between the inference labels.
  • the estimation information 344 includes information indicating the result of estimation based on the distance information 343 by the estimation unit 355, which will be described later.
  • the estimation information 344 may include information indicating values of attributes estimated by the estimation unit 355 among unknown attribute candidates.
  • the estimation information 344 is generated and updated when the estimation unit 355 estimates a plausible value for an unknown attribute among the candidates based on the distance between the inference labels. .
  • the arithmetic processing unit 350 has an arithmetic device such as a CPU and its peripheral circuits.
  • the arithmetic processing unit 350 reads the program 345 from the storage unit 340 and executes it, so that the hardware and the program 345 work together to realize various processing units.
  • Main processing units realized by the arithmetic processing unit 350 include, for example, a candidate data creation unit 351, a candidate data transmission unit 352, an inference result acquisition unit 353, a distance calculation unit 354, an estimation unit 355, an evaluation unit 356, and an output unit. 357, etc.
  • the candidate data creation unit 351 creates candidate data based on the prior information 341. For example, the candidate data creation unit 351 creates candidate data according to the number of candidates indicated by the missing attribute information. The candidate data creation unit 351 may create candidate data at any timing.
  • partial training data information (x 2 , . ) is stored.
  • the candidate data generating unit 351 assumes that the unknown attribute x 1 takes any value of (v 11 , . . . , v 1k ), and the candidate Create data. That is, the candidate data creating unit 351 creates candidate data (v 11 , x 2 , ..., x d ), ..., (v 1k , x 2 , ..., x d ).
  • the prior information 341 can include a plurality of pieces of information in which partial training data information and missing attribute information are associated with each other.
  • the candidate data creation unit 351 may create candidate data using the method described above for each of the associated information.
  • the candidate data transmission unit 352 transmits the candidate data created by the candidate data creation unit 351 to the model storage device 200 .
  • the candidate data transmission unit 352 may transmit, together with the candidate data, identification information of the candidate data according to the partial training data information used when creating the candidate data.
  • the inference result acquisition unit 353 receives and acquires an inference label from the model storage device 200 as a result of inference based on candidate data. For example, the inference result acquisition unit 353 acquires the inference label from the model storage device 200 together with the identification information so that the inference target candidate data can be identified. The inference result acquisition unit 353 also stores the received inference label as the inference result information 342 in the storage unit 340 . The inference result acquisition unit 353 may store the inference label in the storage unit 340 together with the identification information of the corresponding candidate data.
  • the distance calculation unit 354 calculates the distance between the inference label and the label included in the partial training data information from which the candidate data to be inferred was created. calculate. That is, the distance calculation unit 354 calculates the distance between the inference label and the label corresponding to the inference target candidate data. Further, the distance calculation unit 354 stores the calculated distance in the storage unit 340 as the distance information 343 . The distance calculation unit 354 may store the calculation result in the storage unit 340 together with the identification information of the corresponding candidate data.
  • the distance calculation unit 354 calculates a residual between the inference label and the label as the distance between the inference label and the label. For example, let the label be denoted as y and the inference label be denoted as number 1. In this case, the distance calculation unit 354 calculates the residual between the inference label and the label by calculating Equation 2, which will be described later. Note that i takes any value from 1 to k.
  • the distance calculation unit 354 calculates the residual between the inference labels as the distance between the inference labels.
  • the distance calculation unit 354 may be configured to calculate the distance between the inference label and the label using a known method other than the exemplified one, such as calculating a value that is twice the value of Equation 2 as the distance.
  • the estimation unit 355 estimates the value of an attribute that is likely to be an unknown attribute among the candidates.
  • the estimation unit 355 also stores the result of estimation in the storage unit 340 as estimation information 344 .
  • the estimation unit 355 identifies a candidate with the smallest distance based on the distance information 343, and estimates a value according to the identified result. Specifically, for example, the estimation unit 355 identifies i′ by solving Equation 3 below. Then, v 1i' corresponding to the specified i' is output as a plausible attribute value. Note that i' takes any value from 1 to k.
  • the estimating unit 355 can select one of the plurality of i' at random, and output v 1i' according to the selected result.
  • the prior information 341 may include information indicating marginal probabilities.
  • the estimator 355 may select one of the multiple i' based on marginal probabilities. For example, the estimating unit 355 can select i′ having the maximum marginal probability among a plurality of i′. Also, the estimation unit 355 may be configured to select i' with a probability corresponding to the marginal probability.
  • the estimating unit 355 is configured to select one of the plurality of i' by an arbitrary method when there are a plurality of i' and to output v 1i' according to the selected result. good.
  • the estimator 355 may be configured to output a plurality of v 1i ' corresponding to each of the plurality of i'.
  • the evaluation unit 356 performs evaluation based on the estimation information 344. In other words, the evaluation unit 356 performs risk evaluation based on the result of estimation by the estimation unit 355 .
  • the evaluation unit 356 has correct answer information, which is information indicating what value the unknown attribute indicated by the prior information 341 was actually. For example, in the case of FIG. 4, the evaluation unit 356 has correct answer information indicating which value of (v 11 , . . . , v 1k ) x 1 is.
  • the evaluation unit 356 can compare the result of estimation by the estimation unit 355 and the actual value indicated by the correct answer information, and perform risk evaluation based on the comparison result. For example, the evaluation unit 356 can evaluate that the risk is high when the result of estimation by the estimation unit 355 and the actual value indicated by the correct answer information match. On the other hand, when the result of estimation by the estimation unit 355 and the actual value indicated by the correct answer information do not match, the evaluation unit 356 can evaluate that the risk is low.
  • the prior information 341 includes a plurality of pieces of information in which partial training data information and missing attribute information are associated with each other. Therefore, the estimating unit 355 can estimate a candidate for each of the associated information. Therefore, for example, the evaluation unit 356 may perform risk evaluation based on a comparison result between a plurality of estimation results by the estimation unit 355 and correct information corresponding to each estimation. Specifically, for example, the evaluation unit 356 calculates the percentage of correct answers indicating the percentage of matches between the estimation results and the correct information, according to the results of a plurality of comparisons. Then, the evaluation unit 356 can output, for example, the calculated percentage of correct answers as the information indicating the risk. The evaluation unit 356 may be configured to evaluate risk according to whether or not the calculated percentage of correct answers exceeds a predetermined threshold, and output the evaluation result.
  • the output unit 357 outputs information indicating candidates estimated by the estimation unit 355, information indicating evaluation results by the evaluation unit 356, and the like. For example, the output unit 357 displays each of the above information on the screen display unit 320 or transmits the information to an external device via the communication I/F unit 330 .
  • FIG. 3 is a configuration example of the risk evaluation device 300. Next, an operation example of the risk assessment device 300 will be described with reference to FIGS. 6 and 7.
  • FIG. 6 is a configuration example of the risk evaluation device 300.
  • FIG. 6 is a flowchart showing an operation example of the risk evaluation device 300 when estimating an unknown attribute.
  • the candidate data creating unit 351 creates candidate data based on the prior information 341 (step S101). For example, the candidate data creation unit 351 creates candidate data according to the number of candidates indicated by the missing attribute information.
  • the candidate data transmission unit 352 transmits each candidate data created by the candidate data creation unit 351 to the model storage device 200 (step S102).
  • the inference result acquisition unit 353 acquires an inference label for each candidate data from the model storage device 200 as an inference result based on the candidate data (step S103).
  • the distance calculation unit 354 calculates the distance between the inference label and the training label indicated by the corresponding partial training data information (step S104). For example, the distance calculation unit 354 calculates the residual between each received inference label and the label as the distance.
  • the estimation unit 355 estimates a plausible value for the unknown attribute among the candidates (step S105). For example, the estimation unit 355 identifies a candidate with the smallest distance based on the distance information 343, and estimates a value according to the identified result.
  • the above is a configuration example of the risk evaluation device 300 at the time of attribute estimation.
  • the risk evaluation device 300 can perform the processing from step S101 to step S105 for each target to be estimated.
  • FIG. 7 is a flowchart showing an operation example of the risk evaluation device 300 during risk evaluation.
  • the risk evaluation device 300 performs the process of estimating unknown attributes described with reference to FIG. 6 (step S201).
  • the risk evaluation device 300 When the estimation target remains in the prior information 341 (step S202, No), the risk evaluation device 300 returns to the process of step S201 and performs the estimation process. On the other hand, when there is no estimation target in the prior information 341 (step S202, Yes), the risk evaluation device 300 performs risk evaluation according to each estimation result (step S203). For example, the risk assessment device 300 can calculate the percentage of correct answers based on the results of comparison between the result of each estimation and the correct answer information corresponding to each estimation, and output according to the calculated percentage of correct answers.
  • step S203 does not necessarily have to be performed continuously after the processes of steps S201 and S202.
  • the process of step S203 may be performed at any timing after the processes of steps S201 and S202.
  • the risk evaluation device 300 has the distance calculation unit 354 and the estimation unit 355.
  • the estimation unit 355 can estimate the value of the attribute that is likely to be the unknown attribute among the candidates, based on the distance between the estimated labels calculated by the distance calculation unit 354. . That is, according to the above configuration, unknown attribute values can be estimated without assuming error functions or knowledge of marginal probabilities. As a result, the data can be estimated more accurately even when the user does not have the above knowledge or makes no assumptions, that is, even when no prior knowledge is assumed.
  • the present embodiment has exemplified the case where there is one unknown attribute x 1 .
  • the present invention can be applied without problems even when there are multiple unknown attributes.
  • FIG. 8 shows an example of prior information 341 when there are multiple unknown attributes from x1 to xn .
  • FIG. 8 illustrates a case where attributes (x n+1 , . . . , x d ) and label y are known and attributes (x 1 , . . . , x n ) are missing.
  • the missing attribute information indicates information about the value of each missing attribute.
  • the prior information 341 may include information indicating the marginal probability of each candidate even when there are a plurality of unknown attributes.
  • the candidate data creation unit 351 assumes that each unknown attribute takes one of the candidates, and creates a number of candidate data corresponding to the combination of the unknown attribute candidates. create. From the candidate data transmission unit 352 onward, processing can be performed in the same manner as when there is one unknown attribute. For example, when there are a plurality of i', the estimating unit 355 may select one of the plurality of i' at random, for example, as in the case where there is one unknown attribute, or the marginal probability may be You can choose accordingly. For example, as described above, even when there are a plurality of unknown attributes, the same processing as in the case where there is one unknown attribute is performed except that the number of candidate data created by the candidate data creating unit 351 increases. Unknown attribute values can be estimated.
  • the risk evaluation system 100 has the model storage device 200 and the risk evaluation device 300 is exemplified.
  • the risk evaluation system 100 may be composed of, for example, one information processing device having the functions of the model storage device 200 and the risk evaluation device 300 described in this embodiment.
  • Risk assessment system 100 may employ other known variations.
  • FIG. 9 is a diagram illustrating a hardware configuration example of the estimation device 400.
  • FIG. 10 is a block diagram showing a configuration example of the estimation device 400. As shown in FIG.
  • FIG. 9 shows a hardware configuration example of the estimation device 400 .
  • the estimating device 400 has the following hardware configuration as an example.
  • the estimating apparatus 400 can realize the functions of the acquiring unit 421, the calculating unit 422, and the estimating unit 423 shown in FIG.
  • the program group 404 is stored in the storage device 405 or the ROM 402 in advance, for example, and is loaded into the RAM 403 or the like by the CPU 401 as necessary and executed.
  • the program group 404 may be supplied to the CPU 401 via the communication network 411 or stored in the recording medium 410 in advance, and the drive device 406 may read the program and supply it to the CPU 401 .
  • FIG. 9 shows a hardware configuration example of the estimation device 400 .
  • the hardware configuration of estimation device 400 is not limited to the case described above.
  • the estimating device 400 may be configured from part of the configuration described above, such as not having the drive device 406 .
  • the acquisition unit 421 acquires a plurality of inference results that are respectively inferred as a result of inputting a plurality of candidate data created based on information indicating unknown attribute candidates to the learning model.
  • the calculation unit 422 calculates the distance between the inference result acquired by the acquisition unit 421 and the label corresponding to the candidate data for each inference result. For example, the calculator 422 calculates the residual as the distance.
  • the estimation unit 423 estimates the unknown attribute value according to the result calculated by the calculation unit 422 .
  • the estimation device 400 has the calculation unit 422 and the estimation unit 423.
  • the estimation unit 423 can estimate the value of the unknown attribute based on the distance calculation result of the calculation unit 422 . That is, according to the above configuration, unknown attribute values can be estimated without assuming error functions or knowledge of marginal probabilities. As a result, the data can be estimated more accurately even when the user does not have the above knowledge or makes no assumptions, that is, even when no prior knowledge is assumed.
  • the estimation device 400 described above can be realized by installing a predetermined program in an information processing device such as the estimation device 400 .
  • a program that is another aspect of the present invention inputs a plurality of candidate data created based on information indicating unknown attribute candidates to an information processing device such as the estimation device 400 to a learning model. Obtain multiple inference results that are respectively inferred as a result, calculate the distance between the obtained inference result and the label corresponding to the candidate data for each inference result, and determine the unknown attribute according to the calculated result.
  • This is a program for realizing the process of estimating the value of
  • the information processing apparatus such as the estimation apparatus 400 uses a plurality of candidate data created based on information indicating unknown attribute candidates as a learning model. For each inference result, obtain multiple inference results to be inferred as a result of each input, calculate the distance between the obtained inference result and the label corresponding to the candidate data for each inference result, and calculate the distance according to the calculated result It is a method of estimating the value of an unknown attribute by using
  • (Appendix 1) an acquisition unit for acquiring a plurality of inference results respectively inferred as a result of inputting a plurality of candidate data created based on information indicating an unknown attribute candidate to a learning model; a calculation unit that calculates a distance between the inference result acquired by the acquisition unit and a label corresponding to the candidate data for each inference result; an estimation unit that estimates the value of an unknown attribute according to the result calculated by the calculation unit; an estimator.
  • the estimating device The calculation unit calculates a residual between the inference result and the label as a distance between the inference result and the label, The estimation device, wherein the estimation unit estimates a value of an unknown attribute according to the residual calculated by the calculation unit.
  • Appendix 9 The estimation method according to Appendix 8, calculating a residual between the inference result and the label as the distance between the inference result and the label; An estimation method that estimates the value of an unknown attribute according to the computed residuals.
  • Appendix 10 information processing equipment, Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively; calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result; A computer-readable recording medium that records a program for realizing the process of estimating the value of an unknown attribute according to the calculated result.
  • Risk evaluation system 100 Risk evaluation system 200 Model storage device 210 Reception unit 220 Inference unit 230 Output unit 240 Storage unit 241 Learning model 300 Risk evaluation device 310 Operation input unit 320 Screen display unit 330 Communication I/F unit 340 Storage unit 341 Prior information 342 Inference result Information 343 Distance information 344 Estimation information 350 Calculation processing unit 351 Candidate data creation unit 352 Candidate data transmission unit 353 Inference result acquisition unit 354 Distance calculation unit 355 Estimation unit 356 Evaluation unit 357 Output unit 400 Estimation device 401 CPU 402 ROMs 403 RAM 404 program group 405 storage device 406 drive device 407 communication interface 408 input/output interface 409 bus 410 recording medium 411 communication network 421 acquisition unit 422 calculation unit 423 estimation unit

Abstract

An estimation device 400 has: an acquisition unit 421 that acquires a plurality of inference results that are respectively inferred as a result of inputting, to a learning model, multiple pieces of candidate data created on the basis of information indicating an unknown attribute candidate; a calculation unit 422 that for each inference result calculates the distance between the inference result acquired by the acquisition unit 421 and a label corresponding to the candidate data; and an estimation unit 423 that estimates an unknown attribute value in accordance with the result calculated by the calculation unit 422.

Description

推定装置estimation device
 本発明は、推定装置、推定方法、記録媒体に関する。 The present invention relates to an estimation device, an estimation method, and a recording medium.
 機会学習などを用いて学習された学習モデルのリスク評価などを目的として、学習モデルからの出力に基づいて学習時に用いられたデータを推定する技術が知られている。 A known technique is to estimate the data used during learning based on the output from a learning model for the purpose of risk assessment of a learning model learned using machine learning.
 例えば、非特許文献1には、標的データの既知属性と真のラベルを入力として、所定の処理を実行することで尤もらしい属性値を出力する方法が記載されている。例えば、非特許文献1によると、推定対象の未知属性をある値で固定して、決定木の出力する推定ラベルを計算する。その後、仮定したエラー関数を用いて真のラベルと推定ラベルの間のずれを算出し、算出したずれを重みとして周辺確率を評価する。非特許文献1によると、例えば、上記のような処理の結果として、尤もらしい属性値を特定する。なお、非特許文献1に関連する技術としては、例えば、非特許文献2のようなものもある。 For example, Non-Patent Document 1 describes a method of outputting a plausible attribute value by executing a predetermined process with known attributes and true labels of target data as inputs. For example, according to Non-Patent Document 1, an estimated label to be output from a decision tree is calculated by fixing an unknown attribute to be estimated at a certain value. After that, the error function assumed is used to calculate the deviation between the true label and the estimated label, and the marginal probability is evaluated using the calculated deviation as a weight. According to Non-Patent Document 1, for example, a likely attribute value is identified as a result of the above processing. As a technology related to Non-Patent Document 1, for example, there is also a technique such as Non-Patent Document 2.
 また、機械学習について記載された文献として、例えば、特許文献1のようなものがある。例えば、特許文献1には、取得されたデータを学習済み機械学習モデルに与えて、学習済み機械学習モデルで所定の推論を実行させ、その結果として、データに対する推論結果を取得することが記載されている。 Also, as a document describing machine learning, there is, for example, Patent Document 1. For example, Patent Document 1 describes giving acquired data to a trained machine learning model, causing the trained machine learning model to perform predetermined inference, and as a result, obtaining an inference result for the data. ing.
国際公開2021/014878号公報International Publication No. 2021/014878
 非特許文献1や非特許文献2に記載の技術の場合、エラー関数の形を仮定することが必要である、周辺確率を知っていないといけないなど、推定を行うためにはさまざまな事前の知識、仮定が必要であった。そのため、上記のような知識を有さない場合や仮定を行わない場合などにおいて、的確にデータを推定できないおそれがある、という課題が生じていた。 In the case of the techniques described in Non-Patent Document 1 and Non-Patent Document 2, it is necessary to assume the shape of the error function, to know the marginal probability, etc. Various prior knowledge is required for estimation. , an assumption was necessary. Therefore, there is a problem that data cannot be estimated accurately when the above knowledge is not available or assumptions are not made.
 そこで、本発明の目的は、上述した課題を解決することが可能な推定装置、推定方法、記録媒体を提供することにある。 Therefore, an object of the present invention is to provide an estimating device, an estimating method, and a recording medium that can solve the above-described problems.
 かかる目的を達成するため本開示の一形態である推定装置は、
 未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得する取得部と、
 前記取得部が取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出する算出部と、
 前記算出部が算出した結果に応じて、未知の属性の値を推定する推定部と、
 を有する
 という構成をとる。
In order to achieve such an object, the estimating device, which is one aspect of the present disclosure,
an acquisition unit for acquiring a plurality of inference results respectively inferred as a result of inputting a plurality of candidate data created based on information indicating an unknown attribute candidate to a learning model;
a calculation unit that calculates a distance between the inference result acquired by the acquisition unit and a label corresponding to the candidate data for each inference result;
an estimation unit that estimates the value of an unknown attribute according to the result calculated by the calculation unit;
It has a configuration of
 また、本開示の他の形態である推定方法は、
 情報処理装置が、
 未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得し、
 取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出し、
 算出した結果に応じて、未知の属性の値を推定する
 という構成をとる。
In addition, an estimation method, which is another form of the present disclosure,
The information processing device
Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively;
calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result;
It is configured to estimate the value of an unknown attribute according to the calculated result.
 また、本開示の他の形態である記録媒体は、
 情報処理装置に、
 未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得し、
 取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出し、
 算出した結果に応じて、未知の属性の値を推定する
 処理を実現するためのプログラムを記録した、コンピュータが読み取り可能な記録媒体である。
In addition, a recording medium that is another aspect of the present disclosure includes:
information processing equipment,
Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively;
calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result;
It is a computer-readable recording medium recording a program for realizing a process of estimating the value of an unknown attribute according to the calculated result.
 上述したような各構成によると、的確にデータを推定可能な推定装置、推定方法、記録媒体を提供することが出来る。 According to each configuration as described above, it is possible to provide an estimation device, an estimation method, and a recording medium capable of accurately estimating data.
本発明の第1の実施形態におけるリスク評価システムの構成例を示す図である。It is a figure which shows the structural example of the risk-evaluation system in the 1st Embodiment of this invention. モデル格納装置の構成例を示すブロック図である。3 is a block diagram showing a configuration example of a model storage device; FIG. リスク評価装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of a risk-evaluation apparatus. 事前情報の一例を示す図である。It is a figure which shows an example of prior information. 事前情報の他の一例を示す図である。It is a figure which shows another example of prior information. 属性推定時のリスク評価装置の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the risk-evaluation apparatus at the time of attribute estimation. リスク評価時のリスク評価装置の動作例を示すフローチャートである。4 is a flowchart showing an operation example of the risk evaluation device during risk evaluation; 事前情報の他の一例を示す図である。It is a figure which shows another example of prior information. 本開示の第2の実施形態における推定装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the estimation apparatus in 2nd Embodiment of this indication. 推定装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of an estimation apparatus.
[第1の実施形態]
 本開示の第1の実施形態について、図1から図8までを参照して説明する。図1は、リスク評価システム100の構成例を示す図である。図2は、モデル格納装置200の構成例を示すブロック図である。図3は、リスク評価装置300の構成例を示すブロック図である。図4は、事前情報341の一例を示す図である。図5は、事前情報341の他の一例を示す図である。図6は、属性推定時のリスク評価装置300の動作例を示すフローチャートである。図7は、リスク評価時のリスク評価装置300の動作例を示すフローチャートである。図8は、事前情報341の他の一例を示す図である。
[First embodiment]
A first embodiment of the present disclosure will be described with reference to FIGS. 1 to 8. FIG. FIG. 1 is a diagram showing a configuration example of a risk evaluation system 100. As shown in FIG. FIG. 2 is a block diagram showing a configuration example of the model storage device 200. As shown in FIG. FIG. 3 is a block diagram showing a configuration example of the risk evaluation device 300. As shown in FIG. FIG. 4 is a diagram showing an example of the prior information 341. As shown in FIG. FIG. 5 is a diagram showing another example of the prior information 341. As shown in FIG. FIG. 6 is a flowchart showing an operation example of the risk evaluation device 300 during attribute estimation. FIG. 7 is a flowchart showing an operation example of the risk evaluation device 300 during risk evaluation. FIG. 8 is a diagram showing another example of the prior information 341. As shown in FIG.
 本開示の第1の実施形態においては、学習モデル241の訓練時に用いた訓練データを構成する属性のうち一部が秘匿されているなどの理由により一部欠損している場合に、既知の属性を用いて欠損している属性の値を推定するリスク評価システム100について説明する。例えば、リスク評価システム100は、訓練データを構成する属性(x、x2、…、x)のうちの一部の属性の値(x2、…、x)を知っているとともに、未知の属性xがk個の値(v11、…、v1k)のうちのいずれかを取ることが出来ることを知っているとする。このような場合、リスク評価システム100は、未知の属性Xが(v11、……、v1k)のうちのいずれかの値をとるとして、それぞれの値に対応する候補データを作成する。また、リスク評価システム100は、作成した各候補データを学習モデル241に入力して、各候補データに対応する推論結果を取得する。そして、リスク評価システム100は、取得した各推論結果と、既知のラベルと、の間の距離(例えば、残差)をそれぞれ算出して、算出結果に基づいて未知の属性が(v11、……、v1k)のうちのいずれの値をとるかを推定する。このように、本実施形態において説明するリスク評価システム100は、既知の知識に基づいて作成した候補データに対する推論結果と、既知のラベルと、の間の距離を算出することで、未知の属性値を推定する。また、リスク評価システム100は、属性値の推定結果に基づいて、訓練データが漏えいするおそれなどに応じたリスク評価を行うことが出来る。 In the first embodiment of the present disclosure, if some of the attributes that make up the training data used during training of the learning model 241 are partially missing due to reasons such as being concealed, a known attribute A risk assessment system 100 that estimates the values of missing attributes using For example, the risk assessment system 100 knows the values (x 2, ..., x d ) of some of the attributes (x 1 , x 2 , ..., x d ) that make up the training data, Suppose we know that an unknown attribute x 1 can take any of k values (v 11 , . . . , v 1k ). In such a case, the risk assessment system 100 assumes that the unknown attribute X 1 takes any value of (v 11 , . . . , v 1k ) and creates candidate data corresponding to each value. In addition, the risk evaluation system 100 inputs each created candidate data to the learning model 241 and obtains an inference result corresponding to each candidate data. Then, the risk evaluation system 100 calculates the distance (for example, residual) between each acquired inference result and the known label, and based on the calculation result, the unknown attribute is (v 11 , . . . , v 1k ). In this way, the risk evaluation system 100 described in the present embodiment calculates the distance between the inference result for candidate data created based on known knowledge and the known label, thereby obtaining an unknown attribute value to estimate Moreover, the risk evaluation system 100 can perform risk evaluation according to the risk of leakage of training data based on the result of attribute value estimation.
 なお、本実施形態において、学習モデル241は、複数の訓練データを用いた教師あり学習により生成されているものとする。例えば、学習モデル241は、性別、年齢、身長、体重、…、などの複数の属性の入力に応じて、病気か否かなどを示すラベルを出力するように、複数の属性とラベルとを含む訓練データを複数用いて学習されている。なお、属性やラベルの具体例は、上記例示した場合に限られず任意に設定してよい。また、訓練データを用いて訓練するモデルは、決定木やニューラルネットなど任意のものであってよい。なお、属性は、説明変数、または、特徴量などとも呼ばれうる。また、ラベルは、目的変数などとも呼ばれうる。 In addition, in this embodiment, the learning model 241 is generated by supervised learning using a plurality of training data. For example, the learning model 241 includes a plurality of attributes and labels so as to output a label indicating whether or not the patient is ill in response to the input of a plurality of attributes such as gender, age, height, weight, and so on. It is learned using multiple training data. Note that specific examples of attributes and labels are not limited to the above examples, and may be set arbitrarily. Any model such as a decision tree or a neural network may be used as the model trained using the training data. An attribute can also be called an explanatory variable or a feature amount. A label can also be called an objective variable.
 また、本実施形態において説明するリスク評価システム100は、例えば、学習モデル241がブラックボックス設定である場合において、未知属性の推定を行う。例えば、機械学習で生成するモデルは、入力に対する出力だけがユーザに公開されるブラックボックス設定のほか、モデルの構造や分岐条件などのモデル情報も公開するホワイトボックス設定がとられることがある。後述するように、本実施形態におけるリスク評価システム100は、ホワイトボックス設定により公開される情報は用いることなく、未知属性の推定を行うことが出来る。 Also, the risk evaluation system 100 described in the present embodiment estimates unknown attributes when, for example, the learning model 241 is set in a black box. For example, a model generated by machine learning may have a black box setting in which only the output for the input is disclosed to the user, and a white box setting in which model information such as the model structure and branching conditions are also disclosed. As will be described later, the risk evaluation system 100 in this embodiment can estimate unknown attributes without using information disclosed by white box setting.
 図1は、本実施形態におけるリスク評価システム100の構成例を示している。図1を参照すると、リスク評価システム100は、例えば、リスク評価装置300と、モデル格納装置200と、を有している。図1で示すように、リスク評価装置300とモデル格納装置200とは、例えば、ネットワークなどを介して互いに通信可能なよう接続されている。 FIG. 1 shows a configuration example of the risk assessment system 100 in this embodiment. Referring to FIG. 1, the risk assessment system 100 has, for example, a risk assessment device 300 and a model storage device 200 . As shown in FIG. 1, the risk evaluation device 300 and the model storage device 200 are connected, for example, via a network or the like so that they can communicate with each other.
 モデル格納装置200は、訓練データを用いて学習された学習モデル241が格納されている情報処理装置である。図2は、モデル格納装置200の構成例を示している。例えば、図2を参照すると、モデル格納装置200は、学習モデル241が格納された記憶部240を有するとともに、受信部210と、推論部220と、出力部230と、を有している。例えば、モデル格納装置200は、CPU(Central Processing Unit)などの演算装置と記憶装置とを有している。モデル格納装置200は、記憶装置に格納されたプログラムを演算装置が実行することで、上記各処理部を実現することが出来る。 The model storage device 200 is an information processing device that stores a learning model 241 learned using training data. FIG. 2 shows a configuration example of the model storage device 200 . For example, referring to FIG. 2, the model storage device 200 has a storage unit 240 in which a learning model 241 is stored, a receiving unit 210 , an inference unit 220 and an output unit 230 . For example, the model storage device 200 has an arithmetic device such as a CPU (Central Processing Unit) and a storage device. The model storage device 200 can realize each of the above-described processing units by executing the program stored in the storage device by the arithmetic device.
 なお、図2で示すように、記憶部240に格納されている学習モデル241は、複数の属性とラベルとを含む訓練データを複数用いて予め学習されている。学習モデル241は、モデル格納装置200内で学習されていてもよいし、モデル格納装置200外で学習されていてもよい。 Note that, as shown in FIG. 2, the learning model 241 stored in the storage unit 240 is learned in advance using a plurality of training data including a plurality of attributes and labels. The learning model 241 may be learned within the model storage device 200 or may be learned outside the model storage device 200 .
 受信部210は、リスク評価装置300から後述する候補データを受信する。例えば、受信部210は、“v11、x2、…、x”や“v12、x2、…、x”など、リスク評価装置300にとって既知の属性の値を含むとともに、未知の属性の候補を含む訓練データを受信する。一例として、受信部210は、リスク評価装置300にとっての未知の属性候補の数に応じた数の候補データをリスク評価装置300から受信する。受信部210は、候補データとともに識別情報など上記例示した以外の情報を受信してもよい。 Receiving unit 210 receives candidate data, which will be described later, from risk evaluation device 300 . For example, the receiving unit 210 includes values of attributes known to the risk assessment apparatus 300 such as “v 11 , x 2 , . . . , x d ” and “v 12 , x 2 , . Receive training data containing attribute candidates. As an example, the receiving unit 210 receives from the risk assessment device 300 a number of pieces of candidate data corresponding to the number of unknown attribute candidates for the risk assessment device 300 . The receiving unit 210 may receive information other than the above examples, such as identification information, together with the candidate data.
 推論部220は、受信部210が受信した各候補データを学習モデル241に入力する。また、上記入力の結果として、推論部220は、各候補データに対応する推論結果である推論ラベルを取得する。 The inference unit 220 inputs each candidate data received by the reception unit 210 to the learning model 241 . As a result of the input, the inference unit 220 acquires an inference label, which is an inference result corresponding to each candidate data.
 出力部230は、推論部220が取得した推論ラベルをリスク評価装置300に対して送信する。例えば、出力部230は、推論ラベルがどの候補データに基づいて推論した結果であるのかを判別可能なように、候補データの識別情報などとともに推論ラベルをリスク評価装置300に対して送信してよい。 The output unit 230 transmits the inference label acquired by the inference unit 220 to the risk evaluation device 300 . For example, the output unit 230 may transmit the inference label to the risk assessment apparatus 300 together with the identification information of the candidate data so that the inference label can be determined based on which candidate data. .
 例えば、以上のように、モデル格納装置200は、訓練データを用いて学習された学習モデル241を有している。また、モデル格納装置200は、リスク評価装置300から候補データを受信すると、受信した候補データに基づいて学習モデル241を用いた推論を行うことで、候補データに対応する推論ラベルを取得する。そして、モデル格納装置200は、取得した推論ラベルをリスク評価装置300に対して送信する。 For example, as described above, the model storage device 200 has a learning model 241 learned using training data. Also, upon receiving candidate data from the risk evaluation device 300, the model storage device 200 obtains an inference label corresponding to the candidate data by performing inference using the learning model 241 based on the received candidate data. The model storage device 200 then transmits the acquired inference label to the risk evaluation device 300 .
 リスク評価装置300は、既知の属性についての情報などの既知の知識を用いて秘匿されている属性の値を推定する情報処理装置である。また、リスク評価装置300は、推定結果に基づくリスク評価を行うことが出来る。 The risk evaluation device 300 is an information processing device that estimates the values of hidden attributes using known knowledge such as information about known attributes. Also, the risk assessment device 300 can perform risk assessment based on the estimation results.
 図3は、リスク評価装置300の構成例を示している。図3を参照すると、リスク評価装置300は、主な構成要素として、例えば、操作入力部310と、画面表示部320と、通信I/F部330と、記憶部340と、演算処理部350と、を有している。 FIG. 3 shows a configuration example of the risk evaluation device 300. FIG. Referring to FIG. 3, the risk assessment device 300 includes, as main components, for example, an operation input unit 310, a screen display unit 320, a communication I/F unit 330, a storage unit 340, and an arithmetic processing unit 350. ,have.
 なお、図3では、1台の情報処理装置を用いてリスク評価装置300としての機能を実現する場合について例示している。しかしながら、リスク評価装置300は、例えば、クラウド上に実現されるなど、複数台の情報処理装置を用いて実現されてもよい。例えば、リスク評価装置300としての機能は、候補データ作成部351と候補データ送信部352と推論結果取得部353と距離算出部354と推定部355としての機能を有する推定装置と、評価部356と出力部357としての機能を有する評価装置と、の2台の情報処理装置により実現されてもよい。また、リスク評価装置300は、操作入力部や画面表示部を有さないなど上記例示した構成の一部を含まなくてもよいし、上記例示した以外の構成を有してもよい。 Note that FIG. 3 illustrates a case where the function of the risk evaluation device 300 is realized using one information processing device. However, the risk evaluation device 300 may be implemented using a plurality of information processing devices, such as being implemented on a cloud. For example, the functions of the risk evaluation device 300 include an estimation device having functions as a candidate data creation unit 351, a candidate data transmission unit 352, an inference result acquisition unit 353, a distance calculation unit 354, and an estimation unit 355, and an evaluation unit 356. It may be realized by two information processing devices, one is an evaluation device having a function as the output unit 357 . Moreover, the risk assessment device 300 may not include a part of the above-exemplified configuration such as having no operation input unit or screen display unit, or may have a configuration other than the above-exemplified configuration.
 操作入力部310は、キーボード、マウスなどの操作入力装置からなる。操作入力部310は、リスク評価装置300を操作する操作者の操作を検出して演算処理部350に出力する。 The operation input unit 310 consists of operation input devices such as a keyboard and a mouse. The operation input unit 310 detects the operation of the operator who operates the risk evaluation device 300 and outputs it to the arithmetic processing unit 350 .
 画面表示部320は、LCD(Liquid Crystal Display、液晶ディスプレイ)などの画面表示装置からなる。画面表示部320は、演算処理部350からの指示に応じて、記憶部340に格納されている各種情報などを画面表示することが出来る。 The screen display unit 320 consists of a screen display device such as an LCD (Liquid Crystal Display). The screen display unit 320 can display various information stored in the storage unit 340 on the screen in accordance with instructions from the arithmetic processing unit 350 .
 通信I/F部330は、データ通信回路などからなる。通信I/F部330は、通信回線を介して接続されたモデル格納装置200などの外部装置との間でデータ通信を行う。 The communication I/F unit 330 consists of a data communication circuit and the like. The communication I/F unit 330 performs data communication with an external device such as the model storage device 200 connected via a communication line.
 記憶部340は、ハードディスクやメモリなどの記憶装置である。記憶部340は、演算処理部350における各種処理に必要な処理情報やプログラム345を記憶する。プログラム345は、演算処理部350に読み込まれて実行されることにより各種処理部を実現する。プログラム345は、通信I/F部330などのデータ入出力機能を介して外部装置や記録媒体から予め読み込まれ、記憶部340に保存されている。記憶部340で記憶される主な情報としては、例えば、事前情報341、推論結果情報342、距離情報343、推定情報344などがある。 The storage unit 340 is a storage device such as a hard disk or memory. The storage unit 340 stores processing information and programs 345 necessary for various processes in the arithmetic processing unit 350 . The program 345 realizes various processing units by being read and executed by the arithmetic processing unit 350 . The program 345 is read in advance from an external device or recording medium via a data input/output function such as the communication I/F unit 330 and stored in the storage unit 340 . Main information stored in the storage unit 340 includes, for example, advance information 341, inference result information 342, distance information 343, estimation information 344, and the like.
 事前情報341は、モデル格納装置200に格納された学習モデル241の訓練時に用いた訓練データについて予め知っている情報を含んでいる。例えば、事前情報341は、通信I/F部330を介して外部装置から取得する、操作入力部310を用いて入力する、などの方法を用いて予め取得されており、記憶部340に格納されている。 The prior information 341 includes previously known information about training data used during training of the learning model 241 stored in the model storage device 200 . For example, prior information 341 is acquired in advance using a method such as being acquired from an external device via communication I/F unit 330 or being input using operation input unit 310, and is stored in storage unit 340. ing.
 図4は、事前情報341の一例を示している。図4を参照すると、事前情報341には、部分訓練データ情報と、欠損属性情報と、が含まれている。例えば、図4で示すように、事前情報341には、部分訓練データ情報と欠損属性情報とを対応付けた情報を複数含むことが出来る。 FIG. 4 shows an example of the prior information 341. Referring to FIG. 4, the prior information 341 includes partial training data information and missing attribute information. For example, as shown in FIG. 4, the prior information 341 can include a plurality of pieces of information in which partial training data information and missing attribute information are associated with each other.
 ここで、部分訓練データ情報は、学習モデル241を学習する際に用いた訓練データのうち一部の属性が秘匿(欠損)された状態における既知の属性の値と対応するラベルとを示している。例えば、図4では、属性(x、…、x)とラベルyとが既知であり、属性xが欠損している場合について例示されている。また、欠損属性情報は、欠損している属性の値についての情報を示している。例えば、図4では、欠損した属性xがk個の値(v11、…、v1k)のうちのいずれかをとることを示している。なお、本実施形態において、欠損した属性は、例えば、カテゴリカル変数(離散変数)である。 Here, the partial training data information indicates known attribute values and corresponding labels in a state in which some attributes of the training data used for learning the learning model 241 are concealed (deleted). . For example, FIG. 4 illustrates a case where attributes (x 2 , . . . , x d ) and label y are known and attribute x 1 is missing. Missing attribute information indicates information about the value of the missing attribute. For example, FIG. 4 shows that the missing attribute x 1 takes one of k values (v 11 , . . . , v 1k ). Note that in the present embodiment, missing attributes are, for example, categorical variables (discrete variables).
 また、事前情報341には、図4で例示した情報以外の情報を含むことが出来る。例えば、図5は、事前情報341の他の一例を示している。例えば、図5を参照すると、事前情報341には、上記例示した情報の他に、欠損した属性が各候補の値をとる周辺確率を示す情報を含むことが出来る。例えば、事前情報には、図5で示すように、未知の属性xの候補(v11、……、v1k)それぞれに対応する周辺確率を示す情報を含むことが出来る。事前情報341には、上記例示した以外の情報が含まれてもよい。 Also, the advance information 341 can include information other than the information illustrated in FIG. For example, FIG. 5 shows another example of the prior information 341. As shown in FIG. For example, referring to FIG. 5, the prior information 341 can include, in addition to the information exemplified above, information indicating the marginal probability that the missing attribute takes the value of each candidate. For example, the a priori information can include information indicating the marginal probabilities corresponding to each candidate (v 11 , . . . , v 1k ) for the unknown attribute x 1 , as shown in FIG. The prior information 341 may include information other than the above examples.
 推論結果情報342は、後述する候補データ作成部351が事前情報341に基づいて作成する候補データを学習モデル241に入力することで取得される推論ラベルを示す情報を含んでいる。例えば、推論結果情報342には、欠損した属性における候補の数に応じた推論ラベルを示す情報が含まれうる。例えば、推論結果情報342は、後述する推論結果取得部353が推論ラベルをモデル格納装置200から取得することに応じて、生成、更新される。 The inference result information 342 includes information indicating an inference label obtained by inputting candidate data created by the candidate data creation unit 351 based on the prior information 341 to the learning model 241, which will be described later. For example, the inference result information 342 may include information indicating inference labels corresponding to the number of candidates for missing attributes. For example, the inference result information 342 is generated and updated in response to an inference label acquired from the model storage device 200 by an inference result acquisition unit 353 (to be described later).
 距離情報343は、後述する距離算出部354が推論結果情報342に含まれる推論ラベルと、訓練データとして用いたラベルと、の間の距離を算出した結果を示す情報を含んでいる。例えば、距離情報343には、欠損した属性における候補の数に応じた距離を示す情報が含まれうる。例えば、距離情報343は、距離算出部354が推論ラベルとラベルとの間の距離を算出することに応じて、生成、更新される。 The distance information 343 includes information indicating the result of calculating the distance between the inference label included in the inference result information 342 and the label used as training data by the distance calculation unit 354, which will be described later. For example, the distance information 343 may include information indicating distances according to the number of candidates for missing attributes. For example, the distance information 343 is generated and updated as the distance calculator 354 calculates the distance between the inference labels.
 推定情報344は、後述する推定部355が距離情報343に基づいて推定した結果を示す情報を含んでいる。例えば、推定情報344には、未知の属性候補のうち推定部355が推定した属性の値を示す情報などが含まれうる。例えば、推定情報344は、推定部355が推論ラベルとラベルとの間の距離に基づいて、候補のうち未知の属性の値として尤もらしい値を推定することなどに応じて、生成、更新される。 The estimation information 344 includes information indicating the result of estimation based on the distance information 343 by the estimation unit 355, which will be described later. For example, the estimation information 344 may include information indicating values of attributes estimated by the estimation unit 355 among unknown attribute candidates. For example, the estimation information 344 is generated and updated when the estimation unit 355 estimates a plausible value for an unknown attribute among the candidates based on the distance between the inference labels. .
 演算処理部350は、CPUなどの演算装置とその周辺回路を有する。演算処理部350は、記憶部340からプログラム345を読み込んで実行することにより、上記ハードウェアとプログラム345とを協働させて各種処理部を実現する。演算処理部350で実現される主な処理部としては、例えば、候補データ作成部351、候補データ送信部352、推論結果取得部353、距離算出部354、推定部355、評価部356、出力部357などがある。 The arithmetic processing unit 350 has an arithmetic device such as a CPU and its peripheral circuits. The arithmetic processing unit 350 reads the program 345 from the storage unit 340 and executes it, so that the hardware and the program 345 work together to realize various processing units. Main processing units realized by the arithmetic processing unit 350 include, for example, a candidate data creation unit 351, a candidate data transmission unit 352, an inference result acquisition unit 353, a distance calculation unit 354, an estimation unit 355, an evaluation unit 356, and an output unit. 357, etc.
 候補データ作成部351は、事前情報341に基づいて候補データを作成する。例えば、候補データ作成部351は、欠損属性情報が示す候補の数に応じた候補データを作成する。候補データ作成部351は、任意のタイミングで候補データを作成してよい。 The candidate data creation unit 351 creates candidate data based on the prior information 341. For example, the candidate data creation unit 351 creates candidate data according to the number of candidates indicated by the missing attribute information. The candidate data creation unit 351 may create candidate data at any timing.
 具体的に、例えば、事前情報341として、部分訓練データ情報(x、…、x、y)が格納されており、欠損属性情報として未知の属性xが(v11、…、v1k)のいずれかの値である旨が格納されているとする。この場合、候補データ作成部351は、未知の属性xが(v11、…、v1k)のうちのいずれかの値をとるとして、(v11、…、v1k)それぞれに対応する候補データを作成する。つまり、候補データ作成部351は、(v11、x、…、x)、…、(v1k、x、…、x)という候補データを作成する。 Specifically, for example , as the prior information 341, partial training data information (x 2 , . ) is stored. In this case, the candidate data generating unit 351 assumes that the unknown attribute x 1 takes any value of (v 11 , . . . , v 1k ), and the candidate Create data. That is, the candidate data creating unit 351 creates candidate data (v 11 , x 2 , ..., x d ), ..., (v 1k , x 2 , ..., x d ).
 なお、上述したように、事前情報341には、部分訓練データ情報と欠損属性情報とを対応付けた情報を複数含むことが出来る。候補データ作成部351は、上記対応付けた情報ごとに、上述した方法を用いて候補データを作成してよい。 It should be noted that, as described above, the prior information 341 can include a plurality of pieces of information in which partial training data information and missing attribute information are associated with each other. The candidate data creation unit 351 may create candidate data using the method described above for each of the associated information.
 候補データ送信部352は、候補データ作成部351が作成した候補データをモデル格納装置200に対して送信する。候補データ送信部352は、候補データ作成時に用いた部分訓練データ情報などに応じた候補データの識別情報などを候補データとともに送信してもよい。 The candidate data transmission unit 352 transmits the candidate data created by the candidate data creation unit 351 to the model storage device 200 . The candidate data transmission unit 352 may transmit, together with the candidate data, identification information of the candidate data according to the partial training data information used when creating the candidate data.
 推論結果取得部353は、モデル格納装置200から候補データに基づく推論の結果として、推論ラベルを受信、取得する。例えば、推論結果取得部353は、推論対象となった候補データが判別可能なように、識別情報などとともにモデル格納装置200から推論ラベルを取得する。また、推論結果取得部353は、受信した推論ラベルを推論結果情報342として記憶部340に格納する。推論結果取得部353は、対応する候補データの識別情報などともに、推論ラベルを記憶部340に格納してもよい。 The inference result acquisition unit 353 receives and acquires an inference label from the model storage device 200 as a result of inference based on candidate data. For example, the inference result acquisition unit 353 acquires the inference label from the model storage device 200 together with the identification information so that the inference target candidate data can be identified. The inference result acquisition unit 353 also stores the received inference label as the inference result information 342 in the storage unit 340 . The inference result acquisition unit 353 may store the inference label in the storage unit 340 together with the identification information of the corresponding candidate data.
 距離算出部354は、事前情報341と推論結果情報342に基づいて、推論ラベルと、推論対象となった候補データの作成元になった部分訓練データ情報に含まれるラベルと、の間の距離を算出する。つまり、距離算出部354は、推論ラベルと、推論対象となった候補データに対応するラベルと、の間の距離を算出する。また、距離算出部354は、算出した距離を距離情報343として記憶部340に格納する。距離算出部354は、対応する候補データの識別情報などともに、算出結果を記憶部340に格納してもよい。 Based on the prior information 341 and the inference result information 342, the distance calculation unit 354 calculates the distance between the inference label and the label included in the partial training data information from which the candidate data to be inferred was created. calculate. That is, the distance calculation unit 354 calculates the distance between the inference label and the label corresponding to the inference target candidate data. Further, the distance calculation unit 354 stores the calculated distance in the storage unit 340 as the distance information 343 . The distance calculation unit 354 may store the calculation result in the storage unit 340 together with the identification information of the corresponding candidate data.
 具体的には、例えば、距離算出部354は、推論ラベルとラベルとの間の距離として、推論ラベルとラベルとの間の残差を計算する。例えば、ラベルをyと表記し、推論ラベルを数1と表記するとする。この場合、距離算出部354は、後述する数2を計算することで、推論ラベルとラベルとの間の残差を計算する。
 なお、iは、1からkまでのうちのいずれかの値をとる。
Specifically, for example, the distance calculation unit 354 calculates a residual between the inference label and the label as the distance between the inference label and the label. For example, let the label be denoted as y and the inference label be denoted as number 1. In this case, the distance calculation unit 354 calculates the residual between the inference label and the label by calculating Equation 2, which will be described later.
Note that i takes any value from 1 to k.
 例えば、以上のように、距離算出部354は、推論ラベルとラベルとの間の距離として推論ラベルとラベルとの間の残差を算出する。距離算出部354は、距離として数2の2倍の値を算出するなど、例示した以外の既知の方法を用いて推論ラベルとラベルとの間の距離を算出するよう構成してもよい。 For example, as described above, the distance calculation unit 354 calculates the residual between the inference labels as the distance between the inference labels. The distance calculation unit 354 may be configured to calculate the distance between the inference label and the label using a known method other than the exemplified one, such as calculating a value that is twice the value of Equation 2 as the distance.
 推定部355は、距離情報343に基づいて、候補のうち未知の属性として尤もらしい属性の値を推定する。また、推定部355は、推定した結果を推定情報344として記憶部340に格納する。 Based on the distance information 343, the estimation unit 355 estimates the value of an attribute that is likely to be an unknown attribute among the candidates. The estimation unit 355 also stores the result of estimation in the storage unit 340 as estimation information 344 .
 例えば、推定部355は、距離情報343に基づいて、距離が最小になる候補を特定することで、特定した結果に応じた値を推定する。具体的に、例えば、推定部355は、下記数3を解くことでi’を特定する。そして、尤もらしい属性の値として、特定したi’に応じたv1i’を出力する。なお、i’は、1からkまでのうちのいずれかの値をとる。
For example, the estimation unit 355 identifies a candidate with the smallest distance based on the distance information 343, and estimates a value according to the identified result. Specifically, for example, the estimation unit 355 identifies i′ by solving Equation 3 below. Then, v 1i' corresponding to the specified i' is output as a plausible attribute value. Note that i' takes any value from 1 to k.
 なお、残差が0になる場合などにおいて、i’が複数存在する場合がある。この場合、例えば、推定部355は、複数のi’のうちのいずれかを例えばランダムに選んで、選んだ結果に応じたv1i’を出力することが出来る。また、上述したように、事前情報341に周辺確率を示す情報が含まれている場合がある。この場合、推定部355は、周辺確率に基づいて複数のi’のうちのいずれかを選んでよい。例えば、推定部355は、複数のi’のうち周辺確率が最大のi’を選ぶことが出来る。また、推定部355は、周辺確率に応じた確率でi’を選ぶよう構成してもよい。このように、推定部355は、i’が複数存在する場合に任意の方法で複数のi’のうちのいずれかを選んで、選んだ結果に応じたv1i’を出力するよう構成してよい。i’が複数存在する場合、推定部355は、複数のi’それぞれに応じた、複数のv1i’を出力するよう構成してもよい。 It should be noted that there may be a plurality of i' when the residual is 0, for example. In this case, for example, the estimating unit 355 can select one of the plurality of i' at random, and output v 1i' according to the selected result. Further, as described above, the prior information 341 may include information indicating marginal probabilities. In this case, the estimator 355 may select one of the multiple i' based on marginal probabilities. For example, the estimating unit 355 can select i′ having the maximum marginal probability among a plurality of i′. Also, the estimation unit 355 may be configured to select i' with a probability corresponding to the marginal probability. In this way, the estimating unit 355 is configured to select one of the plurality of i' by an arbitrary method when there are a plurality of i' and to output v 1i' according to the selected result. good. When there are a plurality of i', the estimator 355 may be configured to output a plurality of v 1i ' corresponding to each of the plurality of i'.
 評価部356は、推定情報344に基づく評価を行う。換言すると、評価部356は、推定部355による推定の結果に基づいて、リスク評価を行う。 The evaluation unit 356 performs evaluation based on the estimation information 344. In other words, the evaluation unit 356 performs risk evaluation based on the result of estimation by the estimation unit 355 .
 例えば、評価部356は、事前情報341が示す未知の属性が実際にはどのような値であったかを示す情報である正解情報を有している。例えば、図4で例示する場合、評価部356は、xが(v11、…、v1k)のうちのいずれの値であるかを示す正解情報を有している。評価部356は、推定部355による推定の結果と、正解情報が示す実際の値と、を比較して、比較した結果に基づいてリスクの評価を行うことが出来る。例えば、評価部356は、推定部355による推定の結果と、正解情報が示す実際の値と、が一致している場合にリスクが高い、と評価することが出来る。一方、推定部355による推定の結果と、正解情報が示す実際の値と、が一致していない場合、評価部356は、リスクが低い、と評価することが出来る。 For example, the evaluation unit 356 has correct answer information, which is information indicating what value the unknown attribute indicated by the prior information 341 was actually. For example, in the case of FIG. 4, the evaluation unit 356 has correct answer information indicating which value of (v 11 , . . . , v 1k ) x 1 is. The evaluation unit 356 can compare the result of estimation by the estimation unit 355 and the actual value indicated by the correct answer information, and perform risk evaluation based on the comparison result. For example, the evaluation unit 356 can evaluate that the risk is high when the result of estimation by the estimation unit 355 and the actual value indicated by the correct answer information match. On the other hand, when the result of estimation by the estimation unit 355 and the actual value indicated by the correct answer information do not match, the evaluation unit 356 can evaluate that the risk is low.
 なお、上述したように、事前情報341には、部分訓練データ情報と欠損属性情報とを対応付けた情報が複数含まれている。そのため、推定部355は、上記対応付けた情報ごとに候補の推定を行うことが出来る。そこで、例えば、評価部356は、推定部355による複数の推定の結果と、各推定に応じた正解情報と、の比較結果に基づいて、リスクの評価を行ってもよい。具体的に、例えば、評価部356は、複数の比較の結果に応じて、推定の結果と正解情報とが一致した割合を示す正答率を算出する。そして、評価部356は、リスクを示す情報として、例えば、算出した正答率を出力することが出来る。評価部356は、算出した正答率が予め定められた閾値を超えているか否かなどに応じてリスクを評価して、評価した結果を出力するよう構成してもよい。 As described above, the prior information 341 includes a plurality of pieces of information in which partial training data information and missing attribute information are associated with each other. Therefore, the estimating unit 355 can estimate a candidate for each of the associated information. Therefore, for example, the evaluation unit 356 may perform risk evaluation based on a comparison result between a plurality of estimation results by the estimation unit 355 and correct information corresponding to each estimation. Specifically, for example, the evaluation unit 356 calculates the percentage of correct answers indicating the percentage of matches between the estimation results and the correct information, according to the results of a plurality of comparisons. Then, the evaluation unit 356 can output, for example, the calculated percentage of correct answers as the information indicating the risk. The evaluation unit 356 may be configured to evaluate risk according to whether or not the calculated percentage of correct answers exceeds a predetermined threshold, and output the evaluation result.
 出力部357は、推定部355が推定した候補を示す情報や、評価部356による評価結果を示す情報などを出力する。例えば、出力部357は、上記各情報を画面表示部320上に表示させたり、通信I/F部330を介して外部装置に対して送信したりする。 The output unit 357 outputs information indicating candidates estimated by the estimation unit 355, information indicating evaluation results by the evaluation unit 356, and the like. For example, the output unit 357 displays each of the above information on the screen display unit 320 or transmits the information to an external device via the communication I/F unit 330 .
 以上が、リスク評価装置300の構成例である。続いて、図6、図7を参照してリスク評価装置300の動作例について説明する。 The above is a configuration example of the risk evaluation device 300. Next, an operation example of the risk assessment device 300 will be described with reference to FIGS. 6 and 7. FIG.
 まず、図6を参照して、未知の属性を推定する際のリスク評価装置300の動作例について説明する。図6は、未知の属性推定時のリスク評価装置300の動作例を示すフローチャートである。図6を参照すると、候補データ作成部351は、事前情報341に基づいて候補データを作成する(ステップS101)。例えば、候補データ作成部351は、欠損属性情報が示す候補の数に応じた候補データを作成する。 First, an operation example of the risk evaluation device 300 when estimating an unknown attribute will be described with reference to FIG. FIG. 6 is a flowchart showing an operation example of the risk evaluation device 300 when estimating an unknown attribute. Referring to FIG. 6, the candidate data creating unit 351 creates candidate data based on the prior information 341 (step S101). For example, the candidate data creation unit 351 creates candidate data according to the number of candidates indicated by the missing attribute information.
 候補データ送信部352は、候補データ作成部351が作成した各候補データをモデル格納装置200に対して送信する(ステップS102)。 The candidate data transmission unit 352 transmits each candidate data created by the candidate data creation unit 351 to the model storage device 200 (step S102).
 推論結果取得部353は、モデル格納装置200から候補データに基づく推論の結果として、候補データごとに推論ラベルを取得する(ステップS103)。 The inference result acquisition unit 353 acquires an inference label for each candidate data from the model storage device 200 as an inference result based on the candidate data (step S103).
 距離算出部354は、推論結果取得部353が取得した推論ラベルに基づいて、推論ラベルと、対応する部分訓練データ情報が示す訓練時のラベルと、の間の距離を算出する(ステップS104)。例えば、距離算出部354は、受信した各推論ラベルとラベルとの間の残差を距離としてそれぞれ算出する。 Based on the inference label acquired by the inference result acquisition unit 353, the distance calculation unit 354 calculates the distance between the inference label and the training label indicated by the corresponding partial training data information (step S104). For example, the distance calculation unit 354 calculates the residual between each received inference label and the label as the distance.
 推定部355は、距離算出部354が算出した結果に基づいて、候補のうち未知の属性として尤もらしい値を推定する(ステップS105)。例えば、推定部355は、距離情報343に基づいて、距離が最小になる候補を特定することで、特定した結果に応じた値を推定する。 Based on the result calculated by the distance calculation unit 354, the estimation unit 355 estimates a plausible value for the unknown attribute among the candidates (step S105). For example, the estimation unit 355 identifies a candidate with the smallest distance based on the distance information 343, and estimates a value according to the identified result.
 以上が、属性推定時のリスク評価装置300の構成例である。例えば、リスク評価装置300は、推定する対象ごとにステップS101からステップS105までの処理を行うことが出来る。 The above is a configuration example of the risk evaluation device 300 at the time of attribute estimation. For example, the risk evaluation device 300 can perform the processing from step S101 to step S105 for each target to be estimated.
 続いて、図7を参照してリスク評価時のリスク評価装置300の動作例について説明する。図7は、リスク評価時におけるリスク評価装置300の動作例を示すフローチャートである。図7を参照すると、リスク評価装置300は、図6を参照して説明した未知の属性について推定する処理を行う(ステップS201)。 Next, an operation example of the risk evaluation device 300 during risk evaluation will be described with reference to FIG. FIG. 7 is a flowchart showing an operation example of the risk evaluation device 300 during risk evaluation. Referring to FIG. 7, the risk evaluation device 300 performs the process of estimating unknown attributes described with reference to FIG. 6 (step S201).
 推定対象が事前情報341に残っている場合(ステップS202、No)、リスク評価装置300は、ステップS201の処理に戻って推定処理を行う。一方、事前情報341内に推定対象がなくなった場合(ステップS202、Yes)、リスク評価装置300は、各推定の結果に応じたリスク評価を行う(ステップS203)。例えば、リスク評価装置300は、各推定の結果と、各推定に応じた正解情報と、の比較結果に基づいて正答率を算出して、算出した正答率に応じた出力を行うことが出来る。 When the estimation target remains in the prior information 341 (step S202, No), the risk evaluation device 300 returns to the process of step S201 and performs the estimation process. On the other hand, when there is no estimation target in the prior information 341 (step S202, Yes), the risk evaluation device 300 performs risk evaluation according to each estimation result (step S203). For example, the risk assessment device 300 can calculate the percentage of correct answers based on the results of comparison between the result of each estimation and the correct answer information corresponding to each estimation, and output according to the calculated percentage of correct answers.
 以上が、リスク評価時におけるリスク評価装置300の動作例である。なお、ステップS203の処理は、ステップS201、S202の処理の後に必ずしも連続的に行われなくてもよい。例えば、ステップS203の処理は、ステップS201、S202の処理の後、任意のタイミングで行ってよい。 The above is an example of the operation of the risk evaluation device 300 during risk evaluation. Note that the process of step S203 does not necessarily have to be performed continuously after the processes of steps S201 and S202. For example, the process of step S203 may be performed at any timing after the processes of steps S201 and S202.
 このように、リスク評価装置300は、距離算出部354と推定部355とを有している。このような構成によると、推定部355は、距離算出部354が算出した推定ラベルとラベルとの間の距離に基づいて、候補のうち未知の属性として尤もらしい属性の値を推定することが出来る。つまり、上記構成によると、エラー関数を仮定したり周辺確率についての知識を前提としたりすることなく、未知の属性値を推定することが出来る。その結果、上記のような知識を有さない場合や仮定を行わない場合などにおいても、つまり、事前知識を仮定しない場合などであっても、より的確にデータを推定できる。 Thus, the risk evaluation device 300 has the distance calculation unit 354 and the estimation unit 355. According to such a configuration, the estimation unit 355 can estimate the value of the attribute that is likely to be the unknown attribute among the candidates, based on the distance between the estimated labels calculated by the distance calculation unit 354. . That is, according to the above configuration, unknown attribute values can be estimated without assuming error functions or knowledge of marginal probabilities. As a result, the data can be estimated more accurately even when the user does not have the above knowledge or makes no assumptions, that is, even when no prior knowledge is assumed.
 なお、本実施形態においては、未知の属性がx1つである場合について例示した。しかしながら、本発明は、未知の属性が複数ある場合であっても問題なく適用することが出来る。 Note that the present embodiment has exemplified the case where there is one unknown attribute x 1 . However, the present invention can be applied without problems even when there are multiple unknown attributes.
 例えば、図8は、未知の属性がxからxまで複数ある場合における事前情報341の一例を示している。例えば、図8では、属性(xn+1、…、x)とラベルyとが既知であり、属性(x、…、x)が欠損している場合について例示している。この場合、欠損属性情報は、欠損している各属性の値についての情報を示すことになる。なお、図5で例示したように、事前情報341には、未知の属性が複数ある場合においても各候補の周辺確率を示す情報が含まれてもよい。 For example, FIG. 8 shows an example of prior information 341 when there are multiple unknown attributes from x1 to xn . For example, FIG. 8 illustrates a case where attributes (x n+1 , . . . , x d ) and label y are known and attributes (x 1 , . . . , x n ) are missing. In this case, the missing attribute information indicates information about the value of each missing attribute. Note that, as illustrated in FIG. 5, the prior information 341 may include information indicating the marginal probability of each candidate even when there are a plurality of unknown attributes.
 図8で示すように未知の属性が複数ある場合、候補データ作成部351は、未知の属性がそれぞれ候補のうちのいずれかをとるとして、未知属性の候補の組み合わせに応じた数の候補データを作成する。候補データ送信部352以降は、未知の属性が1つである場合と同様に処理することが出来る。例えば、推定部355は、i’が複数存在する場合、未知の属性が1つであった場合と同様に、複数のi’のうちのいずれかを例えばランダムに選んでもよいし、周辺確率に応じて選んでもよい。例えば、以上のように、未知の属性が複数ある場合であっても、候補データ作成部351が作成する候補データの数が増える以外は、未知の属性が1つである場合と同様の処理により未知の属性値を推定することが出来る。 When there are a plurality of unknown attributes as shown in FIG. 8, the candidate data creation unit 351 assumes that each unknown attribute takes one of the candidates, and creates a number of candidate data corresponding to the combination of the unknown attribute candidates. create. From the candidate data transmission unit 352 onward, processing can be performed in the same manner as when there is one unknown attribute. For example, when there are a plurality of i', the estimating unit 355 may select one of the plurality of i' at random, for example, as in the case where there is one unknown attribute, or the marginal probability may be You can choose accordingly. For example, as described above, even when there are a plurality of unknown attributes, the same processing as in the case where there is one unknown attribute is performed except that the number of candidate data created by the candidate data creating unit 351 increases. Unknown attribute values can be estimated.
 なお、本実施形態においては、リスク評価システム100がモデル格納装置200とリスク評価装置300とを有する場合について例示した。しかしながら、リスク評価システム100は、例えば、本実施形態で説明したモデル格納装置200とリスク評価装置300としての機能を有する1台の情報処理装置から構成されてもよい。リスク評価システム100は、その他既知の変形例を採用してもよい。 In addition, in this embodiment, the case where the risk evaluation system 100 has the model storage device 200 and the risk evaluation device 300 is exemplified. However, the risk evaluation system 100 may be composed of, for example, one information processing device having the functions of the model storage device 200 and the risk evaluation device 300 described in this embodiment. Risk assessment system 100 may employ other known variations.
[第2の実施形態]
 次に、本開示の第2の実施形態について、図9、図10を参照して説明する。図9は、推定装置400のハードウェア構成例を示す図である。図10は、推定装置400の構成例を示すブロック図である。
[Second embodiment]
Next, a second embodiment of the present disclosure will be described with reference to FIGS. 9 and 10. FIG. FIG. 9 is a diagram illustrating a hardware configuration example of the estimation device 400. As illustrated in FIG. FIG. 10 is a block diagram showing a configuration example of the estimation device 400. As shown in FIG.
 本開示の第2の実施形態においては、既知の属性についての情報などに基づいて未知の属性値を推定する情報処理装置である推定装置400の構成例について説明する。図9は、推定装置400のハードウェア構成例を示している。図9を参照すると、推定装置400は、一例として、以下のようなハードウェア構成を有している。
 ・CPU(Central Processing Unit)401(演算装置)
 ・ROM(Read Only Memory)402(記憶装置)
 ・RAM(Random Access Memory)403(記憶装置)
 ・RAM403にロードされるプログラム群404
 ・プログラム群404を格納する記憶装置405
 ・情報処理装置外部の記録媒体410の読み書きを行うドライブ装置406
 ・情報処理装置外部の通信ネットワーク411と接続する通信インタフェース407
 ・データの入出力を行う入出力インタフェース408
 ・各構成要素を接続するバス409
In a second embodiment of the present disclosure, a configuration example of an estimation device 400, which is an information processing device that estimates unknown attribute values based on information about known attributes, will be described. FIG. 9 shows a hardware configuration example of the estimation device 400 . Referring to FIG. 9, the estimating device 400 has the following hardware configuration as an example.
- CPU (Central Processing Unit) 401 (arithmetic unit)
・ROM (Read Only Memory) 402 (storage device)
・RAM (Random Access Memory) 403 (storage device)
Program group 404 loaded into RAM 403
- Storage device 405 for storing program group 404
- A drive device 406 that reads and writes a recording medium 410 outside the information processing device
- A communication interface 407 that connects to a communication network 411 outside the information processing apparatus
An input/output interface 408 for inputting/outputting data
A bus 409 connecting each component
 また、推定装置400は、プログラム群404をCPU401が取得して当該CPU401が実行することで、図10に示す取得部421、算出部422、推定部423としての機能を実現することが出来る。なお、プログラム群404は、例えば、予め記憶装置405やROM402に格納されており、必要に応じてCPU401がRAM403などにロードして実行する。また、プログラム群404は、通信ネットワーク411を介してCPU401に供給されてもよいし、予め記録媒体410に格納されており、ドライブ装置406が該プログラムを読み出してCPU401に供給してもよい。 Also, the estimating apparatus 400 can realize the functions of the acquiring unit 421, the calculating unit 422, and the estimating unit 423 shown in FIG. The program group 404 is stored in the storage device 405 or the ROM 402 in advance, for example, and is loaded into the RAM 403 or the like by the CPU 401 as necessary and executed. The program group 404 may be supplied to the CPU 401 via the communication network 411 or stored in the recording medium 410 in advance, and the drive device 406 may read the program and supply it to the CPU 401 .
 なお、図9は、推定装置400のハードウェア構成例を示している。推定装置400のハードウェア構成は上述した場合に限定されない。例えば、推定装置400は、ドライブ装置406を有さないなど、上述した構成の一部から構成されてもよい。 Note that FIG. 9 shows a hardware configuration example of the estimation device 400 . The hardware configuration of estimation device 400 is not limited to the case described above. For example, the estimating device 400 may be configured from part of the configuration described above, such as not having the drive device 406 .
 取得部421は、未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得する。 The acquisition unit 421 acquires a plurality of inference results that are respectively inferred as a result of inputting a plurality of candidate data created based on information indicating unknown attribute candidates to the learning model.
 算出部422は、取得部421が取得した推論結果と、候補データに対応するラベルと、の間の距離を推論結果ごとに算出する。例えば、算出部422は、距離として残差を算出する。 The calculation unit 422 calculates the distance between the inference result acquired by the acquisition unit 421 and the label corresponding to the candidate data for each inference result. For example, the calculator 422 calculates the residual as the distance.
 推定部423は、算出部422が算出した結果に応じて、未知の属性の値を推定する。 The estimation unit 423 estimates the unknown attribute value according to the result calculated by the calculation unit 422 .
 このように、推定装置400は、算出部422と推定部423とを有している。このような構成によると、推定部423は、算出部422が距離を算出した結果に基づいて未知の属性の値を推定することが出来る。つまり、上記構成によると、エラー関数を仮定したり周辺確率についての知識を前提としたりすることなく、未知の属性値を推定することが出来る。その結果、上記のような知識を有さない場合や仮定を行わない場合などにおいても、つまり、事前知識を仮定しない場合などであっても、より的確にデータを推定できる。 In this way, the estimation device 400 has the calculation unit 422 and the estimation unit 423. According to such a configuration, the estimation unit 423 can estimate the value of the unknown attribute based on the distance calculation result of the calculation unit 422 . That is, according to the above configuration, unknown attribute values can be estimated without assuming error functions or knowledge of marginal probabilities. As a result, the data can be estimated more accurately even when the user does not have the above knowledge or makes no assumptions, that is, even when no prior knowledge is assumed.
 なお、上述した推定装置400は、当該推定装置400などの情報処理装置に所定のプログラムが組み込まれることで実現できる。具体的に、本発明の他の形態であるプログラムは、推定装置400などの情報処理装置に、未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得し、取得した推論結果と、候補データに対応するラベルと、の間の距離を推論結果ごとに算出し、算出した結果に応じて、未知の属性の値を推定する、処理を実現するためのプログラムである。 Note that the estimation device 400 described above can be realized by installing a predetermined program in an information processing device such as the estimation device 400 . Specifically, a program that is another aspect of the present invention inputs a plurality of candidate data created based on information indicating unknown attribute candidates to an information processing device such as the estimation device 400 to a learning model. Obtain multiple inference results that are respectively inferred as a result, calculate the distance between the obtained inference result and the label corresponding to the candidate data for each inference result, and determine the unknown attribute according to the calculated result. This is a program for realizing the process of estimating the value of
 また、上述した推定装置400などの情報処理装置により実行される推定方法は、推定装置400などの情報処理装置が、未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得し、取得した推論結果と、候補データに対応するラベルと、の間の距離を推論結果ごとに算出し、算出した結果に応じて、未知の属性の値を推定する、という方法である。 In addition, in the estimation method executed by the information processing apparatus such as the estimation apparatus 400 described above, the information processing apparatus such as the estimation apparatus 400 uses a plurality of candidate data created based on information indicating unknown attribute candidates as a learning model. For each inference result, obtain multiple inference results to be inferred as a result of each input, calculate the distance between the obtained inference result and the label corresponding to the candidate data for each inference result, and calculate the distance according to the calculated result It is a method of estimating the value of an unknown attribute by using
 上述した構成を有する、プログラム、又は、プログラムを記録したコンピュータが読み取り可能な記録媒体、又は、推定方法、の発明であっても、上述した推定装置400と同様の作用・効果を有するために、上述した本発明の目的を達成することが出来る。 Even in the invention of the program, the computer-readable recording medium recording the program, or the estimation method having the configuration described above, in order to have the same effects and effects as the estimation device 400 described above, The objects of the present invention described above can be achieved.
 <付記>
 上記実施形態の一部又は全部は、以下の付記のようにも記載されうる。以下、本発明における推定装置などの概略を説明する。但し、本発明は、以下の構成に限定されない。
<Appendix>
Some or all of the above embodiments may also be described as the following appendices. An outline of the estimation device and the like according to the present invention will be described below. However, the present invention is not limited to the following configurations.
(付記1)
 未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得する取得部と、
 前記取得部が取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出する算出部と、
 前記算出部が算出した結果に応じて、未知の属性の値を推定する推定部と、
 を有する
 推定装置。
(付記2)
 付記1に記載の推定装置であって、
 前記算出部は、前記推論結果と前記ラベルとの間の距離として、前記推論結果と前記ラベルとの間の残差を算出し、
 前記推定部は、前記算出部が算出した残差に応じて、未知の属性の値を推定する
 推定装置。
(付記3)
 付記1または付記2に記載の推定装置であって、
 前記推定部は、前記算出部が算出した結果に応じて距離が最小になる候補を特定することで、特定した結果に応じた値を推定する
 推定装置。
(付記4)
 付記1から付記3までのうちのいずれか1項に記載の推定装置であって、
 前記推定部は、未知の属性候補の周辺確率を示す情報を用いて、未知の属性の値を推定する
 推定装置。
(付記5)
 付記1から付記4までのうちのいずれか1項に記載の推定装置であって、
 既知の属性についての情報と、未知の属性候補を示す情報と、に基づいて、未知の属性候補それぞれに対応する候補データを作成する作成部を有し、
 前記取得部は、前記作成部が作成した複数の候補データを学習モデルに対して入力した結果として推論される推論結果を取得する
 推定装置。
(付記6)
 付記5に記載の推定装置であって、
 前記作成部は、未知の属性が複数ある場合、複数の未知属性の各候補の組み合わせに応じた候補データを作成する
 推定装置。
(付記7)
 付記1から付記6までのうちのいずれか1項に記載の推定装置であって、
 前記推定部による推定の結果に基づく所定の評価を行う評価部を有する
 推定装置。
(付記8)
 情報処理装置が、
 未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得し、
 取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出し、
 算出した結果に応じて、未知の属性の値を推定する
 推定方法。
(付記9)
 付記8に記載の推定方法であって、
 前記推論結果と前記ラベルとの間の距離として、前記推論結果と前記ラベルとの間の残差を算出し、
 算出した残差に応じて、未知の属性の値を推定する
 推定方法。
(付記10)
 情報処理装置に、
 未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得し、
 取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出し、
 算出した結果に応じて、未知の属性の値を推定する
 処理を実現するためのプログラムを記録した、コンピュータが読み取り可能な記録媒体。
(Appendix 1)
an acquisition unit for acquiring a plurality of inference results respectively inferred as a result of inputting a plurality of candidate data created based on information indicating an unknown attribute candidate to a learning model;
a calculation unit that calculates a distance between the inference result acquired by the acquisition unit and a label corresponding to the candidate data for each inference result;
an estimation unit that estimates the value of an unknown attribute according to the result calculated by the calculation unit;
an estimator.
(Appendix 2)
The estimating device according to Supplementary Note 1,
The calculation unit calculates a residual between the inference result and the label as a distance between the inference result and the label,
The estimation device, wherein the estimation unit estimates a value of an unknown attribute according to the residual calculated by the calculation unit.
(Appendix 3)
The estimating device according to Supplementary Note 1 or Supplementary Note 2,
The estimating device, wherein the estimating unit estimates a value according to the specified result by specifying a candidate with the smallest distance according to the result calculated by the calculating unit.
(Appendix 4)
The estimating device according to any one of Supplements 1 to 3,
The estimation device, wherein the estimation unit estimates the value of the unknown attribute using information indicating the marginal probability of the unknown attribute candidate.
(Appendix 5)
The estimating device according to any one of Supplements 1 to 4,
a creation unit that creates candidate data corresponding to each unknown attribute candidate based on information about known attributes and information indicating unknown attribute candidates;
The acquisition unit acquires an inference result inferred as a result of inputting the plurality of candidate data created by the creation unit to the learning model.
(Appendix 6)
The estimating device according to Supplementary Note 5,
The estimating device, wherein, when there are a plurality of unknown attributes, the creating unit creates candidate data according to a combination of candidates of the plurality of unknown attributes.
(Appendix 7)
The estimating device according to any one of Supplements 1 to 6,
An estimating device, comprising an evaluating unit that performs a predetermined evaluation based on a result of estimation by the estimating unit.
(Appendix 8)
The information processing device
Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively;
calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result;
An estimation method that estimates the value of an unknown attribute according to the calculated result.
(Appendix 9)
The estimation method according to Appendix 8,
calculating a residual between the inference result and the label as the distance between the inference result and the label;
An estimation method that estimates the value of an unknown attribute according to the computed residuals.
(Appendix 10)
information processing equipment,
Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively;
calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result;
A computer-readable recording medium that records a program for realizing the process of estimating the value of an unknown attribute according to the calculated result.
 以上、上記各実施形態を参照して本願発明を説明したが、本願発明は、上述した実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明の範囲内で当業者が理解しうる様々な変更をすることが出来る。 Although the present invention has been described with reference to the above-described embodiments, the present invention is not limited to the above-described embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.
100 リスク評価システム
200 モデル格納装置
210 受信部
220 推論部
230 出力部
240 記憶部
241 学習モデル
300 リスク評価装置
310 操作入力部
320 画面表示部
330 通信I/F部
340 記憶部
341 事前情報
342 推論結果情報
343 距離情報
344 推定情報
350 演算処理部
351 候補データ作成部
352 候補データ送信部
353 推論結果取得部
354 距離算出部
355 推定部
356 評価部
357 出力部
400 推定装置
401 CPU
402 ROM
403 RAM
404 プログラム群
405 記憶装置
406 ドライブ装置
407 通信インタフェース
408 入出力インタフェース
409 バス
410 記録媒体
411 通信ネットワーク
421 取得部
422 算出部
423 推定部

 
100 Risk evaluation system 200 Model storage device 210 Reception unit 220 Inference unit 230 Output unit 240 Storage unit 241 Learning model 300 Risk evaluation device 310 Operation input unit 320 Screen display unit 330 Communication I/F unit 340 Storage unit 341 Prior information 342 Inference result Information 343 Distance information 344 Estimation information 350 Calculation processing unit 351 Candidate data creation unit 352 Candidate data transmission unit 353 Inference result acquisition unit 354 Distance calculation unit 355 Estimation unit 356 Evaluation unit 357 Output unit 400 Estimation device 401 CPU
402 ROMs
403 RAM
404 program group 405 storage device 406 drive device 407 communication interface 408 input/output interface 409 bus 410 recording medium 411 communication network 421 acquisition unit 422 calculation unit 423 estimation unit

Claims (10)

  1.  未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得する取得部と、
     前記取得部が取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出する算出部と、
     前記算出部が算出した結果に応じて、未知の属性の値を推定する推定部と、
     を有する
     推定装置。
    an acquisition unit for acquiring a plurality of inference results respectively inferred as a result of inputting a plurality of candidate data created based on information indicating an unknown attribute candidate to a learning model;
    a calculation unit that calculates a distance between the inference result acquired by the acquisition unit and a label corresponding to the candidate data for each inference result;
    an estimation unit that estimates the value of an unknown attribute according to the result calculated by the calculation unit;
    an estimator.
  2.  請求項1に記載の推定装置であって、
     前記算出部は、前記推論結果と前記ラベルとの間の距離として、前記推論結果と前記ラベルとの間の残差を算出し、
     前記推定部は、前記算出部が算出した残差に応じて、未知の属性の値を推定する
     推定装置。
    The estimating device according to claim 1,
    The calculation unit calculates a residual between the inference result and the label as a distance between the inference result and the label,
    The estimation device, wherein the estimation unit estimates a value of an unknown attribute according to the residual calculated by the calculation unit.
  3.  請求項1または請求項2に記載の推定装置であって、
     前記推定部は、前記算出部が算出した結果に応じて距離が最小になる候補を特定することで、特定した結果に応じた値を推定する
     推定装置。
    The estimating device according to claim 1 or claim 2,
    The estimating device, wherein the estimating unit estimates a value according to the specified result by specifying a candidate with the smallest distance according to the result calculated by the calculating unit.
  4.  請求項1から請求項3までのうちのいずれか1項に記載の推定装置であって、
     前記推定部は、未知の属性候補の周辺確率を示す情報を用いて、未知の属性の値を推定する
     推定装置。
    The estimating device according to any one of claims 1 to 3,
    The estimation device, wherein the estimation unit estimates the value of the unknown attribute using information indicating the marginal probability of the unknown attribute candidate.
  5.  請求項1から請求項4までのうちのいずれか1項に記載の推定装置であって、
     既知の属性についての情報と、未知の属性候補を示す情報と、に基づいて、未知の属性候補それぞれに対応する候補データを作成する作成部を有し、
     前記取得部は、前記作成部が作成した複数の候補データを学習モデルに対して入力した結果として推論される推論結果を取得する
     推定装置。
    The estimating device according to any one of claims 1 to 4,
    a creation unit that creates candidate data corresponding to each unknown attribute candidate based on information about known attributes and information indicating unknown attribute candidates;
    The acquisition unit acquires an inference result inferred as a result of inputting the plurality of candidate data created by the creation unit to the learning model.
  6.  請求項5に記載の推定装置であって、
     前記作成部は、未知の属性が複数ある場合、複数の未知属性の各候補の組み合わせに応じた候補データを作成する
     推定装置。
    The estimating device according to claim 5,
    The estimating device, wherein, when there are a plurality of unknown attributes, the creating unit creates candidate data according to a combination of candidates of the plurality of unknown attributes.
  7.  請求項1から請求項6までのうちのいずれか1項に記載の推定装置であって、
     前記推定部による推定の結果に基づく所定の評価を行う評価部を有する
     推定装置。
    The estimating device according to any one of claims 1 to 6,
    An estimating device, comprising an evaluating unit that performs a predetermined evaluation based on a result of estimation by the estimating unit.
  8.  情報処理装置が、
     未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得し、
     取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出し、
     算出した結果に応じて、未知の属性の値を推定する
     推定方法。
    The information processing device
    Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively;
    calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result;
    An estimation method that estimates the value of an unknown attribute according to the calculated result.
  9.  請求項8に記載の推定方法であって、
     前記推論結果と前記ラベルとの間の距離として、前記推論結果と前記ラベルとの間の残差を算出し、
     算出した残差に応じて、未知の属性の値を推定する
     推定方法。
    The estimation method according to claim 8,
    calculating a residual between the inference result and the label as the distance between the inference result and the label;
    An estimation method that estimates the value of an unknown attribute according to the computed residuals.
  10.  情報処理装置に、
     未知の属性候補を示す情報に基づいて作成した複数の候補データを学習モデルに対してそれぞれ入力した結果としてそれぞれ推論される複数の推論結果を取得し、
     取得した前記推論結果と、前記候補データに対応するラベルと、の間の距離を前記推論結果ごとに算出し、
     算出した結果に応じて、未知の属性の値を推定する
     処理を実現するためのプログラムを記録した、コンピュータが読み取り可能な記録媒体。

     
     
    information processing equipment,
    Acquiring multiple inference results that are respectively inferred as a result of inputting multiple candidate data created based on information indicating unknown attribute candidates to a learning model, respectively;
    calculating a distance between the obtained inference result and a label corresponding to the candidate data for each inference result;
    A computer-readable recording medium that records a program for realizing the process of estimating the value of an unknown attribute according to the calculated result.


PCT/JP2022/008617 2022-03-01 2022-03-01 Estimation device WO2023166564A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/008617 WO2023166564A1 (en) 2022-03-01 2022-03-01 Estimation device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/008617 WO2023166564A1 (en) 2022-03-01 2022-03-01 Estimation device

Publications (1)

Publication Number Publication Date
WO2023166564A1 true WO2023166564A1 (en) 2023-09-07

Family

ID=87883155

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/008617 WO2023166564A1 (en) 2022-03-01 2022-03-01 Estimation device

Country Status (1)

Country Link
WO (1) WO2023166564A1 (en)

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JUN YAJIMA AND OTHERS: "A study on analysis methods for AI security vulnerabilities hidden in machine learning systems", PROCEEDINGS OF THE 2021 CRYPTOGRAPHY AND INFORMATION SECURITY SYMPOSIUM (SCIS 2021); JANUARY 19-22, 2021, IEICE, JP, 19 January 2021 (2021-01-19), JP, pages 1 - 8, XP009549404 *
UNE MASASHI: "Research trends and issues regarding security of machine learning systems", FINANCIAL RESEARCH, vol. 38, no. 1, 22 January 2019 (2019-01-22), pages 97 - 123, XP093088996 *
YUJI HIGUCHI AND OTHERS: "Training data estimation attack using VAE against classification model", PROCEEDINGS OF THE 2020 CRYPTOGRAPHY AND INFORMATION SECURITY SYMPOSIUM (SCIS2020); JANUARY 28-31, 2020, IEICE, JP, 21 January 2020 (2020-01-21) - 31 January 2020 (2020-01-31), JP, pages 1 - 8, XP009549403 *
中村 和晃, 第141回 知っておきたいキーワード Model Inversion Attack, 映像情報メディア学会誌, 01 May 2021, vol. 75, no. 3, pp. 384-386 *

Similar Documents

Publication Publication Date Title
US11551153B2 (en) Localized learning from a global model
CN110520871A (en) Training machine learning model
US10395646B2 (en) Two-stage training of a spoken dialogue system
US11030265B2 (en) Cross-platform data matching method and apparatus, computer device and storage medium
CN109389072B (en) Data processing method and device
US20200151545A1 (en) Update of attenuation coefficient for a model corresponding to time-series input data
US20190287010A1 (en) Search point determining method and search point determining apparatus
KR20200049373A (en) System and method for calibrating simulation model
CN110728328A (en) Training method and device for classification model
US20220222581A1 (en) Creation method, storage medium, and information processing apparatus
CN111783810A (en) Method and apparatus for determining attribute information of user
US11676030B2 (en) Learning method, learning apparatus, and computer-readable recording medium
WO2023166564A1 (en) Estimation device
CN110991661A (en) Method and apparatus for generating a model
US20230162018A1 (en) Dimensional reduction of correlated vectors
CN115296984A (en) Method, device, equipment and storage medium for detecting abnormal network nodes
CN110502715B (en) Click probability prediction method and device
WO2023166565A1 (en) Estimation device
JP7464115B2 (en) Learning device, learning method, and learning program
CN113779116A (en) Object sorting method, related equipment and medium
JP2021077206A (en) Learning method, evaluation device, and evaluation system
JP2021039426A (en) Estimation apparatus, estimation method and program
CN110378488B (en) Client-side change federal training method, device, training terminal and storage medium
JP7379300B2 (en) Estimation device, estimation method and program
CN114844889B (en) Video processing model updating method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22929712

Country of ref document: EP

Kind code of ref document: A1