WO2023166565A1

WO2023166565A1 - Estimation device

Info

Publication number: WO2023166565A1
Application number: PCT/JP2022/008618
Authority: WO
Inventors: 邦大伊東
Original assignee: 日本電気株式会社
Priority date: 2022-03-01
Filing date: 2022-03-01
Publication date: 2023-09-07

Abstract

An estimation device 400 includes: a weight calculation unit 421 that calculates a predetermined weight on the basis of information indicating a candidate of an unknown attribute and information regarding known attributes; a conditional-marginal-distribution calculation unit 422 that calculates, on the basis of the information in a decision tree, a value corresponding to the conditional marginal distribution of the unknown attribute under the condition that the values of some attributes are known; and an estimation unit 423 that estimates the value of the unknown attribute on the basis of the weight calculated by the weight calculation unit 421 and the value corresponding to the conditional marginal distribution calculated by the conditional-marginal-distribution calculation unit 422.

Description

estimation device

The present invention relates to an estimation device, an estimation method, and a recording medium.

A known technique is to estimate the data used during learning based on the output from a learning model for the purpose of risk assessment of a learning model learned using machine learning.

For example, Non-Patent Document 1 describes a method of outputting a plausible attribute value by executing a predetermined process with known attributes and true labels of target data as inputs. For example, according to Non-Patent Document 1, an estimated label to be output from a decision tree is calculated by fixing an unknown attribute to be estimated at a certain value. After that, the error function assumed is used to calculate the deviation between the true label and the estimated label, and the marginal probability is evaluated using the calculated deviation as a weight. According to Non-Patent Document 1, for example, a plausible attribute value is specified as a result of the above processing.

In addition, there is Non-Patent Document 2 as a related document. In Non-Patent Document 2, instead of calculating the deviation using the error function, the ratio of the decision tree training data assigned to the same divided region as the target data is calculated, and the calculated ratio is used as a weight to evaluate the marginal probability. are doing.

Also, as a document describing machine learning, there is, for example, Patent Document 1. For example, Patent Document 1 describes giving acquired data to a trained machine learning model, causing the trained machine learning model to perform predetermined inference, and as a result, obtaining an inference result for the data. ing.

International Publication No. 2021/014878

In the case of the techniques described in Non-Patent Document 1 and Non-Patent Document 2, unconditional, that is, average marginal probabilities are used as marginal probabilities. Therefore, it cannot necessarily be said that it represents an accurate marginal distribution for an estimation target, and as a result, there is a possibility that the data cannot be accurately estimated.

Therefore, an object of the present invention is to provide an estimating device, an estimating method, and a recording medium that can solve the above-described problems.

In order to achieve such an object, the estimating device, which is one aspect of the present disclosure,
a weight calculator that calculates a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
a conditional marginal distribution calculation unit that calculates a value according to the conditional marginal distribution of unknown attributes under the condition that the values of some attributes are known, based on the decision tree information;
an estimating unit for estimating a value of an unknown attribute based on the weight calculated by the weight calculating unit and a value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculating unit;
It has a configuration of

In addition, an estimation method, which is another form of the present disclosure,
The information processing device
calculating a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
Based on the information of the decision tree, calculate the value according to the conditional marginal distribution of the unknown attribute under the condition that the value of some attribute is known,
A value of an unknown attribute is estimated based on the calculated weight and the calculated value according to the conditional marginal distribution.

In addition, a recording medium that is another aspect of the present disclosure includes:
information processing equipment,
calculating a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
Based on the information of the decision tree, calculate the value according to the conditional marginal distribution of the unknown attribute under the condition that the value of some attribute is known,
A computer-readable recording medium recording a program for realizing a process of estimating the value of an unknown attribute based on the calculated weight and the calculated value according to the conditional marginal distribution be.

According to each configuration as described above, it is possible to provide an estimation device, an estimation method, and a recording medium capable of more accurately estimating data.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram for explaining the outline of the present disclosure; 1 is a diagram illustrating a configuration example of a risk evaluation system according to a first embodiment of the present disclosure; FIG. 3 is a block diagram showing a configuration example of a model storage device; FIG. FIG. 3 is a diagram showing an example of a model stored in a model storage device; FIG. It is a block diagram which shows the structural example of a risk-evaluation apparatus. It is a figure which shows an example of prior information. It is a figure for demonstrating the example of a process of a conditional marginal-distribution calculation part. It is a figure for demonstrating the example of a process of a conditional marginal-distribution calculation part. It is a flowchart which shows the operation example of the risk-evaluation apparatus at the time of attribute estimation. 4 is a flowchart showing an operation example of the risk evaluation device during risk evaluation; It is a figure which shows another example of prior information. It is a figure which shows the hardware structural example of the estimation apparatus in 2nd Embodiment of this indication. It is a block diagram which shows the structural example of an estimation apparatus.

[First embodiment]
A first embodiment of the present disclosure will be described with reference to FIGS. 1 to 11. FIG. FIG. 1 is a diagram for explaining the outline of the present disclosure. FIG. 2 is a diagram showing a configuration example of the risk assessment system 100 according to the first embodiment of the present disclosure. FIG. 3 is a block diagram showing a configuration example of the model storage device 200. As shown in FIG. FIG. 4 is a diagram showing an example of a model stored in the model storage device 200. As shown in FIG. FIG. 5 is a block diagram showing a configuration example of the risk evaluation device 300. As shown in FIG. FIG. 6 is a diagram showing an example of the prior information 341. As shown in FIG. 7 and 8 are diagrams for explaining a processing example of the conditional marginal distribution calculator 356. FIG. FIG. 9 is a flowchart showing an operation example of the risk evaluation device 300 during attribute estimation. FIG. 10 is a flowchart showing an operation example of the risk evaluation device 300 during risk evaluation. FIG. 11 is a diagram showing another example of the prior information 341. As shown in FIG.

In the first embodiment of the present disclosure, if some of the attributes that make up the training data used during training of the learning model 241 are partially missing due to reasons such as being concealed, a known attribute A risk assessment system 100 that estimates the values of missing attributes using In the case of the present disclosure, the risk assessment system 100 knows the values (x 2, ..., x _d ) of some of the attributes (x ₁ , x ₂ _, ..., x _d ) that make up the training data. and we know that the unknown attribute x ₁ can take any of k values (v ₁₁ , . . . , v _1k ). For example, the risk evaluation system 100 uses knowledge about known attributes and knowledge about unknown attributes to calculate predetermined weights. In addition, the risk evaluation system ₁₀₀ uses one of the attributes (x ₁ , x ₂ , . Calculate the value according to the conditional marginal distribution of the unknown attribute under the condition that the value of the part attribute (x _{2 , .} . . , x _d ) is known. The risk evaluation system 100 then uses the calculated weights and values according to the conditional marginal distribution to estimate unknown attributes. In this way, the risk assessment system 100 described in the present embodiment uses the decision tree information 343 to determine the conditional marginal distribution that indicates the probability of occurrence of unknown attribute values under known attributes. Calculate the value. The risk evaluation system 100 then estimates unknown attribute values based on the calculation results. In addition, the risk evaluation system 100 can perform risk evaluation according to the possibility of leakage of training data based on the attribute value estimation result.

For example, FIG. 1 shows that in a data set in which the value y is inferred by two values _z1 and _z2 , the higher the peak, the more data the data set contains. Also, in FIG. 1, assume that _z1 is an unknown attribute and _z2 is a known attribute. Under such circumstances, for example, when the unconditional marginal probability P(z ₁ =v) as described in Non-Patent Document 1 and Non-Patent Document 2 is used, the unconditional marginal probability is distribution, for example, the marginal distribution of _z1 becomes a gentle peak intermediate between the case where _z2 is v(1) and the case where _z2 is v(2). On the other hand, using knowledge about _z2 , which is a known attribute, we know that, for example, if the value of _z2 is v(0), the marginal distribution of _z1 will be flatter rather than rolling hills. Also, for example, when the value of _z2 is v(3), it can be seen that the marginal distribution of _z1 becomes a larger peak. Thus, conditional marginal distributions can represent better marginal distributions than unconditional marginal distributions. However, it is not always possible to obtain conditional marginal distributions. Therefore, in the present embodiment, the decision tree information 343 is used to empirically calculate a value corresponding to the conditional marginal distribution, and estimation is performed based on the calculation result.

In addition, in this embodiment, the learning model 241 is generated by supervised learning using a plurality of training data. For example, the learning model 241 includes a plurality of attributes and labels so as to output a label indicating whether or not the patient is ill in response to the input of a plurality of attributes such as gender, age, height, weight, and so on. It is learned using multiple training data. Note that specific examples of attributes and labels are not limited to the above examples, and may be set arbitrarily. Also, in this embodiment, the model trained using the training data is a decision tree. A decision tree is a model that is trained by dividing input data into binary trees multiple times until the descriptive performance of the label is sufficiently improved by conditional branching of its attributes. An attribute can also be called an explanatory variable or a feature amount. A label can also be called an objective variable.

Also, the risk evaluation system 100 described in this embodiment estimates unknown attributes when, for example, the learning model 241 is set in a white box. For example, a model generated by machine learning may have a black box setting in which only the output for the input is disclosed to the user, and a white box setting in which model information such as the model structure and branching conditions are also disclosed. As will be described later, the risk assessment system 100 in this embodiment uses the decision tree information 343, which is information disclosed by white box setting, to calculate values according to conditional marginal distributions. The white box setting is set, for example, when performing federated learning in which model training is performed while exchanging information between clients.

FIG. 2 shows a configuration example of the risk assessment system 100 in this embodiment. With reference to FIG. 2, the risk assessment system 100 has, for example, a risk assessment device 300 and a model storage device 200 . As shown in FIG. 2, the risk evaluation device 300 and the model storage device 200 are connected, for example, via a network or the like so that they can communicate with each other.

The model storage device 200 is an information processing device that stores a learning model 241 learned using training data. FIG. 3 shows a configuration example of the model storage device 200. As shown in FIG. For example, referring to FIG. 3, the model storage device 200 has a storage unit 240 in which a learning model 241 is stored, a receiving unit 210 , an inference unit 220 and an output unit 230 . For example, the model storage device 200 has an arithmetic device such as a CPU (Central Processing Unit) and a storage device. The model storage device 200 can realize each of the above-described processing units by executing the program stored in the storage device by the arithmetic device.

Note that, as shown in FIG. 3, the learning model 241 stored in the storage unit 240 is learned in advance using a plurality of training data including a plurality of attributes and labels. The learning model 241 may be learned within the model storage device 200 or may be learned outside the model storage device 200 . Also, as shown in FIG. 4, the learning model 241 is a decision tree. In the learning model 241, which is a decision tree, inference is performed by, for example, outputting a value (label) of one leaf node sorted by an attribute, which is an explanatory variable, for input data.

Receiving unit 210 receives candidate data, which will be described later, from risk evaluation device 300 . For example, the receiving unit 210 includes values of attributes known to the risk assessment apparatus 300 such as “v ₁₁ , x _{2 ,} . . _. , x _d ” and “v ₁₂ , x ₂ , . Receive training data containing attribute candidates. As an example, the receiving unit 210 receives from the risk assessment device 300 a number of pieces of candidate data corresponding to the number of unknown attribute candidates for the risk assessment device 300 . The receiving unit 210 may receive information other than the above examples, such as identification information, together with the candidate data.

The inference unit 220 inputs each candidate data received by the reception unit 210 to the learning model 241 . As a result of the input, the inference unit 220 acquires an inference label, which is an inference result corresponding to each candidate data.

The output unit 230 transmits the inference label acquired by the inference unit 220 to the risk evaluation device 300 . For example, the output unit 230 may transmit the inference label to the risk assessment apparatus 300 together with the identification information of the candidate data so that the inference label can be determined based on which candidate data. .

Also, the output unit 230 can transmit information about the learning model 241 to the risk evaluation device 300 . For example, the output unit 230 learns information such as a model structure such as a binary tree split structure, a model branching condition indicating that an attribute is larger or smaller than a threshold, and the number of training data allocated to each leaf node. It is transmitted to the risk evaluation device 300 as information about the model 241 . The output unit 230 may transmit information about the learning model 241 other than the above examples to the risk evaluation device 300 . Note that, for example, the output unit 230 can be set at an arbitrary timing, such as when transmitting an inference label to the risk evaluation device 300 or when receiving an instruction to transmit information about the learning model 241 from the risk evaluation device 300. Information about the learning model 241 may be transmitted to the risk assessment device 300 .

For example, as described above, the model storage device 200 has a learning model 241 learned using training data. Also, upon receiving candidate data from the risk evaluation device 300, the model storage device 200 obtains an inference label corresponding to the candidate data by performing inference using the learning model 241 based on the received candidate data. The model storage device 200 then transmits the acquired inference label to the risk evaluation device 300 . The model storage device 200 also transmits information about the learning model 241 to the risk assessment device 300 .

The risk evaluation device 300 is an information processing device that estimates the values of hidden attributes using information about known attributes, information about the learning model 241, and the like. Also, the risk assessment device 300 can perform risk assessment based on the estimation results.

FIG. 5 shows a configuration example of the risk assessment device 300. FIG. Referring to FIG. 5, the risk assessment device 300 includes, as main components, an operation input unit 310, a screen display unit 320, a communication I/F unit 330, a storage unit 340, and an arithmetic processing unit 350. ,have.

Note that FIG. 5 illustrates a case where the function of the risk evaluation device 300 is realized using one information processing device. However, the risk evaluation device 300 may be implemented using a plurality of information processing devices, such as being implemented on a cloud. For example, the functions of the risk evaluation device 300 include a candidate data generation unit 351, a candidate data transmission unit 352, an inference result acquisition unit 353, a decision tree information reception unit 354, a weight calculation unit 355, a conditional marginal distribution calculation unit 356, and an estimation It may be implemented by two information processing devices, an estimation device functioning as the unit 357 and an evaluation device functioning as the evaluation unit 358 and the output unit 359 . Moreover, the risk assessment device 300 may not include a part of the above-exemplified configuration such as having no operation input unit or screen display unit, or may have a configuration other than the above-exemplified configuration.

The operation input unit 310 consists of operation input devices such as a keyboard and a mouse. The operation input unit 310 detects the operation of the operator who operates the risk evaluation device 300 and outputs it to the arithmetic processing unit 350 .

The screen display unit 320 consists of a screen display device such as an LCD (Liquid Crystal Display). The screen display unit 320 can display various information stored in the storage unit 340 on the screen in accordance with instructions from the arithmetic processing unit 350 .

The communication I/F unit 330 consists of a data communication circuit and the like. The communication I/F unit 330 performs data communication with an external device such as the model storage device 200 connected via a communication line.

The storage unit 340 is a storage device such as a hard disk or memory. The storage unit 340 stores processing information and programs 346 required for various processes in the arithmetic processing unit 350 . The program 346 realizes various processing units by being read and executed by the arithmetic processing unit 350 . The program 346 is read in advance from an external device or recording medium via a data input/output function such as the communication I/F unit 330 and stored in the storage unit 340 . Main information stored in the storage unit 340 includes prior information 341, inference result information 342, decision tree information 343, weight information 344, and estimation information 345, for example. The storage unit 340 may store only part of the information exemplified above, such as not storing the inference result information 342 .

The prior information 341 includes previously known information about training data used during training of the learning model 241 stored in the model storage device 200 . For example, prior information 341 is acquired in advance using a method such as being acquired from an external device via communication I/F unit 330 or being input using operation input unit 310, and is stored in storage unit 340. ing.

FIG. 6 shows an example of the prior information 341. Referring to FIG. 6, the prior information 341 includes partial training data information and missing attribute information. For example, as shown in FIG. 6, the prior information 341 includes multiple pieces of information in which partial training data information and missing attribute information are associated.

Here, the partial training data information indicates known attribute values and corresponding labels in a state in which some attributes of the training data used for learning the learning model 241 are concealed (deleted). . For example, FIG. 6 illustrates a case where attributes (x ₂ , . . . , x _d ) and label y are known and attribute x ₁ is missing. Missing attribute information indicates information about the value of the missing attribute. For example, FIG. 6 shows that the missing attribute x ₁ takes one of k values (v ₁₁ , . . . , v _1k ). Note that in the present embodiment, missing attributes are, for example, categorical variables (discrete variables). Note that the prior information 341 may include information other than the above examples.

The inference result information 342 includes information indicating an inference label obtained by inputting candidate data created by the candidate data creation unit 351 based on the prior information 341 to the learning model 241, which will be described later. For example, the inference result information 342 may include information indicating inference labels corresponding to the number of candidates for missing attributes. For example, the inference result information 342 is generated and updated in response to an inference label acquired from the model storage device 200 by an inference result acquisition unit 353 (to be described later).

The decision tree information 343 includes information about the learning model 241 acquired from the model storage device 200. In other words, the decision tree information 343 includes information about decision trees. For example, the decision tree information 343 includes information about the learning model 241, such as model structure, model branching conditions, and the number of training data assigned to each leaf node. The decision tree information 343 is updated, for example, when the decision tree information receiving unit 354 receives information about the learning model 241 from the model storage device 200 .

The weight information 344 includes information indicating the weight calculated by the weight calculator 355, which will be described later. For example, weight information 344 may include information indicating a weight according to the number of candidates for missing attributes. For example, the weight information 344 is generated and updated as the weight calculator 355 calculates the weight.

The estimation information 345 includes information indicating the result estimated by the estimation unit 357 described later based on the weight information 344 and the calculation result by the conditional marginal distribution calculation unit 356 . For example, the estimation information 345 may include information indicating attribute values estimated by the estimation unit 357 among unknown attribute candidates. For example, the estimation information 345 is generated and updated according to the results of evaluation of conditional marginal probabilities by the estimation unit 357 using weights.

The arithmetic processing unit 350 has an arithmetic device such as a CPU and its peripheral circuits. The arithmetic processing unit 350 reads the program 346 from the storage unit 340 and executes it, so that the hardware and the program 346 work together to realize various processing units. Main processing units realized by the arithmetic processing unit 350 include, for example, a candidate data generation unit 351, a candidate data transmission unit 352, an inference result acquisition unit 353, a decision tree information reception unit 354, a weight calculation unit 355, a conditional peripheral There are a distribution calculation unit 356, an estimation unit 357, an evaluation unit 358, an output unit 359, and the like.

Note that the risk evaluation device 300 may have only the configuration necessary for the weight calculation unit 355 to calculate the weight among the configurations illustrated above. For example, as will be described later, the weight calculator 355 can calculate the weight by the method described in Non-Patent Document 1 or the method described in Non-Patent Document 2. Here, in the case of the method described in Non-Patent Document 2, inference labels are not necessarily required. Therefore, the risk evaluation apparatus 300 may not have the configuration of the candidate data transmission unit 352 and the inference result acquisition unit 353 depending on the weight calculation method of the weight calculation unit 355 or the like.

The candidate data creation unit 351 creates candidate data based on the prior information 341. For example, the candidate data creation unit 351 creates candidate data according to the number of candidates indicated by the missing attribute information. The candidate data creation unit 351 may create candidate data at any timing.

Specifically, for example _, as the prior information 341, partial _training data information (x ₂ _, _. ) is stored. In this case, _the candidate data generating unit 351 assumes that the unknown attribute x ₁ takes any value of (v ₁₁ , _. . . , v _1k ), and the candidate Create data. That is, the candidate data creating unit 351 creates candidate data (v ₁₁ , x ₂ , ..., x _d ), ..., (v _1k , x ₂ , ..., x _d ).

As described above, the prior information 341 can include a plurality of pieces of information in which partial training data information and missing attribute information are associated with each other. The candidate data creation unit 351 may create candidate data using the method described above for each of the associated information.

The candidate data transmission unit 352 transmits the candidate data created by the candidate data creation unit 351 to the model storage device 200 . The candidate data transmission unit 352 may transmit, together with the candidate data, identification information of the candidate data according to the partial training data information used when creating the candidate data.

The inference result acquisition unit 353 receives and acquires an inference label from the model storage device 200 as a result of inference based on candidate data. For example, the inference result acquisition unit 353 acquires the inference label from the model storage device 200 together with the identification information so that the inference target candidate data can be identified. The inference result acquisition unit 353 also stores the received inference label as the inference result information 342 in the storage unit 340 . The inference result acquisition unit 353 may store the inference label in the storage unit 340 together with the identification information of the corresponding candidate data.

The decision tree information receiving unit 354 receives information about the learning model 241 from the model storage device 200 . For example, the decision tree information receiving unit 354 receives information about the learning model 241 from the model storage device 200, such as the model structure, the branching condition of the model, the number of training data assigned to each leaf node, and the like. Also, the decision tree information receiving unit 354 stores the received information about the learning model 241 as the decision tree information 343 in the storage unit 340 .

Note that the decision tree information receiving unit 354 may instruct the model storage device 200 to transmit information about the learning model 241 at any timing. For example, the decision tree information receiving unit 354 may be configured to receive information about the learning model 241 transmitted in response to the instruction.

The weight calculation unit 355 calculates a predetermined weight using information about known attributes and information about unknown attributes. Also, the weight calculation unit 355 stores the calculated weight in the storage unit 340 as weight information 344 . The weight calculation unit 355 may store the calculation result in the storage unit 340 together with the identification information of the corresponding candidate data.

For example, the weight calculator 355 calculates the weight using a method similar to the method described in Non-Patent Document 1. Specifically, for example, the weight calculation unit 355 uses a predetermined error function err( ) based on the prior information 341 and the inference result information 342 as shown in Equation 1 to obtain an inference label and an inference target. The deviation from the label included in the partial training data information from which the candidate data was created is calculated. In other words, the weight calculator 355 calculates the weight by calculating the deviation between the label and the inference label inferred based on the information about the known attribute and the information about the unknown attribute.
Note that y is the label and f(x') is the guess label.

Further, for example, instead of the above method, the weight calculation unit 355 may be configured to calculate a ratio by a method described in Non-Patent Document 2 and use the calculated ratio as a weight. For example, the weight calculator 355 can calculate the ratio by solving the equation shown in Equation 2.
Note that, as shown in Non-Patent Document 2, φ _i ( ) is a predetermined indicator function for the value candidate v of the missing attribute. Also, p _i =n _i /N for the total number of training data N. Note that n _i indicates the number of training data assigned to leaf node _i . Also, S = (s _i ) _i=1,...,m =(φ _i ,n _i ) _i=1,...,m , indicating all path sets of the decision tree. Also _, s ₁ , _.

Also, the weight calculation unit 355 may calculate the weight by a method other than the above example. For example, the weight calculator 355 may calculate the weight by adjusting the weight initial value determined by any method with the value of the known attribute.

For example, as described above, the weight calculator 355 calculates a predetermined weight using information about known attributes and information about unknown attributes. Note that the weight calculator 355 may be configured to calculate the weight using any one of the methods exemplified above.

The conditional marginal distribution calculation unit 356 calculates a value according to the conditional marginal distribution by calculating the number of training data falling on the target leaf node using the decision tree information 343 . For example, the conditional marginal distribution calculator 356 can calculate the conditional marginal distribution for each unknown attribute candidate.

For example, FIG. 7 shows an example of a region divided by a binary tree in a decision tree that divides the space of feature values into rectangles. Also, in FIG. 7, for example, _z2 is a known attribute and _z1 is an unknown attribute. In this case, by using the decision tree information 343 to see the amount of data assigned to the region with the decision tree, it is possible to indirectly calculate the value corresponding to the true conditional marginal distribution as illustrated in FIG. . Although this value is different from the true distribution, it can be expected to be better than using the average.

Specifically, for example, the conditional marginal distribution calculation unit 356 empirically calculates a value according to the conditional marginal distribution by solving the equation shown in Equation 3 below.

In other words, for example, the conditional marginal distribution calculation unit 356 refers to the decision tree information 343 to identify leaf nodes that fall when an unknown attribute is taken as a candidate. Then, the conditional marginal distribution calculator 356 calculates the first value by calculating the number of pieces of training data falling on the specified leaf node among the entire training data D. FIG. In addition, the conditional marginal distribution calculation unit 356 identifies the leaf node that falls when each candidate is an unknown attribute. Then, the conditional marginal distribution calculation unit 356 calculates the second value by calculating the sum of the number of pieces of training data falling for each of the identified leaf nodes among the entire training data D. FIG. After that, the conditional marginal distribution calculator 356 divides the first value by the second value to calculate a value according to the conditional marginal distribution.

Note that the decision tree does not divide the feature amount space so as to correspond to all the feature amount values. Therefore, in practice, as shown in FIG. 8, the granularity of the divided regions is likely to be coarser than illustrated in FIG. As a result, for example, when comparing FIG. 7 and FIG. 8, as a result of coarser granularity, the number of training data falling on leaf nodes (divided regions) is greater than in the case shown in FIG. It can be seen that there is a risk of slippage. Therefore, the conditional marginal distribution calculator 356 can be configured to correct the number of training data falling on a leaf node per unit area, as indicated by Equation 4 below. In general, the distribution is almost unchanged in the neighborhood in the feature amount space. Therefore, performance improvement can be expected by performing the above correction.

As described above, in the equation shown in Equation 4, the correction process described above is performed by dividing the numerator in Equation 3 by the area occupied in the feature amount space. Here, the area can be calculated, for example, as follows.

For example, when z _j is a discrete value for attribute z _j (j=1, . . . , d), the conditional marginal distribution calculator 356 calculates _{N j} _by counting the number of possible values do. For example, if the attribute _zj can take three values {A, B, C}, then _Nj is 3. The conditional marginal distribution calculator 356 also calculates nj _,i by counting the number of possible values of _zj in the training data assigned to leaf node i. For example, _nj,i is 2 if the path to leaf node i allows two values of {A,C}. Then, the conditional marginal distribution calculator 356 calculates the width _wj by dividing _nj,i by _Nj . For example, after calculating the width _wj as described above, the conditional marginal distribution calculator 356 can calculate the area by solving the equation shown in Equation 5 below.

In the above processing, normalization is performed by dividing nj _,i by _Nj . For example, scaling may be different for each attribute, such as cm for an attribute of height and m for an attribute of distance. Therefore, a more appropriate value can be calculated by dividing by _Nj instead of simply using nj _,i . In cases such as when it is known in advance that normalization is unnecessary, the process of dividing by _Nj may be omitted.

Also, for example _, if z _j is a continuous value for attribute z _j (j=1, . to calculate _Nj . For example, if the attribute z _j can take values from 1 to 10, then N _j is 9 from 10−1. The conditional marginal distribution calculator 356 also calculates nj _,i by calculating the difference in the range of values that _zj can take in the training data assigned to leaf node i. For example, if the path to leaf node i allows values from 2 to 5, nj _,i becomes 3 instead of 5-2. Then, the conditional marginal distribution calculator 356 calculates the width _wj by dividing _nj,i by _Nj . After that, the conditional marginal distribution calculation unit 356 can calculate the area by solving the equation shown in Equation 5 above. As in the case where the attributes are discrete values, the normalization process may be omitted when the attributes are continuous values.

For example, as described above, the conditional marginal distribution calculation unit 356 uses the area in the feature amount space calculated based on the number or range of possible values of _zj in the training data assigned to the leaf node i. , may be configured to compensate for the number of training data falling for leaf nodes.

As will be described later, the values corresponding to the conditional marginal distributions calculated by the conditional marginal distribution calculation unit 356 are compared when the estimation unit 357 performs estimation. In addition, the denominator portion of the formulas shown in Equations 3 and 4 has the same value for each candidate for the unknown attribute. Therefore, the conditional marginal distribution calculator 356 may be configured to calculate only the numerator portion of the equations shown in Equations 3 and 4 as values according to the conditional marginal distribution. In other words, the conditional marginal distribution calculation unit 356 calculates the number of training data falling on the specified leaf node out of the entire training data D, or a value obtained by correcting the number with the area, as a value according to the conditional marginal distribution. and the like may be calculated.

Based on the weight information 344 and the value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculation unit 356, the estimation unit 357 estimates the value of the attribute that is likely to be the unknown attribute among the candidates. Also, the estimation unit 357 stores the estimated result in the storage unit 340 as estimation information 345 .

For example, the estimation unit 357 identifies i′ by identifying i that maximizes the product of the weight and the value according to the conditional marginal distribution, as shown in Equation 6 below. Then, v _1i' corresponding to the specified i' is output as a plausible attribute value. Note that i' takes any value from 1 to k.

Note that Equation 6 exemplifies a processing example of the estimation unit 357 when the weight calculation unit 355 calculates the weight by the method described in Non-Patent Document 1. For example, in Equation 6, the portion using unconditional marginal distributions in the estimation method in Non-Patent Document 1 is replaced with values according to conditional marginal distributions. In this manner, the estimating section 357 may perform the estimation process by a method according to the weight calculation method by the weight calculating section 355 . For example, when the weight calculator 355 calculates the weights by the method described in Non-Patent Document 2, the estimator 357 replaces the portion of the estimation method in Non-Patent Document 2 that uses the unconditional marginal distribution with the conditional marginal distribution. may be configured to perform the estimation process by replacing with a value corresponding to .

The evaluation unit 358 performs evaluation based on the estimated information 345. In other words, the evaluation unit 358 performs risk evaluation based on the estimation results of the estimation unit 357 .

For example, the evaluation unit 358 has correct answer information, which is information indicating what values the unknown attributes indicated by the prior information 341 were actually. For example, in the case of FIG. 6, the evaluation unit 358 has correct answer information indicating which value of (v ₁₁ , . . . , v _1k ) x ₁ is. The evaluation unit 358 can compare the result of estimation by the estimation unit 357 and the actual value indicated by the correct answer information, and perform risk evaluation based on the comparison result. For example, the evaluation unit 358 can evaluate that the risk is high when the result of estimation by the estimation unit 357 and the actual value indicated by the correct answer information match. On the other hand, when the result of estimation by the estimation unit 357 and the actual value indicated by the correct answer information do not match, the evaluation unit 358 can evaluate that the risk is low.

As described above, the prior information 341 includes a plurality of pieces of information in which partial training data information and missing attribute information are associated with each other. Therefore, the estimating unit 357 can estimate a candidate for each of the associated information. Therefore, for example, the evaluation unit 358 may perform risk evaluation based on a comparison result between a plurality of estimation results by the estimation unit 357 and correct information corresponding to each estimation. Specifically, for example, the evaluation unit 358 calculates the percentage of correct answers indicating the percentage of matches between the estimation results and the correct information, according to the results of a plurality of comparisons. Then, the evaluation unit 358 can output, for example, the calculated percentage of correct answers as the information indicating the risk. The evaluation unit 358 may be configured to evaluate the risk according to whether or not the calculated percentage of correct answers exceeds a predetermined threshold, and output the evaluation result.

The output unit 359 outputs information indicating candidates estimated by the estimation unit 357, information indicating evaluation results by the evaluation unit 358, and the like. For example, the output unit 359 displays each of the above information on the screen display unit 320 or transmits the information to an external device via the communication I/F unit 330 .

The above is a configuration example of the risk evaluation device 300. Next, an operation example of the risk assessment device 300 will be described with reference to FIGS. 9 and 10. FIG.

First, an operation example of the risk evaluation device 300 when estimating an unknown attribute will be described with reference to FIG. FIG. 9 is a flowchart showing an operation example of the risk evaluation device 300 when estimating an unknown attribute when calculating weights by the method described in Non-Patent Document 1. FIG. Referring to FIG. 9, the candidate data creating unit 351 creates candidate data based on the prior information 341 (step S101). For example, the candidate data creation unit 351 creates candidate data according to the number of candidates indicated by the missing attribute information.

The candidate data transmission unit 352 transmits each candidate data created by the candidate data creation unit 351 to the model storage device 200 (step S102).

The inference result acquisition unit 353 acquires an inference label for each candidate data from the model storage device 200 as an inference result based on the candidate data (step S103).

The decision tree information receiving unit 354 receives information about the learning model 241 from the model storage device 200 (step S104). Either step S103 or step S104 may be performed first, or may be performed in parallel.

The weight calculation unit 355 calculates weights using knowledge about known attributes and knowledge about unknown attributes (step S105). For example, the weight calculation unit 355 uses a predetermined error function based on the prior information 341 and the inference result information 342 to obtain the partial training data information from which the inference label and the candidate data to be inferred are created. Calculate the deviation between the included labels as a weight.

The conditional marginal distribution calculator 356 calculates a value according to the conditional marginal distribution using the decision tree information 343 (step S106). For example, the conditional marginal distribution calculation unit 356 calculates a value according to the conditional marginal distribution by calculating the number of training data falling on a specified leaf node among the entire training data D. FIG.

Based on the weight information 344 and the value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculation unit 356, the estimation unit 357 estimates the value of the attribute that is likely to be the unknown attribute among the candidates (step S107). ). For example, the estimating unit 357 estimates the unknown attribute value by specifying i that maximizes the product of the weight and the value according to the conditional marginal distribution.

The above is a configuration example of the risk evaluation device 300 at the time of attribute estimation. When calculating the weight using the method described in Non-Patent Document 2 instead of using the method described in Non-Patent Document 1, the risk evaluation device 300 performs the processes from step S101 to step S103. May be omitted. In this case, the risk evaluation device 300 may calculate the weight using the decision tree information 343 or the like in the process of step S105.

Next, an operation example of the risk evaluation device 300 during risk evaluation will be described with reference to FIG. FIG. 10 is a flowchart showing an operation example of the risk evaluation device 300 during risk evaluation. Referring to FIG. 10, the risk evaluation device 300 performs the process of estimating the unknown attributes described with reference to FIG. 9 (step S201).

When the estimation target remains in the prior information 341 (step S202, No), the risk evaluation device 300 returns to the process of step S201 and performs the estimation process. On the other hand, when there is no estimation target in the prior information 341 (step S202, Yes), the risk evaluation device 300 performs risk evaluation according to each estimation result (step S203). For example, the risk assessment device 300 can calculate the percentage of correct answers based on the results of comparison between the results of each estimation and the correct answer information corresponding to each estimation, and output according to the calculated percentage of correct answers.

The above is an example of the operation of the risk evaluation device 300 during risk evaluation. Note that the process of step S203 does not necessarily have to be performed continuously after the processes of steps S201 and S202. For example, the process of step S203 may be performed at any timing after the processes of steps S201 and S202.

Thus, the risk evaluation device 300 has a weight calculator 355 , a conditional marginal distribution calculator 356 and an estimator 357 . According to such a configuration, the estimation unit 357 calculates the unknown attribute of the candidate based on the weight calculated by the weight calculation unit 355 and the value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculation unit 356. can estimate the value of the plausible attribute as As a result, data can be estimated more accurately.

Note that the present embodiment has exemplified the case where there is one unknown attribute x ₁ . However, the present invention can be applied without problems even when there are multiple unknown attributes.

For example, FIG. 11 shows an example of prior information 341 when there are multiple unknown attributes from _x1 to _xn . For example, FIG. 11 illustrates a case where attributes (x _n+1 , . . . , x _d ) and label y are known and attributes (x ₁ , . . . , x _n ) are missing. In this case, the missing attribute information indicates information about the value of each missing attribute. Thus, even if there are multiple unknown attributes, the present invention can be applied without any problem.

In addition, in this embodiment, the case where the risk evaluation system 100 has the model storage device 200 and the risk evaluation device 300 is exemplified. However, the risk evaluation system 100 may be composed of, for example, one information processing device having the functions of the model storage device 200 and the risk evaluation device 300 described in this embodiment. Risk assessment system 100 may employ other known variations.

[Second embodiment]
Next, a second embodiment of the present disclosure will be described with reference to FIGS. 12 and 13. FIG. FIG. 12 is a diagram illustrating a hardware configuration example of the estimation device 400. As illustrated in FIG. FIG. 13 is a block diagram showing a configuration example of the estimation device 400. As shown in FIG.

In a second embodiment of the present disclosure, a configuration example of an estimation device 400, which is an information processing device that estimates unknown attribute values based on information about known attributes, will be described. FIG. 12 shows a hardware configuration example of the estimation device 400. As shown in FIG. Referring to FIG. 12, the estimating device 400 has the following hardware configuration as an example.
- CPU (Central Processing Unit) 401 (arithmetic unit)
・ROM (Read Only Memory) 402 (storage device)
・RAM (Random Access Memory) 403 (storage device)
Program group 404 loaded into RAM 403
- Storage device 405 for storing program group 404
- A drive device 406 that reads and writes a recording medium 410 outside the information processing device
- A communication interface 407 that connects to a communication network 411 outside the information processing apparatus
An input/output interface 408 for inputting/outputting data
A bus 409 connecting each component

Also, the estimation apparatus 400 realizes the functions of the weight calculation unit 421, the conditional marginal distribution calculation unit 422, and the estimation unit 423 shown in FIG. be able to. The program group 404 is stored in the storage device 405 or the ROM 402 in advance, for example, and is loaded into the RAM 403 or the like by the CPU 401 as necessary and executed. The program group 404 may be supplied to the CPU 401 via the communication network 411 or stored in the recording medium 410 in advance, and the drive device 406 may read the program and supply it to the CPU 401 .

Note that FIG. 12 shows a hardware configuration example of the estimating device 400 . The hardware configuration of estimation device 400 is not limited to the case described above. For example, the estimating device 400 may be configured from part of the configuration described above, such as not having the drive device 406 .

The weight calculator 421 calculates a predetermined weight based on information indicating unknown attribute candidates and information about known attributes. For example, the weight calculator 421 can calculate the weight using the methods described in Non-Patent Document 1 and Non-Patent Document 2.

The conditional marginal distribution calculation unit 422 calculates values according to the conditional marginal distribution of unknown attributes under the condition that the values of some attributes are known, based on the information of the decision tree. For example, the conditional marginal distribution calculation unit 422 calculates a value corresponding to the conditional marginal distribution by calculating the number of falling training data for a leaf node that falls when an unknown attribute is set as a candidate.

The estimation unit 423 estimates the value of the unknown attribute based on the weight calculated by the weight calculation unit 421 and the value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculation unit 422 .

Thus, the estimation device 400 has a weight calculator 421 , a conditional marginal distribution calculator 422 , and an estimator 423 . According to such a configuration, the estimation unit 423 calculates the unknown attribute based on the weight calculated by the weight calculation unit 421 and the value according to the conditional marginal distribution calculated by the conditional marginal distribution calculation unit 422. values can be estimated. As a result, it is possible to perform estimation based on values corresponding to the conditional marginal distribution, which can be expected to be a more accurate marginal distribution, thereby performing more accurate estimation.

Note that the estimation device 400 described above can be realized by installing a predetermined program in an information processing device such as the estimation device 400 . Specifically, a program that is another aspect of the present invention causes an information processing device such as the estimation device 400 to calculate a predetermined weight based on information indicating an unknown attribute candidate and information about a known attribute, A process of calculating a value according to the conditional marginal distribution based on the decision tree information, and estimating the value of the unknown attribute based on the calculated weight and the calculated value according to the conditional marginal distribution. It is a program to realize

Further, in the estimation method executed by the information processing apparatus such as the estimation apparatus 400 described above, the information processing apparatus such as the estimation apparatus 400 performs a predetermined calculation based on information indicating unknown attribute candidates and information about known attributes. Calculate the weight, calculate the value according to the conditional marginal distribution based on the decision tree information, and calculate the value of the unknown attribute based on the calculated weight and the calculated value according to the conditional marginal distribution is a method of estimating

Even in the invention of the program, the computer-readable recording medium recording the program, or the estimation method having the configuration described above, in order to achieve the same effects and effects as the estimation device 400 described above, The objectives of the present disclosure described above can be achieved.

<Appendix>
Some or all of the above embodiments may also be described as the following appendices. An outline of the estimation device and the like according to the present invention will be described below. However, the present invention is not limited to the following configurations.

(Appendix 1)
a weight calculator that calculates a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
a conditional marginal distribution calculation unit that calculates a value according to the conditional marginal distribution of unknown attributes under the condition that the values of some attributes are known, based on the decision tree information;
an estimating unit for estimating a value of an unknown attribute based on the weight calculated by the weight calculating unit and a value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculating unit;
an estimator.
(Appendix 2)
1. The estimation apparatus according to appendix 1, wherein the conditional marginal distribution calculation unit calculates the number of training data that falls for a leaf node that falls when an unknown attribute is a certain candidate, so that the conditional marginal distribution An estimating device that calculates a corresponding value.
(Appendix 3)
The estimating device according to Supplementary Note 2,
The conditional marginal distribution calculation unit calculates the number of training data falling for each leaf node falling when an unknown attribute is a certain candidate, and the number of training data falling for each leaf node falling when each candidate is an unknown attribute. An estimating device that calculates a value according to the conditional marginal distribution by dividing by the sum of the number of data.
(Appendix 4)
The estimating device according to Supplementary Note 2 or Supplementary Note 3,
The conditional marginal distribution calculation unit performs a predetermined correction process on the number of falling training data for a leaf node that falls when an unknown attribute is a certain candidate, thereby obtaining a value corresponding to the conditional marginal distribution. An estimator that calculates
(Appendix 5)
The estimating device according to Supplementary Note 4,
The conditional marginal distribution calculation unit performs a correction process of correcting the number of falling training data for a leaf node falling when an unknown attribute is a certain candidate to the number per unit area in the feature amount space, An estimator that calculates values according to conditional marginal distributions.
(Appendix 6)
The estimating device according to Supplementary Note 5,
An estimator that calculates an area based on the number or range of values that attributes can take in training data assigned to leaf nodes.
(Appendix 7)
The estimating device according to any one of Supplements 1 to 6,
The weight calculator uses a predetermined error function to calculate a deviation between an estimated label estimated based on information indicating an unknown attribute candidate and information about a known attribute and a true label. Calculate the weights by
The estimating unit calculates a value of an unknown attribute based on the deviation calculated as the weight by the weight calculating unit and the value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculating unit. estimating estimating device.
(Appendix 8)
The estimating device according to any one of Supplements 1 to 6,
The weight calculation unit calculates the weight by calculating a predetermined ratio based on information indicating an unknown attribute candidate and information about a known attribute,
The estimation unit calculates a value of an unknown attribute based on the ratio calculated as the weight by the weight calculation unit and the value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculation unit. estimating estimating device.
(Appendix 9)
The information processing device
calculating a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
Calculate the value according to the conditional marginal distribution based on the information of the decision tree,
An estimation method for estimating a value of an unknown attribute based on the calculated weight and the calculated value according to the conditional marginal distribution.
(Appendix 10)
information processing equipment,
calculating a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
Calculate the value according to the conditional marginal distribution based on the information of the decision tree,
A computer-readable recording medium recording a program for realizing a process of estimating the value of an unknown attribute based on the calculated weight and the calculated value according to the conditional marginal distribution.

Although the present invention has been described with reference to the above-described embodiments, the present invention is not limited to the above-described embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

100 Risk evaluation system 200 Model storage device 210 Reception unit 220 Inference unit 230 Output unit 240 Storage unit 241 Learning model 300 Risk evaluation device 310 Operation input unit 320 Screen display unit 330 Communication I/F unit 340 Storage unit 341 Prior information 342 Inference result Information 343 Decision tree information 344 Weight information 345 Estimation information 346 Program 350 Operation processing unit 351 Candidate data generation unit 352 Candidate data transmission unit 353 Inference result acquisition unit 354 Decision tree information reception unit 355 Weight calculation unit 356 Conditional marginal distribution calculation unit 357 Estimation unit 358 Evaluation unit 359 Output unit 400 Estimation device 401 CPU
402 ROMs
403 RAM
404 program group 405 storage device 406 drive device 407 communication interface 408 input/output interface 409 bus 410 recording medium 411 communication network 421 weight calculator 422 conditional marginal distribution calculator 423 estimator

Claims

a weight calculator that calculates a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
a conditional marginal distribution calculation unit that calculates a value according to the conditional marginal distribution of unknown attributes under the condition that the values of some attributes are known, based on the decision tree information;
an estimating unit for estimating a value of an unknown attribute based on the weight calculated by the weight calculating unit and a value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculating unit;
an estimator.
2. The estimation apparatus according to claim 1, wherein the conditional marginal distribution calculation unit calculates the number of training data that falls for a leaf node that falls when an unknown attribute is a candidate, and the conditional marginal distribution An estimating device that calculates a value according to
The estimating device according to claim 2,
The conditional marginal distribution calculation unit calculates the number of training data falling for each leaf node falling when an unknown attribute is a certain candidate, and the number of training data falling for each leaf node falling when each candidate is an unknown attribute. An estimating device that calculates a value according to the conditional marginal distribution by dividing by the sum of the number of data.
The estimating device according to claim 2 or 3,
The conditional marginal distribution calculation unit performs a predetermined correction process on the number of falling training data for a leaf node that falls when an unknown attribute is a certain candidate, thereby obtaining a value corresponding to the conditional marginal distribution. An estimator that calculates
The estimating device according to claim 4,
The conditional marginal distribution calculation unit performs a correction process of correcting the number of falling training data for a leaf node falling when an unknown attribute is a certain candidate to the number per unit area in the feature amount space, An estimator that calculates values according to conditional marginal distributions.
The estimating device according to claim 5,
An estimator that calculates an area based on the number or range of values that attributes can take in training data assigned to leaf nodes.
The estimating device according to any one of claims 1 to 6,
The weight calculator uses a predetermined error function to calculate a deviation between an estimated label estimated based on information indicating an unknown attribute candidate and information about a known attribute and a true label. Calculate the weights by
The estimating unit calculates a value of an unknown attribute based on the deviation calculated as the weight by the weight calculating unit and the value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculating unit. estimating estimating device.
The estimating device according to any one of claims 1 to 6,
The weight calculation unit calculates the weight by calculating a predetermined ratio based on information indicating an unknown attribute candidate and information about a known attribute,
The estimation unit calculates a value of an unknown attribute based on the ratio calculated as the weight by the weight calculation unit and the value corresponding to the conditional marginal distribution calculated by the conditional marginal distribution calculation unit. estimating estimating device.
The information processing device
calculating a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
Calculate the value according to the conditional marginal distribution based on the information of the decision tree,
An estimation method for estimating a value of an unknown attribute based on the calculated weight and the calculated value according to the conditional marginal distribution.
information processing equipment,
calculating a predetermined weight based on information indicating unknown attribute candidates and information about known attributes;
Calculate the value according to the conditional marginal distribution based on the information of the decision tree,
A computer-readable recording medium recording a program for realizing a process of estimating the value of an unknown attribute based on the calculated weight and the calculated value according to the conditional marginal distribution.