CN111242319A

CN111242319A - Model prediction result interpretation method and device

Info

Publication number: CN111242319A
Application number: CN202010043115.4A
Authority: CN
Inventors: 方军鹏; 唐才智
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2020-01-15
Filing date: 2020-01-15
Publication date: 2020-06-05

Abstract

The specification discloses a method and a device for explaining model prediction results. The method comprises the following steps: acquiring input data of a target model and a corresponding input data prediction result; carrying out disturbance processing on the input data to obtain a plurality of disturbance data; inputting the disturbance data into the target model respectively to obtain corresponding disturbance prediction results; screening out disturbance data which generate a disturbance prediction result different from the input data prediction result as counterexample disturbance data; determining a plurality of interpretation features for the input data based on the difference between the counter-example perturbation data and the input data; for each interpretation feature, judging whether the feature value of the input data matches the interpretation condition of the interpretation feature; and determining the interpretation characteristics and characteristic values thereof which match the interpretation conditions as the interpretation of the input data prediction result.

Description

Model prediction result interpretation method and device

Technical Field

The specification relates to the technical field of artificial intelligence, in particular to a model prediction result interpretation method and device.

Background

With the development of artificial intelligence technology, machine learning has been widely applied in retail, medical, financial, automatic driving, and other fields. However, many machine learning models are similar to a black box, and output results after data is input, but the results are not interpretative, so that a user cannot know a decision mechanism inside the machine learning models, and cannot meet the requirements of a business scene.

Disclosure of Invention

In view of the above, the present specification provides a method and an apparatus for interpreting a model prediction result.

Specifically, the description is realized by the following technical scheme:

a method of interpreting model predictions, comprising:

acquiring input data of a target model and a corresponding input data prediction result;

carrying out disturbance processing on the input data to obtain a plurality of disturbance data;

inputting the disturbance data into the target model respectively to obtain corresponding disturbance prediction results;

screening out disturbance data which generate a disturbance prediction result different from the input data prediction result as counterexample disturbance data;

determining a plurality of interpretation features for the input data based on the difference between the counter-example perturbation data and the input data;

for each interpretation feature, judging whether the feature value of the input data matches the interpretation condition of the interpretation feature;

and determining the interpretation characteristics and characteristic values thereof which match the interpretation conditions as the interpretation of the input data prediction result.

An apparatus for interpreting model predictions, comprising:

the data acquisition unit is used for acquiring input data of the target model and a corresponding input data prediction result;

the data disturbance unit is used for carrying out disturbance processing on the input data to obtain a plurality of disturbance data;

the disturbance prediction unit is used for respectively inputting the disturbance data into the target model to obtain corresponding disturbance prediction results;

a counter-example screening unit that screens out, as counter-example disturbance data, disturbance data that produces a disturbance prediction result different from the input data prediction result;

the characteristic determining unit is used for determining a plurality of interpretation characteristics for the input data according to the difference between the counter-example disturbance data and the input data;

a condition matching unit that judges, for each interpretation feature, whether or not a feature value of the input data matches an interpretation condition of the interpretation feature;

and a result interpretation unit determining interpretation characteristics and characteristic values thereof matching the interpretation conditions as an interpretation of the input data prediction result.

An apparatus for interpreting model predictions, comprising:

a processor;

a memory for storing machine executable instructions;

wherein, by reading and executing machine-executable instructions stored by the memory that correspond to interpretation logic of model prediction results, the processor is caused to:

One embodiment of the present description may perform perturbation processing on input data of a target model to obtain a plurality of perturbation data, then screen out perturbation data generating different prediction results as counter-example perturbation data, determine interpretation characteristics for the input data according to differences between the counter-example perturbation data and the input data, and further determine interpretation of a prediction result of the input data according to the interpretation characteristics and a characteristic value of the input data, thereby implementing interpretation of a prediction result of the target model.

Drawings

Fig. 1 is a flowchart illustrating a method for interpreting a model prediction result according to an exemplary embodiment of the present disclosure.

Fig. 2 is a flowchart illustrating a method for processing disturbance of input data according to an exemplary embodiment of the present disclosure.

Fig. 3 is a flowchart illustrating a method for determining an interpretation characteristic according to an exemplary embodiment of the present disclosure.

Fig. 4 is a flowchart illustrating a method for interpreting a prediction result of a risk prediction model according to an exemplary embodiment of the present disclosure.

Fig. 5 is a schematic structural diagram of an interpretation apparatus for model prediction results according to an exemplary embodiment of the present disclosure.

Fig. 6 is a block diagram of an apparatus for interpreting a model prediction result according to an exemplary embodiment of the present specification.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present specification. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the specification, as detailed in the appended claims.

The terminology used in the description herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the description. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of the present specification. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

The interpretation method of the model prediction result can be applied to an interpretation system of the model prediction result, and the physical carrier of the model prediction result is usually a server or a server cluster.

Referring to fig. 1, the method for interpreting the model prediction result may include the following steps:

step 102, obtaining input data of a target model and a corresponding input data prediction result.

In this embodiment, the input data may be feature data of an entity object such as a user and a business. The input data prediction result may be a classification result associated with the entity object.

For example, the input data may include attribute characteristics of the user, historical business behavior characteristics of the user, and the like, the target model is a risk prediction model, and the input data prediction result is whether the business request of the user has a risk.

For another example, the input data may include attribute features of the user, historical disease features of the user, current symptom features of the user, and the like, the target model is a disease diagnosis model, and the input data predicts whether the user has a certain disease.

In this embodiment, when a caller calls a target model to perform prediction, the input data may be input through a call interface of the target model, and a prediction result output by the target model may be obtained.

And 104, performing disturbance processing on the input data to obtain a plurality of disturbance data.

In this embodiment, when interpreting the input data prediction result of the input data, the input data may be disturbed to obtain a plurality of disturbance data.

For example, the perturbation data can be changed less, so that the perturbation data retains partial characteristics of the input data, and the perturbation data is more reasonable.

In other examples, after the input data prediction result is obtained, it may be further determined whether the input data prediction result is a prediction result to be explained that needs to be explained.

If yes, the step can be executed to carry out disturbance processing on the input data.

If not, the process can be ended.

And 106, respectively inputting the disturbance data into the target model to obtain corresponding disturbance prediction results.

Based on the foregoing step 104, the generated perturbation data may be respectively input into the target model, so as to obtain a prediction result corresponding to the perturbation data, which is referred to as a perturbation prediction result.

And step 108, screening out disturbance data which generates a disturbance prediction result different from the input data prediction result as counterexample disturbance data.

In this step, it may be determined whether the disturbance prediction result is the same as the input data prediction result.

If the data are the same, that is, the prediction result of the target model on the disturbance data is not changed, it can be shown that the disturbance processing on the input data is not enough for the target model to change the prediction result, and the disturbance processing has no substantial influence on the prediction result.

If the prediction results of the target model and the disturbance data are different, that is, the prediction results of the disturbance data are changed by the target model, it can be shown that the disturbance processing of the input data enables the target model to change the prediction results, the disturbance processing has substantial influence on the prediction results, and the prediction results of the target model can be explained based on the disturbance processing.

In this embodiment, for the convenience of distinction, the perturbation data generating different prediction results may be referred to as counter-example perturbation data.

Step 110, determining a plurality of interpretation features for the input data according to the difference between the counter-example perturbation data and the input data.

In this embodiment, one or more features that affect the target model prediction result may be determined as the interpretation features according to the feature difference between the counter-example disturbance data and the input data.

And 112, judging whether the feature value of the input data matches the interpretation condition of the interpretation feature or not for each interpretation feature.

And step 114, determining the interpretation characteristics and the characteristic values thereof which match the interpretation conditions as the interpretation of the input data prediction result.

In this embodiment, interpretation conditions may be set for each feature in advance, and for the input data, it may be determined whether a feature value of an interpretation feature satisfies the interpretation conditions, and if the interpretation conditions are satisfied, the interpretation feature and the feature value thereof may be returned to the caller as an interpretation of the prediction result of the input data.

For example, assume that there are 2 interpretation features determined for the input data in the foregoing step 110, i.e., feature 1 and feature 3, respectively, and the interpretation condition of feature 1 is that the feature value is greater than 5, and the interpretation condition of feature 3 is that the feature value is less than 7.

If the characteristic value of the input data characteristic 1 is 6, the corresponding interpretation condition is met; the feature value of feature 3 is 8, and the corresponding interpretation condition is not satisfied, then the feature value 6 of feature 1 can be determined as the interpretation of the input data prediction result, i.e. the reason that the target model outputs the input data prediction result is that the feature value of input data feature 1 is 6.

It can be seen from the above description that, in this embodiment, input data of a target model can be subjected to perturbation processing to obtain a plurality of perturbation data, then, perturbation data generating different prediction results are screened out to be used as counter-example perturbation data, interpretation characteristics are determined for the input data according to differences between the counter-example perturbation data and the input data, interpretation of a prediction result of the input data is determined according to the interpretation characteristics and a characteristic value of the input data, and interpretation of a prediction result of the target model is achieved.

The following is a detailed description of two aspects of perturbation processing of input data and determination of interpretation characteristics, respectively.

First, disturbance processing of input data

Referring to fig. 2, the method for processing the disturbance of the input data may include the following steps:

at step 202, samples for training the target model are obtained.

In this embodiment, samples for training the target model may be obtained, which may include training samples, test samples, and the like.

Step 204, generating a plurality of pseudo samples conforming to the sample data distribution.

In this embodiment, a generative model may be used to learn the data distribution of the samples, and generate several samples, referred to as pseudo samples, that conform to the data distribution.

For example, a GAN model (Generative adaptive Networks, Generative countermeasure network), a SMOTE model (Synthetic timing over-sampling technique), or the like may be used.

The pseudo sample generated by the generative model better conforms to the data distribution characteristics of the target model sample and is closer to the real situation.

In this embodiment, the pseudo samples can be generated as many as possible on the basis of balancing the computational power.

In another example, a plurality of dummy samples may be generated in advance for the target model, and the dummy samples generated in advance may be acquired when the prediction result is interpreted, which is not particularly limited in the present specification.

At step 206, a plurality of sets of perturbation characteristics are determined for the input data, each set of perturbation characteristics including one or more perturbation characteristics.

In this embodiment, the number of disturbances of the features may be preset according to the number of sample features, the application scenario of the target model, and other information.

For example, assuming that the number of sample features is 100, the number of perturbations may be set to 8, 10, etc.

The disturbance quantity smaller than the characteristic quantity is set, and part of characteristics of the original input data can be reserved in subsequently generated disturbance data, so that disturbance is more reasonable, and the accuracy of interpretation of a subsequent prediction result can be improved.

After the disturbance quantity of the features is obtained, each feature combination which meets the disturbance quantity can be traversed, and each feature combination can be determined as a group of disturbance features.

Serial number	Set of perturbation features
		1	x₁、x₂
2	x₁、x₃
		3	x₁、x₄
4	x₁、x₅
		5	x₂、x₃
6	x₂、x₄
		7	x₂、x₅
8	x₃、x₄
		9	x₃、x₅
10	x₄、x₅

TABLE 1

Assuming that the number of sample features is 5 and the number of perturbations is 2, the feature is represented by x, x_kRepresenting the kth feature. Referring to the example of table 1, 10 feature combinations with 2 features in each feature combination may be generated, that is, 10 disturbance feature groups with 2 disturbance features in each disturbance feature group may be generated.

Of course, in other examples, the set of perturbation characteristics may be determined in other ways.

For example, the features may be filtered first, some features that are determined not to have a substantial effect on the prediction result are filtered, and the perturbation feature group is determined by using the above traversal method for the filtered features.

For another example, the number of perturbations may not be limited, and the number of perturbations may be set to 1, 2, and 3 … in sequence, up to the number of sample features, and the like, which is not particularly limited in this specification.

And 208, correspondingly replacing the characteristic value of the input data by the characteristic value of each pseudo sample respectively aiming at each group of disturbance characteristics to obtain the disturbance data of the input data.

Based on the foregoing step 206, after the disturbance feature group is determined, for each group of disturbance features, the feature values of the input data are respectively and correspondingly replaced with the feature values of each pseudo sample, so as to obtain disturbance data.

Please continue to refer to the example of Table 1 to perturb the feature set (x)₁、x₂) For example, each pseudo sample feature x may be used separately₁And feature x₂Characteristic value of (2) replaces input data characteristic x₁And feature x₂And obtaining the disturbance data corresponding to the pseudo sample.

Assuming that the eigenvalues of 5 features of the input data are (1, 2, 3, 4, 5), and the eigenvalues of 5 features of a certain pseudo sample are (6, 7, 8, 9, 10), the perturbation feature group (x) is pointed out₁、x₂) The eigenvalue of the disturbance data obtained after the replacement is (6, 7, 3, 4, 5).

In this embodiment, if M pseudo samples exist, M pieces of disturbance data may be generated for each group of disturbance features, and if N groups of disturbance features exist, N × M pieces of disturbance data may be generated.

Thus, several disturbance data of the input data can be obtained.

Second, determining the interpretation characteristics

Referring to fig. 3, the method for determining the interpretation characteristics may include the following steps:

step 302, for each feature, calculating a feature difference between the counterexample disturbance data and the input data.

In the present embodiment, for each feature of the input data, the feature difference between the counter-example disturbance data and the input data under the feature is calculated respectively.

The feature difference can be measured by dimensions such as feature value difference and quantity difference.

(1) Difference of characteristic value

And calculating the value difference between the characteristic value of the counterexample disturbance data and the characteristic value of the input data aiming at each characteristic.

In the present embodiment, the feature x is used₁For example, for each counter-example disturbance data, the difference between the characteristic value of the counter-example disturbance data and the characteristic value of the input data can be calculated, and then the differences between all the counter-example disturbance data and the characteristic value of the input data are integrated to obtain the characteristic x₁The difference in value of (a).

E.g. still with feature x₁For example, the difference between the eigenvalue of each counter-example perturbation data and the eigenvalue of the input data can be calculated, the absolute value of the difference is taken so that the difference is a positive number, and then the sum of the absolute values of the differences of all counter-example perturbation data is calculated as the characteristic x₁The difference in value of (a).

TABLE 2

Referring to the example of Table 2, assume that there are 5 counter-example perturbation data, each of which has a characteristic x₁The characteristic values of (2) are shown in Table 2, and the data characteristic x is input₁Is 10, the difference values shown in table 2 can be calculated, and in this example, the absolute values of the difference values shown in table 2 can be added, i.e., 4+2+2+14+5, to obtain the inverse perturbation data and the input data at the feature x₁The following difference 27.

Similarly, the value difference of the counterexample disturbance data and the input data under each characteristic can be calculated.

(2) Number of differences

For each feature, the number of counter-example disturbance data different from the input data feature value may be counted as the difference number of the feature.

Continuing with the example of Table 2, Table 2 shows 5 counterexample perturbation data, and the feature x of the 5 counterexample perturbation data₁Characteristic value of (2) and input data characteristic x₁All of the characteristic values ofSimilarly, the counterexample perturbation data and the input data are in the feature x₁The number of differences below is 5.

Similarly, the difference between the counterexample disturbance data and the input data under each characteristic can be obtained through statistics.

In this embodiment, the feature difference under each feature can be calculated according to the value difference and the number difference.

For example, for each feature, normalization processing can be performed on the value difference and the quantity difference, and then the feature difference of the counter-example disturbance data and the calling disturbance data under the feature is calculated in a weighted average equal manner.

Of course, in other examples, the value difference and the feature difference may be calculated in other manners, and other difference calculation dimensions may also be added, which is not limited in this specification.

For each feature, based on the step, the feature difference of the counter-example disturbance data and the input data under the feature can be calculated, and the feature difference of the counter-example disturbance data and the input data under each feature can be obtained.

And step 304, determining the feature with the feature difference meeting the difference condition as the interpretation feature.

Based on the foregoing step 302, after the feature differences of the counter-example disturbance data and the input data under the features are obtained through calculation, the features may be arranged in the order from high to low according to the feature differences, and then a preset number of features arranged in the first order may be sequentially selected as the interpretation features. Namely, several features with large differences in features are selected as the explanatory features.

The preset number may be preset, for example, may be set with reference to the total number of features.

In the embodiment, the characteristic of large difference between the counterexample disturbance data and the input data often plays a more important role in the prediction result of the target model, and the characteristic of large difference is taken as the interpretation characteristic of the prediction result, so that the accuracy of interpretation of the prediction result can be effectively improved.

The implementation process of the present specification will be described below by taking an example in which the target model is a risk prediction model and the risk prediction model is applied to cash-out risk prediction in the financial field.

Referring to fig. 4, the method for interpreting the prediction result of the risk prediction model may include the following steps:

step 402, obtaining input data of a risk prediction model and a corresponding risk prediction result.

In this embodiment, taking a credit card application as an example, a user may submit a credit card application request online, and a credit card issuer may input data such as user data and application data as input data into a trained risk prediction model to obtain a prediction result output by the risk prediction model.

The user data can comprise multi-dimensional characteristic data such as gender, age, location, occupation, historical behavior data and the like of the user; the application data can include multidimensional characteristic data such as an application limit, an application equipment identifier, an application equipment network environment and the like, and the sample characteristics of the risk prediction model can be referred to specifically.

The prediction results of the risk prediction model are generally of two types: risky and risk-free.

Step 404, determining whether the risk prediction result is at risk.

Based on the foregoing step 402, after the risk prediction result is obtained, it is determined whether the risk prediction result is at risk.

If yes, go to step 406 to give the prediction basis of risk.

If not, the subsequent process of this embodiment may be ended without interpreting the risk-free result, and a business process corresponding to the risk-free result, such as issuing a credit card, may be executed.

And 406, performing disturbance processing on the input data to obtain a plurality of disturbance data.

In this embodiment, the disturbance data generation scheme shown in the foregoing embodiment shown in fig. 2 may be adopted to generate a plurality of disturbance data for the input data, which is not described in detail herein.

And 408, inputting the disturbance data into the risk prediction model respectively to obtain corresponding disturbance prediction results.

In this embodiment, the disturbance data generated in the step 406 may be respectively input into the risk prediction model to obtain a risk prediction result corresponding to each disturbance data, where the risk prediction result includes two types: risky and risk-free.

And step 410, screening out disturbance data which generate a risk-free prediction result as counterexample disturbance data.

In this embodiment, the disturbance data that produces the risk-free prediction result may be determined as counterexample disturbance data.

At step 412, a number of interpretation features are determined for the input data based on the difference between the counter-example perturbation data and the input data.

In this embodiment, the interpretation feature determination method illustrated in the foregoing embodiment shown in fig. 3 may be employed to determine several interpretation features for the input data.

Step 414, for each interpretation feature, determining whether the feature value of the input data matches the interpretation condition of the interpretation feature.

And step 416, determining the interpretation characteristics and the characteristic values thereof which match the interpretation conditions as the interpretation of the at-risk prediction result.

In the present embodiment, an interpretation condition may be set in advance for each feature, and the interpretation condition generally matches a feature value at risk for the feature.

For example, taking the feature as the credit score of the user as an example, generally speaking, if the credit score is lower than a certain threshold, the credit condition of the user may be considered to be poor, and a cash-out risk may exist, and the interpretation condition of the feature of the credit score may be set as that the credit score is lower than the threshold.

For another example, taking the consumption amount of the user in the last half month as an example, if the consumption amount of the user in the last half month exceeds the normal consumption amount of the user, the interpretation condition of the characteristic of the consumption amount of the user in the last half month may be set to be that the consumption amount is larger than the normal consumption amount.

The normal consumption amount of the user may be calculated by using a method provided in the related art, such as calculating according to the historical consumption amount of the user, and considering special situations such as the year festival, and the like, generally, the normal consumption amounts of different users are often different, and the description of the specification is omitted.

In the present embodiment, based on the interpretation characteristics determined in the foregoing step 412, it may be sequentially determined whether each interpretation characteristic value of the input data matches the interpretation condition of the corresponding interpretation characteristic.

If the matching result is matched with the corresponding interpretation characteristic value, the interpretation characteristic value plays an important role in the risk prediction result output by the risk prediction model, the interpretation characteristic and the characteristic value which are matched with the interpretation conditions can be returned to the credit card seller, and the credit card seller can further return the explanation characteristic and the characteristic value to the user as the reason for refusing to issue the credit card to the user.

If not, the explanation characteristic value can be shown to have no effect on the prediction result of the risk output by the risk prediction model, and the explanation characteristic and the characteristic value thereof can be ignored.

For example, assuming that the input data has 50 features, 3 interpretation features, respectively f, are determined for the input data in this step₁、f₂And f₃。

Wherein the feature f is explained₁A credit score on behalf of the user, with the interpretation providing a credit score of less than 500 points;

interpretation feature f₂Representing the consumption amount of the user in the last week, wherein the explanation condition is that the consumption amount is more than 20000 yuan;

interpretation feature f₃Representing the total amount of the user not yet paid in the respective credit platform, the explanation condition is that the amount is more than 50000 yuan.

Interpretation of features	Explanation of conditions	Characteristic value of input data
			f₁	<500	400
f₂	>20000	25000
			f₃	>50000	48000

TABLE 3

Referring to the example of table 3, the credit score of the user with the risk prediction result output by the risk prediction model in the foregoing step 402 is 400 scores, and the corresponding interpretation condition is matched; the consumption amount of the last week is 25000 yuan, and the corresponding explanation conditions are matched; the total amount currently not yet paid in each credit platform is 48000 dollars, not matching the corresponding interpretation criteria.

In this example, the credit score of 400 minutes and the consumption amount of 25000 yuan of the last week can be used as the explanation of the prediction result of the risk and returned to the credit card seller, and the credit card seller can further reject the credit card application request of the user and return the two points as the explanation to the user, so as to improve the business experience of the user.

Corresponding to the embodiment of the interpretation method of the model prediction result, the present specification also provides an embodiment of an interpretation device of the model prediction result.

The embodiment of the interpretation device for the model prediction result can be applied to the server. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. Taking a software implementation as an example, as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for operation through the processor of the server where the device is located. From a hardware aspect, as shown in fig. 5, a hardware structure diagram of a server where a device is located for explaining a model prediction result in this specification is shown, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 5, the server where the device is located in the embodiment may also include other hardware according to an actual function of the server, which is not described again.

Referring to fig. 6, the apparatus 500 for interpreting the model prediction result can be applied to the server shown in fig. 5, and includes: a data acquisition unit 501, a data perturbation unit 502, a perturbation prediction unit 503, a counter example screening unit 504, a feature determination unit 505, a condition matching unit 506, and a result interpretation unit 507.

The data obtaining unit 501 obtains input data of a target model and a corresponding input data prediction result;

a data perturbation unit 502, which performs perturbation processing on the input data to obtain a plurality of perturbation data;

the disturbance prediction unit 503 is configured to input the disturbance data into the target model respectively to obtain corresponding disturbance prediction results;

a counter-example screening unit 504 that screens out, as counter-example disturbance data, disturbance data that produces a disturbance prediction result different from the input data prediction result;

a feature determination unit 505, configured to determine a plurality of interpretation features for the input data according to a difference between the counter-example disturbance data and the input data;

a condition matching unit 506 that determines, for each interpretation feature, whether or not a feature value of the input data matches an interpretation condition of the interpretation feature;

the result interpretation unit 507 determines an interpretation feature matching the interpretation condition and a feature value thereof as an interpretation of the input data prediction result.

Optionally, the data perturbation unit 502:

obtaining samples for training the target model;

generating a plurality of pseudo samples which accord with the sample data distribution;

determining a plurality of sets of perturbation characteristics for the input data, each set of perturbation characteristics comprising one or more perturbation characteristics;

and for each group of disturbance features, correspondingly replacing the feature value of the input data with the feature value of each pseudo sample respectively to obtain the disturbance data of the input data.

Optionally, the data perturbation unit 502:

and learning the data distribution of the samples by adopting a generative model, and generating the plurality of pseudo samples.

Optionally, the data perturbation unit 502:

acquiring the disturbance quantity of the features;

and traversing each feature combination which accords with the disturbance quantity, and determining each feature combination as a group of disturbance features.

Optionally, the number of perturbations is less than the number of features of the input data.

Optionally, the feature determining unit 505:

for each feature, calculating a feature difference of the counterexample disturbance data and the input data;

and determining the feature with the feature difference meeting the difference condition as the explained feature.

Optionally, the feature determining unit 505:

calculating the value difference between the characteristic value of the counterexample disturbance data and the characteristic value of the input data aiming at each characteristic;

counting the number of counter-example disturbance data different from the characteristic value of the input data for each characteristic;

and determining the characteristic difference according to the value difference and the quantity.

Optionally, the feature determining unit 505:

and arranging the features according to the sequence of feature difference from high to low, and sequentially selecting a preset number of the features arranged in advance as the explanation features.

Optionally, the data perturbation unit 502:

judging whether the input data prediction result is a preset prediction result to be explained;

and if so, executing a step of performing disturbance processing on the input data.

Alternatively to this, the first and second parts may,

the input data is characteristic data of the entity object;

the input data prediction result is a classification result related to the entity object.

The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution in the specification. One of ordinary skill in the art can understand and implement it without inventive effort.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

In correspondence with the foregoing embodiment of the method for interpreting a model prediction result, the present specification also provides an apparatus for interpreting a model prediction result, the apparatus including: a processor and a memory for storing machine executable instructions. Wherein the processor and the memory are typically interconnected by means of an internal bus. In other possible implementations, the device may also include an external interface to enable communication with other devices or components.

In this embodiment, the processor is caused to:

Optionally, when the input data is subjected to perturbation processing to obtain a plurality of perturbation data, the processor is caused to:

obtaining samples for training the target model;

Optionally, when generating a number of pseudo samples conforming to a sample data distribution, the processor is caused to:

Optionally, in determining sets of perturbation characteristics for the input data, the processor is caused to:

acquiring the disturbance quantity of the features;

Optionally, when determining a number of interpretation features for the input data based on the difference between the counter-example perturbation data and the input data, the processor is caused to:

Optionally, in calculating the feature difference between the counterexample disturbance data and the input data, the processor is caused to:

Optionally, when determining a feature whose feature difference satisfies a difference condition as the interpretation feature, the processor is caused to:

Optionally, the processor is further caused to:

The input data is characteristic data of the entity object;

In correspondence with the aforementioned embodiment of the interpretation method of the model prediction result, the present specification also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of:

a method of interpreting model predictions, comprising:

Optionally, the performing disturbance processing on the input data to obtain a plurality of disturbance data includes:

obtaining samples for training the target model;

Optionally, the generating a plurality of pseudo samples conforming to the sample data distribution includes:

Optionally, the determining a plurality of sets of disturbance characteristics for the input data includes:

acquiring the disturbance quantity of the features;

Optionally, determining a plurality of interpretation features for the input data according to the difference between the counter-example perturbation data and the input data includes:

Optionally, the calculating the feature difference between the counterexample disturbance data and the input data includes:

Optionally, the determining, as the interpretation feature, the feature whose feature difference satisfies the difference condition includes:

Optionally, the method further includes:

Optionally, the input data is feature data of an entity object;

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The above description is only a preferred embodiment of the present disclosure, and should not be taken as limiting the present disclosure, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims

1. A method of interpreting model predictions, comprising:

2. The method according to claim 1, wherein the perturbing the input data to obtain a plurality of perturbed data comprises:

obtaining samples for training the target model;

3. The method of claim 2, said generating a number of pseudo samples conforming to a sample data distribution, comprising:

4. The method of claim 2, the determining sets of perturbation characteristics for the input data comprising:

acquiring the disturbance quantity of the features;

5. The method of claim 4, wherein the first and second light sources are selected from the group consisting of,

the number of perturbations is less than a characteristic number of the input data.

6. The method of claim 1, the determining, for the input data, a number of interpretation features based on the difference of the counter-example perturbation data and the input data, comprising:

7. The method of claim 6, the calculating a feature difference of the counter-example perturbation data and the input data, comprising:

8. The method of claim 7, the determining the feature whose feature difference satisfies a difference condition as the interpretation feature, comprising:

9. The method of claim 1, further comprising:

10. The method of claim 1, wherein the first and second light sources are selected from the group consisting of,

the input data is characteristic data of the entity object;

11. An apparatus for interpreting model predictions, comprising:

12. The apparatus of claim 11, the data perturbation unit to:

obtaining samples for training the target model;

13. The apparatus of claim 12, the data perturbation unit to:

14. The apparatus of claim 12, the data perturbation unit to:

acquiring the disturbance quantity of the features;

15. The apparatus of claim 14, wherein the first and second electrodes are disposed on opposite sides of the substrate,

16. The apparatus of claim 11, the feature determination unit to:

17. The apparatus of claim 16, the feature determination unit to:

18. The apparatus of claim 17, the feature determination unit to:

19. The apparatus of claim 11, the data perturbation unit to:

20. The apparatus of claim 11, wherein the first and second electrodes are disposed in a substantially cylindrical configuration,

the input data is characteristic data of the entity object;

21. An apparatus for interpreting model predictions, comprising:

a processor;

a memory for storing machine executable instructions;