WO2023208383A1

WO2023208383A1 - Collective feedback-based decision on model prediction quality

Info

Publication number: WO2023208383A1
Application number: PCT/EP2022/061605
Authority: WO
Inventors: Divyasheel SHARMA; Gayathri GOPALAKRISHNAN; Benjamin KLOEPPER; Joakim ASTROM; Benedikt Schmidt; Yemao MAN; Dawid ZIOBRO; Arzam Muzaffar Kotriwala; Marcel Dix
Original assignee: Abb Schweiz Ag
Priority date: 2022-04-29
Filing date: 2022-04-29
Publication date: 2023-11-02
Also published as: WO2023209185A1

Abstract

A method for deciding on a model result quality based on a collective feedback of a plurality of user, comprising: providing a result of the model to the plurality of user; receiving feedback on the result of the model from the plurality of user; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be non- satisfactory, wherein, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively further parameter instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

Description

COLLECTIVE FEEDBACK-BASED DECISION ON MODEL PREDICTION QUALITY

TECHNICAL FIELD

The present disclosure relates to a method for deciding on a model result quality based on a collective feedback of a plurality of user and to a system for deciding on a model result quality based on a collective feedback of a plurality of user.

TECHNICAL BACKGROUND

The general background of this disclosure is the aid of finding a decision on model results quality based on collective feedback. In order to improve the trust to a result of a model, model user must agree with the result of the model. Hence, model scientists should incorporate the opinions of the model users to train robust models for receiving improved model results.

Typically, none or only a small amount of totally different opinions of the model users are/can be used by the model scientists for improving the model, because solely a minor amount of opinions of the model users are precise enough and/or in the needed detailed manner. Further, the opinions of the model users are mostly absolutely different, such that it is difficult for the model scientist to recognize how to improve the model and to recognize which opinion is more decisive, in particular to find a consensus of all opinions. Therefore, model scientists always have the problem that the few received opinions are mostly not useable, understandable or comparable such that an optimization of a model by the model scientist can be only done based on few opinions received being complexively chosen by the model scientist. This significantly weakens and prevents a comprehensive and sustainable improvement of the models.

Hence, there is a need to enable an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions such that a model scientist can improve the models in a significant manner.

SUMMARY OF THE INVENTION In one aspect of the invention a method for deciding on a model result quality based on a collective feedback of a plurality of user is presented, comprising: providing a result of the model to the plurality of user; receiving feedback on the result of the model from the plurality of user; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be non- satisfactory, wherein, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively further parameter instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

The term model as used herein is to be understood broadly and represents any system/algorithm calculates, determined and/or modulates data. The model may be a machine learning model. The model may be a model in training phase or a model being deployed on production.

The term user as used herein is to be understood broadly and represents any person who uses, programs, and/or provides the model. In particular, the user may be a domain expert, a plant operator, a field engineer but is not limited thereto.

The term feedback as used herein is to be understood broadly and represents any opinion, belief or view of the user with respect to the model, but is not limited thereto. Further, the term feedback may be the transmission of evaluative or corrective information about an model, action, event, or process to the original or controlling source. The feedback may me positive or negative, wherein a positive feedback indicates an agreement and is described by “1” and a negative feedback indicates an disagreement and is described by “0”. The term result of the model as used herein is to be understood broadly and represents any results being provided by a model. Exemplary, the result of the model is a prediction, but is not limited thereto.

The term receiving as used herein is to be understood broadly and represents any action for providing, collecting, and/or recording the feedback from the user.

The term providing as used herein is to be understood broadly and represents any action for showing, depicting, presenting results from a model to the plurality of the user. The providing can be provided by a user interface, a monitor, a display, a touchscreen but is not limited thereto.

The term model result quality as used herein is to be understood broadly and represents any information, data indicating the validity and the correctness of the model results with respect to a real occurrence or a predefined state.

The term collective feedback as used herein is to be understood broadly and represents any summary/aggregation of the feedback on the result of the model of the plurality of user. The collective feedback may be provided as a single output, or as a plurality of outputs.

The term determining as used herein is to be understood broadly and represents any calculation/determination of the collective feedback. The determination may include allocating, averaging, statistical analyzing but is not limited thereto. Alternatively, the determination may be a calculation with a tuple, in particular a tuple <a,d,u> (a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement; satisfying a+d+u = 1 and a,d,u e [0,1]). When using a tuple, the determining uses a mathematical operator being described as the the following formula for determining the collective feedback:

The term consensus as used herein is to be understood broadly and represents the output of the collective feedback determination. Therefore, the consensus may be the judgment arrived by the plurality of users or the collective opinions of the plurality of users. The consensus may be positive or negative. Positive consensus indicates that the plurality of user agree with the result of the model. Negative consensus indicates that the plurality of user disagree with the result of the model. The term non-consensus indicates the opposite of the term consensus.

The term satisfactory as used herein is to be understood broadly and represents that the result of the model are good such that the quality of the results of the model are satisfactory. The term non-satisfactory represents the opposite of the term satisfactory, i.e. the result of the model are bad such that the quality of the results of the model are non-satisfactory.

The term parameter as used herein is to be understood broadly and represents any further explanation instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

The determination of the collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions. Further, the providing of progressively further parameters leads to an easy, less complex and time saving way to find/provide a consensus on the received feedback of the result of the model from the plurality of user such that a model scientist can easily process all of these opinions for significant improving the models.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the feedback on the result of the model from the plurality of user comprises a positive feedback indicating an agreement to the result of the model and a negative feedback indicating a disagreement to the result of the model.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively a further parameter comprises the following steps: providing at least one first model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the first model explanation based on the received feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non- consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

The progressively providing of at least one first model explanation, i.e. a further parameter, the receiving of feedback on the first model explanation and the determination of the collective feedback on the first model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process, in particular find a consensus, all of these opinions such that a model scientist can improve the models in a significant manner.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, when the collective feedback on the first model explanation indicates a non-consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, providing progressively a further parameter comprises the following steps: providing at least one second model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the second model explanation based on the received feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

The progressively providing of at least one second model explanation, the receiving of feedback on the second model explanation and the determination of the collective feedback on the second model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process, i.e. find a consensus, all of these opinions such that a model scientist can improve the models in a significant manner.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the method further comprises: when the collective feedback on the second model explanation still indicates a non-consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following steps until the consensus on the result of the model is provided: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

The term repeating used herein is to be understand broadly and represents any replication of one or a plurality of method steps in the same or in an alternatively order. The repeating is not an infinite cycle, because there are only a finite set of explanations available for the model, such that the system, in particular the user interface, is only able to show a limited number of explanations to provide a possibility for achieving consensus.

The progressively providing of at least one other model explanation, the receiving of feedback on the other model explanation and the determination of the collective feedback on the other model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions such that a model scientist can improve the models in a significant manner.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the method further comprises: when the plurality of user are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, checking for a lack on the first model explanation, the second model explanation and/or the other model explanations, wherein, when a lack is identified, classifying the result of the model as to be non-satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model, wherein, when a lack is not identified, repeating the following steps: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

The term domain authority used herein is to be understood broadly and represents any person or company which owns the model or is responsible for the model. Exemplary, the domain authority may be the provider of the model or the programmer/software engineer of the model, but is not limited thereto. Further, the domain authority may be, in particular preferably are, subject matter experts who would understand the result, e.g. predictions, of the model.

The term lack used herein is to be understood broadly and represents any mistake, error, inconsistency, lack of knowledge being identified in the first model explanation, second model explanation and/or other model explanation leading to an impossibility that the plurality of user are able to reach a consensus on the first model explanation, second model explanation and/or other model explanation.

The checking for a lack on the first model explanation, the second model explanation and/or the other model explanation enables an easy, less complex and time saving way to classify results of a model as to be non-satisfactory. In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the first model explanation, the second model explanation, and/or the other model explanation comprises at least one variable of the model.

The term variable of the model as used herein is to be understood broadly and represents any variable or parameter being able to control, influence, amend, manipulate the behavior, respectively the results, of the model.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the first model explanation, the second model explanation, and/or the other model explanation further comprises a relevance-indicating- element for each one of the at least one variable of the model and/or a request-element for requesting feedback, in particular feedback on the result of the model, feedback on the first model explanation, feedback on the second model explanation and/or feedback on the other model explanation, from another user of the plurality of user.

The term relevance-indicating-element as used herein is to be understood broadly and represents any element indicating the importance/relevance of each one of the at least one variable for the model. Further the relevance-indicating-element may indicate the impact of a variable of the model on the model.

The term request-element as used herein is to be understood broadly and represents any element requesting the system to see other user feedback. In other words the system shares the user feedback with other users.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the method further comprises: when the plurality of user updates his feedback such that consensus on the result of the model is provided, storing the metadata for the first model explanation, second model explanation and/or other model explanation. The term metadata as to be used herein is to be understood broadly and represents any data being essential for the model. Exemplary, the metadata may include data with respect to the variables of the method as described above.

The term storing as to be used herein is to be understood broadly and represents any saving, exporting or transmitting data/metadata to a storage means like a memory, cloud, database but is not limited thereto.

The storing of the metadata for the first model explanation, second model explanation and/or other model explanation leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions, in particular to provide the user in progressive steps with parameter instigating most effectively the user to update his given feedback for finding in an easy manner a consensus. Hence, the model scientist can easily process all of these opinions for significant improving the models.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the feedback on the result of the model, the feedback on the first model explanation, the feedback on the second model explanation, and/or the feedback on the other model explanation comprises a tuple, in particular a 3- tuple, including the degree of agreement, the degree of disagreement, and the uncertainty in assigning an agreement or a disagreement.

The term degree of agreement used herein is to be understood broadly and represents the degree of agreement/accordance between the opinion of the plurality of user and the results of the model. The degree of agreement may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.

The term degree of disagreement used herein is to be understood broadly and represents the degree of agreement/accordance between the opinion of the plurality of user and the results of the model. The degree of agreement may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto. The term uncertainty used herein is to be understood broadly and represents the degree of uncertainty of the plurality of user in assigning their opinions to an agreement or a disagreement. The uncertainty may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.

The detailed indication of the degree of agreement, degree of disagreement and the uncertainty leads to an easy, less complex and fast way to instigate the plurality of user to update their given feedback for finding a consensus such that a model scientist can significant improve the model.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the collective feedback on the result of the model is determined by average the feedback on the result of the model from the plurality of user.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the collective feedback on the first model explanation, the second model explanation, and/or the other model explanation is determined by average the feedback or by a mathematical operator.

The term mathematical operator as used herein is to be understood broadly and represent a determination of the collective feedback by a tuple, in particular a tuple <a,d,u> (a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement; satisfying a+d+u = 1 and a,d,u e [0,1 ]). When using a tuple, the determining uses a mathematical operator being described as the following formula for determining the collective feedback:

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the plurality of user is at least two user.

In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the model is a machine learning model.

In a further aspect a system for deciding on a model result quality based on collective feedback of a plurality of user is presented, the system comprising: an user interface for providing a result of the model to the plurality of user, an at least one first model explanation, an at least one second model explanation, and/or an at least one other model explanation to the plurality of user, and for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of user, a feedback on the second model explanation from the plurality of user, and/or a feedback on the other model explanation from the plurality of user; a processor for executing the above described method.

Any disclosure and embodiments described herein relate to the method and the system, lined out above and vice versa. Advantageously, the benefits provided by any of the embodiments and examples equally apply to all other embodiments and examples and vice versa.

As used herein ..determining" also includes ..initiating or causing to determine", “generating" also includes „ initiating or causing to generate" and “provding” also includes “initiating or causing to determine, generate, select, send or receive”. “Initiating or causing to perform an action” includes any processing signal that triggers a computing device to perform the respective action.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following, the present disclosure is further described with reference to the enclosed figures: Fig. 1 illustrates a flow diagram of a method for deciding on a model result quality based on a collective feedback of a plurality of user;

Fig. 2 illustrates an example embodiment of a system for deciding on a model result quality based on collective feedback of a plurality of user;

DETAILED DESCRIPTION OF EMBODIMENT

The following embodiments are mere examples for the method and the system disclosed herein and shall not be considered limiting.

Fig. 1 illustrates a flow diagram of a method for deciding on a model result quality based on a collective feedback of a plurality of user. In a first step, a result of the model, in particular a prediction of the model, is provided to the plurality of user. The providing of the model is provided to the plurality of user via a user interface. In a second step, feedback on the result of the model, i.e. prediction, is received from the plurality of user. The feedback is received from the plurality of user via the same or another user interface. The feedback of the plurality of user is described as “0” for a disagreement of the user to the result of the model and as “1” for a agreement of the user to the result of the model.

In a third step, a collective feedback of the result of the model, i.e. the prediction, is determined based on the received feedback on the result of the model from the plurality of user. The determination of the collective feedback is provided by average the received feedback, respectively feedback values “0” and “1”.

When the collective feedback indicates a positive consensus, i.e. f(average prediction) = 1 , on the received feedback of the result of the model from the plurality of user, the model is classified as to be satisfactory, i.e. the quality of the result of the model is good because all user agree with the result of the model. In this case, no further parameter have to be provided. When the collective feedback indicates a negative consensus, i.e. f(average prediction) = 0, on the received feedback of the result of the model from the plurality of user, the model is classified as to be non-satisfactory, i.e. the quality of the result of the model is bad because all user disagree with the result of the model. In this case, the non- satisfactory classified model can be send for a countercheck to the model authority or can be discarded due to poor results/model quality. When the collective feedback on the result of the model from the plurality of user indicates a non-consensus, i.e. f(average prediction) > 0 and < 1 , on the received feedback of the result of the model from the plurality of user, progressively further parameter are provided for instigating the plurality of user to update their given feedback on the result of the model leading to the nonconsensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user. The progressive providing of further parameter comprises the following sub steps. In a first sub step at least one first model explanation, i.e. further parameter, is provided to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The providing of the model is provided to the plurality of user via a user interface. In a second sub step, feedback on the first model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The feedback is received from the plurality of user via the same or another user interface. The feedback on the first model explanation is a tuple, in particular a 3-tuple/triple, including the degree of agreement, the degree of disagreement, and the uncertainty.

Feedback on explanation (η) = a tuple < a, d, u > where: a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement satisfying a + d + u = 1 and a, d, u ∈ [0, 1]

Alternatively, instead of using a 3-tuple/triple for providing feedback, the feedback can also be provided by using probability theory and uncertainty. In a third sub step, a collective feedback on the first model explanation is determined based on the received feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback on the first model explanation is provided by mathematical operators. The operator may be:

Exemplary, a first user agrees with the result of the model (prediction) and largely with the first model explanation but isn’t sure about the explanation because of a lack of visibility into the complete workings of the model, the first user might provide feedback Ffjrst user = {1 , <0.7, 0.2, 0.1 >}. Further, exemplary, a second user also agrees with the prediction but not quite with the first model explanation, the second user the engineer provides feedback F_second user = {1 , <0.5, 0.4, 0.1 >}. When determining the collective feedback by using the above described mathematical operator and the feedback of the first user and the feedback of the second user, the collective feedback from the first user and the second user on predictions is rifirst user A second user = 1 , and the collective feedback on explanations is: zynrst user A j]_Second user = <0.35, 0.52, 0.13>. Therefore, the collective feedback is interpreted as a disagreement, because degree of disagreement has the largest value, i.e. , 0.52 on the explanation. The determination of the collective feedback may be calculated once the feedback is received from everyone or one or more feedbacks are received within an allocated time-frame. When the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

Optionally, when the collective feedback on the first model explanation indicates a non- consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, providing progressively a further parameter is provided, wherein the providing comprises the following sub steps. In a first sub step at least one second model explanation is provided to the plurality of user who gave a feedback leading to the nonconsensus on the received feedback of the result of the model. The providing of the at least one second model explanation is provided to the plurality of user via a user interface. In a second step, feedback on the second model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The feedback is received from the plurality of user via the same or another user interface. In a third sub step a collective feedback on the second model explanation is determined based on the received feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback is provided by the above described mathematical operator. When the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

Optionally, when the collective feedback on the second model explanation still indicates a non-consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following sub steps until the consensus on the result of the model is provided. In a first sub step, at least one other model explanation is provided to the plurality of user who gave a feedback leading to the non- consensus on the received feedback of the result of the model. The providing of the at least one other model explanation is provided to the plurality of user via a user interface. In a second sub step, feedback on the other model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The feedback is received from the plurality of user via the same or another user interface. In a third sub step, a collective feedback on the other model explanation is determined based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the nonconsensus on the received feedback of the result of the model. The determination of the collective feedback is provided by the above described mathematical operator. When the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user. This cycle may repeat till the agreement is achieved. However, this is not an infinite cycle, because there are only a finite set of explanations available for a model or only a limited number of explanations is shown to provide a possibility for achieving consensus. Alternatively or additionally, apart from a user’s own professional experience, other further parameters can be used as explanations. Exemplary further parameters as to be explanations may be other modes which users may use to try to get convinced about the model prediction. Further parameters are: the user may ask for importance of another variable than the one for which importance is shown, wherein the method may show the impact of that variable in a visualization (e.g., SHAP values plot), performing a scenario evaluation, wherein the user chooses multiple input variables and system/method generates visualizations (e.g., partial dependency plots and SHAP summary plots) for the same, the users may request the system/method to see other users feedback( in this case, the system/method shares each user’s feedback with the other) such that the users may change their feedback on prediction or explanation after reviewing others feedback, or the users may contest each-others’ feedback and together come to a consensus on the feedback.

Optionally, when the plurality of user are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, the method comprises the step of checking for a lack on the first model explanation, the second model explanation and/or the other model explanations. Users might be unable to reach a consensus on explanations (i.e., the method is unable to gain an agreement on explanations) due to the following reasons: a) the users might be undecided, there might be a lack of knowledge, the explanation did not increase the collective confidence of the user, and the explanation did not help at all. The reason that the users might be undecided is presented, when the collective feedback tuple may be <0.5, 0.0, 0.5>, <0.0, 0.5, 0.5>, <0.5, 0.5, 0.0> or <0.0, 0.0, 1 ,0>. The reason that there is a lack of knowledge is presented, when the collective feedback tuple may be <0.5, 0.0, 0.5> and <0.0, 0.5, 0.5>. The reason that explanations did not increase the collective confidence of the users in the result of the model is presented, when the collective feedback tuple may be <0.5, 0.5, 0.0>. The reason that explanations did not help at all is presented, when the collective feedback tuple may be <0.0, 0.0, 1 ,0>. When a lack is identified, the result of the model is classified as to be non-satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model. When a lack is not identified, the following progressively sub steps are repeated. In a first sub step at least one other model explanation is provided to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model. In a second sub step, feedback on the other model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. In a third sub step, a collective feedback on the other model explanation is determined based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback is carried out by the above described mathematical operator. When the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

Alternatively or additionally, when the feedback from the users turns out to be “harmful” for the model, i.e. , the result of the model degrades, these harmful feedbacks may be discarded and transparently shared with users that provided the feedback for root-cause analysis and agreement. Optionally, when the plurality of user update their feedback such that a consensus on the result of the model is provided, the metadata for the first model explanation, second model explanation and/or other model explanation are stored in a memory.

Fig. 2 illustrates an example embodiment of a system for deciding on a model result quality based on collective feedback of a plurality of user. The system 10 comprises an user interface 11 being configured for providing a result of the model to the plurality of user, an at least one first model explanation, an at least one second model explanation, and/or an at least one other model explanation to the plurality of user. Further, the user interface 11 is configured for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of user, a feedback on the second model explanation from the plurality of user, and/or a feedback on the other model explanation from the plurality of user. The user interface 11 is a touchscreen being able to present information to a user and to receive information from a user. The system 10 further comprises a processor 12 for executing the in Figure 1 described method. The processor 12 and the user interface 11 are connected to each other wireless or by wire in order to provide a data exchange between the interface 11 and the processor 12. Further, the system 10 comprises a memory, not depicted, for storing the metadata for the first model explanation, second model explanation and/or the other model explanation, when the plurality of user update their feedback such that a consensus on the result of the model is provided.

The present disclosure has been described in conjunction with a preferred embodiment as examples as well. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from the studies of the drawings, this disclosure and the claims. Notably, in particular, the any steps presented can be performed in any order, i.e. the present invention is not limited to a specific order of these steps. Moreover, it is also not required that the different steps are performed at a certain place or at one node of a distributed system, i.e. each of the steps may be performed at a different nodes using different equipment/data processing units.

In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.

Claims

1. A method for deciding on a model result quality based on a collective feedback of a plurality of user, comprising: providing a result of the model to the plurality of user; receiving feedback on the result of the model from the plurality of user; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of user, classifyfing the model as to be non-satisfactory, wherein, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively further parameter instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

2. The method according to claim 1 , wherein the feedback on the result of the model from the plurality of user comprises a positive feedback indicating an agreement to the result of the model and a negative feedback indicating a disagreement to the result of the model.

3. The method according to claim 2, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively a further parameter comprises the following steps: providing at least one first model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the first model explanation based on the received feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

4. The method according to claim 3, when the collective feedback on the first model explanation indicates a non-consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, providing progressively a further parameter comprises the following steps: providing at least one second model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the second model explanation based on the received feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

5. The method according to claim 4, further comprising: when the collective feedback on the second model explanation still indicates a nonconsensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following steps until the consensus on the result of the model is provided: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

6. The method according to claims 3 to 5, further comprising: when the plurality of user are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, checking for a lack on the first model explanation, the second model explanation and/or the other model explanations, wherein, when a lack is identified, classifying the result of the model as to be non- satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model, wherein, when a lack is not identified, repeating the following progressively steps: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

7. The method according to one of the claims 3 to 6, wherein the first model explanation, the second model explanation, and/or the other model explanation comprises at least one variable of the model.

8. The method according to claim 7, wherein the first model explanation, the second model explanation, and/or the other model explanation further comprises a relevance-indicating-element for each one of the at least one variable of the model and/or a request-element for requesting feedback, in particular feedback on the result of the model, feedback on the first model explanation, feedback on the second model explanation and/or feedback on the other model explanation, from another user of the plurality of user.

9. The method according to claims 3 to 7, further comprising: when the plurality of user update their feedback such that consensus on the result of the model is provided, storing the metadata for the first model explanation, second model explanation and/or other model explanation.

10. The method according to any one of the previous claims, wherein the feedback on the result of the model, the feedback on the first model explanation, the feedback on the second model explanation, and/or the feedback on the other model explanation comprises a tuple, in particular a 3-tuple, including the degree of agreement, the degree of disagreement, and the uncertainty in assigning an agreement or a disagreement.

11 . The method according to any one of previous claims, wherein the collective feedback on the result of the model is determined by average the feedback on the result of the model from the plurality of user.

12. The method according to any one of previous claims, wherein the collective feedback on the first model explanation, the second model explanation, and/or the other model explanation is determined by average the feedback or by a mathematical operator.

13. The method according to any one of the previous claims, wherein the plurality of user is at least two user.

14. The method according to any one of the previous claims, wherein the model is a machine learning model.

15. A system (10) for deciding on a model result quality based on collective feedback of a plurality of user, the system comprising: an user interface (11 ) for providing a result of the model to the plurality of user, a at least one first model explanation, a at least one second model explanation, and/or a at least one other model explanation to the plurality of user, and for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of user, a feedback on the second model explanation from the plurality of user, and/or a feedback on the other model explanation from the plurality of user; a processor (12) for executing the method according to claims 1 to 14.