WO2023208383A1 - Collective feedback-based decision on model prediction quality - Google Patents

Collective feedback-based decision on model prediction quality Download PDF

Info

Publication number
WO2023208383A1
WO2023208383A1 PCT/EP2022/061605 EP2022061605W WO2023208383A1 WO 2023208383 A1 WO2023208383 A1 WO 2023208383A1 EP 2022061605 W EP2022061605 W EP 2022061605W WO 2023208383 A1 WO2023208383 A1 WO 2023208383A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
feedback
result
user
consensus
Prior art date
Application number
PCT/EP2022/061605
Other languages
French (fr)
Inventor
Divyasheel SHARMA
Gayathri GOPALAKRISHNAN
Benjamin KLOEPPER
Joakim ASTROM
Benedikt Schmidt
Yemao MAN
Dawid ZIOBRO
Arzam Muzaffar Kotriwala
Marcel Dix
Original Assignee
Abb Schweiz Ag
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Abb Schweiz Ag filed Critical Abb Schweiz Ag
Priority to PCT/EP2022/061605 priority Critical patent/WO2023208383A1/en
Priority to PCT/EP2023/061313 priority patent/WO2023209185A1/en
Publication of WO2023208383A1 publication Critical patent/WO2023208383A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • the present disclosure relates to a method for deciding on a model result quality based on a collective feedback of a plurality of user and to a system for deciding on a model result quality based on a collective feedback of a plurality of user.
  • the general background of this disclosure is the aid of finding a decision on model results quality based on collective feedback.
  • model user In order to improve the trust to a result of a model, model user must agree with the result of the model.
  • model scientists should incorporate the opinions of the model users to train robust models for receiving improved model results.
  • a method for deciding on a model result quality based on a collective feedback of a plurality of user comprising: providing a result of the model to the plurality of user; receiving feedback on the result of the model from the plurality of user; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be non- satisfactory, wherein, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively further parameter instigating the plurality of user to update their given feedback on the result of the model
  • model as used herein is to be understood broadly and represents any system/algorithm calculates, determined and/or modulates data.
  • the model may be a machine learning model.
  • the model may be a model in training phase or a model being deployed on production.
  • the term user as used herein is to be understood broadly and represents any person who uses, programs, and/or provides the model.
  • the user may be a domain expert, a plant operator, a field engineer but is not limited thereto.
  • feedback as used herein is to be understood broadly and represents any opinion, belief or view of the user with respect to the model, but is not limited thereto. Further, the term feedback may be the transmission of evaluative or corrective information about an model, action, event, or process to the original or controlling source. The feedback may me positive or negative, wherein a positive feedback indicates an agreement and is described by “1” and a negative feedback indicates an disagreement and is described by “0”.
  • result of the model as used herein is to be understood broadly and represents any results being provided by a model. Exemplary, the result of the model is a prediction, but is not limited thereto.
  • receiving is to be understood broadly and represents any action for providing, collecting, and/or recording the feedback from the user.
  • the term providing as used herein is to be understood broadly and represents any action for showing, depicting, presenting results from a model to the plurality of the user.
  • the providing can be provided by a user interface, a monitor, a display, a touchscreen but is not limited thereto.
  • model result quality as used herein is to be understood broadly and represents any information, data indicating the validity and the correctness of the model results with respect to a real occurrence or a predefined state.
  • collective feedback as used herein is to be understood broadly and represents any summary/aggregation of the feedback on the result of the model of the plurality of user.
  • the collective feedback may be provided as a single output, or as a plurality of outputs.
  • the term determining as used herein is to be understood broadly and represents any calculation/determination of the collective feedback.
  • the determination may include allocating, averaging, statistical analyzing but is not limited thereto.
  • the determining uses a mathematical operator being described as the following formula for determining the collective feedback:
  • consensus as used herein is to be understood broadly and represents the output of the collective feedback determination. Therefore, the consensus may be the judgment arrived by the plurality of users or the collective opinions of the plurality of users.
  • the consensus may be positive or negative. Positive consensus indicates that the plurality of user agree with the result of the model. Negative consensus indicates that the plurality of user disagree with the result of the model.
  • non-consensus indicates the opposite of the term consensus.
  • the term satisfactory as used herein is to be understood broadly and represents that the result of the model are good such that the quality of the results of the model are satisfactory.
  • the term non-satisfactory represents the opposite of the term satisfactory, i.e. the result of the model are bad such that the quality of the results of the model are non-satisfactory.
  • parameter as used herein is to be understood broadly and represents any further explanation instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
  • the determination of the collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions. Further, the providing of progressively further parameters leads to an easy, less complex and time saving way to find/provide a consensus on the received feedback of the result of the model from the plurality of user such that a model scientist can easily process all of these opinions for significant improving the models.
  • the feedback on the result of the model from the plurality of user comprises a positive feedback indicating an agreement to the result of the model and a negative feedback indicating a disagreement to the result of the model.
  • providing progressively a further parameter comprises the following steps: providing at least one first model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the first model explanation based on the received feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non- consensus
  • the progressively providing of at least one first model explanation, i.e. a further parameter, the receiving of feedback on the first model explanation and the determination of the collective feedback on the first model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process, in particular find a consensus, all of these opinions such that a model scientist can improve the models in a significant manner.
  • providing progressively a further parameter comprises the following steps: providing at least one second model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the second model explanation based on the received feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from
  • the progressively providing of at least one second model explanation, the receiving of feedback on the second model explanation and the determination of the collective feedback on the second model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process, i.e. find a consensus, all of these opinions such that a model scientist can improve the models in a significant manner.
  • the method further comprises: when the collective feedback on the second model explanation still indicates a non-consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following steps until the consensus on the result of the model is provided: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a
  • repeating used herein is to be understand broadly and represents any replication of one or a plurality of method steps in the same or in an alternatively order.
  • the repeating is not an infinite cycle, because there are only a finite set of explanations available for the model, such that the system, in particular the user interface, is only able to show a limited number of explanations to provide a possibility for achieving consensus.
  • the progressively providing of at least one other model explanation, the receiving of feedback on the other model explanation and the determination of the collective feedback on the other model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions such that a model scientist can improve the models in a significant manner.
  • the method further comprises: when the plurality of user are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, checking for a lack on the first model explanation, the second model explanation and/or the other model explanations, wherein, when a lack is identified, classifying the result of the model as to be non-satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model, wherein, when a lack is not identified, repeating the following steps: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received
  • domain authority used herein is to be understood broadly and represents any person or company which owns the model or is responsible for the model.
  • the domain authority may be the provider of the model or the programmer/software engineer of the model, but is not limited thereto.
  • the domain authority may be, in particular preferably are, subject matter experts who would understand the result, e.g. predictions, of the model.
  • lack used herein is to be understood broadly and represents any mistake, error, inconsistency, lack of knowledge being identified in the first model explanation, second model explanation and/or other model explanation leading to an impossibility that the plurality of user are able to reach a consensus on the first model explanation, second model explanation and/or other model explanation.
  • the checking for a lack on the first model explanation, the second model explanation and/or the other model explanation enables an easy, less complex and time saving way to classify results of a model as to be non-satisfactory.
  • the first model explanation, the second model explanation, and/or the other model explanation comprises at least one variable of the model.
  • variable of the model as used herein is to be understood broadly and represents any variable or parameter being able to control, influence, amend, manipulate the behavior, respectively the results, of the model.
  • the first model explanation, the second model explanation, and/or the other model explanation further comprises a relevance-indicating- element for each one of the at least one variable of the model and/or a request-element for requesting feedback, in particular feedback on the result of the model, feedback on the first model explanation, feedback on the second model explanation and/or feedback on the other model explanation, from another user of the plurality of user.
  • relevance-indicating-element as used herein is to be understood broadly and represents any element indicating the importance/relevance of each one of the at least one variable for the model. Further the relevance-indicating-element may indicate the impact of a variable of the model on the model.
  • request-element as used herein is to be understood broadly and represents any element requesting the system to see other user feedback. In other words the system shares the user feedback with other users.
  • the method further comprises: when the plurality of user updates his feedback such that consensus on the result of the model is provided, storing the metadata for the first model explanation, second model explanation and/or other model explanation.
  • metadata as to be used herein is to be understood broadly and represents any data being essential for the model.
  • the metadata may include data with respect to the variables of the method as described above.
  • storing as to be used herein is to be understood broadly and represents any saving, exporting or transmitting data/metadata to a storage means like a memory, cloud, database but is not limited thereto.
  • the storing of the metadata for the first model explanation, second model explanation and/or other model explanation leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions, in particular to provide the user in progressive steps with parameter instigating most effectively the user to update his given feedback for finding in an easy manner a consensus.
  • the model scientist can easily process all of these opinions for significant improving the models.
  • the feedback on the result of the model, the feedback on the first model explanation, the feedback on the second model explanation, and/or the feedback on the other model explanation comprises a tuple, in particular a 3- tuple, including the degree of agreement, the degree of disagreement, and the uncertainty in assigning an agreement or a disagreement.
  • degree of agreement used herein is to be understood broadly and represents the degree of agreement/accordance between the opinion of the plurality of user and the results of the model.
  • the degree of agreement may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.
  • degree of disagreement used herein is to be understood broadly and represents the degree of agreement/accordance between the opinion of the plurality of user and the results of the model.
  • the degree of agreement may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.
  • uncertainty used herein is to be understood broadly and represents the degree of uncertainty of the plurality of user in assigning their opinions to an agreement or a disagreement.
  • the uncertainty may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.
  • the collective feedback on the result of the model is determined by average the feedback on the result of the model from the plurality of user.
  • the collective feedback on the first model explanation, the second model explanation, and/or the other model explanation is determined by average the feedback or by a mathematical operator.
  • the determining uses a mathematical operator being described as the following formula for determining the collective feedback:
  • the plurality of user is at least two user.
  • the model is a machine learning model.
  • a system for deciding on a model result quality based on collective feedback of a plurality of user comprising: an user interface for providing a result of the model to the plurality of user, an at least one first model explanation, an at least one second model explanation, and/or an at least one other model explanation to the plurality of user, and for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of user, a feedback on the second model explanation from the plurality of user, and/or a feedback on the other model explanation from the plurality of user; a processor for executing the above described method.
  • ..determining also includes ..initiating or causing to determine
  • generating also includes warmth initiating or causing to generate
  • provding also includes “initiating or causing to determine, generate, select, send or receive”.
  • “Initiating or causing to perform an action” includes any processing signal that triggers a computing device to perform the respective action.
  • Fig. 1 illustrates a flow diagram of a method for deciding on a model result quality based on a collective feedback of a plurality of user
  • Fig. 2 illustrates an example embodiment of a system for deciding on a model result quality based on collective feedback of a plurality of user
  • Fig. 1 illustrates a flow diagram of a method for deciding on a model result quality based on a collective feedback of a plurality of user.
  • a result of the model in particular a prediction of the model
  • the providing of the model is provided to the plurality of user via a user interface.
  • feedback on the result of the model i.e. prediction
  • the feedback is received from the plurality of user via the same or another user interface.
  • the feedback of the plurality of user is described as “0” for a disagreement of the user to the result of the model and as “1” for a agreement of the user to the result of the model.
  • a collective feedback of the result of the model i.e. the prediction, is determined based on the received feedback on the result of the model from the plurality of user.
  • the determination of the collective feedback is provided by average the received feedback, respectively feedback values “0” and “1”.
  • the non- satisfactory classified model can be send for a countercheck to the model authority or can be discarded due to poor results/model quality.
  • a non-consensus i.e. f(average prediction) > 0 and ⁇ 1
  • progressively further parameter are provided for instigating the plurality of user to update their given feedback on the result of the model leading to the nonconsensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
  • the progressive providing of further parameter comprises the following sub steps.
  • At least one first model explanation i.e. further parameter, is provided to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model.
  • the providing of the model is provided to the plurality of user via a user interface.
  • feedback on the first model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model.
  • the feedback is received from the plurality of user via the same or another user interface.
  • the feedback on the first model explanation is a tuple, in particular a 3-tuple/triple, including the degree of agreement, the degree of disagreement, and the uncertainty.
  • a collective feedback on the first model explanation is determined based on the received feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model.
  • the determination of the collective feedback on the first model explanation is provided by mathematical operators. The operator may be:
  • the determination of the collective feedback may be calculated once the feedback is received from everyone or one or more feedbacks are received within an allocated time-frame.
  • the collective feedback on the first model explanation When the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
  • the providing comprises the following sub steps.
  • a first sub step at least one second model explanation is provided to the plurality of user who gave a feedback leading to the nonconsensus on the received feedback of the result of the model.
  • the providing of the at least one second model explanation is provided to the plurality of user via a user interface.
  • feedback on the second model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model.
  • the feedback is received from the plurality of user via the same or another user interface.
  • a collective feedback on the second model explanation is determined based on the received feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model.
  • the determination of the collective feedback is provided by the above described mathematical operator.
  • the collective feedback on the second model explanation When the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
  • the collective feedback on the second model explanation still indicates a non-consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model
  • repeating the following sub steps until the consensus on the result of the model is provided.
  • at least one other model explanation is provided to the plurality of user who gave a feedback leading to the non- consensus on the received feedback of the result of the model.
  • the providing of the at least one other model explanation is provided to the plurality of user via a user interface.
  • a second sub step feedback on the other model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model.
  • the feedback is received from the plurality of user via the same or another user interface.
  • a collective feedback on the other model explanation is determined based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the nonconsensus on the received feedback of the result of the model. The determination of the collective feedback is provided by the above described mathematical operator.
  • the collective feedback on the other model explanation When the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user. This cycle may repeat till the agreement is achieved.
  • the user may ask for importance of another variable than the one for which importance is shown, wherein the method may show the impact of that variable in a visualization (e.g., SHAP values plot), performing a scenario evaluation, wherein the user chooses multiple input variables and system/method generates visualizations (e.g., partial dependency plots and SHAP summary plots) for the same, the users may request the system/method to see other users feedback( in this case, the system/method shares each user’s feedback with the other) such that the users may change their feedback on prediction or explanation after reviewing others feedback, or the users may contest each-others’ feedback and together come to a consensus on the feedback.
  • a visualization e.g., SHAP values plot
  • the method comprises the step of checking for a lack on the first model explanation, the second model explanation and/or the other model explanations.
  • Users might be unable to reach a consensus on explanations (i.e., the method is unable to gain an agreement on explanations) due to the following reasons: a) the users might be undecided, there might be a lack of knowledge, the explanation did not increase the collective confidence of the user, and the explanation did not help at all.
  • the reason that the users might be undecided is presented, when the collective feedback tuple may be ⁇ 0.5, 0.0, 0.5>, ⁇ 0.0, 0.5, 0.5>, ⁇ 0.5, 0.5, 0.0> or ⁇ 0.0, 0.0, 1 ,0>.
  • the reason that there is a lack of knowledge is presented, when the collective feedback tuple may be ⁇ 0.5, 0.0, 0.5> and ⁇ 0.0, 0.5, 0.5>.
  • the reason that explanations did not increase the collective confidence of the users in the result of the model is presented, when the collective feedback tuple may be ⁇ 0.5, 0.5, 0.0>.
  • the reason that explanations did not help at all is presented, when the collective feedback tuple may be ⁇ 0.0, 0.0, 1 ,0>.
  • the result of the model is classified as to be non-satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model.
  • the following progressively sub steps are repeated.
  • a first sub step at least one other model explanation is provided to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model.
  • a second sub step feedback on the other model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model.
  • a collective feedback on the other model explanation is determined based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback is carried out by the above described mathematical operator.
  • the collective feedback on the other model explanation When the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
  • the feedback from the users turns out to be “harmful” for the model, i.e. , the result of the model degrades
  • these harmful feedbacks may be discarded and transparently shared with users that provided the feedback for root-cause analysis and agreement.
  • the metadata for the first model explanation, second model explanation and/or other model explanation are stored in a memory.
  • Fig. 2 illustrates an example embodiment of a system for deciding on a model result quality based on collective feedback of a plurality of user.
  • the system 10 comprises an user interface 11 being configured for providing a result of the model to the plurality of user, an at least one first model explanation, an at least one second model explanation, and/or an at least one other model explanation to the plurality of user. Further, the user interface 11 is configured for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of user, a feedback on the second model explanation from the plurality of user, and/or a feedback on the other model explanation from the plurality of user.
  • the user interface 11 is a touchscreen being able to present information to a user and to receive information from a user.
  • the system 10 further comprises a processor 12 for executing the in Figure 1 described method.
  • the processor 12 and the user interface 11 are connected to each other wireless or by wire in order to provide a data exchange between the interface 11 and the processor 12.
  • the system 10 comprises a memory, not depicted, for storing the metadata for the first model explanation, second model explanation and/or the other model explanation, when the plurality of user update their feedback such that a consensus on the result of the model is provided.

Abstract

A method for deciding on a model result quality based on a collective feedback of a plurality of user, comprising: providing a result of the model to the plurality of user; receiving feedback on the result of the model from the plurality of user; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be non- satisfactory, wherein, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively further parameter instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.

Description

COLLECTIVE FEEDBACK-BASED DECISION ON MODEL PREDICTION QUALITY
TECHNICAL FIELD
The present disclosure relates to a method for deciding on a model result quality based on a collective feedback of a plurality of user and to a system for deciding on a model result quality based on a collective feedback of a plurality of user.
TECHNICAL BACKGROUND
The general background of this disclosure is the aid of finding a decision on model results quality based on collective feedback. In order to improve the trust to a result of a model, model user must agree with the result of the model. Hence, model scientists should incorporate the opinions of the model users to train robust models for receiving improved model results.
Typically, none or only a small amount of totally different opinions of the model users are/can be used by the model scientists for improving the model, because solely a minor amount of opinions of the model users are precise enough and/or in the needed detailed manner. Further, the opinions of the model users are mostly absolutely different, such that it is difficult for the model scientist to recognize how to improve the model and to recognize which opinion is more decisive, in particular to find a consensus of all opinions. Therefore, model scientists always have the problem that the few received opinions are mostly not useable, understandable or comparable such that an optimization of a model by the model scientist can be only done based on few opinions received being complexively chosen by the model scientist. This significantly weakens and prevents a comprehensive and sustainable improvement of the models.
Hence, there is a need to enable an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions such that a model scientist can improve the models in a significant manner.
SUMMARY OF THE INVENTION In one aspect of the invention a method for deciding on a model result quality based on a collective feedback of a plurality of user is presented, comprising: providing a result of the model to the plurality of user; receiving feedback on the result of the model from the plurality of user; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be non- satisfactory, wherein, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively further parameter instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
The term model as used herein is to be understood broadly and represents any system/algorithm calculates, determined and/or modulates data. The model may be a machine learning model. The model may be a model in training phase or a model being deployed on production.
The term user as used herein is to be understood broadly and represents any person who uses, programs, and/or provides the model. In particular, the user may be a domain expert, a plant operator, a field engineer but is not limited thereto.
The term feedback as used herein is to be understood broadly and represents any opinion, belief or view of the user with respect to the model, but is not limited thereto. Further, the term feedback may be the transmission of evaluative or corrective information about an model, action, event, or process to the original or controlling source. The feedback may me positive or negative, wherein a positive feedback indicates an agreement and is described by “1” and a negative feedback indicates an disagreement and is described by “0”. The term result of the model as used herein is to be understood broadly and represents any results being provided by a model. Exemplary, the result of the model is a prediction, but is not limited thereto.
The term receiving as used herein is to be understood broadly and represents any action for providing, collecting, and/or recording the feedback from the user.
The term providing as used herein is to be understood broadly and represents any action for showing, depicting, presenting results from a model to the plurality of the user. The providing can be provided by a user interface, a monitor, a display, a touchscreen but is not limited thereto.
The term model result quality as used herein is to be understood broadly and represents any information, data indicating the validity and the correctness of the model results with respect to a real occurrence or a predefined state.
The term collective feedback as used herein is to be understood broadly and represents any summary/aggregation of the feedback on the result of the model of the plurality of user. The collective feedback may be provided as a single output, or as a plurality of outputs.
The term determining as used herein is to be understood broadly and represents any calculation/determination of the collective feedback. The determination may include allocating, averaging, statistical analyzing but is not limited thereto. Alternatively, the determination may be a calculation with a tuple, in particular a tuple <a,d,u> (a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement; satisfying a+d+u = 1 and a,d,u e [0,1]). When using a tuple, the determining uses a mathematical operator being described as the the following formula for determining the collective feedback:
Figure imgf000005_0001
Figure imgf000006_0001
The term consensus as used herein is to be understood broadly and represents the output of the collective feedback determination. Therefore, the consensus may be the judgment arrived by the plurality of users or the collective opinions of the plurality of users. The consensus may be positive or negative. Positive consensus indicates that the plurality of user agree with the result of the model. Negative consensus indicates that the plurality of user disagree with the result of the model. The term non-consensus indicates the opposite of the term consensus.
The term satisfactory as used herein is to be understood broadly and represents that the result of the model are good such that the quality of the results of the model are satisfactory. The term non-satisfactory represents the opposite of the term satisfactory, i.e. the result of the model are bad such that the quality of the results of the model are non-satisfactory.
The term parameter as used herein is to be understood broadly and represents any further explanation instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
The determination of the collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions. Further, the providing of progressively further parameters leads to an easy, less complex and time saving way to find/provide a consensus on the received feedback of the result of the model from the plurality of user such that a model scientist can easily process all of these opinions for significant improving the models.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the feedback on the result of the model from the plurality of user comprises a positive feedback indicating an agreement to the result of the model and a negative feedback indicating a disagreement to the result of the model.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively a further parameter comprises the following steps: providing at least one first model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the first model explanation based on the received feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non- consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
The progressively providing of at least one first model explanation, i.e. a further parameter, the receiving of feedback on the first model explanation and the determination of the collective feedback on the first model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process, in particular find a consensus, all of these opinions such that a model scientist can improve the models in a significant manner.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, when the collective feedback on the first model explanation indicates a non-consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, providing progressively a further parameter comprises the following steps: providing at least one second model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the second model explanation based on the received feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
The progressively providing of at least one second model explanation, the receiving of feedback on the second model explanation and the determination of the collective feedback on the second model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process, i.e. find a consensus, all of these opinions such that a model scientist can improve the models in a significant manner.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the method further comprises: when the collective feedback on the second model explanation still indicates a non-consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following steps until the consensus on the result of the model is provided: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
The term repeating used herein is to be understand broadly and represents any replication of one or a plurality of method steps in the same or in an alternatively order. The repeating is not an infinite cycle, because there are only a finite set of explanations available for the model, such that the system, in particular the user interface, is only able to show a limited number of explanations to provide a possibility for achieving consensus.
The progressively providing of at least one other model explanation, the receiving of feedback on the other model explanation and the determination of the collective feedback on the other model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions such that a model scientist can improve the models in a significant manner.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the method further comprises: when the plurality of user are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, checking for a lack on the first model explanation, the second model explanation and/or the other model explanations, wherein, when a lack is identified, classifying the result of the model as to be non-satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model, wherein, when a lack is not identified, repeating the following steps: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
The term domain authority used herein is to be understood broadly and represents any person or company which owns the model or is responsible for the model. Exemplary, the domain authority may be the provider of the model or the programmer/software engineer of the model, but is not limited thereto. Further, the domain authority may be, in particular preferably are, subject matter experts who would understand the result, e.g. predictions, of the model.
The term lack used herein is to be understood broadly and represents any mistake, error, inconsistency, lack of knowledge being identified in the first model explanation, second model explanation and/or other model explanation leading to an impossibility that the plurality of user are able to reach a consensus on the first model explanation, second model explanation and/or other model explanation.
The checking for a lack on the first model explanation, the second model explanation and/or the other model explanation enables an easy, less complex and time saving way to classify results of a model as to be non-satisfactory. In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the first model explanation, the second model explanation, and/or the other model explanation comprises at least one variable of the model.
The term variable of the model as used herein is to be understood broadly and represents any variable or parameter being able to control, influence, amend, manipulate the behavior, respectively the results, of the model.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the first model explanation, the second model explanation, and/or the other model explanation further comprises a relevance-indicating- element for each one of the at least one variable of the model and/or a request-element for requesting feedback, in particular feedback on the result of the model, feedback on the first model explanation, feedback on the second model explanation and/or feedback on the other model explanation, from another user of the plurality of user.
The term relevance-indicating-element as used herein is to be understood broadly and represents any element indicating the importance/relevance of each one of the at least one variable for the model. Further the relevance-indicating-element may indicate the impact of a variable of the model on the model.
The term request-element as used herein is to be understood broadly and represents any element requesting the system to see other user feedback. In other words the system shares the user feedback with other users.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the method further comprises: when the plurality of user updates his feedback such that consensus on the result of the model is provided, storing the metadata for the first model explanation, second model explanation and/or other model explanation. The term metadata as to be used herein is to be understood broadly and represents any data being essential for the model. Exemplary, the metadata may include data with respect to the variables of the method as described above.
The term storing as to be used herein is to be understood broadly and represents any saving, exporting or transmitting data/metadata to a storage means like a memory, cloud, database but is not limited thereto.
The storing of the metadata for the first model explanation, second model explanation and/or other model explanation leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions, in particular to provide the user in progressive steps with parameter instigating most effectively the user to update his given feedback for finding in an easy manner a consensus. Hence, the model scientist can easily process all of these opinions for significant improving the models.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the feedback on the result of the model, the feedback on the first model explanation, the feedback on the second model explanation, and/or the feedback on the other model explanation comprises a tuple, in particular a 3- tuple, including the degree of agreement, the degree of disagreement, and the uncertainty in assigning an agreement or a disagreement.
The term degree of agreement used herein is to be understood broadly and represents the degree of agreement/accordance between the opinion of the plurality of user and the results of the model. The degree of agreement may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.
The term degree of disagreement used herein is to be understood broadly and represents the degree of agreement/accordance between the opinion of the plurality of user and the results of the model. The degree of agreement may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto. The term uncertainty used herein is to be understood broadly and represents the degree of uncertainty of the plurality of user in assigning their opinions to an agreement or a disagreement. The uncertainty may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.
The detailed indication of the degree of agreement, degree of disagreement and the uncertainty leads to an easy, less complex and fast way to instigate the plurality of user to update their given feedback for finding a consensus such that a model scientist can significant improve the model.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the collective feedback on the result of the model is determined by average the feedback on the result of the model from the plurality of user.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the collective feedback on the first model explanation, the second model explanation, and/or the other model explanation is determined by average the feedback or by a mathematical operator.
The term mathematical operator as used herein is to be understood broadly and represent a determination of the collective feedback by a tuple, in particular a tuple <a,d,u> (a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement; satisfying a+d+u = 1 and a,d,u e [0,1 ]). When using a tuple, the determining uses a mathematical operator being described as the following formula for determining the collective feedback:
Figure imgf000013_0001
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the plurality of user is at least two user.
In an embodiment of the method for deciding on a model result quality based on a collective feedback of a plurality of user, the model is a machine learning model.
In a further aspect a system for deciding on a model result quality based on collective feedback of a plurality of user is presented, the system comprising: an user interface for providing a result of the model to the plurality of user, an at least one first model explanation, an at least one second model explanation, and/or an at least one other model explanation to the plurality of user, and for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of user, a feedback on the second model explanation from the plurality of user, and/or a feedback on the other model explanation from the plurality of user; a processor for executing the above described method.
Any disclosure and embodiments described herein relate to the method and the system, lined out above and vice versa. Advantageously, the benefits provided by any of the embodiments and examples equally apply to all other embodiments and examples and vice versa.
As used herein ..determining" also includes ..initiating or causing to determine", “generating" also includes „ initiating or causing to generate" and “provding” also includes “initiating or causing to determine, generate, select, send or receive”. “Initiating or causing to perform an action” includes any processing signal that triggers a computing device to perform the respective action.
BRIEF DESCRIPTION OF THE DRAWINGS
In the following, the present disclosure is further described with reference to the enclosed figures: Fig. 1 illustrates a flow diagram of a method for deciding on a model result quality based on a collective feedback of a plurality of user;
Fig. 2 illustrates an example embodiment of a system for deciding on a model result quality based on collective feedback of a plurality of user;
DETAILED DESCRIPTION OF EMBODIMENT
The following embodiments are mere examples for the method and the system disclosed herein and shall not be considered limiting.
Fig. 1 illustrates a flow diagram of a method for deciding on a model result quality based on a collective feedback of a plurality of user. In a first step, a result of the model, in particular a prediction of the model, is provided to the plurality of user. The providing of the model is provided to the plurality of user via a user interface. In a second step, feedback on the result of the model, i.e. prediction, is received from the plurality of user. The feedback is received from the plurality of user via the same or another user interface. The feedback of the plurality of user is described as “0” for a disagreement of the user to the result of the model and as “1” for a agreement of the user to the result of the model.
Figure imgf000015_0002
In a third step, a collective feedback of the result of the model, i.e. the prediction, is determined based on the received feedback on the result of the model from the plurality of user. The determination of the collective feedback is provided by average the received feedback, respectively feedback values “0” and “1”.
Figure imgf000015_0001
When the collective feedback indicates a positive consensus, i.e. f(average prediction) = 1 , on the received feedback of the result of the model from the plurality of user, the model is classified as to be satisfactory, i.e. the quality of the result of the model is good because all user agree with the result of the model. In this case, no further parameter have to be provided. When the collective feedback indicates a negative consensus, i.e. f(average prediction) = 0, on the received feedback of the result of the model from the plurality of user, the model is classified as to be non-satisfactory, i.e. the quality of the result of the model is bad because all user disagree with the result of the model. In this case, the non- satisfactory classified model can be send for a countercheck to the model authority or can be discarded due to poor results/model quality. When the collective feedback on the result of the model from the plurality of user indicates a non-consensus, i.e. f(average prediction) > 0 and < 1 , on the received feedback of the result of the model from the plurality of user, progressively further parameter are provided for instigating the plurality of user to update their given feedback on the result of the model leading to the nonconsensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user. The progressive providing of further parameter comprises the following sub steps. In a first sub step at least one first model explanation, i.e. further parameter, is provided to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The providing of the model is provided to the plurality of user via a user interface. In a second sub step, feedback on the first model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The feedback is received from the plurality of user via the same or another user interface. The feedback on the first model explanation is a tuple, in particular a 3-tuple/triple, including the degree of agreement, the degree of disagreement, and the uncertainty.
Feedback on explanation (η) = a tuple < a, d, u > where: a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement satisfying a + d + u = 1 and a, d, u ∈ [0, 1]
Alternatively, instead of using a 3-tuple/triple for providing feedback, the feedback can also be provided by using probability theory and uncertainty. In a third sub step, a collective feedback on the first model explanation is determined based on the received feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback on the first model explanation is provided by mathematical operators. The operator may be:
Figure imgf000017_0001
Exemplary, a first user agrees with the result of the model (prediction) and largely with the first model explanation but isn’t sure about the explanation because of a lack of visibility into the complete workings of the model, the first user might provide feedback Ffjrst user = {1 , <0.7, 0.2, 0.1 >}. Further, exemplary, a second user also agrees with the prediction but not quite with the first model explanation, the second user the engineer provides feedback Fsecond user = {1 , <0.5, 0.4, 0.1 >}. When determining the collective feedback by using the above described mathematical operator and the feedback of the first user and the feedback of the second user, the collective feedback from the first user and the second user on predictions is rifirst user A second user = 1 , and the collective feedback on explanations is: zynrst user A j]Second user = <0.35, 0.52, 0.13>. Therefore, the collective feedback is interpreted as a disagreement, because degree of disagreement has the largest value, i.e. , 0.52 on the explanation. The determination of the collective feedback may be calculated once the feedback is received from everyone or one or more feedbacks are received within an allocated time-frame. When the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
Optionally, when the collective feedback on the first model explanation indicates a non- consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, providing progressively a further parameter is provided, wherein the providing comprises the following sub steps. In a first sub step at least one second model explanation is provided to the plurality of user who gave a feedback leading to the nonconsensus on the received feedback of the result of the model. The providing of the at least one second model explanation is provided to the plurality of user via a user interface. In a second step, feedback on the second model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The feedback is received from the plurality of user via the same or another user interface. In a third sub step a collective feedback on the second model explanation is determined based on the received feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback is provided by the above described mathematical operator. When the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
Optionally, when the collective feedback on the second model explanation still indicates a non-consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following sub steps until the consensus on the result of the model is provided. In a first sub step, at least one other model explanation is provided to the plurality of user who gave a feedback leading to the non- consensus on the received feedback of the result of the model. The providing of the at least one other model explanation is provided to the plurality of user via a user interface. In a second sub step, feedback on the other model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The feedback is received from the plurality of user via the same or another user interface. In a third sub step, a collective feedback on the other model explanation is determined based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the nonconsensus on the received feedback of the result of the model. The determination of the collective feedback is provided by the above described mathematical operator. When the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user. This cycle may repeat till the agreement is achieved. However, this is not an infinite cycle, because there are only a finite set of explanations available for a model or only a limited number of explanations is shown to provide a possibility for achieving consensus. Alternatively or additionally, apart from a user’s own professional experience, other further parameters can be used as explanations. Exemplary further parameters as to be explanations may be other modes which users may use to try to get convinced about the model prediction. Further parameters are: the user may ask for importance of another variable than the one for which importance is shown, wherein the method may show the impact of that variable in a visualization (e.g., SHAP values plot), performing a scenario evaluation, wherein the user chooses multiple input variables and system/method generates visualizations (e.g., partial dependency plots and SHAP summary plots) for the same, the users may request the system/method to see other users feedback( in this case, the system/method shares each user’s feedback with the other) such that the users may change their feedback on prediction or explanation after reviewing others feedback, or the users may contest each-others’ feedback and together come to a consensus on the feedback.
Optionally, when the plurality of user are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, the method comprises the step of checking for a lack on the first model explanation, the second model explanation and/or the other model explanations. Users might be unable to reach a consensus on explanations (i.e., the method is unable to gain an agreement on explanations) due to the following reasons: a) the users might be undecided, there might be a lack of knowledge, the explanation did not increase the collective confidence of the user, and the explanation did not help at all. The reason that the users might be undecided is presented, when the collective feedback tuple may be <0.5, 0.0, 0.5>, <0.0, 0.5, 0.5>, <0.5, 0.5, 0.0> or <0.0, 0.0, 1 ,0>. The reason that there is a lack of knowledge is presented, when the collective feedback tuple may be <0.5, 0.0, 0.5> and <0.0, 0.5, 0.5>. The reason that explanations did not increase the collective confidence of the users in the result of the model is presented, when the collective feedback tuple may be <0.5, 0.5, 0.0>. The reason that explanations did not help at all is presented, when the collective feedback tuple may be <0.0, 0.0, 1 ,0>. When a lack is identified, the result of the model is classified as to be non-satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model. When a lack is not identified, the following progressively sub steps are repeated. In a first sub step at least one other model explanation is provided to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model. In a second sub step, feedback on the other model explanation is received from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. In a third sub step, a collective feedback on the other model explanation is determined based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback is carried out by the above described mathematical operator. When the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
Alternatively or additionally, when the feedback from the users turns out to be “harmful” for the model, i.e. , the result of the model degrades, these harmful feedbacks may be discarded and transparently shared with users that provided the feedback for root-cause analysis and agreement. Optionally, when the plurality of user update their feedback such that a consensus on the result of the model is provided, the metadata for the first model explanation, second model explanation and/or other model explanation are stored in a memory.
Fig. 2 illustrates an example embodiment of a system for deciding on a model result quality based on collective feedback of a plurality of user. The system 10 comprises an user interface 11 being configured for providing a result of the model to the plurality of user, an at least one first model explanation, an at least one second model explanation, and/or an at least one other model explanation to the plurality of user. Further, the user interface 11 is configured for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of user, a feedback on the second model explanation from the plurality of user, and/or a feedback on the other model explanation from the plurality of user. The user interface 11 is a touchscreen being able to present information to a user and to receive information from a user. The system 10 further comprises a processor 12 for executing the in Figure 1 described method. The processor 12 and the user interface 11 are connected to each other wireless or by wire in order to provide a data exchange between the interface 11 and the processor 12. Further, the system 10 comprises a memory, not depicted, for storing the metadata for the first model explanation, second model explanation and/or the other model explanation, when the plurality of user update their feedback such that a consensus on the result of the model is provided.
The present disclosure has been described in conjunction with a preferred embodiment as examples as well. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from the studies of the drawings, this disclosure and the claims. Notably, in particular, the any steps presented can be performed in any order, i.e. the present invention is not limited to a specific order of these steps. Moreover, it is also not required that the different steps are performed at a certain place or at one node of a distributed system, i.e. each of the steps may be performed at a different nodes using different equipment/data processing units.
In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.

Claims

Claims
1. A method for deciding on a model result quality based on a collective feedback of a plurality of user, comprising: providing a result of the model to the plurality of user; receiving feedback on the result of the model from the plurality of user; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of user, classifying the model as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of user, classifyfing the model as to be non-satisfactory, wherein, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively further parameter instigating the plurality of user to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
2. The method according to claim 1 , wherein the feedback on the result of the model from the plurality of user comprises a positive feedback indicating an agreement to the result of the model and a negative feedback indicating a disagreement to the result of the model.
3. The method according to claim 2, when the collective feedback on the result of the model from the plurality of user indicates a non-consensus on the received feedback of the result of the model from the plurality of user, providing progressively a further parameter comprises the following steps: providing at least one first model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the first model explanation based on the received feedback on the first model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
4. The method according to claim 3, when the collective feedback on the first model explanation indicates a non-consensus on the received feedback on the first model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, providing progressively a further parameter comprises the following steps: providing at least one second model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the second model explanation based on the received feedback on the second model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
5. The method according to claim 4, further comprising: when the collective feedback on the second model explanation still indicates a nonconsensus on the received feedback on the second model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following steps until the consensus on the result of the model is provided: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
6. The method according to claims 3 to 5, further comprising: when the plurality of user are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, checking for a lack on the first model explanation, the second model explanation and/or the other model explanations, wherein, when a lack is identified, classifying the result of the model as to be non- satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model, wherein, when a lack is not identified, repeating the following progressively steps: providing at least one other model explanation to the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of users who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of user update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of user.
7. The method according to one of the claims 3 to 6, wherein the first model explanation, the second model explanation, and/or the other model explanation comprises at least one variable of the model.
8. The method according to claim 7, wherein the first model explanation, the second model explanation, and/or the other model explanation further comprises a relevance-indicating-element for each one of the at least one variable of the model and/or a request-element for requesting feedback, in particular feedback on the result of the model, feedback on the first model explanation, feedback on the second model explanation and/or feedback on the other model explanation, from another user of the plurality of user.
9. The method according to claims 3 to 7, further comprising: when the plurality of user update their feedback such that consensus on the result of the model is provided, storing the metadata for the first model explanation, second model explanation and/or other model explanation.
10. The method according to any one of the previous claims, wherein the feedback on the result of the model, the feedback on the first model explanation, the feedback on the second model explanation, and/or the feedback on the other model explanation comprises a tuple, in particular a 3-tuple, including the degree of agreement, the degree of disagreement, and the uncertainty in assigning an agreement or a disagreement.
11 . The method according to any one of previous claims, wherein the collective feedback on the result of the model is determined by average the feedback on the result of the model from the plurality of user.
12. The method according to any one of previous claims, wherein the collective feedback on the first model explanation, the second model explanation, and/or the other model explanation is determined by average the feedback or by a mathematical operator.
13. The method according to any one of the previous claims, wherein the plurality of user is at least two user.
14. The method according to any one of the previous claims, wherein the model is a machine learning model.
15. A system (10) for deciding on a model result quality based on collective feedback of a plurality of user, the system comprising: an user interface (11 ) for providing a result of the model to the plurality of user, a at least one first model explanation, a at least one second model explanation, and/or a at least one other model explanation to the plurality of user, and for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of user, a feedback on the second model explanation from the plurality of user, and/or a feedback on the other model explanation from the plurality of user; a processor (12) for executing the method according to claims 1 to 14.
PCT/EP2022/061605 2022-04-29 2022-04-29 Collective feedback-based decision on model prediction quality WO2023208383A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/EP2022/061605 WO2023208383A1 (en) 2022-04-29 2022-04-29 Collective feedback-based decision on model prediction quality
PCT/EP2023/061313 WO2023209185A1 (en) 2022-04-29 2023-04-28 Improved model based on collective feedback-based decision on model prediction quality

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/061605 WO2023208383A1 (en) 2022-04-29 2022-04-29 Collective feedback-based decision on model prediction quality

Publications (1)

Publication Number Publication Date
WO2023208383A1 true WO2023208383A1 (en) 2023-11-02

Family

ID=86328957

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/EP2022/061605 WO2023208383A1 (en) 2022-04-29 2022-04-29 Collective feedback-based decision on model prediction quality
PCT/EP2023/061313 WO2023209185A1 (en) 2022-04-29 2023-04-28 Improved model based on collective feedback-based decision on model prediction quality

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/EP2023/061313 WO2023209185A1 (en) 2022-04-29 2023-04-28 Improved model based on collective feedback-based decision on model prediction quality

Country Status (1)

Country Link
WO (2) WO2023208383A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150089399A1 (en) * 2013-09-26 2015-03-26 Polis Technology Inc. System and methods for real-time formation of groups and decentralized decision making
US20190130904A1 (en) * 2017-10-26 2019-05-02 Hitachi, Ltd. Dialog system with self-learning natural language understanding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150089399A1 (en) * 2013-09-26 2015-03-26 Polis Technology Inc. System and methods for real-time formation of groups and decentralized decision making
US20190130904A1 (en) * 2017-10-26 2019-05-02 Hitachi, Ltd. Dialog system with self-learning natural language understanding

Also Published As

Publication number Publication date
WO2023209185A1 (en) 2023-11-02

Similar Documents

Publication Publication Date Title
US10910107B1 (en) Computer network architecture for a pipeline of models for healthcare outcomes with machine learning and artificial intelligence
Sahu et al. Predicting software bugs of newly and large datasets through a unified neuro-fuzzy approach: Reliability perspective
US10998104B1 (en) Computer network architecture with machine learning and artificial intelligence and automated insight generation
CN109117380B (en) Software quality evaluation method, device, equipment and readable storage medium
WO2018220885A1 (en) Production plan creation device, production plan creation method, and production plan creation program
US20150339415A1 (en) System and method for creating a simulation model via crowdsourcing
JP6176979B2 (en) Project management support system
US11763950B1 (en) Computer network architecture with machine learning and artificial intelligence and patient risk scoring
TWI590095B (en) Verification system for software function and verification mathod therefor
US10474926B1 (en) Generating artificial intelligence image processing services
WO2022142013A1 (en) Artificial intelligence-based ab testing method and apparatus, computer device and medium
CN109783459A (en) The method, apparatus and computer readable storage medium of data are extracted from log
CN111428095B (en) Graph data quality verification method and graph data quality verification device
WO2023044632A1 (en) Industrial equipment maintenance strategy generation method and apparatus, electronic device, and storage medium
EP2797257A1 (en) Method for providing network topology information in a communication network
Guo et al. To preserve or not to preserve invalid solutions in search-based software engineering: a case study in software product lines
WO2023208383A1 (en) Collective feedback-based decision on model prediction quality
US11748820B1 (en) Computer network architecture with automated claims completion, machine learning and artificial intelligence
WO2023045378A1 (en) Method and device for recommending item information to user, storage medium, and program product
CN116540546A (en) Recommendation method, system, equipment and medium for control parameters of process control system
US20170052867A1 (en) Integrating system dynamics modelling into information system quality measurement in determining quality of an information system
CN109657907A (en) Method of quality control, device and the terminal device of geographical national conditions monitoring data
JP2019200510A (en) Forecasting system and forecasting method
CN112528500B (en) Evaluation method and evaluation equipment for scene graph construction model
US20210365189A1 (en) Performance analysis apparatus and performance analysis method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22726471

Country of ref document: EP

Kind code of ref document: A1