AU2013224752A1 - Critical decision support system and method - Google Patents

Critical decision support system and method Download PDF

Info

Publication number
AU2013224752A1
AU2013224752A1 AU2013224752A AU2013224752A AU2013224752A1 AU 2013224752 A1 AU2013224752 A1 AU 2013224752A1 AU 2013224752 A AU2013224752 A AU 2013224752A AU 2013224752 A AU2013224752 A AU 2013224752A AU 2013224752 A1 AU2013224752 A1 AU 2013224752A1
Authority
AU
Australia
Prior art keywords
decision
confidence
outcome
making
confidence level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2013224752A
Inventor
Simon Anthony Jackson
Sabina Kleitman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Sydney
Original Assignee
University of Sydney
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Sydney filed Critical University of Sydney
Priority to AU2013224752A priority Critical patent/AU2013224752A1/en
Publication of AU2013224752A1 publication Critical patent/AU2013224752A1/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

Critical Decision Support System and Method A method of predicting a current decision outcome, the method including the steps of: (a) for a large collection of decision participants measuring prior decision outcome predictions and an a current estimated confidence level for the decision participant; (b) utilising the collection of outcome predictions and estimated confidence level to form a regression analysis predictor of the certainty of actual outcome; (c) for a current decision, utilising the regression analysis predictor to predict the current decision outcome based on a participants predicted outcome and current confidence level of outcome. Fig. 1 444IeJJ

Description

- 1 Critical Decision Support System and Method FIELD OF THE INVENTION [0001] The present invention relates to the field of decision support systems and, in particular, discloses a system for highlighting and utilising confidence measures in decisions. BACKGROUND [0002] Any discussion of the background art throughout the specification should in no way be considered as an admission that such art is widely known or forms part of common general knowledge in the field. [0003] Decision-making describes the process by which we consciously interact with and make choices in the world around us. Decision making is often a highly non-linear process where a party needs to weigh up a large number of competing options. [0004] The valid and reliable measurement of the decision making process therefore holds considerable value. Such measurement can detect and be predictive of individual characteristics and determinative of the level of decision- making competence (typically excellent or impoverished) or enable intervention of components susceptible to change. [0005] Different decisions have different weighted consequences. For example, in surgery planning, decision making can have extreme consequences. [0006] For example, in one scenario, in an initial visit, a patient is diagnosed with a condition requiring difficult surgery. If the diagnosis is wrong, surgery could have severe implications. The doctor might therefore normally consider acquiring more information, perhaps via X-ray or blood tests. Such tests strain the available facilities and cost the patient time and money. If the diagnosis is correct, these are unnecessary costs, and sometimes detrimental. This scenario characterises most decision-making contexts in which the outcome of the decision, to conduct surgery or not, varies as a function of judgement accuracy and diagnosis. [0007] Decisions are often based on judgements constructed from variable degrees of inherently uncertain information. The more accurate information a decision-maker has, the more likely it is that their decisions will lead to desired outcomes. It is therefore important for decision makers to assess the accuracy of the judgements on which their decisions are based. When -2 judgements are derived via cognitive processes, their accuracy depends on the accuracy of that cognition. Metacognition describes the psychological process of the assessment, regulation, and control of cognition needed to maximise the accuracy of judgements and guide optimal decision making. [0008] Confidence is the conscious and quintessential experience of the metacognitive system assessing judgement accuracy. As an individual's judgement confidence increases, the more certain they will be of the outcomes their decisions will lead to. When complete confidence is attained, the uncertainty present to the decision-maker exists in the situation. For example, a doctor may be completely confident of the patient's condition and need for surgery, but, with any surgery, something may still go wrong. The probability that the doctor will immediately treat the patient, rather than acquire more information, should therefore increase as his/her confidence in that diagnosis increases. This outlines a basic assumption, henceforth referred to as the Confidence Assumption: as confidence in one's judgement increases, so too does the probability of making a decision congruent with that judgment. Measurement of confidence with minimal error can be of the utmost importance. References [0009] Allwood, C. M., Granhag, P. A., & Jonsson, A-C. (2006). Child witnesses' metamemory realism. Scandinavian Journal of Psychology, 47(6), 461-470. doi: 10.1111/j.1467-9450.2006.00530.x [0010] Appelt, K. C., Milch, K. F., Handgraaf, M. J., & Weber, E. U. (2011). The Decision Making Individual Differences Inventory and guidelines for the study of individual differences in judgment and decision-making research. Judgment and Decision Making, 6(3), 252-262. [0011] Azevedo, R. (2009). Theoretical, conceptual, methodological, and instructional issues in research on metacognition and self-regulated learning: A discussion. Metacognition Learning, 4, 87-95. [0012] Boekaerts, M., & Rozendaal, J. S. (2010). Using multiple calibration indices in order to capture the complex picture of what affects students' accuracy of feeling of confidence. Learning and Instruction, 20(5), 372-382. [0013] Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York, NY: Cambridge University Press; US.
-3 [0014] Chelminski, P., & Coulter, R. A. (2007). On market mavens and consumer self-confidence: A cross-cultural study. Psychology & Marketing, 24(1), 69-91. [0015] Cofrin, K. M. W. (1999). The effect of need for closure on the diagnostic decision process of medical students. Unpublished Doctoral Dissertation. The University of Utah. Utah. [0016] Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioural sciences (3 ed.). Mahwah, New Jersey: Lawrence Erlbaum Associates. [0017] Costermans, J., Lories, G., & Ansay, C. (1992). Confidence level and feeling of knowing in question answering: The weight of inferential processes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18(1), 142-150. doi: 10.1037/0278- 7393.18.1.142 [0018] DeMarree, K. G., & Petty, R. E. (2007). Self-certainty: Parallels to Attitude Certainty. International Journal of Psychology and Psychological Therapy, 7(2), 159-159-188. [0019] Desmarais, S. L., Nicholls, T. L., Read, J., & Brink, J. (2010). Confidence and accuracy in assessments of short-term risks presented by forensic psychiatric patients. Journal of Forensic Psychiatry & Psychology, 21(1), 1-22. [0020] Dory, V., Degryse, J., Roex, A., & Vanpee, D. (2010). Usable knowledge, hazardous ignorance-- Beyond the percentage correct score. Medical Teacher, 32(5), 375-380. [0021] Edwards, W. (1954). The theory of decision making. Psychological Bulletin July, 51(4), 380 417. Edwards, W. (1961). Behavioral Decision Theory. Annual Review of Psychology, 12, 473-498. [0022] Efklides, A. (2008). Metacognition: Defining its facets and levels of functioning in relation to self- regulation and co-regulation. European Psychologist, 13(4), 277-287. doi: 10.1027/1016 9040.13.4.277 [0023] Flavell, J. H. (1979). Metacognition and cognitive monitoring: A new area of cognitive developmental inquiry. American Psychologist, 34(10), 906-911. [0024] Gilovich, T., Griffin, D. W., & Kahneman, D. (Eds.). (2002). Heuristics and Biases: The Psychology of Intiuitive Judgement. New York: Cambridge University Press.
-4 [0025] Hahn, K. H., & Kim, J. (2009). The effect of offline brand trust and perceived internet confidence on online shopping intention in the integrated multi-channel context. International Journal of Retail & Distribution Management, 37(2), 126. [0026] Hansen, M. S., Nogareda, G. J., & Hutchison, S. J. (2007). Frequency of and Inappropriate Treatment of Misdiagnosis of Acute Aortic Dissection. The American journal of cardiology, 99(6), 852 856. [0027] Harvey, N. (1997). Confidence in judgment. Trends in Cognitive Sciences, 1(2), 78-82. [0028] Harvey, N. (2001). Studying judgement: General issues. Thinking & Reasoning, 7(1), 103 118. doi: 10.1080/13546780042000064 [0029] Horn, J. L., & Cattell, R. B. (1982). Whimsy and misunderstanding of gf-gc theory: A comment on Guilford. Psychological Bulletin, 91(3), 623-633. [0030] Hunter, J. E. (1986). Cognitive ability, cognitive aptitudes, job knowledge, and job performance. Journal of Vocational Behavior, 29(3), 340-362. doi: 10.1016/0001-8791(86)90013-8 [0031] Ibabe, I., & Sporer, S. L. (2004). How You Ask Is What You Get: On the Influence of Question Form on Accuracy and Confidence. Applied Cognitive Psychology, 18(6), 711-726. [0032] Juslin, P., Winman, A., & Olsson, H. (2000). Naive Empiricism and Dogmatism in Confidence Research: A Critical Examination of the Hard-Easy Effect. Psychological Review, 107(2), 384-396. [0033] Kaplan, R. M., & Saccuzzo, D. P. (2005). Psychological Testing; Principles, Applications, and Issues (6 ed.). Belmont, CA: Thomson Wadsworth. [0034] Kleitman, S. (2008). Metacognition in the Rationality Debate. Saarbrucken: VDM Verlag Dr. Muller Aktiengesellschaft & Co. KG. [0035] Kleitman, S., & Gibson, J. (2011). Metacognitive beliefs, self-confidence and primary learning environment of sixth grade students. Learning and Individual Differences, 21, 728-735. Kleitman, S. & Mascrop, T. (2010). Self-Confidence and Academic Achievements in Primary- School Children: Their Relationships and Links to Parental Bonds, Intelligence, Age, and Gender (pp. 293- -5 326). In A. Efklides and P. Misailidi (Ed). Trends and Prospects in Metacognition Research, US: Springer. [0036] Kleitman, S., & Stankov, L. (2001). Ecological and person-oriented aspects of metacognitive processes in test-taking. Applied Cognitive Psychology, 15(3), 321-341. doi: 10.1002/acp.705 [0037] Kleitman, S., & Stankov, L. (2007). Self-confidence and metacognitive processes. Learning and Individual Differences, 17, 161-173. [0038] Kleitman, S., Stankov, L., Allwood, C. M., Young, S., & Mak, K. (2011). Metacognitive Self Confidence in School-Aged Children Educators. Unpublished Book Chapter. [0039] Knight, F. H. (1921). Risk, Uncertainty and Profit. Boston: Houghton Mifflin Company. Kruglanski, A. W. (1990). Motivations for judging and knowing: Implications for causal attribution Handbook of motivation and cognition: Foundations of social behavior, Vol 2 (pp. 333-368). New York, NY: Guilford Press; US. [0040] Leippe, M. R., Eisenstadt, D., & Rauch, S. M. (2009). Cueing confidence in eyewitness identifications: Influence of biased lineup instructions and pre-identification memory feedback under varying lineup conditions. Law and Human Behavior, 33(3), 194-212. [0041] Lichtenstein, S., & Fischhoff, B. (1977). Do those who know more also know more about how much they know? Organizational Behavior and Human Performance, 20(2), 159-183. doi: 10.1016/0030-5073(77)90001-0 [0042] McKenzie, C. (1999). (Non)Complementary updating of belief in two hypotheses. Memory & Cognition, 27(1), 152-165. doi: 10.3758/bf03201221 [0043] McKenzie, C. R. (1998). Taking Into Account the Strength of an Alternative Hypothesis. Journal of Experimental Psychology: Learning, Memory, & Cognition, 24(3), 771-792. [0044] McKenzie, C. R. M. (1997). Underweighting Alternatives and Overconfidence. Organizational Behavior and Human Decision Processes, 71(2), 141-160. doi: 10.1006/obhd.1997.2716 [0045] Mellers, B., Schwartz, A., & Cooke, A. (1998). Judgment and decision making. Annual Review of Psychology, 49, 447-477.
-6 [0046] Mengelkamp, C., & Bannert, M. (2010). Accuracy of confidence judgments: Stability and generality in the learning process and predictive validity for learning outcome. Memory & Cognition, 38(4), 441-451. [0047] Moore, D. A., & Healy, P. J. (2008). The Trouble With Overconfidence. Psychological Review, 115(2), 502-517. [0048] Nelson, T. 0. (1996). Consciousness and metacognition. American Psychologist, 51(2), 102 116. [0049] Pallier, G., Wilkinson, R., Danthir, V., Kleitman, S., Knezevic, G., Stankov, L., & Roberts, R. D. (2002). The role of individual differences in the accuracy of confidence judgments. Journal of General Psychology, 129(3), 257-299. doi: 10.1080/00221300209602099 [0050] Raven, J. C. (1938-65). Progressive Matrices. New York: The Psychological Corporation. [0051] Roets, A., & Van Hiel, A. (2011). Item selection and validation of a brief, 15-item version of the Need for Closure Scale. Personality and Individual Differences, 50(1), 90-94. doi:10. 1016/j.paid.2010.09.004 [0052] Saucier, G. (2008). Measures of personality factors found recurrently in human lexicons. In G. J. Boyle, G. Matthews & D. Sakloske (Eds.), Handbook of personality theory and testing (Vol. 2). London: Sage. [0053] Schraw, G. (1995). Measures of feeling-of-knowing accuracy: A new look at an old problem. Applied Cognitive Psychology, 9(4), 321-332. doi: 10.1002/acp.2350090405 [0054] Schraw, G. (2008). A conceptual analysis of five measures of metacognitive monitoring. Metacognition and Learning, 4(1), 33-45. [0055] Schraw, G. (2009). Measuring metacognitive judgments Handbook of metacognition in education (pp. 415-429). New York, NY: Routledge/Taylor & Francis Group; US. [0056] Schraw, G., & Dennison, R. S. (1994). Assessing Metacognitive Awareness. Contemporary Educational Psychology, 19(4), 460-475. doi: 10.1006/ceps. 1994.1033 -7 [0057] Schraw, G., Dunkle, M. E., Bendixen, L. D., & Roedel, T. D. (1995). Does a general monitoring skill exist? Journal of Educational Psychology, 87(3), 433-444. doi: 10.1037/0022 0663.87.3.433 [0058] Shaughnessy, J. J. (1979). Confidence-judgment accuracy as a predictor of test performance. Journal of Research in Personality, 13(4), 505-514. doi: 10. 1016/0092-6566(79)90012-6 [0059] Slovic, P., Fischoff, B., & Lichtenstein, S. (1977). Behavioral Decision Theory. Annual Review of Psychology, 28, 1-39. [0060] Stankov, L. (1997). The Gf/Gc Quickie Test Battery. School of Psychology: The University of Sydney. Unpublished test battery. [0061] Stankov, L. (1999). Mining on the "no man's land" between intelligence and personality. In P. L. Ackerman, P. C. Kyllonen, & R. D. Roberts. (Eds.), Learning and individual differences: Process, trait, and content determinants. (pp. 315-337). Washington, DC, USA: American Psychological Association. [0062] Stankov, L. (2000). Complexity, metacognition, and fluid intelligence. Intelligence, 28, 121 143. [0063] Stankov, L., & Crawford, J. D. (1996). Confidence judgments in studies of individual differences. Personality and Individual Differences, 21(6), 971-986. [0064] Stankov, L., & Crawford, J. D. (1997). Self-confidence and performance on tests of cognitive abilities. Intelligence, 25(2), 93-109. [0065] Stankov, L., & Kleitman, S. (2008). Processes on the Borderline Between Cognitive Abilities and Personality: Confidece and Its Realism. In G. J. Boyle, G. Matthews & D. Sakloske (Eds.), The Handbook of Personality Theory and Testing. (pp. 541 - 555): Sage Publications. Stankov, L., & Lee, J. (2008). Confidence and cognitive test performance. Journal of Educational Psychology, 100(4), 961-976. doi: 10.1037/a0012546 [0066] Stankov, L., Lee, J., Luo, W., & Hogan, D. J. (2012). Confidence: A better predictor of academic achievement than self-efficacy, self-concept and anxiety? Learning and Individual Differences, in press.
-8 [0067] Sternberg, R. J. (1997). Thinking styles. New York, NY: Cambridge University Press; US. Swann, W. B., Jr., & Gill, M. J. (1997). Confidence and accuracy in person perception: Do we know what we think we know about our relationship partners? Journal of Personality and Social Psychology, 73(4), 747-757. [0068] Tsai, C. I., Klayman, J., & Hastie, R. (2008). Effects of amount of information on judgment accuracy and confidence. Organizational Behavior and Human Decision Processes, 107(2), 97-105. [0069] Veenman, M. V., & Beishuizen, J. J. (2004). Intellectual and metacognitive skills of novices while studying texts under conditions of text difficulty and time constraint. Learning and Instruction, 14(6), 621-640. doi: http://dx.doi.org/10.1016/j.learninstruc.2004.09.004 [0070] Veenman, M. V., Elshout, J. J., & Meijer, J. (1997). The generality vs domain-specificity of metacognitive skills in novice learning across domains. Learning and Instruction, 7(2), 187- 209. doi: http://dx.doi.org/10. 1016/SO959-4752%2896%2900025-4 [0071] Want, J., & Kleitman, S. (2006). Imposter phenomenon and self-handicapping: Links with parenting styles and self-confidence. Personality and Individual Differences, 40(5), 961- 971. [0072] West, R. F., & Stanovich, K. E. (1997). The domain specificity and generality of overconfidence: Individual differences in performance estimation bias. Psychonomic Bulletin & Review, 4(3), 387-392. [0073] Williams, E. F., & Gilovich, T. (2008). Do people really believe they are above average? Journal of Experimental Social Psychology, 44(4), 1121-1128. [0074] Yang, H., & Thompson, C. (2010). Nurses' risk assessment judgements: a confidence calibration study. Journal of Advanced Nursing, 66(12), 2751-2760. doi: 10.1111/j.1365 2648.2010.05437.x [0075] Yates, J. F. (1990). Judgment and decision making: Englewood Cliffs, NJ, US: Prentice-Hall, Inc. SUMMARY OF THE INVENTION [0076] It is an object of the invention, in its preferred form to provide an improved form decision support system utilising subject confidence measures.
-9 [0077] In accordance with a first aspect of the present invention, there is provided a method of predicting a likely decision outcome for future decisions for an individual person, the method including the steps of: collecting a series of candidate predicted decision outcomes for the individual person and a corresponding relative confidence level measure for the individual person in making the candidate predicted decision; correlating the actual decision outcome with the candidate predicted decision outcome and the corresponding relative confidence level measure for the individual person; and where a positive correlation exists, utilising the confidence measure as a predictor of the decision outcome for future decisions of the individual. [0078] Preferably, the method can also include the the steps of: for each of a series of individuals within a group, measuring the individuals subjective confidence level in their relative confidence level measures and corresponding outcomes; determining an average across the individuals within a group; determining a relative average of the confidence of the decision outcomes for the individual person, relative to the group; and adjusting the individuals subjective confidence level measure by a amount proportional to the difference in the group average to relative average. [0079] In accordance with a further aspect of the present invention, there is provided a method of predicting a current decision outcome, the method including the steps of: (a) for a large collection of decision participants measuring prior decision outcome predictions and an a current estimated confidence level for the decision participant; (b) utilising the collection of outcome predictions and estimated confidence level to form a regression analysis predictor of the certainty of actual outcome; (c) for a current decision, utilising the regression analysis predictor to predict the current decision outcome based on a participants predicted outcome and current confidence level of outcome. [0080] In step (b) the regression analysis predictor preferably can include adjustment of the estimated confidence level of participants to account for systemic errors in confidence level predictions relative to the collection of decision participants. The systemic errors are preferably also measured relative to the participants previous decisions. [0081] In some example embodiments, the decision has a binary outcome and the regression analysis predictor can be formed utilising binary logistic regression of the prior decision outcome predictions and the current estimated confidence levels for decision participants. In other examples, the regression analysis predictor can be formed utilising multinomial logistic regression of the prior decision outcome predictions and the current estimated confidence levels for decision participants.
-10 [0082] In accordance with a further aspect of the present invention, there is provided a system for predicting a likely decision outcome for future decisions for an individual person, the system including: collection means for collecting a series of candidate predicted decision outcomes for the individual person and a corresponding relative confidence level measure for the individual person in making the candidate predicted decision; and correlating means for correlating the actual decision outcome with the candidate predicted decision outcome and the corresponding relative confidence level measure for the individual person; and where a positive correlation exists, utilising the confidence measure as a predictor of the decision outcome for future decisions of the individual. BRIEF DESCRIPTION OF THE DRAWINGS [0083] Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings in which: [0084] Fig. 1 illustrates a plot of example confidence and treatment decisions of one sample space; [0085] Fig. 2 illustrates an alternative example question set for a second embodiment example; and [0086] Fig. 3 illustrates a plot of regression equations for ideal and adequate decision tendencies. DETAILED DESCRIPTION [0087] The preferred embodiment provides a system that tests, constructs and validates measures of decision-making and it's components. It is particularly directed to measures of confidence in the decision. [0088] Measurement Validation: Binary Logistic Regression [0089] A necessary condition for validating a confidence measure requires that it satisfy the aforementioned Confidence Assumption. That is, higher confidence scores, both within and between individuals, should predict more congruent decision-making. Statistically, the preferred embodiments are directed to test whether confidence, a continuous variable, is a significant predictor of decision-making, a categorical variable (ie. The decision is made (1) or not (0)) in the direction implied by the Confidence Assumption. The preferred embodiments utilise Binary Logistic Regression (BLR) as applied to the Confidence Assumption.
- 11 [0090] To elaborate on the application and utility of BLR, in a first example application, it is necessary to consider data regarding the diagnostic accuracy, diagnostic confidence, and decision making of five doctors, recorded for a total of 210 patients, 42 each. Diagnoses were scored 1 for correct and 0 for incorrect. Confidence w as measured after each diagnosis by asking the clinician to indicate how confident they are that their diagnosis is correct from 0%, not at all confident, to 100%, completely confident. Decisions are scored as 1 for immediate treatment, and 0 for acquiring more information. Table 1 shows the decision frequency, totalled across all five clinicians. [0091] Table 1. Decision Frequency Percentage 0 90 43% 1 120 57% [0092] The best prediction for any clinician decision would be to treat, as this forms the majority (57%). That is, without any independent variable, we could accurately predict 57% of decisions. [0093] It is possible to regress all 210 decisions on their respective diagnostic confidence in a BLR. Like traditional linear regression, BLR generates a model that produces predicted values of the dependant variables based on weightings of the independent variables. This model will be further discussed below. Table 2 shows the frequency of decisions predicted by our model against their true observed value. Bold type has been used to indicate the cells where the model has accurately predicted a 0 (top left) or 1 (bottom right). Table 2. Model Predicted Observed Decision Percentage Decision 0 1 Correct 0 48 42 53% 1 10 110 92% [0094] Now, based on confidence, 48 of the 90 information gathering decisions (53%) and 110 of the 120 treatment decisions (92%) have been correctly predicted by the model. That is, the model is accurately predicting 158 (48+110) of the 210 decisions, or 75%. Utilising confidence, the predictive ability has been increased from 57% to 75%. While these results suggest that confidence, as measured here, is an important predictor of decision-making, more information is needed to demonstrate that the Confidence Assumption has been satisfied.
-12 [0095] This information becomes available when considering the regression equation described by the model. Unlike linear regression, the BLR equation describes a log, rather than linear, function. However the same basic principals apply: A positive coefficient implies that an increase in confidence is associated with an increasing probability of decisions becoming congruent (an increase from 0 to 1). The relevant coefficient output, shown in Table 3 below, confirms that this is indeed the case. [0096] Table 3. b Exp(b) Confidence 0.151 1.164* Constant -9.059 *p < .05 [0097] Fig. 1 illustrates a scatter plot of one clinician's confidence and decision pairings show this relationship. The magnitude of the unstandardized coefficient b is difficult to interpret due to the logarithmic scale. Rather, its exponential, Exp(b), is an easily interpreted odds ratio: For every percentage increase in diagnostic confidence, clinicians are 1.164 times more likely to treat their patient than to seek more information, on average. [0098] Both the unstandardized coefficient and its exponential are further indicative of the consistency with which congruent decisions have been made at higher values than incongruent decisions. That is, the greater the magnitude of these values, the steeper is the slope defining the shift in probability from incongruent (0) to congruent (1). Combined, as a more readily interpretable metric, the sign and magnitude of Exp(b) are indicative of the slope's direction and definition respectively (herein after referred to as the SLIDE). A positive SLIDE indicates that the odds of a congruent decision increases with increasing confidence and that the Confidence Assumption has been satisfied. The greater the SLIDE magnitude the more consistently congruent decisions have been made above the POST, and incongruent decisions below it. Hence, examination of the SLIDE should always be conducted prior to any further analyses. [0099] The results of our five clinicians indicate that the Confidence Assumption has been satisfied. In turn, this supports the validity of measuring confidence with a percentage scale. However, this basic assumption and analysis have much more to offer than simple measurement validation within individuals. Psychological Equivalence and Partitioning Error Variance -13 [00100] Further benefits arise following the application of classical test theory: that observed confidence is a function of true confidence and error. One potential source of error is that individuals do not share the same psychological experience for the same point on the confidence scale (Assumption of psychological equivalence). That is, it is unlikely that humans reliably quantify their experiences such that any two individuals indicating 60% confidence or any other value, truly share the same psychological experience. This induces systematic error in the measurement as a result of how individuals vary in their perception of confidence and their use of the scale. Error might therefore be partitioned into (i) systematic error resulting from individual use of the scale and (ii) residual, unsystematic, error. If variance associated with this systematic error can be partialled out from observed scores, the remaining scores should more accurately represent true confidence. This reasoning is represented in the equation below. Confidenceobserved = Confidencetrue +Stotal Confidenceobserved = Confidencetrue +EindividualUse +"residual Confidenceobserved
--
8 individualUse = Confidencetrue r e siduall [00101] A simple extension of the BLR analysis can help check whether such systematic error exists and, if so, correct for it. The Point of Sufficient Certainty [00102] To continue along this path requires the identification of some point at which individuals indeed share the same psychological experience. Examination of Fig. 1 suggests that after about 60 65% confidence, the clinician becomes far more likely to treat their patients. This plot confirms what common sense tells us: At some point along the confidence spectrum, decision-makers become sufficiently certain in their judgements to switch from incongruent to congruent decisions. Let us call this the Point Of Sufficient cerTainty (POST). It is feasible to consider that individuals share the same psychological experience at the POST. [00103] If individual POSTs vary in measurement, it is likely that systematic error exists due to individual uses of the confidence scale. Removing variance associated with POST differences from the observed scores should therefore correct for this systematic error. Adjustment then requires shifting observed values by the deviation of each individual POST from the mean sample POST.
-14 This effectively shifts observed confidence ratings so that the POST of each individual will be at an equivalent point. Confidenceobserved
-
8 individualUse = Confidencetrue +-residual Confidenceobserved (POSTi - POSTsample )= Confidencetrue +"residual Confidenceobserved POSTi + POSTsamnple = Confidencetrue +"residual [00104] However, the values obtained on the left side of the above equation are uninterpretable without reference to the POST. Thus, further subtracting the mean sample POST from these values will position the POST at 0 with positive and negative values falling above and below the POST respectively. These POST Adjusted values will be referred to as POSTA Confidence, and their calculation for the ith individual can be seen below: ConfidencePOSTA = Confidenceobserved -POSTi + POSTs e -POSTsae ConfidencePOSTA = Confidenceobserved -POSTi [00105] This adjustment reduces simply to the deviation of ones confidence from their own POST. This, conveniently, removes any association of the score with sample based statistics. For example, say the POST of clinician 1 is 50%, and of clinician 2 is 90%. They each diagnose a patient with 60% and 80% confidence respectively. These observed values can be adjusted as follows: ConfidencePOSTA =60-50=10 ConfidencePOSTA2= 80-90 =-10 [00106] Clinician 1 has exceeded his/her POST, being more likely to make a congruent decision. Clincian 2 has indicated confidence 10 below his/her POST, being more likely to make an incongruent decision. Furthermore, clinian l's confidence has exceeded clinician 2's confidence by 20, in relation to the POST. If the POST is a psychologically equivalent point, POSTA Confidence corrects for violations of Psychological Equivalence, reduces systematic error, and is readily interpreted in terms of decision likelihood. Calculating the POST -15 [00107] Calculating the POST requires a more thorough examination of the regression equation described by BLR. Mentioned previously, as the dependent variable is categorical, the BLR regression equation defines a log, or sigmoid function defined by an intercept and the unstandardized regression coefficients as shown below. [00108] Logit values are calculated into predicted probabilities of the dependent variable as follows: I+ e' [00109] That is, each confidence value is associated with a predicted probability, ranging from 0 to 1. The predicted decision is incongruent (0) or congruent (1) if the predicted probability is less than or greater than .5, respectively. As the POST defines the point at which a decisions switches from incongruent to congruent, a predicted probability of .5 is best associated with the confidence value of the POST: The point at which incongruent and congruent decisions are equally probable. To calculate the POST therefore requires substituting a predicted probability of .5 into the regression equation and working backwards to solve for confidence. This process is shown below.
-16 .5(1+e ) 5+e =e .5 = ei - e .5 =5e' & = I I=1 [00110] Expanding !' a + b x POST =l1 PhIn - a b [00111] This will apply for the case in which all decisions are regressed on confidence. So this equation would be suitable if the data for each individual was analysed separately. However, simply including a categorical id variable into the analysis provides us with the additional information necessary to calculate the POST for each individual within a single analysis. [00112] Returning to the five clinicians, it is pos sible to regress decisions on observed confidence and a categorical id variable. Table 4 shows the frequency of decisions predicted by our model against their true observed value. [00113] Table 4 Predicted Decision Observed 0 1 Percentage -17 Decision Correct 0 79 11 88% 1 11 109 91% [00114] The model is now accurately predicting 188 (79+109) of the 210 decisions, or 90%. This model has further increased our predictive ability from 75% to 90%, suggesting that meaningful between individual variance exists in the use of the confidence scale. To compute the POST for each clinician requires investigation of the coefficients table shown below (Table 5). Table 5 b Exp(b) Confidence 0.312 1.366* id(1) 2.155 8.632* id(2) -3.508 0.030* id(3) -2.879 0.056* id(4) -4.595 0.010* Constant -17.324 *p < .05 [00115] These coefficients form separate regression equations for each individual, with the final id category (clinician 5) being a reference class. To calculate the POST of the reference clinician (5) is the same as the formula provided earlier. For the remaining clinicians, the logit equation expands by adding the coefficient associated with their id. This is shown below. = a+ b x Confidence + b [00116] Substituting in the manner described earlier: In - a b POSTdenc -18 [00117] Hence, for each clinician: POST, = (n1-(-17.324)-2.155) / .312 =48.619
POST
2 = (Inl-(-17.324)-(-3.508)) /.312 = 66.769 POST = (In1-(-17.324)-(-2.879)) / .312 = 64.753
POST
4 = (n1-(-17.324)-(-4.595)) /.312 = 70.253 POST = (In1-(-17.324)) / .312 = 55.526 [00118] Table 6 below shows the diagnostic accuracy, average diagnostic confidence, POST, POSTA Confidence, SLIDE, and percentage of congruent (treatment) decisions for each clinician. The SLIDE value comes directly from the Table 5. [00119] Table 6: (*p < .05) Accuracy Confidence POST Congruent Clinician (% correct) (Average %) (%) POSTA SLIDE Decisions (%) 1 76.19 77.62 48.61 29.01 8.63* 81 2 90.48 59.76 66.77 -7.01 0.03* 29 3 97.62 63.57 64.75 -1.18 0.06* 45 4 80.95 80.24 70.25 9.99 0.01* 79 5 76.19 56.19 55.53 0.66 1.37* 52 [00120] Each clinician has a positive and statistically significant SLIDE, indicating that they have each satisfied the Confidence Assumption. Additionally, their magnitudes show that clinician 1 has initiating congruent decisions above his/her POST, and incongruent decisions below his/her POST, far more consistently than the other clinicians. Examination of the 95% confidence intervals for Exp(B) suggest that this difference is statistically significant. Still, what a SLIDE of increasing magnitude means from a psychological perspective is an empirical question: It could be more refined cognitive control, a method effect resulting in a conscious effort to separate decisions above and below a particular point, etc. [00121] POST scores vary considerably, ranging from 48.61 to 70.25. Combined with the percentage increase in predictions made by our model, it appears that the Assumption of Psychological Equivalence has been violated, and observed average confidence includes error associate with individual use of the confidence scale. It therefore seems appropriate to calculate -19 POSTA Confidence and make further conclusions about each clinician. For example, it appears that, on average, clinician 1 is 29.01% more confident than his/her POST, and more likely to make congruent decisions. [00122] One potential caveat is that Psychological Equivalence is not violated and observed POST differences are meaningful. That is, a lower POST might imply that an individual is sufficiently certain to initiate a congruent decision at a true experience of less confidence than another. Whether the POST, POSTA Confidence, neither, or perhaps both, should be interpreted as meaningful metrics remains an empirical question. [00123] The SLIDE and POST offer significant avenues for development. They potentially provide considerable advantages over the myriad of confidence indices that exist, and can be used for measurement validation and as meaningful individual differences variables. Comparing the pattern of predictive validity and pattern of relationships between traditional indices, the SLIDE and POST indices, and other variables of interest can provide insight into their utility. Potentially, the SLIDE and POST have application to validate confidence measures and reveal novel ways to partition variance captured by on-tasks, on-line confidence estimates, thus improving their predictive validity. [00124] Embodiment Two [00125] Further background research in the basis for the utilisation of confidence in decision prediction was undertaken. [00126] Identifying the psychological processes that lead to optimal decision-making holds great promise for a vast range of domains. This embodiment expands the consideration given to the analysis of decision-making processes and the role played by the metacognitive judgement of confidence and its calibration (bias and discrimination). The first aim was to investigate whether broad Confidence and Calibration factors would emerge when a novel medical decision-making paradigm was included within a battery of cognitive ability tests. A further aim was to establish the existence and consistency of the novel decision-making tendencies (ideal, adequate, wasteful, fatal and congruent) within the decision making task. Finally, examination of the predictive validity of confidence is undertaken and its calibration indices on these decision-making tendencies. 193 undergraduate students completed a medical decision making test and three cognitive ability tests, a personality questionnaire, and the Need for Closure questionnaire. Broad Confidence, Bias and Discrimination factors emerged across the decision-making and cognitive domains. The medical test yielded theoretically and psychometrically sound decision making tendencies. Furthermore, confidence and calibration indices were strong incremental predictors - 20 of these tendencies. The results provide preliminary support that habitual patterns in decision-making tendencies might generalise across decision-making contexts as a result of stable Confidence and Calibration constructs. [00127] The goal of decision-making is to make choices that will maximise the likelihood of desired outcomes. Identifying the psychological processes that enhance these outcomes holds great promise for a vast range of domains. This embodiment focuses on a medical decision-making paradigm, where the desired outcomes involve maximising patient survival rates, while minimising strain on the medical facilities. [00128] Consider a patient complaining of stabbing chest pains. Unknown to the doctor, the patient has aortic dissection (AD): bleeding into and along the walls of the aorta. AD can be identified by hearing a 'blowing' murmur over the aorta, and confirmed via X-ray. If appropriate treatment is not administered immediately, the likelihood of mortality increases rapidly. Indeed 50% of AD patients whose aorta ruptures will die travelling to hospital. Unfortunately, AD is commonly misdiagnosed and treated as acute coronary syndrome (ACS). Treating ACS involves administering antithrombotic agents, which, for AD patients, delays appropriate diagnosis and causes increases in bleeding and mortality rates (Hansen, Nogareda, & Hutchison, 2007). When diagnosing and treating an AD patient, four outcomes might occur. Ideally, the doctor will correctly diagnose the patient with AD, and opt to conduct the appropriate treatment (outcome A, see Table 7 below); The doctor might correctly diagnose the patient with AD, but choose to risk waiting for confirmation from an X-ray (outcome B, see Table 7); Alternatively, the doctor could incorrectly diagnose the patient with ACS. In this case, the patient's outlook becomes significantly worse if the doctor decides to administer antithrombotic agents (outcome C, see Table X1), or the better outcome will occur if s/he chooses to request an X-ray (outcome D, see Table X1). [00129] This scenario frames a typical decision-making process (Edwards, 1954, 1961; Harvey, 2001; Mellers, Schwartz, & Cooke, 1998). An initial judgement is formed (diagnosis) and is followed by the final decision: select treatment, or acquire additional information to confirm the diagnosis. The outcomes of the final decision are considered in light of that initial judgment being correct or incorrect. It is necessary to consider four possible outcomes that follow correct or incorrect judgements. Table 7 below describes these outcomes for a patient with AD as well as for patients in the medical paradigm discussed later. For now, just consider the AD scenario, and the distinction between congruent and incongruent decisions will be made shortly. [00130] Table 7 - 21 AD and MDMT Patient Outcomes Considering Diagnostic Accuracy and Decision Judgenient Decision Accuracy (Diagnosis) Scenario Congruent treaty 3 ) Incongruent (test) Correct (A) Ideal outcome: (B) Wasteful outcome: AD Patient treated appropriately. Unnecessarily request X-ray. Patient condition may deteriorate. MDMT Patient survives. Unnecessarily request blood test, ill patients have a 50% mortality rate, strain on facilities. & waste of resources, Incorrect (C) Fatal Outcome: (D) Adequate Outcome: AD Patient administered Patient may get worse, but X anthrombotic agents, ray will make correct increasing bleeding and diagnosis. mortalty rates. MDMT Patient dies. Ill patients have a 50% mortality rate, strain on [00131] facilities, & waste of resources, [00132] For the MDMT, treatment was administered if the diagnosis was ill, or the patient was released if diagnosed was paralymphnal free. [00133] In this case, if the diagnosis is correct, immediate treatment will lead to an ideal outcome (A). Here, sending a patient for further tests rather than treatment could result in a waste of resources and time, compromising survival rates (a wasteful error, outcome B). In contrast, if the diagnosis had been incorrect, the results of immediate treatment could be deadly (a fatal error, outcome C). Here, acquiring further information would be the most appropriate choice, as it would rule out the misdiagnosis, and eventually lead to appropriate treatment (adequate outcome; D). Thus, the accuracy of the original judgement defines the outcome value of the final decision. This embodiment focuses on determining the psychological processes that optimise the outcome value of the final decision. Specifically, a consideration is undertaken of individual differences in confidence judgments as a part of the diagnostic decision-making process. [00134] There are both cognitive and metacognitive processes that inform judgements and direct subsequent decision-making. Cognitive processes primarily inform our judgements. For example, if a patient presents with AD symptoms, the doctor might retrieve from memory that those symptoms indicate AD. Metacognitive processes monitor and control these cognitive processes (see Azevedo, 2009; Efklides, 2008; Nelson 1996; Stankov, 1999, for reviews). In this embodiment, we focus on one of the - 22 key metacognitive experiences, judgement confidence, which reflect one's certainty that their judgement is accurate (Allwood, Granhag & Jonsson, 2006; DeMarree & Petty, 2007; Efklides, 2008; Stankov 1999), thus guiding decision behaviour. [00135] Confidence levels hold a strong position in decision-making research. For example, Knight (1921, p. 100), claimed "the action which follows an opinion depends as much upon the amount of confidence in that opinion as it does upon the favourableness of the opinion itself." Swann and Gill (1997, p. 747), that "confidence serves as a psychological gatekeeper of sorts, systematically determining whether people translate their beliefs into action." Even some of the biggest names in contemporary decision research have asserted "confidence controls action" (Gilovich, Griffin & Kahneman, 2002, p. 248). [00136] Although these claims were made in reference to different theories, each views the relationship between confidence and decision-making in a similar vein. This similarity stems from a shared assumption that as confidence in one's judgement and thus subjective certainty increases, so too does the likelihood of translating that judgment into a decision congruent with the judgment (DeMarree & Petty, 2007; Slovic, Fischoff, & Lichtenstein, 1977). Here, a congruent decision is defined as the decision an individual would be expected to make, given that their judgement is true. This does not necessarily mean that the judgement and/or the decision have been correct. Rather, the decision is in line with the belief that the judgement is accurate. For example, administering antithrombotic agents to a patient diagnosed with ACS would be congruent with this diagnosis, whether the patient has ACS or not. In contrast, an incongruent decision describes a decision made on the basis of a judgement considered to be possibly inaccurate: A doctor who does not hold sufficient certainty/confidence in the accuracy of his/her diagnosis of AD will opt to request an X-ray. This is referred to as the Confidence hypothesis. [00137] Indeed, the Confidence hypothesis is so engrained that researchers often make direct behavioural interpretations of judgement confidence acquired independently of any decision (e.g., Chelminski & Coulter, 2007; Desmarais, Nicholls, Read, & Brink, 2010; Dory, Degryse, Roex, & Vanpee, 2010; Hahn & Kim, 2009; Ibabe & Sporer, 2004; Leippe, Eisenstadt, & Rauch, 2009; Tsai, Klayman, & Hastie, 2008). These represent just a selection of such studies spanning domains such as consumer preference, medical education and eyewitness memory. However, limited research has been conducted to explicitly investigate links between judgements, confidence, and decisions. Hence, the validity of the Confidence hypothesis, although implicitly appealing, lacks empirical evidence. [00138] Where support does exist, research focus is typically domain specific, and analyses do not consider the entire decision-making process, inclusive of judgement accuracy, confidence, and decision.
- 23 For example, in one empirical investigation, Cofrin (1999) had medical students diagnose vignettes, indicate the confidence in their diagnoses, and to choose whether to initiate treatment or request more information. She found that confidence was a significant positive predictor of the decision to initiate treatment when the vignette condition was considered non- urgent. This prediction was significant incrementally over the Need for Closure, a measure of cognitive style. That is, confidence predicted the decision to initiate treatment, after the contribution made by Need for Closure was accounted for. However, the decision to initiate treatment only considers overall congruent decision-making. Furthermore, Cofrin did not include diagnostic accuracy within this analysis, nor did she consider varying decision optimality as a function of correct or incorrect diagnoses. Thus, while this study provides preliminary support for the Confidence hypothesis, it is representative of the available research that can accommodate an analysis of a complete decision-making process. The major concern is that predicting the frequency of congruent decisions is of little use without considering the outcomes of those decisions. The present embodiment provides a mechanism to provide a predictive validity of confidence on decision behaviour, taking judgement accuracy, as well as other potentially important variables (e.g., intelligence, cognitive styles, personality), into account. [00139] To achieve a more complete perspective of decision-making, the outcomes described in Table 7 are used to consider various tendencies that arise through congruent and incongruent decision-making, following correct or incorrect judgements. Table 7 highlights that simply assessing the frequency of congruent decisions muddles the outcome value of, and fails to distinguish between, decisions that lead to ideal (A) or fatal (C) outcomes. It might be of greater benefit to also consider an individual's tendency to make decisions that successfully lead to ideal and adequate outcomes. [00140] Further consideration might provide unique information with respect to the types of errors individuals tend to make. Perhaps an individual tends to make incongruent errors following correct judgements, leading to wasteful outcomes (B). Alternatively, they might make congruent errors following incorrect judgements, leading to fatal outcomes (C). An individual could be high or low in their tendency to commit either one, or both, of these errors. It is unknown whether people are consistent in such tendencies across several decision acts. [00141] Utilising this approach, it is possible to first examine the validity of the Confidence hypothesis taking the full range of these individual tendencies (including examination of their reliability) into consideration. Specifically, it is possible to examine whether individuals who are more confident in their judgements, prior to making a decision, tend to make more congruent decisions, leading to more fatal and fewer wasteful outcomes within a medical decision-making paradigm, after controlling for their judgement (diagnostic) accuracy, intelligence, cognitive styles, personality, gender and age.
- 24 [00142] McKenzie (1997; 1998; 1999) developed the original framework of the medical decision making test that can be using to investigate the Confidence hypothesis. His test required participants to learn how to diagnose two fictitious illnesses. Participants then diagnosed patients with either one or none of these illnesses, as well as immediately providing a confidence rating in the accuracy of each diagnosis. In one of his experiments in the 1998 study, participants were required to indicate whether they would administer treatment to each patient after undergoing a lengthy learning phase. The focus of his research was to consider whether contrasting overlapping information (symptoms in this case) during learning affect judgement accuracy. Thus, he did not examine whether higher confidence estimates were associated with being more likely to administer treatment, nor did he examine the decision-making tendencies described. Still, his test scenario provides a convenient platform to measure a complete decision-making process including judgement accuracy, confidence, and subsequent decision behaviour. Thus a modified version of McKenzie's (1998) test (described in detail later) can be employed in the present study to scrutinise the Confidence hypothesis. Measuring Confidence [00143] As previously performed by McKenzie, this research focuses on the measurement of judgement confidence that immediately follow each cognitive act, and reflect the assessment of one's performance. The methods for measuring confidence in this way can vary considerably across domains (see Moore & Healy, 2008, for a review). One popular approach is to ask an individual how confident they are in the accuracy of their judgement as a percentage (e.g., Allwood, Granhag & Jonsson, 2006; Costermans, Lories, & Ansay, 1992; Efklides, 2008; Flavell, 1979; Schraw & Denison, 1994; Stankov, 1999; 2000). Specifically, individuals indicate confidence in their judgement from 0% being absolutely sure they are incorrect, to 100% being absolutely sure they are correct. [00144] Fig. 2 illustrates a typical cognitive knowledge question with the respective confidence rating. This method of assessment has been demonstrated to be well understood by adults (Williams & Gilovich, 2008) and children (Kleitman & Gibson, 2011; Kleitman, Stankov, Allwood, Young, & Mak, 2012), and to possess excellent psychometric properties (see Stankov & Kleitman, 2008, for a review). Hence, the present study adopts the percentage scale approach for measuring confidence. [00145] Calibration [00146] Measuring confidence in this way allows for a multitude of calibration indices to be calculated (Boekaerts & Rozendaal, 2010; Harvey, 1997; Schraw, 2009; Yates, 1990). Calibration in this context is broadly defined as a metacognitive phenomenon relating to the adaptiveness and effectiveness of the - 25 monitoring process (Nelson, 1996; Stankov, 1999). Hence, good calibration is assumed to be necessary in order to execute optimal decision-making behaviour. The two calibration indices considered here, and of the utmost theoretical importance to the outlined hypotheses, are bias and discrimination. [00147] Bias. The most widely used and investigated calibration index is bias, also referred to as over /under-confidence. The bias score indicates whether, on average, an individual has been able to match or calibrate their confidence levels with their actual levels of accuracy. Bias is calculated across a number of judgements as the difference between average subjective confidence estimates and objective accuracy: Bias = la, 11 n [00148] where n is the total number of items; ci is the confidence assigned to the i item; ai is accuracy of the ia item, scored 1 for correct and 0 for incorrect. Bias scores can range from +1 to -1. High (above zero) and low (below zero) scores, indicative of poor confidence calibration, are described as over and underconfidence respectively (Lichtenstein & Fischhoff, 1977). Furthermore, superior test retest and split-half reliability estimates for bias have been demonstrated relative to other calibration indices (Stankov & Crawford, 1996). Stankov and colleagues refer to bias as a measure of whether people are able to detect judgements that are easy or difficult, on average (Stankov, Morony, Lee, Luo, & Hogan, 2011). Overconfidence suggests that the person has felt the judgements s/he made were easier than they actually were for him/her. Underconfidence suggests that the person has felt the judgements s/he made were harder than they actually were for him/her. Scores close to zero indicate good calibration, and the person has found the judgements s/he made to be about as difficult as they actually were. [00149] Differential predictions, with respect to decision-making, are made regarding the direction of the bias score. For example, in the area of self-regulated learning, Efklides (2009, p. 81) postulates that overconfidence can make an individual "less perceptive of situational demands," and underconfidence may increase anxiety resulting in task avoidance. Others have suggested that underconfident students might devote an unnecessary amount of time to studying learned material (Hacker, Bol, & Keener, 2008). Likewise, financial decision models incorporating the confidence calibration model predict that overconfident investors will trade more stock than well-calibrated investors (Glaser & Weber, 2007). While only limited support is available for these predictions, across domains, they can be largely reduced into simple decision terms for general investigation: Increasingly overconfident individuals tend to make - 26 more errors following incorrect judgements as a result of congruent tendencies (fatal tendencies). Similarly, increasingly underconfident individuals tend to make errors following correct judgements as a result of incongruent tendencies (wasteful tendencies). [00150] Further predictions are made on the basis of the magnitude of the bias score (increasing deviation from zero) in either direction, reflecting the raw degree of miscalibration. The larger the magnitude, the larger the discrepancy between one's confidence in their performance and their actual performance. Normatively, larger discrepancy represents poorer self-monitoring (see Stankov, 1999 for a review), and is assumed to lead to poorer decision-making. That is, as either over or underconfidence increases, self-monitoring skills become more impaired, and the tendency to reduce errors overall (minimising detrimental effects) should decrease at an increasing rate. [00151] This can best be represented by a non-linear (quadratic) relationship illustrated in Fig. 3. where higher scores on the vertical axis represent fewer errors. Scarce research on this prediction has been conducted. Instead, research typically considers linear relationship between bias and different outcomes (e.g., Yang & Thompson, 2010; Glaser & Weber, 2007). In this embodiment consideration of both the linear and quadratic terms is undertaken to investigate the Bias hypothesis: As over or underconfidence increase, the tendency to respectively minimise decision errors as a result of congruent or incongruent error tendencies will diminish in the form of a quadratic trend. Hence, bias scores will be calculated and related to decision-making behaviour in order to scrutinise this hypothesis. [00152] Discrimination. A second index of calibration is discrimination, referring to the ability to discriminate between correct and incorrect judgements. Discrimination is typically computed simply as the difference between average confidence assigned to correct and incorrect items. However, to account for individual variation in confidence ratings, we will use a comparable measure: the confidence judgement accuracy quotient (CAQ; Shaughnessy, 1979; Schraw, 2008), calculated as the difference between average confidence assigned to correct and incorrect items, divided by the standard deviation of all confidence ratings. Formally: CAQ P th Where cicorrect is confidence assigned to the i correct item; p is the number of correct items; ciincorrect is confidence assigned to the i incorrect item; q is the number of incorrect items; a is the standard - 27 deviation of all the confidence ratings, and adjusts for how tightly an individual uses the confidence scale. CAQ scores can range from negative to positive values, with higher scores indicating higher confidence for correct rather than incorrect judgements. Thus, increasingly positive CAQ indicates better discrimination. [00153] Applying the Confidence hypothesis here gives rise to the Discrimination hypothesis: That better discrimination will lead to better decision-making overall. That is, individuals who discriminate well will tend to make congruent decisions when their judgements are correct, as a result of higher certainty/confidence. Furthermore, they will tend to make incongruent decisions when their judgements are incorrect, as a result of lower certainty/confidence. We therefore expect that better discrimination (indexed by higher CAQ) will predict increasing tendencies to make decisions that lead to ideal outcomes and fewer errors overall: due to decreases in congruent (fatal) and incongruent (wasteful) error tendencies. Given the nature of this hypothesis, only a linear trend is expected. Individual differences in confidence [00154] A branch of decision-making research concerned with judgement confidence has found stable individual differences in confidence levels within the cognitive domain (e.g., Kleitman & Mascrop, 2010; Kleitman & Stankov, 2001; 2007; Kleitman, et al., 2011; Mengelkamp & Bannert, 2010; Pallier et al., 2002; Schraw, Dunkle, Bendixen, & Roedel, 1995; Stankov, 1999; Stankov & Crawford, 1996; 1997; Stankov & Lee, 2008). In particular, these studies found that confidence levels acquired across a diverse battery of cognitive tests demonstrated higher intercorrelations than the correlations between them and the relevant test accuracy scores. A robust Confidence factor emerged when both exploratory and confirmatory factor analytic models were employed. This factor was positively related to, yet distinct from, the relevant Accuracy factors (see Kleitman, 2008; Kleitman, et al., 2011; Pallier, et al., 2002; Stankov, 1999 for reviews). Recently replicated in a large cross-cultural study (Stankov, Lee, Luo, & Hogan, 2012), these findings are in support of a broad Confidence factor present in the cognitive domain that is distinct, yet related to, ability. Similarly, scores obtained on different cognitive tests converged on a broad Bias factor (Kleitman, 2008; Schraw, 1997; Schraw, et al., 1995; Stankov & Crawford, 1996; West & Stanovich, 1997). That is, irrespective of the nature of the cognitive tasks, and their difficulty, people who tend to be overconfident on one type of task, tend to be overconfident on other types of tasks relative to the others. Likewise, people who tend to be underconfident on one type of task, tend to be underconfident on other types of tasks. These findings are in support of a broad Bias factor. [00155] The present embodiment aims examines whether confidence, bias, and CAQ scores, within the medical paradigm, converge respectively with these scores acquired across a battery of cognitive - 28 tests. Given the robust and replicable nature of the broad Confidence and Bias factors, it is expected that these metacognitive indices will converge across cognitive and decision-making domains (metacognitive generality hypothesis). If Confidence, Bias, and Discrimination factors generalise to decision-making scenarios, this holds significant implications. Medical Decision-making Test [00156] McKenzie's (1998) original paradigm provides an excellent framework to test these hypotheses, provided certain modifications are made. Mentioned previously, McKenzie's participants learned about symptoms associated with two fictitious illnesses, made subsequent diagnoses of new patient profiles, and recommended whether to send patients to treatment. To suit the present embodiment, modifications were made for two major reasons: so that participants would be somewhat uncertain about their diagnostic accuracy; and so they were fully aware of the outcomes treatment might lead to. Following is a detailed description of the final modified design of the medical paradigm used in the present study (also see Method section), hereafter referred to as the Medical Decision-Making Test (MDMT). [00157] First, participants were instructed to adopt the role of a specialist in deadly paralymphnal illnesses, of which there are only two kinds (puneria and zymosis), and that different treatments were available for each. However, administering a treatment to an individual without the illness it is intended for would be fatal, whether they had the alternate illness or were paralymphnal free. Participants also learned that a blood test was available to correctly identify whether an individual had a paralymphnal illness and, if so, which one. However, due to the severity of paralymphnal illnesses, ill individuals had only a 50% chance to survive waiting for the test results. Furthermore, blood tests were costly and placed strained on the available medical facilities. Thus, paralleling the AD scenario, four outcomes might occur for each patient, as a result of treatment or testing, following a correct or incorrect diagnosis (see Table 1). [00158] Participants were told that they would be required to diagnose 42 patients, indicate their confidence in each diagnosis, and deal with them by treating patients accordingly with their diagnosis or requesting a blood test if they felt uncertain about it. They were then shown an example question (Fig. 3) before learning how to diagnose potential patients with the three possible illness states: puneria, zymosis, or paralymphnal free. Decision-making variables - 29 [00159] The design of the MDMT allows for an analysis of the various tendencies discussed throughout. For each participant, it is possible to compute the frequency of their patients that end up in each of the four outcome cells described in Table 7. These frequencies were used to compute decision tendency variables outlined in Table 8: MDIT Decisfon-making Variables Decsison Tendency indexes Calcukaton Desired Score Ideal decision tendencies OptmaL decisionmaking (A)A+B+ C +D) 1 Adequate decision tendencies Overa error reducton (ophmal) (A+D)Jl(A+B+C+D) Wasteful decision tendences Incongruent errors (A Fatal decision tendercies Congruent erors (C) (C+D) 0 Congruent decision tendecces Overall congruent decisions (A+C)(A+B+C+D) n/a Ntke. Each vanble is calculated as a proportion ranging from 0 to 1. Also note the disbnction between these vaniabes, followed by decision tendencres and the frequencies derved from their outcomes in TFahe 1, foksed by u (A+B+C+D) }s for ilustrative purposes, and equals 42 in the case of the MDMT. [00160] Each variable provides unique information about decision-making tendencies within the test. Ideal decision tendencies is the primary index of optimal decision-making and computed as the total number of ideal outcomes divided by the number of decisions. Adequate decision tendencies is the secondary optimal variable, measuring an individual's tendency to reduce decision errors overall. It is computed as the sum of ideal and adequate outcomes, divided by the number of decisions. Fatal decision tendencies is the most critical error variable and is associated with congruent tendencies following incorrect diagnostic judgments. It is computed as the frequency of fatal outcomes divided by the number of incorrect diagnoses. Wasteful decision tendencies is also an error variable and is associated with incongruent tendencies following correct diagnostic judgments. It is computed as the frequency of wasteful outcomes divided by the frequency of correct diagnoses. Congruent decision tendencies is computed as the sum of ideal and fatal outcomes, divided by the total number of decisions. The purpose of these variables is to more completely consider patterns of decision-making that typically go unobserved. [00161] Utilising the MDMT and these variables, the present embodiment combines three related branches of psychological research: decision-making, metacognition, and differential psychology. The first aim is to determine the generality of confidence and its calibration across decision and cognitive domains using an individual differences methodology. It is hypothesised that confidence, bias, and CAQ scores, derived from the MDMT and three cognitive tests, will converge on single factors when - 30 submitted to respective exploratory factor analyses (metacognitive generality hypothesis). A further aim is to establish the existence and consistency of the novel decision-making tendencies within the decision making task (ideal, adequate, wasteful, fatal and congruent). Finally, we aim to examine the predictive validity of confidence and its calibration indices on decision-making behaviour as outlined by the Confidence, Bias, and Discrimination hypotheses. To achieve this, MDMT decision tendencies will be regressed on MDMT metacognitive scores, controlling for diagnostic accuracy, intelligence, personality, cognitive styles, gender, and age. With respect to confidence and its calibration, robust, albeit untested, decision-making assumptions coupled with replicable metacognitive findings provide a solid foundation, and justification, for these aims and hypotheses. [00162] Method: Participants: 202 (121 female, 81 male, Mage = 19 years, age range: 17-50 years) first year psychology students at the University of Sydney participated in return for partial course credit. Nine participants were excluded for reporting significant English or other (e.g. dyslexia) difficulties. As such, the final sample consisted of 193 participants (114 female, 79 male, Mage = 19.41 years, age range: 17-39 years). [00163] Materials: Six tasks were administered on Dell computers. Unless otherwise stated, tasks were programmed using the online questionnaire program 'Surveygizmo 3.0', and completed in an Internet Explorer 8 browser. In addition to the tasks described below, the battery also included the Big-6 Personality Inventory (Saucier, 2008) and the Need for Closure (Roets & Van Hiel, 2011). Neither contributed significantly to the results, and all of the presented results were demonstrated incrementally over and above these measures. This was also the case for gender and age, and thus all have been omitted from the following analyses. [00164] Medical Decision-making Test (MDMT). As described earlier, the test consisted of 42 items asking participants to diagnose and treat people potentially infected with two fictitious illnesses. After seeing an example question (Fig. 3), participants had ten minutes to learn from three tables, presented on a single A4 page, how eight symptoms were associated with three illness states: puneria, zymosis, and paralymphnal free. Each table presented the symptoms experienced by 20 patients for whom their illness state was known. Once ten minutes had elapsed, tables were removed and participants progressed to the test phase. [00165] For the test phase, participants completed all test profiles in a randomised order. No feedback was provided, and each test profile was completed in the same way depicted in the example (Fig. 3).
- 31 [00166] The following variables were calculated from the diagnosis and diagnostic confidence: diagnostic accuracy, confidence, bias, and CAQ. The five decision tendency variables were calculated from diagnostic accuracy and the final decision (as described in Table X2). [00167] Raven's Advanced Progressive Matrices (RPM; Raven, 1938-65). This test included 20 items and was programmed using Macromedia Authorware 7. Each item presented a 3 x 3 display of abstract figures following a particular pattern both horizontally and vertically. The bottom right figure is left blank, and participants were required to choose which of 8 alternative figures would complete the display. After each question, participants indicated their confidence levels by selecting a value from 10% to 100%, in 10% increments. RPM is a gold-standard measure of fluid intelligence (Gf), and has been shown to possess good internal alpha reliability estimates, typically greater than .80 (Raven, 1938-65). [00168] Esoteric Analogies Test (EAT; Stankov, 1997). This test required participants to complete 24 verbal analogies, by selecting one of four alternative words that share the same relationship with a target word as that of an original pair. For example, FIRE is to HOT as ICE is to: POLE, COLD*, CREAM, or WHITE. After each question, participants indicated how confident they were that their answer was correct by typing some value between 25% and 100%. This test requires both fluid and crystallised intelligence (Gf and Gc), and has been shown to possess internal alpha reliability estimates acceptable for research purposes of .66 to .76 (Kleitman, 2008; Kleitman & Stankov, 2007; Want & Kleitman, 2006). [00169] Vocabulary Test (VT; Stankov, 1997). This test involved completing 18 items in which participants must select which one of five words or short phrases has the same meaning as a target word. For example, FEIGN is to: PRETEND*, PREFER, WEAR, BE CAUTIOUS, SURRENDER. After each item, participants indicated their confidence that their answer was correct by typing some value between 20% and 100%. This test is a distinct marker of Gc, and has been shown to possess internal alpha reliability estimates acceptable for research purposes, ranging between .67 to .81 (Kleitman, 2008; Stankov & Crawford, 1997). Each participant had the following variables computed for all three cognitive tests: test accuracy, confidence, bias and CAQ. [00170] Procedure: In groups of up to 10, participants received instructions upon arrival and completed basic demographic and English proficiency questions. The MDMT was then completed prior to the cognitive tests and questionnaires, which were counterbalanced. This ensured that confidence ratings in the MDMT were not influenced by performance or exposure to these ratings in the cognitive tests. Testing was self-paced and completed in approximately 60 to 90 minutes. Participants were - 32 encouraged to take breaks, provided with light refreshments, and thanked and debriefed upon completion. Results:Preliminary Analysis [00171] Missing values analysis. Other than the RPM, which had 10% of its data missing, no more than 5% of data was missing for any other variable of interest. All missing values were the result of software errors, and treated in a pairwise fashion for all forthcoming analyses. [00172] Descriptive statistics and reliabilities. Descriptive statistics and Cronbach's alpha reliability estimates (where applicable) for all variables are presented in Table 9 below: Descriptive Statistcs and Reliab'Pes n A SD Range Decision Tenden cies Optimal 186 Oil . .8 -07.8 Realistic 16 0.62 0.13 0,74 .24-.90 incompetent 181 0.50 0 29 0.83 00-1. 00 Hesitant 186 0,32 0.18 0.8 1 .03- 8 CongRent 186 0, 65 0.7 0,34 12-95 Accuracy (% correct) MDMT 186 75-67 15.90 86 26-100 APM 173 59.71 2 0 3 6 5- 100 EAT 192 70.135 14.93 O 70 17-10 VT 193 49.28 14.90 065 11 -89 Confidence (Average MDMT 186 70 18 87 0, 93 392 APM 173 7203 14.50 0.93 29-99 EAT 192 77.08 10.63 087 50-100 VT 193 63.02 13. 74 0O89 20-98 Bas MDMT 186 -0.35 0.16 0.86 (-43)-49 APM 173 0,12 015 EAT 192 006 0 15 065 (-2359 VT 193 0.14 0,12 0.38 Discrmnmatton (CAQ) MDMT is: 0 52 0.62 n/a (-112-9 APM 172 1.22 0%62 n/a (-.43)-4 38 EAT 19( 0.99 0 60 na (-.55)-3 77 VT 193 1.13 0 39 n/a (-46)-1.82 Note. MDMT = Medica Decision-makng Test; APM = Raven's Advanced Progressive Matrices; EAT = Esoteric Analogies Test; VT = Voabuary Test. a"For detsion variabes, each item was scored as 1 if the outcome wasin he numerator of the varables equation, and otherwise. Table 9 - 33 [00173] The pattern of results for the decision variables suggested that participants generally demonstrated reasonable decision-making ability. The means of ideal and adequate decision tendencies were .51 and .62 (out of 1) respectively. On average, participants were therefore able to accurately diagnose and treat over half of the patients, or to appropriately test a further 11% of misdiagnosed patients. Moreover, participants were generally more likely to treat than test with a mean of .65 for congruent decision tendencies. This was also evident in the higher mean of fatal than wasteful decision tendencies. Indeed, individuals killed 50% of their patients by treating them following misdiagnoses, on average. Of note were the high reliability estimates calculated for all decision-making variables or their key components with respect to wasteful and fatal decision tendencies. This suggested that there were strong within-test consistencies in decision-making tendencies. [00174] Mean accuracy scores were highest for the MDMT, followed by the EAT, RPM, and the VT, and their Cronbach's alpha reliability estimates were in an acceptable range. Mean confidence ratings for the cognitive tests were comparable with those in the studies cited above. Furthermore, mean MDMT diagnostic confidence fell within this range. Consistent with previous research, reliability estimates for all confidence ratings were high (see Stankov & Kleitman, 2008 for a review). [00175] Also consistent with this research, a slight overconfidence bias was evident in the ability tests. Mean MDMT diagnostic bias was close to perfect calibration, but in the underconfidence region. However, this is to be anticipated when test accuracy approaches 80% (Lichtenstein & Fischhoff, 1977), as was the case for MDMT diagnostic accuracy. With the exception of the VT, internal reliability estimates of the bias scores were satisfactory. The low VT estimate is undoubtedly linked to its low accuracy reliability estimate (Kaplan & Saccuzzo, 2005). Again in support of its future utility, the MDMT yielded the greatest bias reliability estimate. [00176] All mean CAQ scores were greater than 0, indicating that participants appropriately adjusted their confidence between correct and incorrect answers. CAQ for the cognitive tests converged around 1, but was lower for the MDMT, indicating that participants found it somewhat more difficult to discriminate between correct and incorrect answers in this test. [00177] Additional analyses. To ensure task order did not have an effect, an analysis of variance was conducted on each variable comparing differences across counterbalanced conditions. No significant differences emerged. Furthermore, to allow for an analysis of the MDMT by combining the results across all test profiles-(puneria, zymosis, and paralymphnal free) diagnosis and decision preferences were examined. Consistent with the high reliability estimates for the derived scores, nothing of concern emerged and a combined analysis was considered appropriate.
- 34 [00178] Generality: Exploratory Factor Analyses [00179] Confidence. Table 10 summarises the correlation coefficients and the results of an Exploratory Factor Analysis (EFA; Principal Components [PC] with PROMAX-rotation), constrained to two factors, performed on accuracy and confidence scores. Confidence ana Accuracy ercorreIafions and EFA results PearsOn r Correlations |Factor Loadin~gs 2 3 4 5 6 7 8 1 2 h2 Accuracy I MDMT 31** .39* .25" .23** .13 .08 .13 .51 -.04 .24 2 RP1 .51* .36" .26 6 .35** -. 3 .31 .53 .54 3 EA .58* .15* .29"* _39* 46** .84 -.00 70 4 VT .05 .16* .37* .63** .96 -.20 78 Confidence 5 MDMT 43* 36" .28* -.25 .82 55 6 RPM 57" 43" -08 .93 <80 7 EA 523 .64 .60 a VT .58 .31 .60 Nte. Factor boadings > .30 are in bodface. h communaly; MDMT Me Decision making Test; RPMA = Raven's Advanced Progressive Matbce; EAT = Esoteric Anakgies Test; VT = Vocabulary Test. *p < .5 *p < <.a1. *** p< .001. Table 10 [00180] These results were in support of broad Ability and Confidence factors. Accuracy scores were significantly and positively correlated with each other, as were confidence scores. Despite particularly strong correlations between the RPM and VT accuracy and confidence scores, the remaining correlations between the accuracy and confidence scores were positive, but weaker in general. In support of the MDMT's divergence from typical cognitive tests, diagnostic accuracy was not significantly correlated with any ability test confidence scores. [00181] The EFA (PC) provided further support, with the two factors explaining 60.04% of the common variance. Communalities for all scores were high, except for a low MDMT accuracy communality, again indicative of the test's divergence from the typical cognitive tests. Notably, MDMT diagnostic confidence did not demonstrate such divergence. Using .30 as the cut-off criterion for a meaningful factor loading, Factor 1 was defined by all of the accuracy scores, as well as a meaningful loading from VT confidence. In support of the postulated hypothesis, Factor 2 was defined by each of the confidence scores, but also a considerable loading from RPM accuracy. As expected, the MDMT diagnostic confidence had a high loading on this factor, and the two factors were positively correlated - 35 with each other (r = .47). Thus, even with the two cross- loadings, these results were in support of two broad and distinct, but related, factors: General Cognitive and Diagnostic Ability, and Confidence. [00182] Calibration. Table 11 summarises the relevant correlation coefficients and results of two EFAs (PC) performed on (1) the bias scores, and (2) the CAQ scores. Bias and CAQ Intercorrelations and EFA results Pea-son r Bas Factor CAD: Factor Correlations Loadinqs Loadings 2 3 4 1 h 1 h 1 MDMT - 34'" .39" .29** .68 44 .64 _40 2 RPM l1* - 4" .41" .77 .59 54 31 3 EA 1 7* 18 - .40" IS 62 .64 42 4 VT .20** .1 .20:" - .69 51 .63 .39 Note- Correlatons above the doagonal are between bas scores. Correlatons beow the diagonal are between CAO cores, Factor loadings > .30 are in bodface, h commmuality; MMIT = Medical Decdsion-making Test; RPM = RavenR9s Advanced Progressive MatrIces; EAT = Esoteric Analogies Test; VT = Vocabulary Test. *p1 < .05. "p < _01. "' p< .0031. Table 11 [00183] The results were in support of broad Bias and Discrimination factors. All correlations between bias scores (above the diagonal) were positive and significant. The same was evident for all CAQ intercorrelations but one, between RPM and VT (below the diagonal). Submitting each set of scores to EFA (PC) clearly revealed single factors explaining 54.01% and 37.87% of the common variance for bias and discrimination respectively. Factor loadings and communalities were all moderately high for both calibration indices. Predictive validity: Regression Analyses [00184] Predictive validity was investigated via relevant sets of multiple regression analyses. Set one investigated the Confidence hypothesis by regressing each decision variable in a hierarchical fashion on (1) diagnostic accuracy, (2) general cognitive ability- Intelligence; (computed as the mean of the three cognitive ability tests) and (3) diagnostic confidence. Set two investigated the Bias hypothesis by regressing each decision variable on diagnostic bias. Ideal and adequate decision tendencies were then regressed on the square of bias in a subsequent step to investigate the non-linear component: whether the tendency to minimise decision errors would diminish with increasing bias in the form of a quadratic trend. Set three investigated the Discrimination hypothesis by regressing each decision variable on diagnostic CAQ. It was not possible to include diagnostic accuracy or Intelligence in the calibration sets - 36 due to these indices being derived from accuracy scores. To reiterate, all analyses were conducted controlling for personality, need for closure, gender, and age, all of which did not contribute statistically significantly to the results and have been omitted here. The results of these analyses can be seen below in Table 12 below: Mvutipie Regression Analyses of Decision Tendencies on Metacognitve indices MDMT Decision Tendences Idea_ Adequate Wasteful Fata Conget Podictor Set Y @ AR 2 B R6 R ARE # Con ,fidence Doagnstic Accuracy .38* .62 9 .31 .0 [ 2 3 -.17 .01 -09 inelgne.02* .14 .03 A17 ct3*' -,1 9 .00 .L0 02* .16 Digotc ofdnc.1** 33 j08*** .28 .7** -.43 09*** .32 .1-* 44 Dignstc is 12** -35 .A -114* -. 21 .9** .39 .03, A16 DanscBia' .M3* - 18 .06"* -- 26 - --- CAQ Dtignost'ic CAQ .* .25 .1** .35 .02* .1 .43*** -.6M .r0 -.01 Note. Predictos i each madei set are entered as steps in the same order as dispLayed Table 12 [00185] Confidence. In line with the Confidence hypothesis, an increase in diagnostic confidence predicted a statistically significant incremental increase in congruent and fatal decision tendencies (19% and 9% respectively) and a decrease in wasteful tendencies (17%), over and above diagnostic accuracy and Intelligence. Somewhat unexpectedly, diagnostic confidence also predicted a statistically significant incremental increase in ideal and adequate decision tendencies (10% and 8%). [00186] Bias. In line with the Bias hypothesis (linear trend), increasing bias predicted a statistically significant increase in congruent and fatal decision tendencies, and a decrease in wasteful decision tendencies (3%, 9% and 4% respectively). Unexpectedly, increasing bias also predicted a significant decrease in ideal tendencies (12%). [00187] As predicted by the magnitude component of the Bias hypothesis (quadratic trend), the square of diagnostic bias incrementally and negatively predicted, ideal and adequate decision tendencies (3% and 6% respectively). These regression equations were further examined to determine whether these relationships were described by functions turning on perfect calibration (bias = 0), as hypothesised (see Cohen, Cohen, West, & Aiken, 2003, for more details). While adequate decision tendencies were greatest only slightly below perfect calibration (bias = -.07), ideal decision tendencies were greatest - 37 considerably below perfect calibration (bias = -.25). Fig. 3 illustrates a plot of the curves described by these regression equations, within the range of observed bias. [00188] Discrimination. In line with the Discrimination hypothesis, an increase in diagnostic CAQ predicted a statistically significant increase in both optimal decision tendencies (ideal: 6%; adequate: 12%) and a decrease in both error decision tendencies (wasteful: 2%; fatal: 43%). [00189] Metacognitive judgement confidence and its calibration are considered important underlying psychological constructs of decision-making. In the present analysis we examined the generality of confidence and its two key calibration indices (bias and discrimination) across decision-making and cognitive domains (metacognitive generality hypothesis). Moreover, until now, insufficient attention has been paid to the step by step decision-making process with a focus on confidence judgements. This has prevented consideration of decision-making tendencies which highlights the varying merit of decision outcomes. Individual differences in these constructs, however, may exist and underpin habitual decision making behaviour. Thus, the present analysis sought to determine evidence for the existence and consistency of these tendencies utilising the novel MDMT. We also aimed to examine the predictive validity of metacognitive constructs of confidence and its calibration on the decision-making behaviours. [00190] Metacognitive Generality [00191] Given robust and replicable findings within the cognitive domain (e.g. Kleitman, et al., 2011; Mengelkamp & Bannert, 2010; Pallier, et al., 2002; Schraw, et al., 1995; Stankov & Lee, 2008), it was hypothesised that indices of metacognitive confidence and its calibration would converge across cognitive and decision-making domains. Generality was assessed using EFA, and strong to moderate evidence was found for each index. [00192] Confidence, bias, and CAQ scores, clearly converged on three respective factors. Regarding confidence, despite two cross-loadings-undoubtedly resulting from the use of only one 'pure' marker for each cognitive domain (see Carroll, 1993 for a review)-the results were in support of two distinct albeit related factors: general Confidence and Ability. Bias and CAQ scores clearly converged on a single Bias and Discrimination factor, respectively. However, aligning with the results of Schraw et al. (1995), support for a broad Discrimination factor was noticeably weaker. Importantly, all three indices derived from the MDMT clearly converged on their intended factors. [00193] These results provide support for some important implications: (i) Individuals who are more confident in the accuracy of their answers in typical cognitive tests tend to be more confident in their - 38 judgments within a typical decision-making context; (ii) Relative to others, individuals more over/underconfident in the accuracy of their answers in cognitive tests tend to be respectively over/underconfident in the accuracy of their judgements prior to decision-making; (iii) Individuals who better discriminate between correct and incorrect answers in cognitive tests tend to better discriminate between correct and incorrect judgments prior to making decisions. However, the extent of this third overlap clearly requires further scrutiny. Collectively, these results support the hypothesis that stable individual differences in metacognitive confidence and its calibration generalise beyond the cognitive domain to decision-making under conditions of subjective uncertainty. Decision-making tendencies [00194] Several modifications to McKenzie's (1998) original test were necessarily introduced to provide insight into patterns of decision-making behaviour and the relationships between decision judgements, metacognitive confidence, and decision-making tendencies. Our results provided strong support that the MDMT is a useful research tool that captures a variety of novel decision- making variables, which are both theoretically and psychometrically sound. Specifically, ideal and adequate tendencies captured individuals' ability to make optimal decisions. Wasteful and fatal decision tendencies captured individual tendencies to make incongruent or congruent decision errors. Finally, congruent tendencies captured individual tendencies to make decisions aligned with their judgements. These variables capture real-life aspects of decision-making, which in this study were not predicted by popular constructs of personality and cognitive styles. Importantly, all MDMT derived indices, including the decision variables, demonstrated very high internal consistency. [00195] Furthermore, the accuracy of the test diverged from the typical cognitive tests, providing evidence for its divergent validity. Thus, the MDMT did not simply measure another cognitive ability, and yet the relative confidence and calibration scores clearly converged across domains. To our knowledge, the MDMT is the first test to explicitly and reliably capture individual differences in the link between judgements, metacognitive confidence and decision-making tendencies. The MDMT can guide future development of valid and reliable decision-making tests. The test framework (with minor modifications if required) is applicable to real world decision-making scenarios within and outside of the medical paradigm.
- 39 Predictive Validity [00196] Another aim of the present study was to investigate the predictive validity of these metacognitive indices on decision-making tendencies in the manner predicted by the Confidence, Bias, and Discrimination hypotheses. Support existed for each hypothesis. [00197] Confidence. The Confidence hypothesis predicts that the greater a person's confidence in the appraisal of their judgement accuracy, the more likely they are to engage in the congruent rather than incongruent decision-making act (e.g., DeMarree & Petty, 2007; Slovic, Fischoff & Lichtenstein, 1977). The results of the hierarchical regression analyses strongly supported this hypothesis. As expected, individuals more confident in their diagnoses tended to make more congruent decisions overall, more fatal errors following incorrect diagnoses, and fewer wasteful errors following correct diagnoses. Additionally, more confident individuals made more ideal decisions and fewer decision errors overall (indexed by adequate decision tendencies). This might be because confidence is important in terms of optimal decision-making, at least how each is measured here. Alternatively, mean diagnostic accuracy was high (70%) and thus decisions generally followed correct diagnoses. Congruent decision-making would hence generally lead to improved outcomes, improving ideal and adequate decision tendencies. Either way, these results were demonstrated incrementally over diagnostic accuracy, personality, cognitive styles, gender, age, and even a general cognitive ability, which itself is a powerful predictor of real-life outcomes (e.g. Hunter, 1986). This indicates that metacognitive confidence-a construct typically ignored in preference of Intelligence and/or other popular measures- is an important psychological construct to be included in a study of the decision-making process. [00198] Bias. The Bias hypothesis further predicts that as over or underconfidence increase, the tendency to respectively minimise decision errors as a result of congruent or incongruent error tendencies will diminish in the form of a quadratic trend. The bias regression analyses supported this prediction. Increasingly overconfident-or decreasingly underconfident- individuals tended to commit more congruent decisions overall, more fatal errors, and fewer wasteful errors in a linear fashion. Furthermore, the frequency of decision errors overall (indexed by adequate decision tendencies) diminished as bias approached near perfect calibration in a quadratic fashion. However, increasing overconfidence was unexpectedly associated with making fewer decisions resulting in ideal outcomes. No doubt this was the result of strong correlations between diagnostic accuracy and ideal tendencies (r = .62; p < .01), and diagnostic accuracy and diagnostic bias (r = -.85; p < .01). Both relationships were expected: the former the result of ideal tendencies being contingent on the frequency of correct diagnoses; the latter describing a persistent finding known as the hard-easy effect (Juslin, Winman, & Olsson, 2000; Lichtenstein & - 40 Fischhoff, 1977). The ideal decision tendencies were greatest considerably below perfect calibration. With this exception in mind, these results were in support of the Bias hypothesis. [00199] However, these results call into question the utility of the bias score relative to its constituent components of confidence and accuracy, in this study. The proportion of variance in the decision variables accounted for by bias was considerably lower than that accounted for by accuracy and confidence. Even the novel predictions described by the quadratic trends had very little to add in this respect. However, these results may be due again to a restriction in diagnostic accuracy. [00200] Discrimination. The Discrimination hypothesis predicted that better discrimination between correct and incorrect judgements should lead to more optimal decision-making and a reduction in decision errors. The regression analyses provided support for this hypothesis. Better discriminators tended to commit more ideal decisions and fewer decision errors overall, resulting from fewer fatal and wasteful decision errors. Additionally, discrimination accounted for a far greater proportion of the variance in the primary error tendency, fatal tendencies (43%), than any other variable. This high percentage may be due to the fact that fatal tendencies are based on incorrect diagnoses only, and people low on the discrimination index had obvious difficulties discriminating between their correct and incorrect answers. Overall, these results provide support that discrimination, indexed by CAQ, is an important construct in terms of optimal decision- making and a useful individual differences measure of calibration. [00201] A limitation involves the utilisation of only three distinct cognitive markers. Other than those already discussed, a further consequence of this was that the role of general cognitive ability (Intelligence)- undeniably crucial to optimal decision-making in general- might not have been appropriately represented. Administering both a broader selection of tests would be necessary to appropriately consider a fuller range of cognitive abilities. This will also allow for the use of Confirmatory factor analytic techniques to more stringently examine the expected structure of the metacognitive factors across tests. [00202] Additional studies into a variety of psychometrically sound individual tendencies might yield great benefit. MDMT like paradigms could help examine such equivalence by establishing the point between and within individuals at which the switch from incongruent to congruent decision-making occurs. Furthermore, a large collection of variables purported to underlie individual differences in decision-making presently exist and continue to emerge (see Appelt, Milch, Handgraaf, & Weber, 2011 for a review). It may be of future benefit to examine how these measures relate to confidence and with each other.
- 41 [00203] Implications and conclusions [00204] To this end, the present study provides a number of important implications for metacognitive and decision-making research. On a first level, this research extended knowledge about the generality and predictive validity of metacognitive judgement confidence and its calibration. Our results provide strong support that individual differences in metacognitive confidence and its calibration, so consistently observed within the cognitive domain, generalise to decision-making under conditions of subjective uncertainty. Furthermore, the present results revealed reliable decision-making tendencies, and that these tendencies share meaningful predictive relations with confidence and its calibration. [00205] Regarding applied practice, it is clear that judgement confidence and its derived measures should be collected as important predictors of decision-making behaviour. For example, confidence estimates acquired within a practical medical scenario could provide useful insight into doctors' decision-making tendencies and ability. This could help train doctors where they demonstrate serious metacognitive deficits. Furthermore, confidence estimates may be a desirable predictor in high stake situations as good calibration may be particularly difficult to fake. The ease with which confidence estimates are collected could see their utility being effectively implemented across many domains: industrial/organisational, educational, clinical, financial, military, and political, to name a few. [00206] Two further applications present themselves. Firstly, it is possible that interventions designed to develop and improve confidence and its calibration could have significant impacts on decision making. Implementing strategies designed to develop diagnostic confidence and its calibration at a education level can lead to better future clinicians. Secondly, given the generality observed here, confidence judgements acquired in any domain might still provide some useful indication of decision making tendencies in another. Confidence estimates could be easily collected to help predict decision making tendencies where cognitive tests are routinely administered, such as pre-employment selection. [00207] We sought to investigate the role of confidence in decision-making by capturing a more complete perspective of the decision-making process. In doing so, our results indicated that reliable individual differences in decision-making tendencies exist and that confidence shares a significant and meaningful relationship with them. The implications of these results extend beyond the theoretical realm to possibly improving prediction and intervention strategies wherever decision-making is of interest. Interpretation [00208] Reference throughout this specification to "one embodiment", "some embodiments" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the - 42 embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases "in one embodiment", "in some embodiments" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments. [00209] As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner. [00210] In the claims below and the description herein, any one of the terms comprising, comprised of or which comprises is an open term that means including at least the elements/features that follow, but not excluding others. Thus, the term comprising, when used in the claims, should not be interpreted as being limitative to the means or elements or steps listed thereafter. For example, the scope of the expression a device comprising A and B should not be limited to devices consisting only of elements A and B. Any one of the terms including or which includes or that includes as used herein is also an open term that also means including at least the elements/features that follow the term, but not excluding others. Thus, including is synonymous with and means comprising. [00211] As used herein, the term "exemplary" is used in the sense of providing examples, as opposed to indicating quality. That is, an "exemplary embodiment" is an embodiment provided as an example, as opposed to necessarily being an embodiment of exemplary quality. [00212] It should be appreciated that in the above description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, FIG., or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.
- 43 [00213] Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention, and form different embodiments, as would be understood by those skilled in the art. For example, in the following claims, any of the claimed embodiments can be used in any combination. [00214] Furthermore, some of the embodiments are described herein as a method or combination of elements of a method that can be implemented by a processor of a computer system or by other means of carrying out the function. Thus, a processor with the necessary instructions for carrying out such a method or element of a method forms a means for carrying out the method or element of a method. Furthermore, an element described herein of an apparatus embodiment is an example of a means for carrying out the function performed by the element for the purpose of carrying out the invention. [00215] In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description. [00216] Similarly, it is to be noticed that the term coupled, when used in the claims, should not be interpreted as being limited to direct connections only. The terms "coupled" and "connected," along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. Thus, the scope of the expression a device A coupled to a device B should not be limited to devices or systems wherein an output of device A is directly connected to an input of device B. It means that there exists a path between an output of A and an input of B which may be a path including other devices or means. "Coupled" may mean that two or more elements are either in direct physical or electrical contact, or that two or more elements are not in direct contact with each other but yet still co operate or interact with each other. [00217] Thus, while there has been described what are believed to be the preferred embodiments of the invention, those skilled in the art will recognize that other and further modifications may be made thereto without departing from the spirit of the invention, and it is intended to claim all such changes and modifications as falling within the scope of the invention. For example, any formulas given above are merely representative of procedures that may be used. Functionality may be added or deleted from the block diagrams and operations may be interchanged among functional blocks. Steps may be added or deleted to methods described within the scope of the present invention.

Claims (8)

1. A method of predicting a likely decision outcome for future decisions for an individual person, the method including the steps of: (a) collecting a series of candidate predicted decision outcomes for the individual person and a corresponding relative confidence level measure for the individual person in making the candidate predicted decision; (b) correlating the actual decision outcome with the candidate predicted decision outcome and the corresponding relative confidence level measure for the individual person; where a positive correlation exists, utilising the confidence measure as a predictor of the decision outcome for future decisions of the individual.
2. A method as claimed in claim 1 further comprising the steps of: (d) for each of a series of individuals within a group, measuring the individuals subjective confidence level in their relative confidence level measures and corresponding outcomes; (e) determining an average across the individuals within a group; (f) determining a relative average of the confidence of the decision outcomes for the individual person, relative to the group; (g) adjusting the individuals subjective confidence level measure by a amount proportional to the difference in the group average to relative average.
3. A method of predicting a current decision outcome, the method including the steps of: (a) for a large collection of decision participants measuring prior decision outcome predictions and an a current estimated confidence level for the decision participant; (b) utilising the collection of outcome predictions and estimated confidence level to form a regression analysis predictor of the certainty of actual outcome; - 45 (c) for a current decision, utilising the regression analysis predictor to predict the current decision outcome based on a participants predicted outcome and current confidence level of outcome.
4. A method as claimed in claim 3 wherein said step (b) wherein said regression analysis predictor includes adjustment of the estimated confidence level of participants to account for systemic errors in confidence level predictions relative to the collection of decision participants.
5. A method as claimed in claim 4 wherein said systemic errors are also relative to the participants previous decisions.
6. A method as claimed in any previous claim 3 to 5 wherein said decision has a binary outcome and said regression analysis predictor is formed utilising binary logistic regression of the prior decision outcome predictions and the current estimated confidence levels for decision participants.
7. A method as claimed in any previous claim 3 to 5 wherein said regression analysis predictor is formed utilising multinomial logistic regression of the prior decision outcome predictions and the current estimated confidence levels for decision participants.
8. A system for predicting a likely decision outcome for future decisions for an individual person, the system including: collection means for collecting a series of candidate predicted decision outcomes for the individual person and a corresponding relative confidence level measure for the individual person in making the candidate predicted decision; correlating means for correlating the actual decision outcome with the candidate predicted decision outcome and the corresponding relative confidence level measure for the individual person; and where a positive correlation exists, utilising the confidence measure as a predictor of the decision outcome for future decisions of the individual.
AU2013224752A 2013-09-09 2013-09-09 Critical decision support system and method Abandoned AU2013224752A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2013224752A AU2013224752A1 (en) 2013-09-09 2013-09-09 Critical decision support system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2013224752A AU2013224752A1 (en) 2013-09-09 2013-09-09 Critical decision support system and method

Publications (1)

Publication Number Publication Date
AU2013224752A1 true AU2013224752A1 (en) 2015-03-26

Family

ID=52706441

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2013224752A Abandoned AU2013224752A1 (en) 2013-09-09 2013-09-09 Critical decision support system and method

Country Status (1)

Country Link
AU (1) AU2013224752A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695695A (en) * 2020-06-09 2020-09-22 北京百度网讯科技有限公司 Quantitative analysis method and device for user decision behaviors

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695695A (en) * 2020-06-09 2020-09-22 北京百度网讯科技有限公司 Quantitative analysis method and device for user decision behaviors
CN111695695B (en) * 2020-06-09 2023-08-08 北京百度网讯科技有限公司 Quantitative analysis method and device for user decision behaviors

Similar Documents

Publication Publication Date Title
Kincade et al. Meta-analysis and common practice elements of universal approaches to improving student-teacher relationships
Neal et al. Psychological assessments in legal contexts: Are courts keeping “junk science” out of the courtroom?
Farmer et al. Externalizing and internalizing behavior problems, peer affiliations, and bullying involvement across the transition to middle school
Smith et al. Measuring chronic condition self-management in an Australian community: factor structure of the revised Partners in Health (PIH) scale
Jackson et al. Individual differences in decision-making and confidence: Capturing decision tendencies in a fictitious medical test
Dhami Psychological models of professional decision making
Duffy et al. Self-assessment in lifelong learning and improving performance in practice: physician know thyself
Schaefer et al. Evaluating immediate feedback via bug-in-ear as an evidence-based practice for professional development
Lindacher et al. Evaluation of empowerment in health promotion interventions: a systematic review
Hamilton et al. Designed to fit: The development and validation of the STRONG-R recidivism risk assessment
Papageorgiou et al. An investigation of the use of TOEFL® Junior™ Standard scores for ESL placement decisions in secondary education
Rufino et al. Scoring subjectivity and item performance on measures used to assess violence risk: The PCL-R and HCR-20 as exemplars
Chow et al. A systematic review and meta-analysis of the language skills of youth offenders
Arroyo et al. Maternal communication strategies that promote body image in daughters
Wu et al. Promoting self-regulation progress and knowledge construction in blended learning via ChatGPT-based learning aid
Aymans et al. Gender and career optimism—The effects of gender‐specific perceptions of lecturer support, career barriers and self‐efficacy on career optimism
Lin et al. Achievement Goal Orientations and Self–Regulated Learning Strategies of Adult and Traditional Learners
Flynn et al. The development and validation of the comprehensive counseling skills rubric
Olswang et al. Reliability issues and solutions for coding social communication performance in classroom settings
Sharma et al. Modelling the impact of emotional intelligence, career success and happiness on turnover intention among managerial-level employees in the information technology industry
Fama et al. The subjective experience of word-finding difficulties in people with aphasia: A thematic analysis of interview data
Holmes et al. Evaluating the longitudinal efficacy of SafeTALK suicide prevention gatekeeper training in a general community sample
Mei et al. Redesigning the central eight: Introducing the M-PACT Six
Vaughan et al. The general factor of self-control and cost consideration: A critical test of the general theory of crime
Reddy et al. The relationship between school administrator and teacher ratings of classroom practices and student achievement in high-poverty schools

Legal Events

Date Code Title Description
MK4 Application lapsed section 142(2)(d) - no continuation fee paid for the application