Mindware: Critical Thinking for the Information Age


Richard E. Nisbett (Theodore M. Newcomb Distinguished University Professor)

University of Michigan

Week 1

  • Lesson 1: Statistics
  • Lesson 2: The Law of Large Numbers
Week 2
  • Lesson 3: Correlation
  • Lesson 4: Experiments
Week 3
  • Lesson 5: Prediction
  • Lesson 6: Cognitive Biases
Week 4
  • Lesson 7: Choosing and Deciding
  • Lesson 8: Logic and Dialectical Reasoning
Course Overview: Apply Basic concepts of statistics, probability theory, scientific method, psychology, microeconomics, logic, to judgments and decisions of everyday life. Critique scientific findings in media. Learn about cognitive biases, inference procedures (rapid and automatic but are erroneous).

Course Introduction: You will be smarter because you will have rules of inference to allow you to learn about the world. Think logically, improve ability to think in hypothetical terms and improve ability to think about abstrications. Information age required tools for dealing with data, collect, analyze the validity, finde patterns, avoid false patterns, critic arguments based on data. Skill needed come from statistics, probability, scientific methodology, cost-benefit analysis, cognitive psychology. 

Lesson 1 is statistics, concepts of variable, normal distribution, standard, deviation, correlation, reliability, validity.

Lesson 2 is law of large numbers (sample values resemble population values as a function of their size). Great when lots of error in sample, and we are not well calibrated. Recognize and estimate size of error.

Lesson 3 is correlation, the degree of association between two variables. Avoid false correlations.

Lesson 4 is experiments, what is a good experiment, why they are superior to correlational evidence, natural experiments, do experiments on yourself, cost of avoiding experiments.

Lesson 5 is pedicting, regression to the mean (extreme values for any given variable are rare, if you see a rare value then the next value is less extreme), predictions and similar cases.

Lesson 6 is cognitive biases, errors in judgement because of lack of important concepts. Illusion of objectivity, sensory input is understood in a direct and automatic way, but actually has overlay of perceptual and cognitive processes. Fundamental attribution error (behaviour of object/people is produced internal to their ability/traits but real cause of behaviour is situation the person is in), heuristics as rules of thumb but can be false when assessing probability and causality, confirmation bias (looking for supportive, and not contradictory evidence).

Lesson 7 is choosing and deciding, cost-benefit analysis and when to ignore the results, opportunity costs and how to avoid taking actions that make potentially more valuable actions impossible, sunk costs and avoiding carrying out an action because of prior investment.

Lesson 8 is logic and dialectical reasoning, syllogisms (categories and quantities-some, all, none), conditional reasoning (P=Q=R), dialectical reasoning (find the truth, in what way may two opposing propositions be correct).

Lesson 1: Statistics

Variables - Normal Distribution:
Variable: changing value (current temperature, height, iq, errors per week in factory) 
Constant: immutable value (freezing temperature)
Normal Distribution (bell curve, could be described as the standard deviations from mean): _./''\._ middle is mean (ust common value), further away equals more rare
Standard Deviation:  | 13.59% (-2SD) | 34.13% (-1SD) | mean | 34.13% (+1SD) | 13.59% (+2SD) | 2.14, 0.13. The percentile rank has the % of the population in the standard deviations (1, 5, 10, 20, 30, 50, 70, 80, 90, 95, 99). Mean is always 50th percentile, 1 SD is 84th thus 83 % is below and 16% are above.
Average Deviation: eggs per week (6), hen A laid (8) deviation (2), hen B (5) deviation (-1), average deviation = (abs(2) + abs(-1))/2 = 3/2 = 1.5 eggs.
Effect Size: Magnitude of difference in standard deviation terms. Ex. Old score 72, new score 78, if standard deviation is small, then it takes an average user from 50th to 84th percentile, if SD is large say 20, then the improvement of 6 is not that is still within less than 50% of a SD.
Personal Note: Look at SD calculation with /n vs /n-1 for populations and samples, unbiased estimators, small samples will need to reduce the denominator by one (reducing a large samples denominator by one doesn't affect it as much).

Introduction to Correlation:
Correlation: measures the association between variables
Scatter Plots and Correlations: -1 is linear descending (\), +1 is linear ascending (/).
Example Correlations: SAT score & college GPA (.4), mother's height & daughter's height (.5), height & weight (.7), overweight & cardiovascular illness (.3).
Rank Order Correlation: Order one variable as a rank value, and pair them with their correlative rank value then calculate the correlation.
Actual Value Correlation: use actual values like height (cm) in rank order, paired with actual height correlatives.
Reliability (1): degree to which a measure of a particular variable gives the same value across occasions. A degree to which a measure correlates with itself (also depends on units of measure), height should be 1 but if using microns it may vary during the day.
Reliability (2): degree to which two different measures which are supposed to measure the same thing give the same result. Ex. IQ test A vs B, if low then one test is unreliable.
Validity: degree to which a variable measures what it's supposed to. Ex. IQ with income. There can be no validity if there is no reliability.

Lesson 2: The Law of Large Numbers

The Law of Large Numbers: Part 1: With a larger data set, each entry has less effect on the outcome; small data sets may have a single entry misrepresenting the outcome.
We can ask, for a 50:50 ratio what is the likelihood of a 60:40 outcome.
Interviews and job performance require different skills.

The Law of Large Numbers: Part 2: The main driver of behaviour is situation, thus the likelihood of on day compared to another day is a bad indicator unless you have a history to refer to. People underestimate the law of large numbers for abilities/performance, and overestimate for personality traits. 
Fundamental Attribution Error: behaviour of a kind that is prompted by the situation some is in is mistakenly attributed to personality traits.
Observation = true score + error
Humans tend to go from an observation to a generalization quickly.

Lesson 3: Correlation

Association determination requires all four numbers in a 2x2 grid, comparing two ratios (proportion of people with disease how have and those without symptoms) Ex YES 50 yes 5 no, NO 10 yes 20 no thus 50/60 vs 5/25. 

Illusory Correlation: if we are prepared to see a particular association, we're likely to see it.
Confirmation Bias: we rest content with confirming data, and don't think to look for data that might dis-confirm our hypothesis.

Confounded Variables; Statistical Significance:   
Confounded Variable is a hidden C variable that may be causing both A and B. Ex couples that spend more time on wedding preparations are less likely to get divorced (because those people are less hurried, better off financially, and older).
Statistical Significance: probability that a result at least as extreme as the one obtained could have occurred given that there is in fact no relationship. Expressed as p < 0.XX Ex. p < 0.05 probability of this result occurring given that there is no relationship is 5 in 100.

Lesson 4: Experiments

The Superiority of Experiments over Correlations: Correlations cannot tell us about causality, we must conduct an experiment.
Experiment: scientific study where at least one variable manipulated, and at least one variable is measured.
Gold Standard (Randomized Control Design): experimenter assigns people/things at random to experimental vs control condition, experimental condition receives some treatment, the treatment is the independent variable, control condition receives a different treatment or no treatment, things measured constitute the dependent variables.
Many correlational studies (easy), vs less experimental (hard).
Self selection and healthy user bias: by virtue of the optimal candidate being tested, the correlation is more likely to be true due to confounding variables.
Multiple regression Analysis: examines the association of each of a number of variables with the target independent variable X and with a target dependent variable Y. The additional variables are controls, thus we look at the correlation of X and Y controlling for all variables that are correlated with both X and Y by subtracting out the correlation between X and Y the correlation between each of the other independent variables with both the X and Y variable.
Problems with Multiple Regression Analysis: can't identify all possible variables that might be correlated with the target independent variable and the target dependent variable, can't measure the variables you identify, or measurement is with poor reliability or validity which distorts results, meaningless to say you control for variables with missing values.

A/B Testing: Dividing groups and presenting one different variation of input to each group then measuring the outcome.
You can't just observe the world, you can't just be systematic and look at correlations, usually you must do an experiment to be sure about what is causing what.

Experimental Design and Natural Experiments:
Within vs. Between Designs: the conditions of an experiment have either or.
Within design: all people/things studied participate in all conditions.
Between design: people/things studied participate in only one condition.
Within design is much more powerful, error variance is greatly reduced. The only thing that differs across treatments are the treatments themselves, between designs are noisy the treatments differ but so do the people.
Statistical Independence: Ex. you do an experiment with 40 students in one class via method A, and 60 in another class via method B, what is N (the number of cases)? Not 100, but rather 2. N is the number of cases for which the results are not influenced by the other cases. A disruption in one class affects all students in the class, however a MOOC is unique to each student. There is no way to establish statistical significance if case results are not independent of one another.
You need to know whether a given type of event is characterized by independence or dependence.
Natural Experiments: an empirical study in which individuals are exposed to the experimental and control conditions that are determined by nature or by other factors outside the control of the investigators.
Experiments that don't get done (or done right) may lead to incorrect assumptions or plans that have unknown effectiveness and could be costly depending on scale. Ex. 9-11 had grief counselors, but a study showed that it prolongs grief vs non-counseled people.
Personal Note: but we don't know if the prolonged grief group has any benefit (like coping strategies for future events, less overall anxiety, less anxiety attacks, ... etc.) which would appear as an overall positive when compared to the duration of counseled vs non-counseled people experienced side effects.

Lesson 5: Prediction

Regression to the Mean: