Mindware: Critical Thinking for the Information Age

Week 1
  • Lesson 1: Statistics
  • Lesson 2: The Law of Large Numbers
Week 2
  • Lesson 3: Correlation
  • Lesson 4: Experiments
Week 3
  • Lesson 5: Prediction
  • Lesson 6: Cognitive Biases
Week 4
  • Lesson 7: Choosing and Deciding
  • Lesson 8: Logic and Dialectical Reasoning
Course Overview: apply basic concepts of statistics, probability theory, scientific method, psychology, microeconomics, logic, to judgments and decisions of everyday life. Critique scientific findings in media. Learn about cognitive biases, inference procedures (rapid and automatic but are erroneous).

Course Introduction: You will be smarter because you will have rules of inference to allow you to learn about the world. Think logically, improve ability to think in hypothetical terms and improve ability to think about abstrications. Information age required tools for dealing with data, collect, analyze the validity, finde patterns, avoid false patterns, critic arguments based on data. Skill needed come from statistics, probability, scientific methodology, cost-benefit analysis, cognitive psychology. 

Lesson 1 is statistics, concepts of variable, normal distribution, standard, deviation, correlation, reliability, validity.

Lesson 2 is law of large numbers (sample values resemble population values as a function of their size). Great when lots of error in sample, and we are not well calibrated. Recognize and estimate size of error.

Lesson 3 is correlation, the degree of association between two variables. Avoid false correlations.

Lesson 4 is experiments, what is a good experiment, why they are superior to correlational evidence, natural experiments, do experiments on yourself, cost of avoiding experiments.

Lesson 5 is pedicting, regression to the mean (extreme values for any given variable are rare, if you see a rare value then the next value is less extreme), predictions and similar cases.

Lesson 6 is cognitive biases, errors in judgement because of lack of important concepts. Illusion of objectivity, sensory input is understood in a direct and automatic way, but actually has overlay of perceptual and cognitive processes. Fundamental attribution error (behaviour of object/people is produced internal to their ability/traits but real cause of behaviour is situation the person is in), heuristics as rules of thumb but can be false when assessing probability and causality, confirmation bias (looking for supportive, and not contradictory evidence).

Lesson 7 is choosing and deciding, cost-benefit analysis and when to ignore the results, opportunity costs and how to avoid taking actions that make potentially more valuable actions impossible, sunk costs and avoiding carrying out an action because of prior investment.

Lesson 8 is logic and dialectical reasoning, syllogisms (categories and quantities-some, all, none), conditional reasoning (P=Q=R), dialectical reasoning (find the truth, in what way may two opposing propositions be correct).

Lesson 1: Statistics

Variables - Normal Distribution:
Variable: changing value (current temperature, height, iq, errors per week in factory) 
Constant: immutable value (freezing temperature)
Normal Distribution (bell curve, could be described as the standard deviations from mean): _./''\._ middle is mean (ust common value), further away equals more rare
Standard Deviation:  | 13.59% (-2SD) | 34.13% (-1SD) | mean | 34.13% (+1SD) | 13.59% (+2SD) | 2.14, 0.13. The percentile rank has the % of the population in the standard deviations (1, 5, 10, 20, 30, 50, 70, 80, 90, 95, 99). Mean is always 50th percentile, 1 SD is 84th thus 83 % is below and 16% are above.
Average Deviation: eggs per week (6), hen A laid (8) deviation (2), hen B (5) deviation (-1), average deviation = (abs(2) + abs(-1))/2 = 3/2 = 1.5 eggs.
Effect Size: Magnitude of difference in standard deviation terms. Ex. Old score 72, new score 78, if standard deviation is small, then it takes an average user from 50th to 84th percentile, if SD is large say 20, then the improvement of 6 is not that is still within less than 50% of a SD.
Personal Note: Look at SD calculation with /n vs /n-1 for populations and samples, unbiased estimators, small samples will need to reduce the denominator by one (reducing a large samples denominator by one doesn't affect it as much).

Introduction to Correlation:
Correlation: measures the association between variables
Scatter Plots and Correlations: -1 is linear descending (\), +1 is linear ascending (/).
Example Correlations: SAT score & college GPA (.4), mother's height & daughter's height (.5), height & weight (.7), overweight & cardiovascular illness (.3).
Rank Order Correlation: Order one variable as a rank value, and pair them with their correlative rank value then calculate the correlation.
Actual Value Correlation: use actual values like height (cm) in rank order, paired with actual height correlatives.
Reliability (1): degree to which a measure of a particular variable gives the same value across occasions. A degree to which a measure correlates with itself (also depends on units of measure), height should be 1 but if using microns it may vary during the day.
Reliability (2): degree to which two different measures which are supposed to measure the same thing give the same result. Ex. IQ test A vs B, if low then one test is unreliable.
Validity: degree to which a variable measures what it's supposed to. Ex. IQ with income. There can be no validity if there is no reliability.

Lesson 2: The Law of Large Numbers

The Law of Large Numbers: Part 1: With a larger data set, each entry has less effect on the outcome; small data sets may have a single entry misrepresenting the outcome.
We can ask, for a 50:50 ratio what is the likelihood of a 60:40 outcome.
Interviews and job performance require different skills.

The Law of Large Numbers: Part 2: The main driver of behaviour is situation, thus the likelihood of on day compared to another day is a bad indicator unless you have a history to refer to. People underestimate the law of large numbers for abilities/performance, and overestimate for personality traits. 
Fundamental Attribution Error: behaviour of a kind that is prompted by the situation some is in is mistakenly attributed to personality traits.
Observation = true score + error
Humans tend to go from an observation to a generalization quickly.

Lesson 3: Correlation

Association determination requires all four numbers in a 2x2 grid, comparing two ratios (proportion of people with disease how have and those without symptoms) Ex YES 50 yes 5 no, NO 10 yes 20 no thus 50/60 vs 5/25. 

Illusory Correlation: if we are prepared to see a particular association, we're likely to see it.
Confirmation Bias: we rest content with confirming data, and don't think to look for data that might dis-confirm our hypothesis.

Confounded Variables; Statistical Significance:   
Confounded Variable is a hidden C variable that may be causing both A and B. Ex couples that spend more time on wedding preparations are less likely to get divorced (because those people are less hurried, better off financially, and older).
Statistical Significance: probability that a result at least as extreme as the one obtained could have occurred given that there is in fact no relationship. Expressed as p < 0.XX Ex. p < 0.05 probability of this result occurring given that there is no relationship is 5 in 100.

Lesson 4: Experiments

The Superiority of Experiments over Correlations: Correlations cannot tell us about causality, we must conduct an experiment.
Experiment: scientific study where at least one variable manipulated, and at least one variable is measured.
Gold Standard (Randomized Control Design): experimenter assigns people/things at random to experimental vs control condition, experimental condition receives some treatment, the treatment is the independent variable, control condition receives a different treatment or no treatment, things measured constitute the dependent variables.
Many correlational studies (easy), vs less experimental (hard).
Self selection and healthy user bias: by virtue of the optimal candidate being tested, the correlation is more likely to be true due to confounding variables.
Multiple regression Analysis: examines the association of each of a number of variables with the target independent variable X and with a target dependent variable Y. The additional variables are controls, thus we look at the correlation of X and Y controlling for all variables that are correlated with both X and Y by subtracting out the correlation between X and Y the correlation between each of the other independent variables with both the X and Y variable.
Problems with Multiple Regression Analysis: can't identify all possible variables that might be correlated with the target independent variable and the target dependent variable, can't measure the variables you identify, or measurement is with poor reliability or validity which distorts results, meaningless to say you control for variables with missing values.

A/B Testing: Dividing groups and presenting one different variation of input to each group then measuring the outcome.
You can't just observe the world, you can't just be systematic and look at correlations, usually you must do an experiment to be sure about what is causing what.

Experimental Design and Natural Experiments:
Within vs. Between Designs: the conditions of an experiment have either or.
Within design: all people/things studied participate in all conditions.
Between design: people/things studied participate in only one condition.
Within design is much more powerful, error variance is greatly reduced. The only thing that differs across treatments are the treatments themselves, between designs are noisy the treatments differ but so do the people.
Statistical Independence: Ex. you do an experiment with 40 students in one class via method A, and 60 in another class via method B, what is N (the number of cases)? Not 100, but rather 2. N is the number of cases for which the results are not influenced by the other cases. A disruption in one class affects all students in the class, however a MOOC is unique to each student. There is no way to establish statistical significance if case results are not independent of one another.
You need to know whether a given type of event is characterized by independence or dependence.
Natural Experiments: an empirical study in which individuals are exposed to the experimental and control conditions that are determined by nature or by other factors outside the control of the investigators.
Experiments that don't get done (or done right) may lead to incorrect assumptions or plans that have unknown effectiveness and could be costly depending on scale. Ex. 9-11 had grief counselors, but a study showed that it prolongs grief vs non-counseled people.
Personal Note: but we don't know if the prolonged grief group has any benefit (like coping strategies for future events, less overall anxiety, less anxiety attacks, ... etc.) which would appear as an overall positive when compared to the duration of counseled vs non-counseled people experienced side effects.

Lesson 5: Prediction

Regression to the Mean: regression creates an illusion of causality; extreme scores on a variable don't predict future scores on that variable.
Observation = True Score + (Error or Luck)
Statistical Regression: extreme events of a type that is distributed normally will be followed and preceded by less extreme events - to the extent that the events are subject to chance influences.
How to estimate one value from another: find/estimate the correlation between the two types of events, go that far away from the mean in the direction of the extreme value to get an estimate of the additional value, if correlation is .50 go half way from the mean to the more extreme score, if .30 go .3 of the way from the mean to the more extreme score.

Base Rate: in probability it is the event of something happening (without intervention)
Ex of having cancer: 1/100 have cancer, of 1,000 people then 10 have cancer, and 80/100 are correctly diagnosed so 8/10, and 2/10 are incorrectly diagnosed (have cancer but it doesn't show up). 990 don't have cancer but 10/100 get false positives so 99. Divide number of men who have cancer and test positive by the total number of men who test positive so .008/(0.008+0.099) or 8/107 = 7.5%.
If you test positive for a disease, you must know: the percent of people who have the disease who correctly test positive, the percent of people who do not have the disease who will incorrectly test positive (false positive), the overall percent of people who have the disease (base rate), then you divide the number of positives who have the disease by the total number of positives.

Lesson 6: Cognitive Biases

The Illusion of Objectivity: belief that we understand the world by direct perception (but our understanding of even the simplest thing is guided by layers of cognitive processes) aka. naive realism.

Heuristics: informal cognate procedures for solving a variety of everyday problems involving inference and judgement.
Availability heuristic: how readly certain instances come to mind, judging the frequency or probability of an event.
Representativeness heuristic: categorizes something by how similar it is to our conception of the typical member of the category, used for judging probability, used for inferring causality by assessing how similar an effect is to some possible cause.

Fundamental Attribution Error: the tendency to mistakenly regard dispositions of the object or person as the primary cause of behaviour, while ignoring important situational or context factors. dispositions include traits, abilities, attitudes and motives.
Confirmation Bias: when testing hypotheses we tend to look only for evidence that could confirm the hypothesis and not for evidence that could disconfirm it. The search for evidence that might disconfirm a hypothesis is as important to testing a hypothesis as a search for confirming evidence.

Lesson 7: Choosing and Deciding

Cost-Benefit Analysis: note the possible costs and the possible benefits of each action, and you choose the action with the best pattern of benefits versus costs. 
Cognitive dissonance: if our beliefs don't fit our behaviour, this produces mental pain and we either change our beliefs or behaviour. Because we can't have direct control over our beliefs, we should control what we can, our behaviour.
Choose the action that has the best net benefit. 
Weighted decision matrix: table of options and factors (with a weight in the factors), and ratings in the matrix then multiply the weight to factor and add up each row for a single rating. 
Expected value: ex. two jobs, income depends on your ability to sell. You compare meet sales target to not meeting sales target and assign and probability to each and multiply the amount value to get a final income value.

Sunk Costs: Only future benefits and costs should figure in your choices, time and energy and money spent on some activity no longer have relevance once expended, only if the activity is valued in its own right should it be carried out. The rest of you life begins now.
Opportunity Cost Principle: every action has a cost, the net benefit (benefit minus cost) of the second best action, ensure that the second best is the second best, but if it is better than what you are planning to do, do the second best instead.

Loss Aversion: Logical Consistency doesn't prove practical utility. Loss is twice as painful as the same gain is pleasurable. This causes people to forego lots of opportunities they shouldn't forego.
The bottom line is, what am I losing to avoid this loss, how much do I really want this thing, am I buying this thing just because I got a discount?

Lesson 8: Logic and Dialectical Reasoning

Logical Reasoning: Educated person should know something about this achievement, needed for science and math, useful for difference between truth and validity of a conclusion. An argument is valid if the truth of the premises logically guarantees the truth of the conclusion. We can incorrectly conclude what we want to believe faulty logic if we don't have a clear understanding of the logical structures. We may incorrectly reject a proposition if it follows from admittedly true premises.
Formal Logic: deductive reasoning, top down, given prior inputs you must accept the conclusion.
Syllogisms: categories and quantifications, has a major premise which is a generalization, and minor premise which is an example of the subject of the generalization. All A are be, C is an A, thus C is B.
Venn Diagrams: most useful formalisms, pictorial way of categorical members.
Propositional Logic: If P then Q, must map onto a cogent argument form in order to be able to yield a valid conclusion.
Converse Error: form of reason that erroneously converts the premise if P the Q to if Q then P.
Inverse Error: invert the premise if P then Q into if not P then not Q.
Converse and inverse errors only yield deductively invalid conclusions, and can actually be pretty goud inductive conclusions if the premises are true the conclusion is more likely to be true.
Inductive Reasoning: bottom up logical process in which multiple premises are all believed to be true most of the time are combined to obtain a conclusion which is deemed probably true.

Dialectical Reasoning:
Socratic Dialog: in debate people offer propositions and argue for them, other people modify or contradict those propositions and the point is to reach the truth about some matter.
German Dialectical: concept of matters proceeding by a thesis, then antithesis, then a contradictory thesis, then a synthesis which reconciles the two theses.
Post-formalism: looking at reasoning, patterns that we develop after formal roles, and logical roles are not deductive, involving attention to relations and context. Anti-formalists believe that it is a mistake to separate form from content, and are concerned with identifying contradiction and transcending contradiction or accepting it or using contradiction to learn something new, concerned with change and uncertainty.
Foundations of Dialectical Reasoning: not propositions but a looser definition, principle of change (reality is a process of change, currently true will be false), principle of contradiction (contradiction is the dynamic underlying change, changi is constant thus contradiction is constant), principle of relationship/holism (the whole is more than the sum of its parts, parts are meaningful only in relation to the whole).

Concluding Thoughts
Each use of the concept increases range of problems that you can apply the concept to. You will make errors frequently, ex. fundamental attribution error. Build humility as confidence, at the mercy of experts but errors are constant.